Theory of Computation

Prerequisite Definitions

Alphabets \(\Sigma\), and \(\Gamma\) are finite nonempty sets of symbols.

A string is a finite sequence of zero or more symbols from an alphabet.

\(\Sigma^\star\) is the set of all strings over alphabet \(\Sigma\).

\(\epsilon\) is the empty string and cannot be in \(\Sigma\).

A problem is a mapping from strings to strings.

A decision problem is a problem whose output is yes/no (or often accept/reject).

A decision problem be thought of as the set of all strings for which the function outputs “accept”.

A language is a set of strings, so any set \(S \subseteq \Sigma^\star\) is a language, even \(\emptyset\). Thus, decision problems are equivalent to languages.

Regular Languages

\(L(M)\) is the language accepted by machine \(M\).

A deterministic finite automaton is a 5-tuple \(M = (Q, \Sigma, \delta, q_0, F)\), where

If \(\delta\) is not fully specified, we assume an implicit transition to an error state.

A deterministic finite automaton \(M\) accepts input string \(w = w_1w_2 \dots w_n\) (\(w_i \in \Sigma\)) if there exists a sequence of states \(r_0, r_1, r_2, \dots, r_n\) (\(r_i \in Q\)) such that

\(r_0, r_1, r_2, \dots, r_n\) are the sequence of states visited during the machine’s computation.

A non-deterministic finite automaton is a 5-tuple \(M = (Q, \Sigma, \delta, q_0, F)\), where

A non-deterministic finite automaton accepts the string \(w = w_1w_2 \dots w_n\) (\(w_i \in \Sigma\)) if there exist a string \(y = y_1y_2 \dots y_m\) (\(y_i \in \Sigma \cup \{\epsilon\}\)) and a sequence \(r = r_0, r_1, \dots, r_n\) (\(r_i \in Q\)) such that

The \(\epsilon\)-closure for any set \(S \subseteq Q\) is denoted \(E(S)\), which is the set of all states in \(Q\) that can be reachable by following any number of \(\epsilon\)-transition.

A non-deterministic finite automaton can be converted to an equivalent deterministic finite automaton.

A regular language is any language accepted by some finite automaton. The set of all regular languages is called the class of regular languages.

Regular languages are closed under

\(R\) is a regular expression if \(R\) is

where \(R_i\) is a regular expression.

Identities of Regular Languages

Languages accepted by DFAs = languages accepted by NFAs = regular languages

If \(L\) is a finite language, \(L\) is regular.

If a computation path of any finite automaton is longer than the number of states it has, there must be a cycle in that computation path.

Every regular language satisfies the pumping condition.

Pumping condition: There exists an integer \(p\) such that for every string \(w \in L\), with \(|w| \geq p\), there exist strings \(x, y, z \in \Sigma^\star\) with \(w = xyz, y \neq \epsilon, |xy| \leq p\) such that for all \(i \geq 0\), \(xy^iz \in L\).

Negation of pumping condition: For all integers \(p\), there exists a string \(w \in L\), with \(|w| \geq p\), for all \(x, y, z \in \Sigma^\star\) with \(w = xyz, y \neq \epsilon, |xy| \leq p\), there exists \(i \geq 0, i \neq 1\) such that \(xy^iz \notin L\).

Limitations of finite automata:

Context-Free Languages

A pushdown automaton is a 6-tuple \(M = (Q, \Sigma, \Gamma, \delta, q_0, F)\), where

Labels: \(a, b \rightarrow c\): if input symbol is \(a\), and top of stack is \(b\), pop it and push \(c\). In other words, input symbol read, stack symbol popped \(\rightarrow\) stack symbol pushed, e.g. \(0, \epsilon \rightarrow \$\).

Suppose \(u, v, w\) are strings of variables and terminals, and there is a rule \(A \rightarrow w\). From the string \(uAv\), we can obtain \(uwv\). We write \(uAv \rightarrow uwv\), and say \(uAv\) yields \(uwv\).

If \(u_1 \rightarrow u_2 \rightarrow \dots \rightarrow u_k\), then \(u_1 \rightarrow^\star u_k\), or \(u_1\) derives \(u_k\). There must be a finite number of arrows between \(u_1\) and \(u_k\).

Given a grammar \(G\), the language derived by the grammar is \(L(G) = \{w \in \Sigma^\star : S \rightarrow^\star w \text{ and } S \text{ is the start variable}\}\)

Context-free grammar: the lhs of rules is a single variable, rhs is any string of variables and terminals. A context-free language is one that can be derived from a context-free grammar. An example context-free grammar is \(G = (V, \Sigma, R, \langle \texttt{EXPR} \rangle)\), where \(V = \{\langle \texttt{EXPR} \rangle, \langle \texttt{TERM} \rangle, \langle \texttt{FACTOR} \rangle\}\), \(\Sigma = \{a, +, \times, (, )\}\), and \(R = \{\langle \texttt{EXPR} \rangle \rightarrow \langle \texttt{EXPR} \rangle + \langle \texttt{TERM} \rangle | \langle \texttt{TERM} \rangle, \langle \texttt{TERM} \rangle \rightarrow \langle \texttt{TERM} \rangle \times \langle \texttt{FACTOR} \rangle | \langle \texttt{FACTOR} \rangle, \langle \texttt{FACTOR} \rangle \rightarrow (\langle \texttt{EXPR} \rangle)\}\).

A left-most derivation is a sequence \(S \rightarrow u_1 \rightarrow u_2 \rightarrow \dots \rightarrow u_k \rightarrow w\) where each step applies a rule to the left-most variable. A grammar is ambiguous when it has multiple left-most derivations for the same string.

A language \(L\) is recognized by a pushdown automaton iff \(L\) is described by a context-free grammar.

Context-free languages are closed under union, concatenation, star.

Recognizable Languages

Differences from previous models

A deterministic Turing machine is a 7-tuple \(M = (Q, \Sigma, \Gamma, \delta, q_0, q_{accept}, q_{reject})\), where

Labels: \(a \rightarrow b, R\): if tape symbol is \(a\), write \(b\) and move head right. \(a \rightarrow R\): if tape symbol is \(a\), move head right. \(a, b, c \rightarrow R\): if tape symbol is \(a, b\), or \(c\), move head right.

On input \(x\), a Turing machine can (1) accept, (2) reject, or (3) run in an infinite loop.

The language recognized by a Turing machine \(M\) is \(L(M) = \{x : \text{on input } x, M \text{ halts in } q_{accept}\}\). A language is recognizable if there exists a Turing machine which recognizes it.

Regular languages \(\subseteq\) context-free languages \(\subseteq\) decidable languages \(\subseteq\) recognizable languages

A configuration is a way to describe the entire state of the Turing machine. It is a string \(aqb\) where \(a \in \Gamma^\star, q \in Q, b \in \Gamma^\star\), which indicates that \(q\) is the current state of the Turing machine, the tape content currently is \(ab\) and its head is currently pointing at the first symbol of \(b\). Any Turing machine halts if its configuration is of the form \(aq_{accept}b\), or \(aq_{reject}b\) for any \(ab\). Config(\(i\)) uniquely determines Config(\(i + 1\)).

Every \(k\)-tape Turing machine has an equivalent single tape Turing machine.

If the alphabet of the multitape Turing machine is \(\Gamma\), we can make the single tape Turing machine’s alphabet \((\Gamma \cup \{\#\}) \times \{\texttt{normal}, \texttt{bold}\}\).

A non-deterministic Turing machine is a 7-tuple \(M = (Q, \Sigma, \Gamma, \delta, q_0, q_{accept}, q_{reject})\), where the only difference from a deterministic Turing machine is the transition function \(delta : Q \times \Gamma \rightarrow 2^{Q \times \Gamma \times \{L, R\}}\).

A non-deterministic Turing machine accepts its input iff some node in the configuration tree has \(q_{accept}\). It does not accept its input iff the configuration tree grows forever (infinite loop) or no node in the tree has \(q_{accept}\).

Acceptance of a non-deterministic Turing machine: input \(w\) is accepted if there exist configurations \(c_0, c_1, \dots, c_k\) where

The outcomes could be

A Turing machien is a decider if it halts on all inputs, i.e. it either rejects or accepts all inputs.

Every non-deterministic Turing machine has an equivalent deterministic Turing machine. If that non-deterministic Turing machine is a decider, there is an equivalent deterministic Turing machine decider.

Recognizable languages are closed under union, intersection, concatenation, star.

Implementation level description of a multitape Turing machine for \(L = \{x\#x : x \in \{0, 1\}^\star\}\):

\(\langle O \rangle\) is a string encoding for the object \(O\).

Cardinality of Sets: two sets \(A\) and \(B\) have the same cardinality if there exists a bijection \(f : A \rightarrow B\).

\(\mathbb{N} = \{1, 2, 3, \dots\}\) is the set of all natural numbers. A set is finite if it has a bijection to {1..n} for some natural number \(n\). A set is countably infinite if it has the same cardinality as \(\mathbb{N}\). A set is countable or at most countable if it is finite or countably infinite.

Any language \(L\) is countable.

The set of all Turing machines is countable.

The set \(\mathcal{B}\) of all infinite bit-sequences is not countable.

\(2^{\Sigma^\star}\) is uncountable.


\(A_{TM} = \{\langle M, w \rangle : M \text{ accepts } w\}\) and \(HALT_{TM} = \{\langle M, w \rangle : M \text{ halts on input } w\}\) are recognizable but not decidable.

If \(L\) and \(\overline{L}\) are recognizable, then \(L\) is decidable (and so is \(\overline{L}\)).

\(\overline{A_{TM}}\) is unrecognizable.

Proof template for undecidability via Turing reduction: Reduce a problem known to be undecidable to that language \(L\), usually \(A_{TM}\), i.e. \(A_{TM} \leq_T L\). Assume a Turing machine decider \(R\) for \(L\). Construct \(S\) that decides \(A_{TM}\) using \(R\).

Runtime of a deterministic Turing machine is a function \(f : \mathbb{N} \rightarrow \mathbb{N}\) given by \(f(n) = max_{x \in \Sigma^\star, |x| = n} (\text{no. of steps of } M \text{ on input } x)\).

\(TIME(t(n)) = \{\text{language } L : \exists \text{deterministic Turing machine that} \\ \text{decides } L \text{ in time } O(t(n))\}\).

\(P = \bigcup_{c \geq 0} TIME(n^c)\)

\(EXP = \bigcup_{k \geq 0} TIME(2^{n^k})\)

If \(f : \mathbb{N} \rightarrow \mathbb{N}\) is reasonable and \(f = \Omega(n\log n)\) then \(TIME(f(n)) \subset TIME(f(n)^2)\).

\(P \subset EXP\)

Runtime of a non-deterministic Turing machine is the height of the configuration tree.

\(NTIME(t(n)) = \{\text{language } L : \exists \text{ non-deterministic Turing machine that} \\ \text{decides } L \text{ in time } t(n)\}\)

\(NP = \bigcup_{c > 0} NTIME(n^c)\), i.e. languages for which it is easy to verify membership.

\(P \subseteq NP\)

\(NP \subseteq EXP\)

Verifier-based definition for \(L \in NP\): there exists a deterministic polytime Turing machine \(V\) and a constant \(c\) such that \(L = \{x \in \Sigma^\star : \exists y \in \Sigma^\star, |y| \leq |x|^c, V \text{ accepts } (x, y)\}\).

A function is polytime computable if \(f : \Sigma^\star \rightarrow \Sigma^\star\) if there exists a Turing machine \(M\) that has \(x\) as input, runs for time poly(\(|x|\)) and halts with \(f(x)\) written on the tape.

\(f\) is a polytime reduction from language \(A\) to language \(B\), denoted \(A \leq_P B\) if (1) \(f(A) \subseteq B\), (2) \(f(\overline{A}) \subseteq \overline{B}\), and (3) \(f\) is a polytime computable function.

If \(A \leq_P B\) and \(B \in P\) then \(A \in P\).

A language \(L\) is \(NP\)-hard if \(A \leq_P L\) for all \(A \in NP\). A language \(L\) is \(NP\)-complete if \(L\) is \(NP\)-hard and \(L \in NP\).

If \(P \neq NP\), then there exists language \(L\) such that \(L \notin NP\)-complete, \(L \notin P\), and \(L \in NP\).

If (1) \(B\) is \(NP\)-complete, (2) \(C \in NP\), and (3) \(B \leq_P C\), then \(C\) is \(NP\)-complete.

\(CLIQUE\) is a language whose strings are of the form \(\langle G, k \rangle\), where \(G = (V, E)\) is a graph and \(k \in \mathbb{N}\), for which there exists \(U \subseteq V\) with \(|U| \geq k\) such that \(\{u, v\} \in E\) for all distinct vertices \(u, v \in U\).

\(CLIQUE\) is \(NP\)-complete

\(3SAT \leq_P CLIQUE\)


Reductions from \(3SAT\) often involves gadgets:

\(INDSET\) is a language whose strings are of the form \(\langle H, k \rangle\), where \(H = (V, E)\) is a graph and \(k \in \mathbb{N}\), for which there exists \(U \subseteq V\) with \(|U| = k\) such that \(\forall u, v \in U, \{u, v\} \notin E\).

\(VERTEX-COVER\) is a language whose strings are of the form \(\langle H, t \rangle\), where \(H = (V, E)\) is a graph and \(t \in \mathbb{N}\), for which there exists a set \(C \subseteq V\) with \(|C| \leq t\) such that \(\forall \{u, v\} \in E\), either \(u\), \(v\) or both is in \(C\).

Let \(G = (V, E)\) be a graph. Then \(\overline{G} = (V, \overline{E})\) where \(\overline{E} = \{\{u, v\} : \{u, v\} \notin E\}\).

\(U\) is a clique in \(G\) iff \(\overline{U} = V \setminus U\) is a vertex cover in \(\overline{G}\). This implies \(G\) has a clique of size \(\geq k\) iff \(\overline{G}\) has a vertex cover of size \(\leq n - k\), where \(|V| = n\).



\(SAT\) is \(NP\)-complete via \(a\) where \[\begin{gathered} C = Q \cup \{\#\} \cup \Gamma \\ x_{i, j, s} = true \Leftrightarrow cell[i, j] = s \\ \Phi_{start} = x_{1, 1, \#} \land x_{1, 2, q_{start}} \land x_{1, 3, w_1} \land \dots \\ \land x_{1, n^k-1, \text{\textvisiblespace}} \land x_{1, n^k, \#} \\ \Phi_{cell} = \bigwedge\limits_{i, j = 1}^{n^k} \left(\bigvee\limits_{s \in C} x_{i, j, s} \land \right. \\ \left. \bigwedge\limits_{s, t \in C, 1 \leq i, j \leq n^k} \lnot (x_{i, j, s} \land x_{i, j, t})\right) \\ \Phi_{moves} = \bigwedge\limits_{i, j \geq n^k} \text{(window}[i, j] \text{ is valid)} \\ \Phi_{accept} = \bigvee\limits_{1 \leq i, j \leq n^k} x_{i, j, q_{accept}} \end{gathered}\]

\(coNP = \{\text{language } L : \overline{L} \in NP\}\), i.e. languages for which it is easy to verify non-membership. Machine model for \(L \in coNP\) is when \(x \in L\), all leaves are accepting configurations; otherwise, when \(x \notin L\), there exists one leaf which is a rejecting configuration.

\(coNP\)-complete \(= \{\text{language } B : B \in coNP, \forall A \in coNP, A \leq_P B\}\).

\(NOSAT\) is \(coNP\)-complete.

\(L \in NP\)-complete iff \(\overline{L} \in coNP\)-complete.

Probabilistic Turing Machines

\(RP\), or randomized polynomial time, are the languages \(L\) for which there is a probabilistic Turing machine that, on input \(x\), runs in poly(\(|x|\)) and when \(x \in L\), Pr[reaching accept] \(\geq \frac{1}{2}\); otherwise, when \(x \notin L\), Pr[reaching reject] \(= 1\).

Second definition for \(RP\): it contains languages \(L\) for which there exists a deterministic polytime Turing machine \(V\) such that when \(x \in L\), for at least half of all \(y\) with \(|y| \leq \text{poly}(|x|), V\) accepts \((x, y)\); when \(x \notin L\), for all \(y\) with \(|y| \leq \text{poly}(|x|), V\) rejects \((x, y)\).

Contrast with \(NP\), where \(\forall x \in L\), Pr[reaching accept] \(> 0\), \(\forall x \notin L\), Pr[reaching reject] \(= 1\).

\(RP \subseteq NP\)

\(coRP\): \(\forall x \in L\), Pr[reaching accept] \(= 1\), \(\forall x \notin L\), Pr[reaching reject] \(\geq \frac{1}{2}\).

\(coNP\): \(\forall x \in L\), Pr[reaching accept] \(= 1\), \(\forall x \notin L\), Pr[reaching reject] \(> 0\).

\(BPP\), or bounded error probabilistic polynomial time: \(\forall x \in L\), Pr[reaching accept] \(\geq \frac{2}{3}\), \(\forall x \notin L\), Pr[reaching reject] \(\geq \frac{2}{3}\).

\(RP \subseteq BPP\)

\(coRP \subseteq BPP\)

\(RP(\frac{1}{2}) = RP(\frac{3}{4})\) (proof via amplification)

\(RP\) is closed under composition.

Communication Complexity


The trivial prototol:

Total: \(\log |X| + \log |Z|\) or \(\log |Y| + \log |Z|\)

A communication protocol is a binary tree where each node is labelled by either \(a_v : X \rightarrow \{L, R\}\) or \(b_v : Y \rightarrow \{L, R\}\) and each leaf is labelled by an element of \(Z\). The depth of the protocol tree is the maximum number of bits sent by the protocol.

The deterministic communication complexity of a function \(f\) is \[\begin{aligned} D(f) & = \min\limits_{\text{tree for } f} \left(\max\limits_{(x, y)}(\text{number of bits}) \right) \\ & = \min\limits_{\text{tree for } f} (\text{depth of tree})\end{aligned}\]

\(D(EQ_n) \leq n + 1\)

A rectangle in \(X \times Y\) is a set of the form \(R = A \times B\) where \(A \subseteq X\) and \(B \subseteq Y\). \(R\) is a rectangle iff \((x, y) \in R \land (x', y') \in R \Leftrightarrow (x, y') \in R \land (x', y) \in R\)

Let \(T\) be a protocol tree, \(R_v\) be the set of inputs that causes the protocol to arrive at node \(v\). Then \(R_v\) is a rectangle.

A rectangle is called \(f\)-monochromatic if \(f(x, y)\) is the same for all \((x, y) \in R\).

Let \(R_i \subset X \times Y\) be a rectangle for \(i = 1, \dots, k\). The set \(\mathcal{R} = \{R_1, \dots, R_k\}\) is called an \(f\)-monochromatic partition (into rectangles) if each \(R_i\) is \(f\)-monochromatic, and each \((x, y) \in X \times Y\) is contained in exactly one \(R_i\).

\(C^{\text{partition}}(f) = \min \{|\mathcal{R}| : \mathcal{R} \text{ is an }f\text{-monochromatic partition}\}\)

For any protocol tree \(T\), the rectangles \(\{R : v \text{ is a leaf in } T\}\) are an \(f\)-monochromatic partition.

\(C^{\text{partition}}(f) \leq \min\limits_{\text{protocol tree } T} |\text{number of leaves in } T|\)

\(D(f) \geq \left \lceil \log_2 C^{\text{partition}}(f) \right \rceil\)

A fooling set \(S \subseteq X \times Y\) is a set where all points \((x, y) \in S\) have the same value \(f(x, y) = z\), and for any distinct points \((x, y)\) and \((x', y')\) in \(S\), either \(f(x, y') \neq z\) or \(f(x', y) \neq z\).

\(C^{\text{partition}}(f) \geq |S| + 1\), where \(S\) is a fooling set for \(f\)

\(D(f) \geq \left \lceil \log_2(|S| + 1) \right \rceil\), where \(S\) is a fooling set for \(f\)

\(D(EQ_n) = D(GTE_n) = D(DISJ_n) = n + 1\)

Model for non-deterministic communication complexity:

\(N(f) = \min\limits_{nondet protocol}(\text{length of cert})\). Or, \(N(f) = \min\{k\}\) such that there exist \(A\) and \(B\), for all \(x \in X\), \(y \in Y\), \(f(x, y) = 1 \Rightarrow \exists z \in \{0, 1\}^k, A(x, z) = 1 \land B(y, z) = 1\), \(f(x, y) = 0 \Rightarrow \forall z \in \{0, 1\}^k, A(x, z) = 0 \lor B(y, z) = 0\).

\(N(\lnot DISJ_n) \leq \log n\)

For all \(f\), \(D(f) = D(\lnot f)\).

\(N(\lnot EQ_n) \leq \log(n) + 1\)

Let \(S\) be a fooling set where \(f(x, y) = 1\) for all \((x, y) \in S\). Then \(N(f) \geq \left \lceil \log_2(|S|) \right \rceil\).

\(N(EQ_n) \geq n\)

The set \(\mathcal{R} = \{R_1, \dots, R_k\}\) is a cover of the \(1\)-entries (by rectangles) if (1) each \(R_i\) is a rectangle containing only \(1\)s, and (2) every \((x, y) \in X \times Y\) with \(f(x, y) = 1\) is contained in at least one \(R_i\).

\(C^{1\text{-cover}}(f) = \min \{|\mathcal{R}| : \mathcal{R} \text{ is a cover of the } 1\text{-entries}\}\)

\(C^{0\text{-cover}}(f) = C^{1\text{-cover}}(\lnot f)\).

\(C^{\text{partition}}(f) = C^{1\text{-cover}}(f) + C^{0\text{-cover}}(f)\).

\(N(f) = \left \lceil \log_2(C^{1\text{-cover}}(f) ) \right \rceil\)

\(D(f) \geq N(f)\)

\(D(\lnot f) = N(f)\)

Let \(f : X \times Y \rightarrow \{0, 1\}\) be arbitrary, \(C_0\) be a cover of the \(0\)-entries, and \(C_1\) be a cover of the \(1\)-entries. Then \(D(f) = O(\log C_0 * \log C_1)\).

\(D(f) = O(N(f) * N(\lnot f))\)