Locally Testable Codes and PCPs of Almost-Linear Length
ODED GOLDREICH
Weizmann Institute of Science
AND
MADHU SUDAN
Massachusetts Institute of Technology
Abstract. We initiate a systematic study of locally testable codes; that is, error-correcting codes that
admit very efficient membership tests. Specifically, these are codes accompanied with tests that make
a constant number of (random) queries into any given word and reject non-codewords with probability
proportional to their distance from the code.
Locally testable codes are believed to be the combinatorial core of PCPs. However, the relation
is less immediate than commonly believed. Nevertheless, we show that certain PCP systems can be
modified to yield locally testable codes. On the other hand, we adapt techniques that we develop for
the construction of the latter to yield new PCPs.
Our main results are locally testable codes and PCPs of almost-linear length. Specifically, we prove
the existence of the following constructs:
Locally testable binary (linear) codes in which k information bits are encoded by a codeword of
length k · exp(
˜
O(
(log k))). This improves over previous results that either yield codewords of
exponential length or obtained almost quadratic length codewords for sufficiently large nonbinary
alphabet.
PCP systems of almost-linear length for SAT. The length of the proof is n · exp(
˜
O(
(log n))) and
verification in performed by a constant number (i.e., 19) of queries, as opposed to previous results
that used proof length n
(1+O(1/q))
for verification by q queries.
The novel techniques in use include a random projection of certain codewords and PCP-oracles
that preserves local-testability, an adaptation of PCP constructions to obtain “linear PCP-oracles” for
proving conjunctions of linear conditions, and design of PCPs with some new soundness properties—a
direct construction of locally testable (linear) codes of subexponential length.
An extended abstract of this article appeared in Proceedings of 43rd Symposium on Foundations of
Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 2002.
A preliminary full version [Goldreich and Sudan 2002] appeared on ECCC.
The research of O. Goldreich was supported by the MINERVA Foundation, Germany.
The research of M. Sudan was supported in part by National Science Foundation (NSF) awards CCR-
9875511, and CCR-9912342, and MIT-NTT Award MIT 2001–04.
Authors’ addresses: O. Goldreich, Department of Computer Science, Weizmann Institute of Science,
Rehovot, Israel, e-mail: [email protected]; M. Sudan, MIT CSAIL, 32 Vassar Street,
Cambridge, MA 02139, e-mail: [email protected].
Permission to make digital or hard copies of part or all of this work for personal or classroom use is
granted without fee provided that copies are not made or distributed for profit or direct commercial
advantage and that copies show this notice on the first page or initial screen of a display along with the
full citation. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute
to lists, or to use any component of this work in other works requires prior specific permission and/or
a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701,
New York, NY 10121-0701, USA, fax +1 (212) 869-0481, or [email protected].
C
2006 ACM 0004-5411/06/0700-0558 $5.00
Journal of the ACM, Vol. 53, No. 4, July 2006, pp. 558–655.
Locally Testable Codes and PCPs of Almost-Linear Length 559
Categories and Subject Descriptors: F.1.3 [Computation by Abstract Devices]: Complexity Mea-
sures and Classes
General Terms: Algorithms, Theory, Verification
Additional Key Words and Phrases: Proof verification, probabilistically checkable proofs, error-
correcting codes, derandomization
1. Introduction
Locally testable codes are (good) error-correcting codes that admit very efficient
codeword tests. Specifically, the testing procedure makes only a constant number of
(random) queries, and should reject non-codewords with probability proportional
to their distance from the code.
Locally testable codes are related to Probabilistically Checkable Proofs (PCPs,
cf. Arora et al. [1998], Arora and Safra [1998], Babai et al. [1991a], and Feige et al.
[1996]) and to Property Testing (cf. Goldreich et al. [1998] and Rubinfeld and Sudan
[1996]). Specifically, locally testable codes can be thought of as a combinatorial
counterparts of the complexity theoretic notion of PCPs, and in fact the use of codes
with related features is implicit in known PCP constructions. Local testability of
codes is also a special case of property testing, and indeed the first collection of
properties that were shown to be testable also yield constructions of locally testable
codes [Blum et al. 1993].
Locally testable codes were introduced in passing, by Friedl and Sudan [1995]
and Rubinfeld and Sudan [1996]. However, despite the central role of locally testable
codes in complexity theoretic and algorithmic research, they have received little
explicit attention so far. The primary goal of this work is to initiate a systematic
study of locally testable codes. In particular, we focus on the construction of locally
testable codes over a binary alphabet and on the development of techniques to
reduce the alphabet size of locally testable codes. Studying the length of locally
testable codes, we obtain for the first time (even for nonbinary alphabets), codes of
almost-linear length.
Some Well-Known Examples. To motivate some of the parameters of concern,
we start by considering some “trivial codes” that are easily testable. For example,
the “code” that contains all strings of a given length is trivially testable (by accepting
each string without even looking at it). It is also easy to test the “code” that consists
of a single codeword (e.g., given an arbitrary string w , pick a random index i and
verify that w and the single codeword agree at the the ith coordinate). Thus, the
concept of locally testable codes is interesting mainly in the case of “good” codes;
that is, codes that have “many” codewords that are pairwise at “large” distance from
each other.
One nontrivial code allowing efficient testing is the Hadamard code: the code-
words are linear functions represented by their values on all possible evaluation
points. The number of codewords in Hadamard codes grows with the length of
the code, and the pairwise distance between codewords is half of the length of the
code. So this code does not admit trivial tests as above. It turns out that in this case
codeword testing amounts to linearity testing [Blum et al. 1993], and this can be
performed efficiently, though the analysis is quite nontrivial.
The drawback of the Hadamard code is that k bits of information are encoded
using a codeword of length 2
k
. (The k information bits represent the k coefficients
560 O. GOLDREICH AND M. SUDAN
of a linear function {0, 1}
k
→{0, 1}, and bits in the codeword correspond to all
possible evaluation points.)
A Basic Question. The question addressed in this work is whether one can
hope for a better relation between the number of information bits, denoted k, and
the length of the codeword, denoted n. Specifically, can n be polynomial or even
linear in k? For a sufficiently large nonbinary alphabet, Friedl and Sudan [1995]
showed that n can be made nearly quadratic in k. The main contribution of this
article is the demonstration of the existence of locally testable codes in which n is
almost-linear in k (i.e., n = k
1+o(1)
), even for the binary alphabet.
In Section 2.1, we provide precise definition of locally testable codes and state
our main result regarding them. But before doing so, we discuss the relation between
locally testable codes and three other notions (i.e., PCP, property testing and locally
decodable codes).
1.1. R
ELATION TO PCP. As mentioned earlier, locally testable codes are closely
related to Probabilistically Checkable Proofs (PCPs). Recall that a PCP system is
defined by a (probabilistic) verifier that is given a pair of strings—a purported
theorem (assertion) and a claimed proof (evidence)—such that if the theorem is
true, then there exists a proof such that the verifier accepts; and if the assertion
is not true then no evidence causes the verifier to accept (with high probability).
Furthermore, PCP verifiers achieve their goals by making only a small number of
queries to the proof, which is given as an oracle. The PCP Theorem [Arora et al.
1998; Arora and Safra 1998] shows how to construct PCP verifiers that make only
a constant number of queries to the proof-oracle.
PCPs achieve their strong features by implicitly relying on objects related to
locally testable codes. Indeed the construction of codes over large alphabets that
are testable via a small (yet not necessarily constant) number of queries lies at the
heart of many PCPs. It is a common belief, among PCP enthusiasts, that the PCP
Theorem [Arora et al. 1998; Arora and Safra 1998] already provides (binary) locally
testable codes. This belief relates to a stronger property of the proof of the PCP
theorem which actually provides a transformation from standard witnesses for, say
SAT, to PCP-proof-oracles, such that transformed strings are accepted with proba-
bility one by the PCP verifier. When applied to an instance of SAT that is a tautology,
the map typically induces a good error-correcting code mapping k information bits
to codewords of length poly(k) (or almost linear in k, when using Polishchuk and
Spielman [1994]), which are pairwise quite far from each other. The common belief
is that the PCP-verifier also yields a codeword test. However, this is not quite true:
typically, the analysis only guarantee that each passing oracle can be “decoded”
to a corresponding NP-witness, but encoding the decoded NP-witness does not
necessarily yield a string that is close to the oracle. In particular, this allows for
oracles that are accepted with high probability to be far from any valid codeword.
Furthermore, it is not necessarily the case that only codewords pass the test with
probability one. For example, part of the proof-oracle (in typical PCPs) is supposed
to encode an m-variate polynomial of individual degree d, yet the (standard) PCP-
verifier will also accept the encoding of any m-variate polynomial of total degree
m · d (and the “decoding” procedure will work in this case too).
We conclude that the known constructions of PCPs as such do not yield locally
testable codes. However, we show that many known PCP constructions can be
Locally Testable Codes and PCPs of Almost-Linear Length 561
modified to yield good codes with efficient codeword tests. We stress that these
modifications are nontrivial and furthermore are unnatural in the context of PCP.
Yet, they do yield coding results of the type we seek (e.g., see Theorem 2.3).
On the other hand, a technique that emerges naturally in the context of our study
of efficient codeword tests yields improved results on the length of efficient PCPs.
Specifically, we obtain (constant-query) PCP systems that utilize oracles that are
shorter than known before (see Theorem 2.5).
1.2. R
ELATION TO PROPERTY TESTING. Property testing is the study of highly
efficient approximation algorithms (tests) for determining whether an input is close
to satisfying a fixed property. Specifically, for a property (Boolean function) ,a
test may query an oracle x at few positions and accept if (x) is true, and reject with
high probability if (
˜
x) is not true for every
˜
x that is “close” to x. Property testing
was defined in Rubinfeld and Sudan [1996] (where the focus was on algebraic
properties) and studied systematically in Goldreich et al. [1998] (where the focus
was on combinatorial properties).
Viewed from the perspective of property testing, the tester of a local testable
code is a tester for the property of being a member of the code, where the notion of
“closeness” is based on Hamming distance. Furthermore, in the coding setting, it is
especially natural that one is not interested in exactly deciding whether or not the
input is a codeword, but rather in the “approximate” distance of the input from
the code (i.e., whether it is a codeword or far from any codeword). Thus, lo-
cally testable codes are especially well-connected to the theme of property testing.
Indeed the first property tests in the literature (e.g., linearity tests [Blum et al. 1993],
low-degree tests [Babai et al. 1991b, 1991a; Rubinfeld and Sudan 1996; Friedl and
Sudan 1995]) can be interpreted as yielding some forms of locally testable codes.
More recent works on algebraic testing [Bellare et al. 1996; Alon et al. 2003] high-
light the connections to codes more explicitly. Our work also uses the results and
techniques developed in the context of low-degree testing. However, by focusing
on the codes explicitly, we highlight some missing connections. In particular, most
of the prior work focussed on codes over large alphabets and did not show how to
go from testable codes over large alphabets to codes over small alphabets. In this
work, we address such issues explicitly and resolve them to derive our main results.
Furthermore, we focus on codes that can be tested by making a constant number
of queries.
1.3. R
ELATION TO LOCALLY DECODABLE CODES. A task that is somewhat com-
plementary to the task investigated in this article, is the task of local decoding. That
is, we refer to the project of constructing codes that have very efficient (sublinear
time) implicit decoding algorithms. Specifically, given oracle access to a string that
is close to some unknown codeword, the decoding procedure should recover any
desired bit of the corresponding message while making, say, a constant number of
queries to the input oracle. Codes that have such decoding algorithms are called lo-
cally decodable codes. While local testability and local decodability appear related,
no general theorems linking the two tasks are known. In fact, gaps in the perfor-
mance of known constructions for the two tasks suggest that local decodability is
“harder” to achieve than local testability. Our results confirm this intuition:
We show the existence of almost-linear (i.e., n = k
1+o(1)
) length (binary) codes
having codeword tests that make a constant number of queries. In contrast, it was
562 O. GOLDREICH AND M. SUDAN
shown that locally decodable codes cannot have almost-linear length [Katz and
Trevisan 2000]: that is, if q queries are used for recovery then n = (k
1+(1/(q1))
).
For a (large) alphabet that can be viewed as vector space over some field F,we
show almost-linear length F-linear codes having codeword tests that make only
two queries. In contrast, it was shown that F-linear codes that allow for local
decodability by two queries require exponential length [Goldreich et al. 2002].
Specifically, an F-linear code over the alphabet = F
is a linear space
over F (but not necessarily over F
). In our codes (which support two-query
tests) it holds that = exp(
log k) and |F|=O(), while n < k
1+(log k)
0.4999
=
k
1+o(1)
. In contrast, the lower-bound on n (for two-query decoding) established
in Goldreich et al. [2002] assert that n > exp((k(·
)
2
)) in case F = GF(2
),
which yields n > exp((k)) for the relevant values of = exp(
log k) = k
o(1)
and
= log O().
1.4. O
RGANIZATION AND PREVIOUS VERSIONS. Section 2 provides a formal
treatment of locally testable codes and PCPs. It also contains a (formal) statement
of our main results as well as a high-level discussion of our main techniques (Sec-
tion 2.3). In Section 3, we present direct and self-contained constructions of locally
testable codes (albeit not achieving the best results). We stress that these construc-
tions make no reference to PCP, although they do use low-degree tests. Sections 1–3
occupy less than a third of the length of the article.
Our best constructions of locally testable codes are presented in Section 5, where
we adapt standard PCP constructions and combine them with the construction
presented in Section 3.2. This section takes about half of the length of the article.
In Section 4, we adapt some of the ideas presented in Section 3.2 in order to derive
improved PCPs. We stress that Sections 4 and 5 can be read independently of one
another, whereas they both depend on Section 3.2.
Subsequent works and open problems are discussed in Section 6. In particular,
we mention that the subsequent works of Ben-Sasson et al. [2004], Ben-Sasson and
Sudan [2005], and Dinur [2006] do not provide strong codeword tests (but rather
only weak ones).
The current version differs from our preliminary report [Goldreich and Sudan
2002] in several aspects, the most important ones are discussed next.
In Section 2.1, we present two definitions of locally testable codes, whereas only
the weaker one has appeared in Goldreich and Sudan [2002]. Furthermore, in
order to obtain locally testable codes under the stronger definition, we use a
different analysis of the constructions presented in Section 3.2.
Section 5 has been extensively revised, while narrowing the scope of some of
the secondary results (e.g., the two composition theorems (i.e., Theorems 5.13
and 5.16)). These modifications do not effect our main results.
In addition, the presentation has been expanded and high-level overviews (most
notably Sections 2.3 and 5.3.1) were added.
2. Formal Setting
Throughout this work, all oracle machines (i.e., codeword testers and PCP verifiers)
are nonadaptive; that is, they determine their queries based solely on their input and
Locally Testable Codes and PCPs of Almost-Linear Length 563
random choices. This is in contrast to adaptive oracle machines that may determine
their queries based on answers obtained to prior queries. Since our focus is on
positive results, this only makes our results stronger.
Throughout this work, all logarithms are to base 2, and for a natural number n
we denote [n]
def
={1,...,n}. We often use an arbitrary finite set, other than [n], as
an index set to some sequence. For any finite set S, we denote by e
i
: i S the
sequence of e
i
s, where the order in the sequence is induced by an (often unspecified)
total order of the set S.
2.1. C
ODES. We consider codes mapping a sequence of k input symbols into
a sequence of n k symbols over the same alphabet, denoted , which may (but
need not) be the binary alphabet. Such a generic code is denoted by C :
k
n
,
and the elements of {C(a):a
k
}⊆
n
are called codewords (of C). Sometimes,
it will be convenient to view such codes as maps C :
k
× [n] .
Throughout this article, the integers k and n are to be thought of as param-
eters, and may depend on them. Thus, we actually discuss infinite families
of codes (which are associated with infinite sets of possible ks), and whenever
we say that some quantity of the code is a constant we mean that this quan-
tity is constant for the entire family (of codes). In particular, the
rate of a code
is the functional dependence of n on k, which we wish to be almost-linear. Typ-
ically, we seek to have as small as possible, desire that || be a constant (i.e.,
does not depend on k), and are most content when ={0, 1} (i.e., a binary
code).
The distance between n-long sequences over is defined in the natural manner;
that is, for u, v
n
, the distance (u, v) is defined as the number of locations on
which u and v differ (i.e., (u, v)
def
=|{i : u
i
= v
i
}|, where u = u
1
···u
n
n
and v = v
1
···v
n
n
). The relative distance between u and v, denoted δ(u, v), is
the ratio (u, v)/n. To avoid technical difficulties, we define the distance between
sequences of different length to equal the length of the longer sequence.
The distance of a code C :
k
n
is the minimum distance between its
codewords; that is, min
a=b
{(C(a), C(b))}. Throughout this work, we focus on
codes of “large distance”; specifically, codes C :
k
n
of distance (n).
The distance of w
n
from a code C :
k
n
, denoted
C
(w), is the mini-
mum distance between w and the codewords; that is,
C
(w)
def
= min
a
{(w, C(a))}.
An interesting case is of noncodewords that are “relatively far from the code”,
which may mean that their distance from the code is greater than (say) a third of
the distance of the code.
We will sometimes say that w
n
is -far from v (respectively, from the code
C), meaning that (w , v) · n (respectively,
C
(w) · n). Similarly, we say
that w is
-close from v (respectively, from C)if(w, v) · n (respectively,
C
(w) · n). Note that we have allowed w to be both -far and -close to v
(respectively, C) in case its relative distance to v (respectively, C)isexactly .
2.1.1. Codeword Tests: Weak and Strong Versions. Loosely speaking, by a
codeword test (for the code C :
k
n
) we mean a randomized (nonadap-
tive) oracle machine, called a tester, that is given oracle access to w
n
(viewed
as a function w :[n] ). The tester is required to (always) accept every code-
word and reject with (relatively) high probability every oracle that is “far” from
the code. Indeed, since our focus is on positive results, we use a strict formulation
564 O. GOLDREICH AND M. SUDAN
in which the tester is required to accept each codeword with probability 1. (This
corresponds to “perfect completeness” in the PCP setting.)
The following two definitions differ by what is required from the tester in case
the oracle is not a codeword. The weaker definition (which is the one that appears in
our preliminary report [Goldreich and Sudan 2002]) requires that for every w
n
,
given oracle access to w, the tester rejects with probability (
C
(w)/n) o(1). An
alternative formulation (of the same notion) is that, for some function f (n) = o(n),
every w
n
that is at distance greater than f (n) from C is rejected with probability
(
C
(w)/n). Either way, this definition (i.e., Definition 2.1) effectively requires
nothing with respect to noncodewords that are relatively close to the code (i.e.,
are ( f (n)/n)-close to C). A stronger and smoother definition (i.e., Definition 2.2)
requires that every noncodeword w is rejected with probability (
C
(w)/n).
Definition 2.1 (Codeword Tests, Weak Definition). A randomized (nonadap-
tive) oracle machine M is called a weak codeword test for C :
k
n
if it
satisfies the following two conditions:
(1) Accepting codewords: For any a
k
, given oracle access to w = C(a),
machine M accepts with probability 1. That is, Pr[M
C(a)
(k, n,) =1] = 1,
for any a
k
.
(2) Rejection of noncodeword: For some constant c > 0 and function f (n) = o(n),
for every w
n
, given oracle access to w , machine M rejects with probability
at least (c ·
C
(w) f (n))/n. That is, Pr[M
w
(k, n,) =1] (c ·
C
(w)
f (n))/n, for any w
n
.
We say that the code C :
k
n
is weakly locally testable if it has a weak
codeword test that makes a constant number of queries.
Definition 2.2 (Codeword Tests, Strong Definition). A randomized (nonadap-
tive) oracle machine M is called a strong codeword test for C :
k
n
(or just
a codeword test for C :
k
n
) if it satisfies the following two conditions:
(1) Accepting codewords: As in Definition 2.1, for any a
k
, given oracle access
to w = C(a), machine M accepts with probability 1.
(2) Rejection of noncodeword: For some constant c > 0 and for every w
n
,
given oracle access to w
n
, machine M rejects with probability at least
c ·
C
(w)/n. That is, Pr[M
w
(k, n,)=1] c ·
C
(w)/n, for any w
n
.
We say that the code C :
k
n
is locally testable if it has a strong codeword
test that makes a constant number of queries.
Our constructions satisfy the stronger definition (i.e., Definition 2.2), but we
consider the weaker definition (i.e., Definition 2.1) to be of sufficient interest to
warrant presentation here. Furthermore, in two cases (i.e., in the proof of Claim 3.3.2
and in Section 5.1), we find it instructive to establish the weak definition before
turning to the strong one.
We comment that one may consider various natural variants on the two defini-
tions. For example, in both cases, we have required that the rejection probability
grows linearly with the distance of the oracle from the code. More generally, one
may consider requiring a slower (e.g., polynomial) growth rate. Another example is
relaxing our requirement that every codeword is accepted with probability 1. More
generally, one may allow codewords to be rejected with some small probability.
Locally Testable Codes and PCPs of Almost-Linear Length 565
(Note that this relaxation (with respect to codewords) may be odd if coupled with
the stronger definition regarding non-codewords (i.e., the one in Definition 2.2).)
Relation to Property Testing. Codeword tests are indeed a special type of prop-
erty testers (as defined in Rubinfeld and Sudan [1996] and Goldreich et al. [1998]).
However, in the “property testing” literature one typically prefers providing the
tester with a distance parameter and requiring that the tester rejects all objects that
are that far from the property with probability at least 2/3 (rather than with probabil-
ity proportional to their distance). In such a case, the query complexity is measured
as a function of the distance parameter and is constant only when the latter pa-
rameter is a constant fraction of the maximum possible distance. Strong codeword
testers yield property testers with complexity that is inversely proportional to the
distance parameter, whereas the complexity of testers derived from weak codeword
tests is “well behaved” only for large values of the distance parameter.
2.1.2. Our Main Results. Our main result regarding codes is the following:
T
HEOREM 2.3 (LOCALLY TESTABLE BINARY CODES OF k
1+o(1)
LENGTH). For
infinitely many k’s, there exist locally testable codes with binary alphabet such
that n = exp(
˜
O(
log k)) · k = k
1+o(1)
. Furthermore, these codes are linear and
have distance (n).
Theorem 2.3 (as well as Part 2 of Theorem 2.4) vastly improves over the
Hadamard code (in which n = 2
k
), which is the only locally testable binary code
previously known. Theorem 2.3 is proven by combining Part (1) of the following
Theorem 2.4 with nonstandard modifications of standard PCP constructions. We
emphasize the fact that Theorem 2.4, which is weaker than Theorem 2.3, is proven
without relying on any PCP construction.
THEOREM 2.4 (WEAKER RESULTS PROVED BY DIRECT/SELF-CONTAINED
CONSTRUCTIONS
).
(1) For infinitely many ks, there exist locally testable codes with nonbinary al-
phabet such that n = exp(
˜
O(
log k)) · k = k
1+o(1)
and log ||=
exp(
˜
O(
log k)) = k
o(1)
. Furthermore, the tester makes two queries and the
code is F-linear,
1
where = F
.
(2) For every c > 1 and infinitely many k’s, there exist locally testable codes over
binary alphabet such that n < k
c
. Furthermore, the code is linear.
In both cases, the codes have distance (n).
Part (1) improves over the work of Friedl and Sudan [1995], which only yields
n = k
2+o(1)
.
The set of ks for which Theorems 2.3 and 2.4 hold is reasonable dense; in
all cases, if k is in the set then the next integer in the set is smaller than k
1+o(1)
.
Specifically, in Part (1) (respectively, Part (2)) of Theorem 2.4, if k is in the set
then the next integer in the set is smaller than exp((log k)
0.51
) · k (respectively,
O(poly(log k) ·k)).
1
A code over the alphabet = F
is called F-linear code if its codewords form a linear space over
F (but not necessarily over F
).
566 O. GOLDREICH AND M. SUDAN
Caveat. Theorems 2.3 and 2.4 are proven via the probabilistic method, and
thus do not yield an explicit construction. Such a construction has been found
subsequently by Ben-Sasson et al. [2003b]. (See further discussion in Section 6.)
Comment. The result of Theorem 2.3 holds also when using testers that make
three queries. On the other hand, (good) binary codes cannot be tested using two
queries (cf. Ben-Sasson et al. [2003a]).
2.2. PCP: S
TANDARD DEFINITIONS AND NEW RESULTS. Following Bellare
et al. [1998], we consider PCP systems for promise problems (cf. Even et al.
[1984]). (Recall that a promise problem is a pair of nonintersecting subsets of {0, 1}
,
which do not necessarily cover {0, 1}
.) A probabilistic checkable proof (PCP) sys-
tem for a promise problem = (
YES
,
NO
) is a probabilistic polynomial-time
(nonadaptive) oracle machine (called verifier), denoted V , satisfying
Completeness. For every x
YES
, there exists an oracle π
x
such that V ,on
input x and access to oracle π
x
, always accepts x.
Soundness.oreveryx
NO
and every oracle π , machine V , on input x and
access to oracle π, rejects x with probability at least
1
2
.
Actually, we will allow the soundness error to be a constant that is arbitrary
close to
1
2
.
As usual, we focus on PCP systems with logarithmic randomness complexity and
constant query complexity. This means that, without loss of generality, the length of
the oracle is polynomial in the length of the input. However, we aim at PCP systems
that utilize oracles that are of almost-linear length. Our main result regarding such
PCP systems is the following
T
HEOREM 2.5. There exists an almost-linear time randomized reduction of SAT
to a promise problem that has a 19-query PCP system that utilizes oracles of length
exp(
˜
O(
log n)) ·n = n
1+o(1)
, where n is the length of the input. Furthermore, the
reduction maps k-bit inputs to n-bit inputs such that n = exp(
˜
O(
log k)) · k =
k
1+o(1)
.
This should be compared to the PCP system for SAT of Polishchuk and Spielman
[1994] that when utilizing oracles of length n
1+
makes O(1/) queries. In contrast,
our PCP system utilizing oracles of length n
1+o(1)
while making 19 queries.
Caveat. Theorem 2.5 does not yield a PCP for SAT, but rather a PCP for a
promise problem to which SAT can be reduced (via a randomized reduction that runs
in almost-linear time). (The reduction merely augments the input by a random string
of an adequate length; thus allowing the application of a probabilistic argument
analogous to the one underlying the proof of Part (1) of Theorem 2.4.) A PCP for
SAT itself has been found subsequently by Ben-Sasson et al. [2003b].
2.3. O
UR TECHNIQUES. In this section, we highlight some of the techniques
used in this article.
Random Projection of Codes and PCPs. We derive locally testable codes
(respectively, PCPs) of shorter length by showing that a random projection of
the original codewords (respectively, proofs) on a smaller number of coordinates
maintains the testability of the original, construct. In Section 3.2, this process is
applied to a specific code; in Section 4.1, it is applied to any two-prover one-round
proof system; and in Section 4.2, it is applied to certain three-prover proof systems.
Locally Testable Codes and PCPs of Almost-Linear Length 567
In retrospect, one may say that in all cases we show that in certain multiprover
proof systems one may randomly “trim” all the provers to the “size” of the smallest
one, where the size of a prover is defined as the length of (the explicit description
of) its strategy (which, in turn, is exponential in the length of the queries that the
prover answers).
Extending the Paradigm of Code-Concatenation to Codeword Testing. In the
1960s, the notion of concatenated codes was introduced by Forney [1966] as a
technique for reducing the alphabet size of codes. Our constructions of locally
testable codes extend this technique by showing that in certain cases codeword
testers for the “outer” and “inner” codes yield a codeword tester for the concatenated
code. Specifically, we refer to cases where the inner code allows direct access to
the values checked by the tester of the outer code, and furthermore that this direct
access supports self-correction (cf. Blum et al. [1993]). Two examples appear in
Sections 3.3 and 3.4, respectively. We also explore the related composition of locally
testable codes with (inner-verifier) PCPs; see Section 5.3.
Developing a Theory of PCPs for Linear Assertions. When composing a
locally-testable code with an inner-verifier, we may obtain a locally-testable code
over a smaller alphabet, but will this code preserve the linearity of the original
code? This seems to require that the inner-verifier uses proof-oracles that are linear
in the main input, which seems plausible when the assertion itself is linear (i.e.,
asserts that the input resides in some linear subspace). The suitable definitions and
composition results are developed in Section 5.3, while, in Section 5.4, we show
that known PCP constructions can be modified to maintain the linearity of the
assertions.
Two Notions of Strong Soundness for PCP. When composing a locally-testable
code with an inner-verifier, we may preserve the strong testability of the original
codeword test if the inner-verifier satisfies two (adequate) “strong” soundness con-
ditions. The first condition requires the rejection of “noncanonical” proofs, whereas
the second condition requires the rejection of nonproofs with probability propor-
tional to their distance from a valid proof. We believe that these notions may be of
independent interest, and refer the reader to Section 5.3.1 for a general presentation
of these notions.
We comment that unexpected technical problems arise when composing such
PCPs with themselves (respectively, with locally testable codes): the issue is the
preservation of strong soundness (rather than standard soundness) by the compo-
sition. This issue is addressed in Section 5.3.4 (respectively, Section 5.3.3).
3. Direct Constructions of Short Locally Testable Codes
In this section, we prove Theorem 2.4. In particular, for every c > 1, we present
locally testable codes that map k bits of information to codewords of length k
c
.
These codes are presented in a direct and self-contained manner (without using
any general PCPs). Although we do not use any variant of the PCP Theorem,
our constructions are somewhat related to known PCP constructions in the sense
that we use constructs and analyses that appear, at least implicitly, in the “PCP
literature” (e.g., in Arora et al. [1998] and Arora and Safra [1998]). Specifically,
we will use results regarding “low-degree tests” that were proven for deriving the
PCP Theorem [Arora et al. 1998; Arora and Safra 1998]. We stress that we do not
568 O. GOLDREICH AND M. SUDAN
use the more complex ingredients of the proof of the PCP Theorem; that is, we
neither use the (complex) parallelization procedure nor the “proof-composition”
paradigm of Arora and Safra [1998] and Arora et al. [1998]. We note that the proof-
composition paradigm is more complex than the classical notion of concatenated
codes [Forney 1966] used below.
We start by describing (in Section 3.1) a code over a large alphabet, which we
refer to as the FS/RS code. This code, which is a direct interpretation of low-
degree tests, was proposed by Friedl and Sudan [1995] and Rubinfeld and Sudan
[1996]. The length of codewords (in this code) turns out to be nearly quadratic in
the length of the encoded information (even when using the best possible analysis
of low-degree tests). To reduce the length of the code to being nearly linear, we
introduce (in Section 3.2) a “random projection” technique. This establishes Part (1)
of Theorem 2.4 (which refers to codes over large alphabets, and will be used to
establish Theorem 2.3). In Sections 3.3 and 3.4, we apply the “code concatenation”
technique to reduce the alphabet size of the codes, while preserving local testability.
Specifically, in Section 3.3 we obtain locally testable codes over a much smaller
(albeit nonbinary) alphabet, whereas in Section 3.4 we obtain a binary code, thus
establishing Part 2 of Theorem 2.4.
3.1. T
HE BASIC CODE (FS/RS-CODE). The FS/RS code is based on low-degree
multivariate polynomials over finite fields. We thus start with the relevant pre-
liminaries. Let F be a finite field, and m, d be integer parameters such that
m d < |F|. Denote by P
m,d
the set of m-variate polynomials of total degree d
over F. We represent each p P
m,d
by the list of its
m+d
m
coefficients; thus,
|P
m,d
|=|F |
(
m+d
m
)
< |F|
O(d/m)
m
, (1)
where the inequality holds because m d and
2d
m
< (2d)
m
/(m!) = O(d/m)
m
.
Denote by L
m
the set of lines over F
m
, where each line is defined by two points
a, b F
m
; that is, for a = (a
1
,...,a
m
) and b = (b
1
,...,b
m
), the line
a,b
consists
of the set of |F| points {
a,b
(t)
def
= ((a
1
+ tb
1
),...,(a
m
+ tb
m
)) : t F}.
The Code. We consider a code C : P
m,d
|L
m
|
, where = F
d+1
; that is,
C assigns each p P
m,d
a(|L
m
|-long) sequence of -values. For every p P
m,d
,
the codeword C( p) is a sequence of |L
m
| univariate polynomials, each of degree
d, such that the element in the sequence associated with L
m
is the univariate
polynomial that represents the values of the polynomial p : F
m
F on the line .
We view L
m
as the set of indices (or coordinates) in any w
|L
m
|
; that is, we view
w as a function from L
m
to . Thus, for any L
m
, we denote by w() the symbol
in w having index . Viewing the code C as a mapping C : P
m,d
×L
m
such that
C( p, ·) is the encoding (or codeword) of p P
m,d
, we have that for every
a,b
L
m
the univariate polynomial q
a,b
= C( p,
a,b
) satisfies q
a,b
(z) = p(
a,b
(z)), where
p(
a
1
,...,a
m
,b
1
,...,b
m
(z)) = p((a
1
+ b
1
z),...,(a
m
+ b
m
z)). Note that, indeed, if p has
total degree d then, for every
a,b
L
m
, the univariate polynomial q
a,b
= C( p,
a,b
)
has degree at most d.
Parameters. To evaluate the basic parameters of the code C, let us consider it as
mapping
k
n
, where indeed n =|L
m
|=|F|
2m
and k = log |P
m,d
|/log ||.
Locally Testable Codes and PCPs of Almost-Linear Length 569
Note that
k =
log |P
m,d
|
log ||
=
m+d
d
log |F|
(d +1) log |F|
=
m+d
m
d +1
, (2)
which, for m d, is approximated by (d/m)
m
/d (d/m)
m
. Using |F|=poly(d),
we have n =|F|
2m
= poly(d
m
), and so k (d/m)
m
is polynomially related to
n =|F|
2m
(provided, say, that m <
d). Note that the code has large distance,
because the different C( p)’s tend to disagree on most lines.
The Codeword Test. The test consists of selecting two random lines that share
a random point, and checking that the univariate polynomials associated with these
lines yield the same value for the shared point. That is, to check whether w : L
m
is a codeword, we select a random point r F
m
, and two random lines
,

going through r (i.e.,
(t
) = r and

(t

) = r for some t
, t

F), obtain the
answer polynomials q
and q

(i.e., q
= w(
) and q

= w(

)) and check whether
they agree on the shared point (i.e., whether q
(t
) = q

(t

)). This test is essentially
the one analyzed in Arora et al. [1998], where it is shown that (for |F|=poly(d))
if the oracle is -far from the code then this fact is detected with probability ().
We comment that in Arora et al. [1998] the test is described in terms of two
oracles: a point oracle f : F
m
F (viewed as the primary or “real” input) and a
line oracle g : L
m
F
d+1
(viewed as an auxiliary or additional oracle). Indeed,
we will also revert to this view in our analysis. Unfortunately, using oracles having
different range will complicate the code-concatenation (presented in Section 3.3),
and this is the reason that we maintain explicitly only the line-oracle (and refer to
the point-oracle only in the analysis). Note that a line-oracle can be used to define a
corresponding point-oracle in several natural ways. For example, we may consider
the (random) value given to each point by a random line passing through this point,
or consider the value given to each point by a canonical line passing through this
point.
3.2. R
ANDOM PROJECTION OF THE FS/RS-CODE. Our aim in this section is to
tighten the relationship between k and n in locally testable codes. Starting with the
FS/RS-Code, in order to get the best possible relation between n and k, one needs to
use an analysis (of the low-degree test) that allows for |F|to be as small as possible
when compared to d. Based on the analysis of Polishchuk and Spielman [1994], it
was shown in Friedl and Sudan [1995] that it suffices to use |F|=(d). However,
even with this (best possible) analysis, we are still left with n that is quadratic in
|F|
m
, whereas k = o(d
m
) = o(|F|
m
). This quadratic blowup comes from the fact
that the number of lines (over F
m
) is quadratic in the number of points, which in
turn upper-bounds the number of coefficients of a (generic) m-variate polynomial
(over F). Thus, to obtain n almost-linear in k, we must use a different code.
Overview of Our Construction. Our main idea here is to project the FS/RS code
to a randomly chosen subset of the coordinates. Thus, our code is essentially just
a projection of the FS/RS code to a random subset of lines over F
m
. This subset
will have size that is almost-linear in |F|
m
, and consequently the code will have
almost-linear length. We note that, with overwhelmingly high probability (over the
choices of this random subset), approximately the same number of selected lines
pass through each point of F
m
. It is also easy to see that, with overwhelmingly
high probability, the resulting code maintains the distance properties of the basic
570 O. GOLDREICH AND M. SUDAN
FS/RS-Code. Most of this subsection will be devoted to proving that the resulting
code also maintains the local testability properties of the FS/RS-Code.
The Projected Code. In what follows, we will fix positive integers m, d and a
field F. We will assume log log d m d and |F|=(d). Our code will be over
the alphabet = F
d+1
corresponding to the vector space of univariate polynomials
of degree at most d over F. For the sake of concreteness, we will assume that the
univariate polynomial p(x) =
d
i=0
c
i
x
i
is represented by the vector c
0
,...,c
d
.
Let L = L
m
denote the collection of all lines in F
m
. For a (multi-)set R L,we
define the code C
R
: P
m,d
R
such that, for every p P
m,d
and R
, the th
symbol in the encoding C
R
(p) is the polynomial obtained by restricting p to the
line . In the following definition, we view the code as a mapping from P
m,d
× R
to .
Construction 3.1. Let F be a finite field, m d be integers, and = F
d+1
.
We define C
R
: P
m,d
× R such that, for every p P
m,d
and R L
m
,
it holds that C(p,) is the univariate polynomial that represents the values of the
m-variant polynomial p on the line . That is, for every e F, the polynomial C( p,)
evaluated at e yields the value p((e)).
Thus, our encoding is simply a projection of the FS/RS code to the coordinates
in R, where R is an arbitrary subset of L.
In what follows, we will show that if R is chosen uniformly at random (with
replication from L) and |R|=(m|F|
m
log |F|), then the code C
R
is locally
testable. (To shorten our sentences, we will simply say R is chosen randomly” and
mean that the elements of the multi-set R are chosen uniformly at random from L.)
We next describe the parameters of the code, and then describe the codeword test.
The Basic Parameters. We consider the information length k, the block length
n and relative distance of the code. To compare k with n, let us consider the code
C
R
as a mapping
k
n
, where n =|R|=O(m|F|
m
log |F|) and k =
log |P
m,d
|/log || (as in Eq. (2)). Then, k = (d/m)
m
/d = (d)
m1
/m
m
and,
for |F|=O(d), we have n = O(m|F|
m
log |F|) =
˜
O(O(d)
m
). In this case,
log ||=log |F|
d+1
=
˜
O(d). We highlight two possible settings of the parameters:
(1) Using d = m
m
,wegetk = (d)
m2
= m
m
2
2mo(m)
and n =
˜
O(O(d)
m
) =
m
m
2
+o(m)
, which yields
n = exp(
˜
O(
log k)) · k and log ||=exp(
˜
O(
log k)). (3)
(2) For any constant e > 1, letting d = m
e
, we get k = (m
e
)
m1
/m
m
=
m
(e1o(1))m
and n =
˜
O(O(d)
m
) = m
(e+o(1))m
, which yields
n = k
(e+o(1))/(e1)
and log || < (log k)
e
. (4)
We next show that when |F|=(d) and R is a randomly chosen set of size
(m|F|
m
log |F|), the code has constant relative distance, with overwhelmingly
high probability. This can be proven by upper-bounding the probability that the
distance between any two codewords is too small. However, it is somewhat less
cumbersome to first prove that the code is “linear” in an adequate sense (as defined
below), and next to upper-bound the probability that any (nonzero) codeword has
too small weight. (Furthermore, for sake of later use, we need to establish this
linearity property anyhow.)
Locally Testable Codes and PCPs of Almost-Linear Length 571
F-Linearity. We say that a subset C
n
, where = F
d+1
,isF-linear if
C is a linear subspace of (F
d+1
)
n
when viewed as a vector space over F. In other
words, for every x, y C and α, β F, it is the case that αx + βy C, where
αx = (αx
1
,...,αx
n
) for x = (x
1
,...,x
n
) (F
d+1
)
n
and αx
i
denotes the usual
scalar-vector product.
P
ROPOSITION 3.1. For every R, the code C
R
is F-linear. That is, for every
p
, p

P
m,d
and α, β F, it holds that αC
R
(p
, ·) + βC
R
(p

, ·) equals C
R
(q, ·),
for some q P
m,d
.
P
ROOF. Letting p() denote the univariate polynomial representing the values
of the polynomial p when restricted to the line ,wehaveC
R
(p
,) = p
() and
C
R
(p

,) = p

(). Thus, for every R, it holds that
C
R
(αp
+β p

,) =(αp
+β p

)() =αp
() +βp

() =αC
R
(p
,) +βC
R
(p

,),
where the second equality follows from the fact that (αp
+ β p

)(x) = αp
(x) +
βp

(x) for every x F
m
. Hence, C
R
(αp
+β p

, ·) = αC
R
(p
, ·) +βC
R
(p

, ·), and
the proposition follows (indeed, with q = αp
+ β p

).
The Relative Distance of C
R
. We now turn back to analyze the relative distance
of C
R
.
P
ROPOSITION 3.2. With probability 1o(1), for a randomly chosen R, the code
C
R
has relative distance at least δ = (1 d/|F|) > 0.
We mention that the error probability in this proposition is exponentially vanish-
ing (as a function of |F|
m
).
P
ROOF. Intuitively, the code C
L
has relative distance at least 1 d/|F|, and so
projection on a random subset of coordinates should leave it with relative distance
at least δ = (1 d/|F|). Below, we formally prove this assertion for δ =
1
2
·(1
d/|F|), but the same argument can be used to establish δ = c ·(1 d/|F|), for any
constant c < 1.
Since the code C
R
is F-linear (see Proposition 3.1), the distance between any two
different codewords is captured by the weight of some nonzero codeword. Thus,
it suffices to lower-bound the weight of all nonzero codewords in C
R
.Wefixa
non-zero polynomial p P
m,d
, and consider the corresponding codeword C
R
(p).
Our aim is to prove that the probability that C
R
(p) has relative weight less than δ
is at most o(|P
m,d
|
1
).
We first consider C
L
(p). By the well-known property of multivariate polynomials,
we have that p evaluates to nonzero values on at least 1d/|F|fraction of the points
in F
m
. Extending this fact to lines, we can infer immediately that the restriction
of p toa1 d/|F|=2δ fraction of the lines is nonzero. (This is true since one
can sample a random line by picking a random point x and picking a random line
through x, and if the p is nonzero at x, it must be nonzero on the line.) So in order
for C
R
(p) to have fewer than a δ fraction of nonzero coordinates, it must be that p
is non-zero on fewer than a δ fraction of the lines in R. But we also have that the
expected fraction of lines in R where p is non-zero, when R is chosen at random, is at
572 O. GOLDREICH AND M. SUDAN
least 2δ. Applying (the multiplicative) Chernoff Bound,
2
we get that the probability
that this fraction turns out to be less than δ when R is chosen at random, is at most
exp((δ|R|)) = o(|F|
−|F|
m
) = o(|P
m,d
|
1
). Thus, the probability that C
R
(p) has
relative weight less than δ is at most o(|P
m,d
|
1
). Taking the union bound over all
possible polynomials p, we conclude that the probability that C
R
has a codeword
of weight less than δ is at most o(1).
We now move to describing the codeword test.
The Codeword Test. The test for the code C
R
is a variant of the points-vs-lines
test (cf. Arora et al. [1998]) that accesses two oracles, one giving the value of a
function f : F
m
F and the other supposedly giving the restriction of f to
lines in F
m
. The original test picks a random point x F
m
and a random line
L passing through x and verifies that f (x) agrees with the supposed value of
the restriction of f to the line . In implementing this test, we modify it in two
ways: Firstly, we do not have the value of the restriction of f to each line, but rather
only to lines in R. So we modify the above test by picking a random R that
passes through x. Secondly, we do not (actually) have oracle access to the value
of f on individual points, but rather the value of the restriction of f to various
lines (i.e., those in R). So we use the values assigned to these lines in order to
define such a point oracle. This can be done in various ways, and we used one of
them.
3
Specifically, given a set of lines R, we associate to each point x F
m
some
fixed line (in R), denoted
x
, that passes through x. Note that we do not assume
that these lines are distinct (i.e., that
x
=
y
for x = y). Also, we do not assume
that such lines exist for each point (i.e., that for every point there are lines passing
thought it). Still, with overwhelmingly high probability, over the choice of R, the
set R covers all points (i.e., each point resides on some line in R). This discussion
leads to the following codeword test.
Construction 3.2. Given oracle access to w : R , which is supposedly a
codeword of C
R
, the test proceeds as follows:
(1) Pick x F
m
uniformly at random, and let
x
R be an arbitrary line that passes
through x. If no such line exists, halt with output 1 (representing accept).
(2) Pick R uniformly among the lines that pass through x.
That is, select R with probability m
x
()/t
x
, where m
x
(
) denotes the
number of occurrences of x on the line
, and t
x
=
R
m
x
(
).
(3) Query w at
x
and , and denote the answers by h
x
= w(
x
) and h = w().
(Recall that h
x
and h are univariate polynomials of degree d.)
2
The (multiplicative) Chernoff Bound (see, e.g., Motwani and Raghavan [1995]) is extensively used in
this work. It refers to independent random variables, denoted ζ
1
,...ζ
n
, where each ζ
i
[0, 1]. Letting
ζ
def
=
1
n
·
n
i=1
ζ
i
denote the average of these random variables, and p
def
=
E
[ζ ] =
1
n
n
i
E
[ζ
i
] denote
its expectation, the bound asserts that, for every γ [0, 1], the probability that ζ is not (1 ± γ ) · p is
exponentially vanishing in (γ
2
pn). That is, Pr[|ζ p| p] < 2 exp(γ
2
pn/3).
3
An alternative to the fixed canonical lines used below, is to use a random line passing through the
point. This defines a randomized function, but the analysis can be applied to it just as well. Indeed, this
would coincide with the “two-line test” analyzed (differently) in our preliminary report [Goldreich
and Sudan 2002].
Locally Testable Codes and PCPs of Almost-Linear Length 573
(4) Point-vs-Line Test: Let α, β F be such that
x
(α) = (β) = x.Ifh
x
(α) =
h(β) then halt with output 1. Otherwise halt with output 0.
(Note that
x
is effectively a query to a point oracle, whereas is indeed a query
to a line oracle.)
The line
x
will be called the canonical line of x.
Note that the codeword test makes two queries to w (i.e., for w(
x
) and w()).
We analyze the codeword test next.
Analysis. It is obvious that the test accepts a valid codeword with probability 1.
Below we give a lower bound on the rejection probability of noncodewords. As
in Proposition 3.2, the lower bound holds for almost all choices of R of size n =
(m|F|
m
log |F|).
L
EMMA 3.3. The following holds, for some n = O(m|F|
m
log |F|): For at least
a 1o(1) fraction of the possible choices of R of size n, every w
n
is rejected by
the codeword test (of Construction 3.2) with probability (δ
C
R
(w)), where δ
C
R
(w)
is the relative distance of w from the code C
R
(i.e., δ
C
R
(w) =
C
R
(w)/n).
The above lemma improves over the probability bound (δ
C
R
(w)) o(1) that
was established in our preliminary report [Goldreich and Sudan 2002] (for a related
test). We mention that the fraction of exceptional sets in Lemma 3.3 can be bounded
by |F|
tm
, where t = n/(m|F|
m
log |F|).
P
ROOF. We start with an overview of the proof. We consider two cases regarding
a point function, denoted f
w
: F
m
F, determined by (some of) the entries of
w (which are univariate polynomials supposedly representing the values of some
polynomial when restricted to the corresponding lines). Specifically, we refer to the
function f
w
: F
m
F defined by setting f
w
(x) according to the value assigned
by w to the canonical line (
x
R) that passes through x. We consider two cases
regarding the distance of f
w
from P
m,d
:
(1) The first case is that this relative distance is large (e.g., larger than one-third of
δ
C
R
(w)). This case is handled by proving that, for all but a o(1) of the choices
of R, the Point-vs-Line Test rejects with probability that is linearly related to
the the distance of f
w
from P
m,d
. (Note that the claim refers only to the portion
of w that is used to define f
w
, and holds regardless of the rest of w.)
The proof of this claim (Claim 3.3.2) is the most interesting part of the current
analysis. It amounts to showing that, for most choices of R, the modified (Point-
vs-Line) low-degree test that selects lines in R performs as well as the original
low-degree test (which selects lines in L). The proof relies on the following
observations:
(a) Each possible function f : F
m
F determines an optimal answer (i.e., a
univariate polynomial) for each possible line-query, which in turn assigns
each possible line-query a “rejection value” that is merely the fraction of
points on the line for which the (optimal) answer disagrees with the value
of f .
(b) The rejection probability of the original low-degree test is linearly related
to the average of these rejection values, where the average is taken over all
lines.
574 O. GOLDREICH AND M. SUDAN
(c) The modified test refers to a (random) set of line-queries, and so its rejection
probability is linearly related to the average of the aforementioned rejection
values, where the average is taken over the said set.
The punch-line is that, for a random set (of adequate size), with overwhelm-
ingly high probability, the average of values assigned to elements in the set
approximates the average of all values. The error probability is sufficiently
small to allow for the application of a (non-straightforward) union bound on
all possible ws; see Step (2) in the proof of Claim 3.3.2.
(2) The second case is that f
w
is relatively close to P
m,d
(e.g., f
w
is δ
C
R
(w)/3-close
to P
m,d
). Suppose that the function f
w
is actually a low-degree polynomial (i.e.,
f
w
P
m,d
). Still, the sequence of univariate polynomial representing the values
of f
w
on the lines in R may be different from the sequence w. This distance
is “accounted for” by the fact that, for all but a o(1) of the choices of R, the
Point-vs-Line Test will cause rejection with probability that is linearly related
to the distance of C
R
( f
w
) from w . The claim can be extended to the general
case in which f
w
is only close to P
m,d
; for details see Claim 3.3.3.
We comment that, in (the first case of) our analysis, the function f
w
is viewed
as the primary object, and w is viewed as a potential proof of the claim f
w
P
m,d
.
This perspective is not natural in the context of testing whether w is a codeword,
because in the latter context w is the primary object and f
w
is an auxiliary object.
Still, this is a legitimate mental experiment. As for the analysis itself, we note that
in the first case the testing features of the low-degree test are used in a natural way
(because f
w
is “far” from being a low-degree polynomial). Indeed, in this case we
refer to the standard analysis of low-degree tests. In contrast, in the second case,
the low-degree test is invoked in a nonstandard situation (i.e., f
w
is “close” to being
a low-degree polynomial), and a straightforward analysis shows that the test will
reject when the proof oracle (i.e., the line oracle) is “far” from being correct (i.e.,
being the restriction of f
w
to R).
Turning to the actual proof, we present some notation first. As above, we view
w as a function from R to . We denote by f
w
: F
m
F the function defined
by the values assigned to points by their canonical lines; that is, f
w
(x) = v if the
polynomial h
x
= w(
x
) assigns the value v to x, where
x
is the canonical line
passing through x (i.e., if
x
(α) = x then v = h
x
(α)). Let p
w
P
m,d
denote
the m-variate degree d polynomial closest to f
w
(breaking ties arbitrarily). Let
δ(w) = δ
C
R
(w) be the (relative) distance of w from the code C
R
. In accordance with
the motivational discussion, we consider the following auxiliary distances:
(1) δ
ldp
(w) denotes the relative distance of f
w
from p
w
(or equivalently from P
m,d
).
(2) δ
agr
(w) denotes the relative distance between the values assigned by p
w
to
lines in R and w itself; that is, δ
agr
(w) = Pr
R
[p
w
() = w
], where (as above)
p
w
(
a,b
) denotes the univariate polynomial in z F that represents p
w
(a +zb).
Using this notation, we have
δ
ldp
(w) =
( f
w
, p
w
)
|F
m
|
and δ
agr
(w) =
(w, C
R
(p
w
))
|R|
(5)
Clearly,
C
R
(w) (w, C
R
(p
w
)), and so δ
agr
(w) δ(w). In Claim 3.3.3, we
will show that the tester (of Construction 3.2) rejects w with probability at least
(δ
agr
(w)/2) δ
ldp
(w), which establishes the lemma in case δ
ldp
(w) δ(w)/3. On
Locally Testable Codes and PCPs of Almost-Linear Length 575
the other hand, in Claim 3.3.2, we will show that the tester (of Construction 3.2)
rejects w with probability at least (δ
ldp
(w)), which will take care of the case
δ
ldp
(w) δ(w)/3. Thus, either way, the lemma follows.
Before proving the aforementioned claims, we establish a useful fact regarding
typical sets R. Specifically, we show that they cover all points almost-uniformly
(see Claim 3.3.1). In particular, such sets will contain canonical lines for all points.
A Tedious Comment. Throughout this work, when we talk about the number of
lines (respectively, selecting a random line) in R that pass through a specific point
x, we actually mean the number of pairs (respectively, selecting a random pair)
of the form (, e) R × F such that (e) = x. Thus, lines that contain multiple
occurrences of a point are counted multiple times and are selected with greater
probability. Indeed, the only lines containing multiple occurrences of a point are
the constant lines, and the reader can safely ignore them (because R is unlikely to
contain more than few such lines). Still, the rest of the analysis (like Step (2) of
Construction 3.2), does refer to the general case (where constant lines occur and
are dealt with using the above convention).
C
LAIM 3.3.1. For all but at most an o(1) fraction of the possible choices of R,
it holds that, for each point x F
m
, there are (1 ±0.1) ·|R|/|F|
m1
lines in R that
pass through x.
We mention that the constant 0.1 is quite arbitrary, and can be replaced
by any other constant >0 (while effecting the hidden constant in |R|=
O(m|F|
m
log |F|)).
P
ROOF. For every fixed x F
m
and e F, we consider the number of lines
R satisfying (e) = x. The expected number of such lines, for a random R,
is exactly |R|/|F|
m
. Using Chernoff Bound (see Footnote 2), we infer that the
probability that the number of such lines deviates from (1 ± 0.1) ·|R|/|F|
m
is
exponentially vanishing in |R|/|F|
m
= (m log |F|). Thus, by a suitable choice
of the latter constant, the aforementioned probability is o(|F|
(m+1)
), and using a
union bound on all possible x F
m
and e F, the claim follows.
For the next claim, we rephrase the Point-vs-Line test in terms of the associated
functions f : F
m
F and g : R , where in our application f = f
w
and
g() = w() (for every R). The test picks x F
m
uniformly at random and
R uniformly among the lines passing through x.Forβ, such that (β) = x,it
verifies that h(β) = f (x), where h is the univariate polynomial g(). Let δ
ld
( f ) =
δ
P
m,d
( f ) denote the relative distance of f from P
m,d
. Indeed, δ
ld
( f
w
) = δ
ldp
(w).
C
LAIM 3.3.2. For all but at most an o(1) fraction of the possible choices of R,
the following holds: For every f : F
m
F and g : R , the probability that
the Point-vs-Line Test rejects the oracle pair ( f, g) is at least (δ
ld
( f )).
In particular, we may conclude that our codeword test rejects any w with proba-
bility at least (δ
ldp
(w)). Note that Claim 3.3.2 does not refer to the distance of g
from being a “consistent” line-oracle (let alone one that corresponds to f ). Thus,
Claim 3.3.2 effectively refers to all possible gs (or rather to the best possible g)
that may be paired with f .
P
ROOF. We prove the claim in two steps. First, we fix f : F
m
F and prove
that for all but exp((δ
ld
( f ) ·|R|)) fraction of Rs, the rejection probability of the
576 O. GOLDREICH AND M. SUDAN
test on input f and any g : R is (δ
ld
( f )). Next, we use a union bound over
an appropriate collection of functions, to prove that no function f is rejected with
probability less than (δ
ld
( f )). An interesting aspect of the second step is that we
analyze the performance of the test on all functions by using a union bound only
on a small fraction of the possible functions.
Step 1—Overview. Following Rubinfeld and Sudan [1996], Arora et al. [1998],
Arora and Safra [1998], Polishchuk and Spielman [1994], and Friedl and Sudan
[1995], we observe that for each possible function f : F
m
F there exists an
optimal strategy for answering all possible line-queries such that the acceptance
probability of the point-vs-line test for oracle pairs ( f, ·) is maximized. Specifically,
for a fixed function f , and each line , the optimal way to answer the line-query is
given by the degree d univariate polynomial that agrees with the value of f on the
maximum number of points of . Thus, the optimal strategy for fooling the point-
vs-line test, when the point-oracle equals f , depends only on f and not on the set
of lines that may serve as possible queries. Furthermore, the rejection probability
of the point-vs-line test is the average of quantities (i.e., the agreements of f with
the best univariate polynomials) that f associates with each of the possible lines.
The latter fact holds not only when the test operates with the set of all lines, but
also when it operates with any set of lines R (as in the claim).
4
The key observation
is that for a random set R, with overwhelmingly high probability, the average over
R of quantities associated with lines in R approximates the average over L of the
same quantities.
Step 1—Details. Fix f : F
m
F an let δ = δ
ld
( f ) denote its distance to the
nearest low-degree polynomial. Let us denote by D
( f ) the fractional disagreement
of f , when restricted to line , with the best univariate polynomial (i.e., the univariate
polynomial of degree d that is nearest to f |
(i.e., f restricted to )). That is,
D
( f )
def
= min
pP
1,d
{Pr
eF
[ f ((e)) = p(e)]}. (6)
Indeed, a polynomial p achieving the minimum in Eq. (6) is an optimal answer to
the line-query . Note that, on input oracles f and g, the rejection probability of the
standard point-vs-line test (which refers to all possible lines), denoted p
L
( f, g), is
lower-bounded by the average of the D
( f )’s over all L (with equality holding
if, for every line L, it holds that g() is a polynomial with maximal agreement
with f |
). A similar observation holds for the Point-vs-Line Test that refers to the
set of lines R, except that now the average is taken over the lines in R. Actually, the
average is weighted according to the probability that the test inspects the different
lines (because a line is selected by uniformly selecting a point and then selecting
a random line that passes through this point). Thus, the rejection probability of the
Point-vs-Line Test that refers to the set R, denoted p
R
( f, g), is lower-bounded by
the weighted average of the corresponding D
( f )’s. Denoting the Point-vs-Line
Test that selects lines in R by T
R
, we state the above fact for future reference:
p
R
( f, g)
R
Pr[ is selected by T
R
] · D
( f ). (7)
4
In the latter case, the average is taken according to the distribution on R that is induced by the test.
Note that this distribution is not necessarily uniform over R.
Locally Testable Codes and PCPs of Almost-Linear Length 577
Indeed, p
L
( f, g)
L
|L|
1
· D
( f ) follows as a special case. Using the best-
known analysis of the standard low-degree test (in particular, using Friedl and Sudan
[1995, Thm. 7] to support the case that |F|=O(d)), we obtain that
5
p
L
( f, g) τ( f )
def
=|L|
1
·
L
D
( f ) = (δ) . (8)
Actually, we only care about the second inequality (i.e., τ ( f ) = (δ), where
δ = δ
ld
( f )). Now, when R is chosen at random (as a set of n lines from L), the
expected value of
τ
R
( f )
def
=|R|
1
·
R
D
( f ) (9)
equals
E
L
[D
( f )] = τ ( f ). By Chernoff Bound (see Footnote 2), we have that
the probability that R is such that τ
R
( f ) ( f )/2 is exponentially small in δ|R|.
That is, for a random set R of n lines, it holds that
( f )Pr
R
[τ
R
( f ) ( f )/2] < exp
(
(δ
ld
( f ) ·|R|)
)
. (10)
In the following two paragraphs we assume that R is such that τ
R
( f ) τ ( f )/2.
Let us assume that R covers all points uniformly; that is, each point resides
on the same number of lines in R (where several appearances on the same line are
counted several times). This implies that our test selects lines uniformly in R. Then,
the rejection probability of our test (i.e., the point-vs-line test for lines uniformly
selected in R), when applied to f and any g, is lower-bounded by the (unweighted)
average of the D
( f )’s over the lines in R (rather than over the set of all lines, L).
It follows that p
R
( f, g) τ
R
( f ) τ ( f )/2 = (δ
ld
( f )). (Recall that p
R
( f, g)
denotes the rejection probability of the test that selects lines in R.)
In the previous paragraph, we have assumed that R covers all points uniformly
(i.e., each point resides on the same number of lines in R). In general, this may not be
the case. Yet, with very high probability, a random set R coversall points in an almost
uniform manner, and this “almost uniformity” suffices for extending the above
analysis. Specifically, we first note that, with overwhelmingly high probability,
each point in F
m
resides on (1 ± 0.1) ·|R|/|F|
m1
lines (see Claim 3.3.1). Next
observe that in the above analysis we assumed that the test selects lines uniformly
in R, whereas our test selects lines in R by selecting uniformly a point and then
selecting a random line passing through this point. However, as formally shown in
the next paragraph, for R as above (i.e., that covers all points “almost uniformly”),
the distribution induced on the selected lines assigns each line in R a probability
of (1 ± 0.1)
1
/|R|. Thus, the rejection probability may be skewed by a factor of
(1 ± 0.1)
1
= (1 ± 0.2) from the value |R|
1
·
R
D
( f ) = τ
R
( f ), which is
analyzed above. We get p
R
( f, g) 0.8 · τ
R
( f ) 0.4 · τ ( f ) = (δ
ld
( f )). Using
5
The inequality |L|
1
·
L
D
( f ) = (δ
ld
( f )) is only implicit in most prior works, but it can also
be inferred from the results that are stated explicitly in them. Specifically, these works only refer to the
rejection probability of the standard test (for the best possible g), showingthat min
g:LP
1,d
{p
L
( f, g)}=
(δ
ld
( f )). (For example, Friedl and Sudan [1995, Thm. 7] assert that p
L
( f, g) min(1/9
ld
( f )/2)
for every f and g, provided |F|=(d).) However, by the above discussion, it is clear that, for the
optimal line oracle, the rejection probability of the standard point-vs-line test equals the average of the
D
( f )’s; that is, for some g
opt
, which depends on f , it holds that p
L
( f, g
opt
) =|L|
1
·
L
D
( f )).
578 O. GOLDREICH AND M. SUDAN
only the second inequality (which holds whenever R covers all points “almost
uniformly”) and referring to Claim 3.3.1, we state the following fact for future
reference.
Pr
R
[( f, g) p
R
( f, g) 0.8 · τ
R
( f )] = 1 o(1) . (11)
It is left to analyze the distribution induced on lines selected from a fixed R (by
the aforementioned process), when R covers all points “almost uniformly”. Recall
that, for a point x, we denote by m
x
() the number of occurrences of x on the
line , and by t
x
=
R
m
x
(). Then, the probability that the non-constant line
= (x
1
,...,x
|F|
) R is selected equals
|F|
i=1
Pr[x
i
is selected] ·
m
x
i
()
t
x
i
=|F
1
|F|
m
·
1
(1 ± 0.1) ·|R|/|F|
m1
which equals (1 ± 0.1)
1
·|R|
1
as claimed. Similarly, a constant line = (x,
...,x) R is selected with probability
1
|F|
m
·
|F|
(1±0.1)·|R|/|F|
m1
, which also satisfies
the claim.
Step 2—Overview. Recall that we have bounded (in Eq. (10)) the fraction of
Rs for which τ
R
( f ) τ ( f )/2 does not hold for (any) fixed f . Our current goal is
to show that, for most Rs, it is the case that τ
R
( f ) τ ( f )/2 holds for every f . This
suffices to complete the proof of the current claim, because we have shown in Step 1
(see Eq. (8) and Eq. (11), respectively) that τ ( f ) = (δ
ld
( f )) holds for all f and that
(for most choices of R) it holds that p
R
( f, g) = (τ
R
( f )) for every pair ( f, g). The
natural approach towards meeting our goal is taking a union bound over all f s that
are δ-far from P
m,d
in order to upper bound the fraction of Rs such that there exists a
function f that is δ-far from P
m,d
for which τ
R
( f ) ( f )/2. The problem is that the
number of such functions is certainly greater than |P
m,d
| > exp((d/m)
m
), whereas
(for a random R) we only have Pr
R
[τ
R
( f ) ( f )/2] < exp((δ|R|)) (and in
fact Pr
R
[τ
R
( f ) ( f )/2] > exp(O(δ|R|))). This is not a problem in case δ is
any positive constant (or more generally if δ|R| > H
2
(δ)·|F|
m
+O(d/m)
m
), which
in turn suffices to establish weak testability (as per Definition 2.1),
6
but we wish to
handle the general case (in order to establish strong testability as per Definition 2.2).
Thus, we cluster these functions according to the low-degree function that is closest
to them, and show that it is enough to analyze one cluster (e.g., the one of the zero
polynomial). The validity of the latter observation relies on properties of the set
P
m,d
that imply that D
( f ) = D
( f + p) holds for every function f , polynomial
p P
m,d
and line . The benefit in the said observation is that we need only
consider the functions that are closest to some fixed polynomial and are δ-far
from it (rather than all functions δ-far from P
m,d
). Thus, we get an upper-bound of
|F|
δ|F|
m
·
|F|
m
δ|F|
m
·exp((δ|R|), which is negligible (because |R||F|
m
log |F|
m
).
Step 2—Details. For any fixed δ
0
> 0, we start by considering the functions
that are at relative distance exactly δ
0
from the zero polynomial. The number of
6
Weak testability is all that was established in our preliminary report [Goldreich and Sudan 2002],
and the stronger analysis that follows is new.
Locally Testable Codes and PCPs of Almost-Linear Length 579
such functions is at most
(|F|−1)
δ
0
|F|
m
·
|F|
m
δ
0
|F|
m
< (|F|
m+1
)
δ
0
|F|
m
= exp(δ
0
·(m +1)|F|
m
log |F|). (12)
On the other hand, by Eq. (10), for any function f , it holds that Pr
R
[τ
R
( f ) <
τ ( f )/2] = exp((δ
ld
( f ) ·|R|)), and if this function is closest to the zero polyno-
mial (i.e., ( f, 0) =
P
m,d
( f )) then δ
ld
( f ) = δ
0
. Thus, using |R|=c·|F|
m
log |F|
m
(for an adequate constant c), the probability (over the choices of R) that there exists
a function f that is closest to the zero polynomial and is at relative distance exactly
δ from it such that τ
R
( f ) ( f )/2 is upper-bounded by
exp(δ · (m + 1)|F|
m
log |F|) ·exp((δ ·|R|)) = exp(2δ|F|
m
log |F|
m
)
< o(|F|
m
) ,
where the last inequality uses δ 1/|F|
m
. Summing over all (the |F|
m
) possible
values of δ, we see that the probability over R, that there exists a function f
that is closest to the zero polynomial (among all polynomials in P
m,d
) such that
τ
R
( f ) ( f )/2iso(1). Thus, we have
Pr
R
[for every f such that ( f, 0) =
P
m,d
( f ) it holds that τ
R
( f )
τ ( f )/2] = 1 o(1) (13)
To conclude the argument, we use properties of the set P
m,d
. Specifically, suppose
that R is such that for every function f
that is closest to the zero polynomial it
holds that τ
R
( f
) τ ( f
)/2. Now, consider an arbitrary function f and let p P
m,d
be the polynomial closest to f . Then, the function f
= f p is closest to the zero
polynomial, and we claim that τ ( f
) = τ( f ) and τ
R
( f
) = τ
R
( f ). These claims
follow from the fact that, for every function f and every polynomial p P
m,d
and for every line , it holds that D
( f ) = D
( f + p) (although the polynomials
selected to achieve the maximum agreement with f and f + p, over the line ,
may be different). Indeed, if q is used to achieve the maximum agreement with f
over the line then q +( p|
) achieves the maximum agreement with f + p, where
p|
is the univariate polynomial obtained by restricting the polynomial p to the
line . Thus, for every function f and p P
m,d
that is closest to f , it holds that
τ
R
( f ) = τ
R
( f p) and τ ( f p) = τ ( f ). Using Eq. (13), we get
Pr
R
[τ
R
( f ) τ ( f )/2 for every f ] = 1 o(1) . (14)
Combining Eq. (11) and Eq. (14), we get
Pr
R
[( f, g) p
R
( f, g) 0.8τ
R
( f ) 0.4τ ( f )] = 1 o(1).
Recalling Eq. (8), which asserts τ( f ) = (δ
ld
( f )) for every f , the claim
follows.
The last claim, which also relates to the Point-vs-Line Test, is also phrased in
terms of the associated functions f : F
m
F and g : R , where in our
application f = f
w
and g() = w() (for every R). (When applied outside
the context of this work, one should note that C
R
(p) is the sequence of univariate
polynomials representing the restriction of the polynomial p to all lines in R.)
C
LAIM 3.3.3. Let R be such that, for each point x F
m
, there are (1 ±0.1) ·
|R|/|F|
m1
lines in R that pass through x. Then, for every f : F
m
F and
580 O. GOLDREICH AND M. SUDAN
g : R , the probability that the Point-vs-Line Test rejects the oracle pair ( f, g)
is at least
1
2
·
(g, C
R
(p))
|R|
( f, p)
|F
m
|
,
where p is the polynomial in P
m,d
that is closest to f .
Claim 3.3.3 will be applied to pairs ( f
w
, w ), in which case (w, C
R
(p
w
)) =
δ
agr
(w) ·|R| and ( f
w
, p
w
) = δ
ldp
(w) ·|F
m
| (recalling that p
w
is the polynomial
closest to f
w
). Consequently, we will infer that the codeword test reject any w
with probability at least (δ
agr
(w)/2) δ
ldp
(w). Needless to say, Claim 3.3.3 will be
invoked only in case δ
ldp
(w)
agr
(w)/2.
P
ROOF. We will first consider what happens when the test is invoked with oracle
access to the pair ( p, g), rather than to the pair ( f, g). The claim will follow by
observing that the test queries the point oracle on a single uniformly distributed
point, and so replacing p by f may reduce the rejection probability by at most the
relative distance between f and p.
As in the proof of Claim 3.3.2, we start by assuming that R covers all points
uniformly (i.e., each point resides on the same number of lines in R). In this case,
the test selects lines uniformly in R. Thus, with probability δ
def
= (g, C
R
(p))/|R|,
the test selects a line such that h
def
= g() does not agree with p on . Now, since
both h and p|
(i.e., the values of p restricted to the line ) are degree d univariate
polynomials (and since they disagree), they disagree on at least |F|−d > 2|F|/3
of the points on . Thus, the test will reject the oracle pair ( p, g) with probability
at least (2/3) · δ.
However, in general, R may not cover all points uniformly. Yet, the claim’s
hypothesis by which R covers all points “almost uniformly” suffices for extending
the above analysis. Specifically (as shown in the proof of Claim 3.3.2), in this case
each line is selected (by the test) with probability (1 ± 0.1)
1
/|R|, and so the test
rejects the oracle pair (p, g) with probability at least 0.8 ·(2δ/3) /2.
So far we have analyzed the behavior of the test with respect to the oracle pair
(p, g), whereas we need to analyze the behavior with respect to the oracle pair
( f, g). Recalling the test makes a single uniformly distributed query to the point
oracle, it follows that test rejects the oracle pair ( f, g) with probability at least
(δ/2) (( f, p)/|F|
m
). The claim follows.
Completing the Proof of Lemma 3.3. We call the set R good if it satisfies the
conclusions of Claims 3.3.1 and 3.3.2. Thus, these claims assert that 1 o(1)
fraction of the possible choices of R are good, and we are going to fix such a good
R for the rest of the discussion. Considering any w
n
, recall that δ
agr
(w)
δ(w)
def
=
C
R
(w)/n and δ
ld
( f
w
) = δ
ldp
(w). If δ
ld
( f
w
) δ(w)/3, then invoking
Claim 3.3.2 (with f = f
w
and g = w) we are done, because (for a good R) the test
rejects with probability (δ
ld
( f )), which in this case is (δ(w)). Otherwise (i.e.,
δ
ld
( f
w
) (w)/3), invoking Claim 3.3.3 (with f = f
w
and g = w), we conclude
that (for a good R) the test rejects with probability (δ
agr
(w)/2) δ
ld
( f
w
) (w)/6,
because δ
agr
(w) δ(w). The lemma follows.
Remark 3.4. In continuation to Footnote 3, we note that the proof of
Lemma 3.3 holds for any choice of a line
x
that passes through x, including a
Locally Testable Codes and PCPs of Almost-Linear Length 581
probabilistic choice. In particular, Lemma 3.3 holds also in the case that
Construction 3.2 is modified such that
x
is selected uniformly among all lines
that passes through x; that is,
x
is selected identically to the way is selected
in Step (2), which means that we select independently and uniformly two lines
that pass through the random point x. The important fact about this modification
is that both lines (i.e., the queries of the tester) are almost uniformly distributed
in R, provided that R covers all points almost uniformly (which we assume and
establish anyhow—see Claim 3.3.1). Specifically, each line in R is selected (as a
query) with probability (1 ± 0.2)/|R|. The constant 0.2 is rather arbitrary, and by
using |R|=O(
2
m|F|
m
log |F|), we can ensure that each line in R is selected
(as a query) with probability (1 ± )/|R|.
Corollary: Part 1 of Theorem 2.4. By the above, with probability 1 o(1)
over the choice of R, the code C
R
:
k
n
has relative constant distance and
is locally-testable (using two queries). Furthermore, by Proposition 3.1, the code
is F-linear where = F
d+1
. Using the first parameter-setting (i.e., d = m
m
),
we establish Part 1 of Theorem 2.4 (see Eq. (3)). In particular, we establish that
for infinitely many ks, there exist two-query testable codes of constant relative
distance over a nonbinary alphabet such that n = exp(
˜
O(
log k)) ·k = k
1+o(1)
and log ||=exp(
˜
O(
log k)) = k
o(1)
.
Remark 3.5. The above code C
R
:
k
n
, where = F
d+1
, can be
constructed only for specific values of k; that is, those given in Eq. (2) as a function
of the parameters m and d. Furthermore, using d = m
m
, we get a construction of
any k that satisfies k = k
1
(m)
def
=
m+m
m
m
/(m
m
+ 1) m
m
2
1
for some m. In this
case, k
1
(m +1) exp(
log k
1
(m)) ·k
1
(m). Using d = m
e
for some e > 1, we get
a construction of any k that satisfies k = k
2
(m)
def
=
m+m
e
m
/(m
e
+ 1) m
(e1)m
for
some m. In this case k
2
(m + 1) < (log k
1
(m))
e
· k
2
(m).
3.3. D
ECREASING THE ALPHABET SIZE. The code C
R
presented in
Construction 3.1 uses quite a big alphabet (i.e., = F
d+1
, where |F|=(d)).
Our aim, in this subsection, is to maintain the local-testability properties of C
R
while
using a smaller alphabet (i.e., F rather than F
d+1
). This is achieved by concate-
nating C
R
(which encodes information by a sequence of n univariate polynomials
over F, each of degree d) with the following inner-code C
that maps F
d+1
to F
n
,
where n
is sub-exponential in k
def
= d +1.
The Inner-Code. For a (suitable) constant integer d
, let k
= h
d
. As a warm-
up, consider the special case of d
= 2. In this case, the code C
maps bilinear
forms in x
i
s and y
i
s (with coefficients c
i, j
: i, j [h]) to the values of these
forms under all possible assignments. That is, C
: F
h
2
F
|F|
2h
maps the se-
quence of coefficients c
i, j
: i, j [h] to the sequence of values v
a
1
,...,a
h
,b
1
,...,b
h
:
a
1
,...,a
h
, b
1
,...,b
h
F where v
a
1
,...,a
h
,b
1
,...,b
h
=
i, j[h]
c
i, j
·a
i
b
j
. Viewing C
as a mapping from F
h
2
×F
2h
to F,wehaveC
((c
1,1
,...,c
h,h
), (a
1
,...,a
h
, b
1
,...,
b
h
)) =
i, j[h]
c
i, j
· a
i
b
j
. In general (i.e., for an arbitrary integer d
1), the
inner-code C
: F
k
F
n
maps d
-multilinear forms in the variables sets
{z
(1)
i
:i [h]},...,{z
(d
)
i
: i [h]} to the values of these d
-multilinear forms under
all possible assignments to these d
h variables. That is, C
maps the se-
quence of coefficients c
i
1
,...,i
d
: i
1
,...,i
d
[h] to the sequence of values
582 O. GOLDREICH AND M. SUDAN
v
a
(1)
1
,...,a
(1)
h
,...,a
(d
)
1
,...,a
(d
)
h
: a
(1)
1
,...,a
(1)
h
,...,a
(d
)
1
,...,a
(d
)
h
F where v
a
(1)
1
,...,a
(1)
h
,...,
a
(d
)
1
,...,a
(d
)
h
=
i
1
,...,i
d
[h]
c
i
1
,...,i
d
·
d
j=1
a
( j )
i
j
. Viewing C
as a mapping from
F
h
d
× F
d
h
to F,wehave
C
(c
1,...,1
,...,c
h,...,h
),
a
(1)
1
,...,a
(1)
h
,...,a
(d
)
1
,...,a
(d
)
h

=
i
1
,...,i
d
[h]
c
i
1
,...,i
d
·
d
j=1
a
( j )
i
j
(15)
Thus, (k
= h
d
and) n
=|F|
d
h
= exp(d
· (k
)
1/d
· log |F|). Using |F|=O(k
)
and d
= O(1), we have n
= exp(
˜
O(k
)
1/d
). Note that the inner-code has relative
distance (1 (d
/|F|)) > 3/4, assuming |F| > 4d
.
Testing the Inner-Code. A valid codeword (viewed as a function from F
d
h
to F) is a multilinear function (in the variable sets {z
(1)
i
: i [h]},...,{z
(d
)
i
:
i [h]}); that is, for each j , a valid codeword is linear in the variables
z
( j )
i
s. Thus, testing whether w : F
d
h
F belongs to the inner-code re-
duces to d
linearity checks. Specifically, for each j, we randomly select r =
(r
(1)
1
,...,r
(1)
h
,...,r
(d
)
1
,...,r
(d
)
h
) F
d
h
, s
( j )
= (s
( j )
1
,...,s
( j )
h
) F
h
and e F,
and check whether or not w (
r
, r
( j )
, r

)+e·w(r
, s
( j )
, r

) = w (r
, r
( j )
+e·s
( j )
, r

),
where
r
= (r
(1)
1
,...,r
(1)
h
,...,r
( j 1)
1
,...,r
( j 1)
h
), r
( j )
= (r
( j )
1
,...,r
( j )
h
), and
r

= (r
( j +1)
1
,...,r
( j +1)
h
,...,r
(d
)
1
,...,r
(d
)
h
). In addition, we also let the test employ
a total low-degree test (to verify that the codeword is a multivariate polynomial of
total-degree d
).
7
The total-low-degree test uses d
+2 queries, and so our codeword
test uses 3d
+(d
+2) = O(d
) queries. For sake of clarity, we provide an explicit
statement of the resulting test.
Construction 3.6. Given oracle access to w : F
d
h
F, which is supposedly
a codeword of C
, the test proceeds as follows:
The Linearity Tests. For j = 1,...,d
, we test whether w is linear in the
jth block of variables. That is, for uniformly selected
r
= (r
(1)
1
,...,r
(1)
h
,...,
r
( j 1)
1
,...,r
( j 1)
h
) F
( j 1)h
and r

= (r
( j +1)
1
,..., r
( j +1)
h
,...,r
(d
)
1
,...,r
(d
)
h
)
F
(d
j )h
, we test whether the resulting function w
r
,r

(z
1
,...,z
h
)
def
= w(r
,
z
1
,...,z
h
, r

) is linear (in z
1
,...,z
h
).
The linearity of w
r
,r

: F
h
F is tested using a BLR-type test [Blum
et al. 1993], specifically the Extended Linearity Test of Kiwi [2003, p. 10].
We select uniformly r
1
,...,r
h
, s
1
,...,s
h
, e, f F, and accept if and only if
e ·w
r
,r

(r
1
,...,r
h
)+ f ·w
r
,r

(s
1
,...,s
h
) = w
r
,r

(e ·r
1
+ f ·s
1
,...,e ·r
h
+ f ·s
h
).
The (Auxiliary) Low-Degree Test. Following the low-degree test of Arora et al.
[1998], we select uniformly
a = (a
(1)
1
,...,a
(1)
h
,...,a
(d
)
1
,...,a
(d
)
h
) F
d
h
and
b = (b
(1)
1
,...,b
(1)
h
,...,b
(d
)
1
,...,b
(d
)
h
) F
d
h
and test whether there exists a
univariate polynomial p of degree d
such that w(a +e ·b) agrees with p(e) on every
7
We believe that the codeword test operates well also without employing the total-degree test, but
the augmented codeword test is certainly easier to analyze.
Locally Testable Codes and PCPs of Almost-Linear Length 583
e F, where
a +e ·b = (a
(1)
1
+eb
(1)
1
,...,a
(1)
h
+eb
(1)
h
,...,a
(d
)
1
+eb
(d
)
1
,...,a
(d
)
h
+
eb
(d
)
h
). Specifically, for fixed and distinct η
0
,...,η
d
F , we check whether the
univariate degree d
polynomial p(ζ ) defined such that p(η
i
) = w(a + η
i
· b), for
i = 0, 1, .., d
, agrees with w (a + ζ · b) on a random point; that is, we uniformly
select η F and accept if and only if p(η) = w(
a + η ·b).
We accept if and only if all d
+ 1 tests accept.
We note that the interpolation condition used for low-degree testing is linear in
the recovered values. Thus, Construction 3.6 checks d
+1 linear conditions, where
each of the first d
conditions involves three values of w, and the remaining condition
involves d
+2 such values. Clearly, Construction 3.6 accepts every codeword of C
with probability 1. The following lemma provides a lower bound on the rejection
probability of noncodewords.
L
EMMA 3.7. Every w
F
n
is rejected by the codeword test of
Construction 3.6 with probability (δ
C
(w
)), where δ
C
(w
) is the relative distance
of w
from the code C
(i.e., δ
C
(w
) =
C
(w
)/n
).
P
ROOF. Let δ = δ
C
(w
), and let δ
denote the relative distance of w
(viewed as
a function w
: F
d
h
F) from the set of d
h-variate polynomials of total degree
d
. (Indeed, δ
δ, because every codeword (i.e., a d
-multilinear function) is a
polynomial of total degree d
.) If δ
min(δ, 0.4) then w is rejected with probability
(δ
) by the total-degree test (cf., e.g., Arora et al. [1998, Lem. 7.2.1.4]), and the
lemma follows (because δ
δ). Specifically, the analysis in Arora et al. [1998]
shows that the restriction of w on a random line is expected to be (δ
)-far from
any univarite polynomial of degree d
, which in particular applies to the polynomial
obtained by interpolation based on the points η
0
,...,η
d
.
Otherwise (i.e., δ
< min(δ, 0.4)), let p
denote the degree d
polynomial clos-
est to w
. By the case hypothesis (i.e., δ
) this p
must be nonlinear in
some block of variables (otherwise δ = δ
C
(w
) (w
, p
)/n
= δ
); that is,
for some j, the polynomial p
is nonlinear in {z
( j )
i
: i [h]}. We claim that,
with probability at least 1 (d
/|F|) > 0.9, this nonlinearity is preserved when
assigning random values to the variables of all the other blocks; that is, for a
random
r = (r
(1)
,...,r
(d
)
) (F
h
)
d
, with probability at least 0.9, the polyno-
mial p
r
(z
( j )
)
def
= p
(r
(1)
,...,r
( j 1)
, z
( j )
, r
( j +1)
,...,r
(d
)
) is not linear in z
( j )
=
(z
( j )
1
,...,z
( j )
h
). The claim is proved by writing p
as the sum of monomials in
z
( j )
with coefficients being functions of the other variables. Consider any non-
linear monomial in
z
( j )
having a non-zero coefficient. This non-zero coefficient
is a polynomial of degree at most d
2 in the other variables (because p
has
total degree d
and the said monomial is nonlinear in z
( j )
). Then, by the Schwarz–
Zippel Lemma, with probability at least 1 ((d
2)/|F|), a random assign-
ment
r to the other variables will yield a non-zero value, and thus this (nonlinear)
monomial in
z
( j )
will appear in p
r
(with a non-zero coefficient).
Furthermore, in this case, the nonlinear polynomial p
r
(which also has degree
at most d
) is at distance 1 (d
/|F|) > 0.9 from any linear function (in z
( j )
).
Thus, for a random
r, the expected relative distance between p
r
and the set of linear
functions is greater than 0.9 · 0.9 > 0.8. On the other hand, the expected relative
distance between the residual w
and p
(i.e., between w
r
and p
r
) under the random
584 O. GOLDREICH AND M. SUDAN
assignment r is δ
< 0.4 (where the inequality is due to the case hypothesis). Thus,
under such random assignment, the expected fractional distance of the residual
w
(i.e., w
r
) from the set of linear functions (in {z
( j )
i
: i [h]}) is greater than
0.8 0.4 = 0.4. It follows that w
is rejected with constant probability by the
jth linearity test (because, with probability at least 0.2, the residual w
is at least
0.2-far from being linear, and so is rejected with constant probability [Kiwi 2003,
Lem. 4.4]).
The Concatenated-Code. We apply the code-concatenation paradigm (cf.
Forney [1966]) to the codes C = C
R
(of Construction 3.1) and C
(of Eq. (15)).
The concatenated-code obtained by composing the outer-code C :
k
n
with
the inner-code C
: F
k
F
n
, where = F
d+1
= F
k
, maps (x
1
,...,x
k
)
to (C
(y
1
),....,C
(y
n
)), where (y
1
,....,y
n
)
def
= C(x
1
, ..., x
k
). In other words, for
x
k
F
k·k
,wehave:
concatenated-code(x) = C
(C(x, 1)),...,C
(C(x, n)), (16)
where here we view C as a mapping from
k
×[n]to. Thus, the concatenated-code
maps k · k
-long sequences over F to n · n
-long sequences over F. Furthermore,
since both C and C
are F -linear, the concatenated-code is F-linear; that is, for each
i, each F-symbol in the sequence C
(y
i
) is a linear combination of the F-symbols
in y
i
= C(x, i ) F
k
, which in turn are linear combinations of the F-symbols in
(x
1
,...,x
k
) F
k·k
.
Testing the Concatenated-Code. Loosely speaking, in order to test the concate-
nated code, we first test whether its n blocks are codewords of the inner-code, and
next use “self-correction” (cf. Blum et al. [1993]) on these blocks to emulate the test-
ing of the outer-code. Specifically, the tester for the concatenated code first selects at
random (as the tester of the outer-code) two intersecting lines
and

, and applies
the inner-code tester (of Construction 3.6) to the inner-encoding of the polynomi-
als associated with these two lines (by the outer-code). Next, to emulate the actual
check of the outer-code test (of Construction 3.2), the current tester needs to obtain
the values of these two polynomials at some elements of F (which are determined
by the outer test). Suppose that we need the value of q
(a univariate polynomial of
degree d = h
d
1 over F)att F, and that q
is encoded by the inner-code. Re-
call that q
is represented as a sequence of coefficients (q
0
,...,q
d
). For sake of the
inner-code, this sequence may be viewed as indexed by d
-tuples over [h] such that
the index (i
1
,...,i
d
) [h]
d
corresponds to
d
j=1
(i
j
1) · h
j1
∈{0, 1,...,d};
that is, q
i
1
,...,i
d
is the coefficient of the
d
j=1
(i
j
1) · h
j1
-th power. Thus (under
this convention), q
(z) =
i
1
,...,i
d
[h]
q
i
1
,...,i
d
· z
d
j=1
(i
j
1)·h
j1
, which in turn (using
Eq. (15)) yields the following key observation:
q
(t) =C

q
i
1
,...,i
d
: i
1
,...,i
d
[h]
,
t
0
,...,t
h1
, t
0
,...,t
(h1)h
,...,t
0
,...,t
(h1)h
d
1

.
(17)
That is, q
(t) resides in the entry of C
(q
i
1
,...,i
d
: i
1
,...,i
d
) that is indexed by t
F
d
h
, where the ith entry in t is t
(i1modh)·h
(i1)/ h
. But since this specific entry of the
inner-code may be corrupted (in a noisy codeword), we recover it by self-correction
based on few random positions in the codeword. Specifically, self-correction of the
desired entry is performed via polynomial interpolation, and requires only d
+ 1
Locally Testable Codes and PCPs of Almost-Linear Length 585
queries (where each query is uniformly distributed). This discussion leads to the
following test, where we are assuming (for simplicity) that the lines in R cover all
points of F
m
.
Construction 3.8. Given oracle access to w : R × [h
d
] F, which is sup-
posedly a codeword of the concatenated-code (of C
R
and C
), the test proceeds as
follows:
(1) As in Construction 3.2, we pick x F
m
uniformly at random, and let
x
R
be an arbitrary line that passes through x. Pick R uniformly among the
lines that pass through x.
(2) Testing the Inner-Code. Apply Construction 3.6 to the residual oracles w(
x
, ·)
and w(, ·).
(3) Emulating the Point-vs-Line Test (i.e., the Remaining Steps of Construc-
tion 3.2): Let α, β F be such that
x
(α) = (β) = x. Let q
x
and q be
the polynomials encoded (possibly with noise) in w (
x
, ·) and w (, ·), respec-
tively. We obtain the values of q
x
(α) and q(β), via self-correction, and check
whether these two values are equal.
Self-correction of q(β) is performed as follows. Setting
a =
1,...,β
h1
, 1 ...,β
(h1)h
, 1 ...,β
(h1)h
2
,...,1,...,β
(h1)h
d
1
we select uniformly
b = (b
(1)
1
,...,b
(1)
h
,...,b
(d
)
1
,...,b
(d
)
h
) F
d
h
, obtain the
values w(,
a + η · b) for d
+ 1 distinct nonzero values η F, and compute
the desired value by polynomial interpolation. That is, for fixed and distinct
η
1
,...,η
d
F \{0}, we determine the univariate polynomial p of degree d
sat-
isfying p(η
i
) = w (, a +η
i
·b) for i = 1, .., d
+1, and take p(0) as the value of
q(β). Note that, by Eq. (17), q(β) equals C
(q
i
1
,...,i
d
: i
1
,...,i
d
[h], a), and
that we have accessed a slightly noise version of C
(q
i
1
,...,i
d
: i
1
,...,i
d
[h])
at d
+1 uniformly distributed positions. Self-correction of q
x
(α) is performed
analogously.
We accept if and only if both invocation of Construction 3.6 as well as the
emulated Point-vs-Line Test accept.
Note that the tester performs 2 · (4d
+ 2) + 2 · (d
+ 1) = O(d
) queries.
Furthermore, its checks amount to checking several linear conditions regarding the
retrieved values. (The latter fact follows from the linearity of the check performed
by Construction 3.6 and the linearity of the interpolation performed in Step (3).)
Clearly, Construction 3.8 accepts each codeword with probability 1, but lower-
bounding the rejection probability of noncodewords does require a detailed analysis
(which is provided next). The point is to prove that the “composition of tests” (for
the concatenation-code) does work as one would have expected.
L
EMMA 3.9. Let F and C = C
R
be as in Construction 3.1, and C
: F
k
F
n
be as in Eq. (15), where k
= h
d
and n
=|F|
d
h
. Then, for all but a o(1) fraction
of the possible choices of R, every w F
nn
is rejected by the concatenated-code
tester (of Construction 3.8) with probability that is linearly related to the distance
of w from the concatenated-code (of Eq. (16)).
P
ROOF. We fix any set R satisfying the claims of Lemma 3.3 and Claim 3.3.1.
That is, the code C
R
is locally testable (via Construction 3.2), and R covers all
586 O. GOLDREICH AND M. SUDAN
points almost-uniformly. (Recall that indeed all but a o(1) fraction of the possible
choices of R can be used here.) For this fixed R, we analyze the performance of
Construction 3.8 with respect to the corresponding concatenated-code (of Eq. (16)).
Fixing any w = (w
1
,...,w
n
) (F
n
)
n
, let us denote by δ the relative distance of
w from the concatenated-code, and let δ
i
def
=
C
(w
i
)/n
denote the relative distance
of w
i
from the inner-code C
. Throughout this proof (unless stated differently),
distances refers to sequences over F.
Recall that each of the two lines selected by the outer-code tester (i.e., the tester
of Construction 3.2) is not uniformly distributed in [n] R. It is rather the case
that the first line is the canonical line associated with a uniformly selected point,
whereas the second line is a selected uniformly among the lines (in R) that passes
through this point. Let us denote by p
i
and q
i
the corresponding distribution on
lines; that is, p
i
(respectively, q
i
) denotes the probability that the ith line in R is
selected as a canonical line (respectively, a random line) for a uniformly selected
point. Note that, by the almost uniformity condition, it holds that q
i
= 1/(1 ±0.1)n
for every i [n]. For a constant c > 1 (to be determined), we consider the following
two cases:
Case 1. Either
n
i=1
p
i
δ
i
/c or
n
i=1
q
i
δ
i
/c. In this case, at least one of
the two blocks (i.e., either w
x
or w
) probed by the outer-code tester is at expected
relative distance at least δ/c from the inner-code. Thus, in this case, the inner-code
tester (of Construction 3.6, as analyzed in Lemma 3.7) invoked in Step (2) rejects
with probability (δ/c), which is (δ) because c is a constant.
Case 2. Both
n
i=1
p
i
δ
i
δ/c and
n
i=1
q
i
δ
i
δ/c. In particular, using q
i
=
1/(1 ±0.1)n, it follows that
1
n
n
i=1
δ
i
< 2δ/c. Denoting the closest corresponding
C
-codewords by c
i
s (i.e., (w
i
, c
i
) = δ
i
· n
), we let d
i
denote the decoding of c
i
(and of w
i
). Then, denoting the concatenated code by CC (and viewing d
i
=
F
d+1
as a single symbol but c
i
= C
(d
i
) and w
i
as n
-long sequences (over F )), we
have
C
(d
1
,...,d
n
)
n
CC
(C
(d
1
),...,C
(d
n
))
nn
CC
(w
1
,...,w
n
)
nn
((w
1
,...,w
n
), (c
1
,...,c
n
)
nn
= δ
1
n
n
i=1
δ
i
which is greater than δ (2δ/c) /2 (using
n
i=1
δ
i
/n < 2δ/c and assuming
c 4). Thus, for some constant c
> 0 (determined in Lemma 3.3), the outer-
code test rejects (d
1
,...,d
n
) with probability at least c
· δ/2. (We will set c =
16(d
+ 1)/c
> 4.) The question is what happens when the concatenated-code
tester (given access to (w
1
,...,w
n
)) emulates the outer-code test.
Recall that, in the current case, both the indices probed by the outer-code tester
correspond to w
i
s that are at expected relative distance at most δ/c from the inner-
code, where each expectation is taken over the distribution of the corresponding
index. Thus, for each of the two indices, with probability at most 2(d
+1)δ/c, the
randomly selected index corresponds to a block w
i
that is at relative distance greater
than p
def
= 1/2(d
+1) from the inner-code. It follows that, with probability at least
Locally Testable Codes and PCPs of Almost-Linear Length 587
1 2 · (2(d
+ 1)δ/c), both indices probed by the outer-code tester correspond to
w
i
s that are at relative distance at most p from the inner-code. In this case, with
probability at least (1 (d
+ 1) · p)
2
= 1/4, all actual (random) probes made
in Step (3) to the inner-code are to locations in which the corresponding w
i
and
c
i
= C
(d
i
) agree, and thus both the self-corrected values (computed by our test)
will match the corresponding d
i
s. Note that if the above two events occur then
our tester correctly emulates the outer-code tester. Thus, our tester rejects if the
following three events occur:
(1) The outer-code tester would have rejected the two answers (i.e., the two d
i
s).
(2) The two probed indices correspond to w
i
s that are at relative distance at most
1/2(d
+ 1) from the inner-code (and in particular from the corresponding
C
(d
i
)’s, which are the C
-codewords closest to them).
(3) The self-corrected values match the corresponding d
i
’s.
By the above, Event 1 occurs with probability at least c
δ/2, and Event (2) fails
with probability at most 4(d
+1)δ/c = c
δ/4 (by setting c = 16(d
+1)/c
). Thus,
our tester rejects w with probability at least ((c
δ/2)(c
δ/4))·(1/4) = (δ), where
the 1/4 is due to the probability that Event (3) occurs (conditioned on Events (1)
and (2) occuring). Thus, in both cases, any word that is at relative distance δ from
the concatenated-code is rejected with probability (δ). The lemma follows.
Other Properties. Recall that the concatenated code, mapping F
kk
to F
nn
is
linear (over F). Furthermore, the codeword test is a conjunction of O(d
) linear tests.
Alternatively, we may perform one of these linear tests, selected at random (with
equal probability). The relative distance of the concatenated code is the product of
the relative distances of the outer and inner codes, and thus is a constant. Regarding
the parameters of the concatenated code, suppose that in the outer-code we use the
setting d = m
e
(for any constant e > 1), and that in the inner-code we use d
= 2e.
Then, we obtain a code that maps F
kk
to F
nn
, where n < k
(e+o(1))/(e1)
and
k
= d + 1 < |F| < (log k)
e
(both by Eq. (4)), and n
= exp(
˜
O(d
1/d
)) (see
Eq. (15)) which in turn equals exp(
˜
O((log k)
e/d
)) = exp(
˜
O(
log k)) = k
o(1)
.
Thus,
nn
= (kk
)
(e+o(1))/(e1)
and |F| < (log k)
e
. (18)
For usage in the next subsection, we only care that the alphabet size (i.e., |F|)is
k
o(1)
, while the rate is good (i.e., nn
(kk
)
e/(e1)
).
Remark 3.10. The code C
: F
k
F
n
can be constructed only for specific
values of k
; that is, k
= h
d
for some integers h and d
. Thus, fixing any constant
integer d
, we obtain codes for every d
-th (integer) power. Recall that when using
this code together with C :
k
n
, where = F
d+1
, to derive the concatenated
code we must set k
= d + 1. Actually, we go the other way around: Starting with
any h,wesetk
= h
d
and d = k
1, and determine k as a function of d (and the
parameter m < d) according to Eq. (2).
3.4. O
BTAINING A BINARY LOCALLY TESTABLE CODE. Our last step is to derive
a binary code. This is done by concatenating the code presented in Section 3.3 with
the Hadamard code, while assuming that F = GF(2
k

). That is, the Hadamard code
is used to encode elements of F by binary sequences of length n

def
= 2
k

. (Recall
588 O. GOLDREICH AND M. SUDAN
that the Hadamard encoding of a string s ∈{0, 1}
is given by the sequence all 2
partial sums (mod 2) of the bits of s.)
To test the newly concatenated code, we combine the obvious testing procedure
for the Hadamard code with the fact that all that we need to check for the current
outer-code are (a constant number of) linear (in F) conditions involving a constant
number of F-entries. (Recall that Construction 3.8 only checks linear constraints,
and that we are going to set d
to be a constant.) Now, instead of checking such a
linear condition over F, we check that the corresponding equality holds for a random
sum of the bits in the representation of the elements of F (using the hypothesis that
F = GF(2
k

)). Specifically, suppose that we need to check whether
t
i=1
α
i
a
i
= 0
(in F), for some known α
1
,...,α
t
F and oracle answers denoted by a
1
,...,a
t
F. Then, we uniformly select r GF(2
k

), and check whether IP
2
(r,
t
i=1
α
i
a
i
)
0 mod 2 holds, where
IP
2
(u, v) denotes the inner-product modulo 2 of (the GF(2
k

)
elements) u and v (viewed as k

-bit long vectors). The latter check is performed
by relying on the following two facts:
Fact 1.
IP
2
(r,
t
i=1
α
i
a
i
)
t
i=1
IP
2
(r
i
a
i
) mod 2.
This fact holds because
IP
2
(r
1
···r
k

, s
1
···s
k

) =
k

j=1
r
j
s
j
.
Fact 2. Each
IP
2
(r
i
a
i
) can be obtained by making a single query (which
is determined by r and α
i
) to the Hadamard coding of a
i
, because IP
2
(r
i
a
i
)is
merely a linear combination of the bits of a
i
with coefficients depending on α
i
and
r (i.e.,
IP
2
(r
i
a
i
) = IP
2
( f (r
i
), a
i
), where f is determined by the irreducible
polynomial representing the field GF(2
k

)).
This fact holds because each bit of α
i
a
i
GF(2
k

) is a linear combination of the
bits of a
i
with coefficients depending on α
i
, and IP
2
(r, v) is a linear combination of
the bits of v with coefficients depending on r.
We now turn to the actual construction of the final (binary) code. Recall that
we wish to apply the code-concatenation paradigm to the code presented in Sec-
tion 3.3 and the suitable Hadamard code. Specifically, let C
out
: F
kk
F
nn
denote
the former code, and let C

: {0, 1}
k

→{0, 1}
n

denote the suitable Hadamard
code, where F = GF(2
k

) ≡{0, 1}
k

and [n

] ≡{0, 1}
k

. (The parameter d
that determines the rate of C
out
as well as the query complexity of its codeword
tester will be set to a constant.) Then, concatenating these two codes, we ob-
tain a code that maps (x
1
,...,x
kk
) ({0, 1}
k

)
kk
to (C

(y
1
),...,C

(y
nn
)), where
(y
1
,...,y
nn
) = C
out
(x
1
,...,x
kk
). In other words, for x F
kk
≡{0, 1}
kk
·k

,we
have:
concatenated-code(x) = C

(C
out
(x, 1)),....,C

(C
out
(x, nn
)), (19)
where C

(y) =IP
2
(y, p):p ∈{0, 1}
|y|
.
Loosely speaking, in order to test the concatenated code, we first test (random
instances of) the inner-code (i.e., C

), and next use “self-correction” (cf. Blum et al.
[1993]) on the latter to emulate the testing of the outer-code (i.e., C
out
). Setting d
to
a constant, the query complexity of the codeword tester of C
out
is a constant, denoted
q (because q = O(d
)). Recall that the codeword tester of C
out
: F
kk
F
nn
(i.e.,
Construction 3.8) checks a constant number of linear conditions, each depending
on a constant number of positions (i.e., F-symbols). By uniformly selecting one of
these conditions, we obtain a tester, denoted T , that randomly selects q positions
Locally Testable Codes and PCPs of Almost-Linear Length 589
in the tested word and checks a single linear condition regarding the F-symbols in
these positions. (Indeed, the noncodeword detection probability probability of T
may be q times smaller than that of Construction 3.8.) Thus, the tester of the (new)
concatenated code invokes T to determine q random locations i
1
,...,i
q
[nn
]
and a linear condition (α
1
,...,α
q
) F
q
to be checked (on the corresponding
answers). The (new) tester next checks whether the corresponding q blocks in
the tested (nn
n

-bit long) string are codewords of C

. Finally, the tester emulates
the check
q
j=1
α
j
d
i
j
= 0ofT , where d
i
j
is the C

-decoding the i
j
th block of the
tested string. This emulation is performed via self-correction, to be discussed next.
Recall that, rather than checking
q
j=1
α
j
d
i
j
= 0, we are going to check
IP
2
(r,
q
j=1
α
j
d
i
j
) = 0, for a uniformly selected r ∈{0, 1}
k

. Further-
more, by Fact 1, rather then checking
IP
2
(r,
q
j=1
α
j
d
i
j
) = 0, we may check
q
j=1
IP
2
(r
j
d
i
j
) = 0. To this end, we should obtain
IP
2
(r
j
d
i
j
), for r and α
j
that are known to us. As stated in Fact 2, the desired bit can be expressed as a
linear combination (with coefficients depending only on r and α
j
) of the bits of
d
i
j
. That is, IP
2
(r
j
d
i
j
) = IP
2
(r
j
, d
i
j
), where r
j
is determined by r and α
j
(i.e.,
r
j
= f (r
i
), where f depends on the representation of GF(2
k

)). Recall that
IP
2
(r
j
, d
i
j
) = C

(d
i
j
, r
j
). However, since we may not have a valid codeword of d
i
j
,
we obtain the corresponding entry via self-correction of the i
j
th block of the tested
string. That is, we obtain a good guess for C

(d
i
j
, r
j
), by taking the exclusive-or of
positions r
j
s
j
and s
j
in that block, for a uniformly selected s
j
∈{0, 1}
k

[n

].
This discussion leads to the following test, where we add Step (4) to ease the
analysis.
Construction 3.11. The tester is given oracle access to w = (w
1
,....,w
nn
),
where each w
i
= w
i,1
···w
i,n

∈{0, 1}
n

, and proceeds as follows:
(1) The tester selects the locations i
1
,...,i
q
[nn
] and the linear condition
(α
1
,...,α
q
) F
q
to be checked by the codeword tester of C
out
. That is, these
choices are determined by invoking T .
(2) For j = 1,...,q, the tester checks that w
i
j
is a codeword of C

. For each
j, this is done by uniformly selecting r, s ∈{0, 1}
k

, and checking whether
w
i
j
,r
+ w
i
j
,s
= w
i
j
,rs
.
(3) The tester emulates the check
q
j=1
α
j
d
i
j
= 0 of the tester for C
out
, where d
i
j
is the C

-decoding of w
i
j
and the arithmetic is over F = GF(2
k

). This is done
by uniformly selecting r ∈{0, 1}
k

, and checking that IP
2
(r,
q
j=1
α
j
d
i
j
) =
0. Actually, we test the equivalent condition
q
j=1
IP
2
(r
j
d
i
j
) = 0, where
the values
IP
2
(r
j
d
i
j
), for j = 1,...,q, are obtained via self-correction as
follows.
For the uniformly selected r ∈{0, 1}
k

, we determine r
1
,...,r
q
based on r and
α
1
,...,α
q
(where the α
j
s are as determined in Step 1). That is, for each j,we
determine r
j
= f (r
j
) such that IP
2
(r
j
, x) = IP
2
(r
j
x) holds for any value
of x GF(2
k

). Next, we select uniformly s
1
,...,s
q
∈{0, 1}
k

, and check that
q
j=1
(w
i
j
,r
j
s
j
w
i
j
,s
j
) = 0.
(4) The tester selects uniformly i
0
[nn
], and checks that w
i
0
is a codeword of C

(by uniformly selecting r, s ∈{0, 1}
k

, and checking whether w
i
0
,r
+ w
i
0
,s
=
w
i
0
,rs
).
590 O. GOLDREICH AND M. SUDAN
We output 1 if and only if all q + 2 checks are satisfied.
Our tester makes 3(q+1)+2q to the code, where q is a constant. It is clear that this
tester accepts any valid codeword (because C

(y, r s) = C

(y, r ) + C

(y, s) for
every y, r, s ∈{0, 1}
k

). The analysis of the rejection probability of noncodewords
can be carried out analogously to Lemma 3.9.
L
EMMA 3.12. Fo r F = GF(2
k

), let C
out
: F
kk
F
nn
be as in Eq. (16), and
C

: {0, 1}
k

→{0, 1}
n

be the Hadamard code, where n

= 2
k

. Then, every
w ∈{0, 1}
nn
n

is rejected by the concatenated-code tester (of Construction 3.11)
with probability that is linearly related to the distance of w from the concatenated-
code (of Eq. (19)).
P
ROOF. Recall that C
out
is locally testable (by Lemma 3.9), and furthermore that
the test T , which makes a constant number of queries, rejects every noncodeword
with probability that is linearly related to its distance from C
out
.
Fixing any w = (w
1
,...,w
nn
) ({0, 1}
n

)
nn
, let us denote by δ the relative
distance of w from the concatenated-code, and let δ
i
def
=
C

(w
i
)/n

denote the
relative distance of w
i
from the inner-code C

. Throughout this proof (unless stated
differently), distances refers to binary sequences.
As in the proof of Lemma 3.9, we distinguish between two cases according
to the average distance of the w
i
s from valid codewords of C

. However, rather
than relying on the specifics of T , we use a more generic approach here, while
relying on the added test performed in Step (4). Specifically, let us denote by p
i
the probability that a random query of T probes the i th location, where i [nn
]
(and we refer to a uniformly selected query among the q random queries made
by T ). For a constant c > 1 (to be determined), we consider the following two
cases:
Case 1. Either
nn
i=1
p
i
δ
i
/c or
nn
i=1
δ
i
/nn
/c. In this case, at least
one of the blocks (i.e., either w
i
j
for some j [q]orw
i
0
) probed by the outer-
code tester (in either Step 2 or Step 4, respectively) is at expected relative distance
at least δ/c from the inner-code. Thus, in this case, the inner-code tester (which
is the extensively analyzed BLR-test [Blum et al. 1993]) rejects with probability
(δ/c) = (δ).
Case 2. Both
nn
i=1
p
i
δ
i
δ/c and
nn
i=1
δ
i
/nn
δ/c. Denoting the closest
corresponding C

-codewords by c
i
s (i.e., (w
i
, c
i
) = δ
i
·n

), we let d
i
denote the
decoding of c
i
(and of w
i
). Thus, the relative distance of (d
1
,...,d
nn
) from the
outer-code (when viewing each d
i
as a single symbol) is at least δ (δ/c) /2,
provided that c > 4, where the δ/c term accounts for the average relative distance
between the c
i
s and the w
i
s. That is, for every x F
kk
the fraction of is such that
d
i
= C
out
(x, i) is at least δ/2. It follows that, for some constant c
> 0 (determined
in Lemma 3.9), the outer-code test T rejects (d
1
,...,d
nn
) with probability at least
c
· δ/2. (We will set c = 12q
2
/c
> 4.) The question is what happens when the
concatenated-code tester (given access to w) emulates T . The answer is that our
tester rejects if the following four events all occur.
(1) The outer-code tester T would have rejected the q answers (i.e., the relevant
d
i
s). That is,
q
j=1
α
j
d
i
j
= 0 (over GF(2
k

)), for the adequate α
j
s. Recall that
this event occurs with probability at least c
δ/2.
Locally Testable Codes and PCPs of Almost-Linear Length 591
(2) The q probed indices correspond to w
i
s that are at relative distance at most
1/3q from the inner-code (and in particular from the corresponding C

(d
i
)’s).
To see that this event occurs with probability at least 1 3q
2
δ/c, let b
j
denotes
the probability for the bad (complementary) (sub-)event in which the j th query
is made to a block w
i
that is 1/3q-far from the corresponding codeword C

(d
i
).
Then,
nn
i=1
δ
i
/nn
1
q
·
q
j=1
b
j
· (1/3q), which using the case hypothesis
implies that
q
j=1
b
j
3q
2
· δ/c, as claimed.
(3) The self-corrected values match the corresponding d
i
s. Given Event (2), the
current event occurs with probability at least (1 2 · (1/3q))
q
> 1/3, because
self-correction succeeds whenever all actual (random) probes to the inner-code
are to locations in which the corresponding w
i
and c
i
= C

(d
i
) agree.
(4) The string r ∈{0, 1}
k

, selected uniformly in Step (3), is such that
q
j=1
IP
2
(r
j
d
i
j
) = 0, Given Event 1, the current event occurs with prob-
ability 1/2.
Note that the first three events are analogous to events considered in the proof
of Lemma 3.9, whereas the last event is introduced because we do not emulate the
actual check of T (but rather a randomized version of it). Setting c = 12q
2
/c
,we
infer that all four events occur with probability at least ((c
δ/2) (3q
2
δ/c)) ·(1/3) ·
(1/2) = c
δ/24 = (δ).
Thus, in both cases, any word that is at relative distance δ from the concatenated-
code is rejected with probability (δ). The lemma follows.
Corollary: Part 2 of Theorem 2.4. For any desired constant e > 1, we use the
parameter setting d = m
e
and d
= 2e in the construction of the code C
out
.
As summarized in Eq. (18), this yields a code C
out
: F
kk
F
nn
, where nn
<
(kk
)
(e+o(1))/(e1)
and |F| < (log k)
e
. Recall that we compose C
out
with C

:
{0, 1}
k

→{0, 1}
n

, where {0, 1}
k

is associated with F = GF(2
k

). Thus, our final
code maps {0, 1}
kk
k

to {0, 1}
nn
n

, where n

= 2
k

=|F|=poly(log k) = k
o(1)
,
and so nn
n

< (kk
k

)
(e+o(1))/(e1)
. Also note that the final code is linear and has
linear distance. Thus, we have established Part (2) of Theorem 2.4.
Remark 3.13. The performance of the final codeword tester (of Construc-
tion 3.11) depends on the parameter e > 1, which determines the rate of the final
code (i.e., the relation between nn
n

and kk
k

). The query complexity of the tester
is linear in e, and the rejection probability of noncodewords depends is inversely
proportional to poly(e) (i.e., a string that is δ-far from the code is rejected with prob-
ability (δ/e
4
)). The rejection probability of noncodewords can be improved, but
we doubt that one get get below (δ/e
2
) without introducing significantly different
ideas.
Remark 3.14. In continuation to Remarks 3.5 and 3.10, we comment that the
final binary code can be constructed only for specific values of k, k
and k

. Fixing
any integer e > 1, the aforementioned code can be constructed for any integer h,
while setting k
= h
e
, k

= log O(k
) and k (m
e1
)
m
, where m = (h
e
1)
1/e
h.
Thus, K
def
= kk
k

h
(e1)h
· h
e
· log h
e
h
(e1)h
. The ratio between consecutive
admissible values of K is given by
(h+1)
(e1)(h+1)
h
(e1)h
= O(h)
e1
< (log K )
e1
, and so
the admissible successor of K is smaller than (log K )
e1
· K .
592 O. GOLDREICH AND M. SUDAN
4. PCPs of Nearly Linear Length
In this section, we give a probabilistic construction of nearly-linear sized PCPs
for SAT. More formally, we reduce SAT in almost-linear probabilistic time to a
promise problem, and show that this problem has a PCP of randomness complexity
(1+o(1)) log n (on inputs of length n) and constant query complexity. Furthermore,
this PCP has perfect completeness, soundness arbitrarily close to
1
2
, and its query
complexity is a small explicit constant. Specifically, with 19 (bit) queries we obtain
randomness complexity log
2
n +
˜
O(
log n). Recall that actually we care about the
proof length (i.e., the length of the PCP oracle), which is 2
r
· q, where r and q
are the randomness and query complexities of the PCP. Our PCPs improve over
the parameters of the PCPs constructed by Polishchuk and Spielman [1994], and
are obtained by applying the “random projection” method (introduced in Section 3)
to certain constant-prover one-round proof systems, which are crucial ingredients
in the constructions of PCPs. Specifically, we apply this technique to (a variant of)
the three-prover one-round proof system of Harsha and Sudan [2000].
Random Projection of Proof Systems. Typically, constant-prover one-round
proof systems use provers of very different sizes. Indeed, this is the case with the
proof system of Harsha and Sudan [2000]. By applying the “random projection”
method to the latter proof system, we obtain an equivalent system in which all
provers have size roughly equal to the size of the smallest prover in the original
scheme. At this point, we reduce the randomness complexity to be logarithmic
in the size of the provers (i.e., and thus logarithmic in the size of the smallest
original prover). (The latter step is rather straightforward.) Starting with the system
of Harsha and Sudan [2000], we obtain a constant-prover one-round proof system
of randomness complexity (1 + o(1)) log n (on inputs of length n). However, the
query complexity of the resulting system is not constant, although it is small, but
the standard proof composition paradigm (combined with known PCPs) comes to
our rescue.
Recall that typical PCP constructions are obtained by using the technique of
proof composition introduced by Arora and Safra [1998]. In this technique, an
“outer verifier”, typically a verifier for a constant-prover one-round proof system,
is composed with an “inner verifier” to get a new PCP verifier. The new verifier
essentially inherits the randomness complexity of the outer verifier and the query
complexity of the inner verifier. Since our goal is to reduce the randomness complex-
ity of the composed verifier, we achieve this objective by reducing the randomness
complexity of the outer verifier.
Organization. As stated above, our key step is to reduce the sizes of the provers
(in certain constant-prover one-round proof system). As a warm-up (in Section 4.1),
we first show that the random projection method can be applied to any 2-prover one-
round proof system, resulting in an equivalent proof system in which both provers
have size roughly equal to the size of the smallest prover in the original scheme.
Next, in Section 4.2, we show how to apply the random projection to the verifier
of a specific 3-prover one-round proof system used by Harsha and Sudan [2000].
Their verifier is a variant of the one constructed by Raz and Safra [1997] (see also,
Arora and Sudan [2003]), which are, in turn, variants of a verifier constructed by
Arora et al. [1998]. All these verifiers share the common property of working with
provers of vastly different sizes. We manage to reduce the sizes of all the provers
to the size of the smallest one, and consequently reduce the randomness of the
Locally Testable Codes and PCPs of Almost-Linear Length 593
verifier to (1 +o(1)) log n (where n is the input length). We stress that the “random
size reduction” step is not generic, but rather relies on properties of the proof of
soundness in, say, Harsha and Sudan [2000], which are abstracted below. Applying
known composition lemmas (i.e., those developed in Harsha and Sudan [2000]) to
this gives us the desired short PCP constructions.
4.1. T
WO-PROVER VERIFIERS AND RANDOM SAMPLING. We start by defining
a 2-prover 1-round proof system as a combinatorial game between a verifier and
two provers. Below, denotes the space of verifier’s coins, q
i
denotes its strategy
of forming queries to the ith prover, and P
i
denotes a strategy for answering these
queries (where we refer to the residual strategy for a fixed common input, which is
omitted from the notation).
Definition 4.1. For finite sets Q
1
, Q
2
,,and A,a(Q
1
, Q
2
,,A)-2IP verifier
V is given by functions q
1
: Q
1
and q
2
: Q
2
and Verdict : × A ×
A →{0, 1}. The value of V , denote w (V ), is the maximum, over all functions
P
1
: Q
1
A and P
2
: Q
2
A of the quantity
w
P
1
,P
2
(V )
def
=
E
r
[
Verdict(r, P
1
(q
1
(r)), P
2
(q
2
(r)))
]
. (20)
where r is uniformly distributed in . A 2IP verifier V is said to be uniform if, for
each i ∈{1, 2}, the function q
i
: Q
i
is ||/|Q
i
|-to-one. The size of prover
P
i
is defined as |Q
i
|.
Focusing on the case |Q
2
||Q
1
|, we define a “sampled” 2IP verifier. In accor-
dance with the preliminary motivational discussion, we use a two-stage sampling:
first, we sample the queries to the bigger prover (i.e., S Q
2
), and then we sam-
ple the set of (relevant) coin tosses (i.e., T ). Thus, the first step corresponds
to a random projection of the second prover’s strategy on a subset of the possible
queries. Note that the first stage results in a proof system in which the second prover
has size |S| (rather than |Q
2
|), and that we should restrict the space of the resulting
verifier’s coins such that their image under q
2
equals S.
Definition 4.2. Given a (Q
1
, Q
2
,,A)-2IP verifier V and set S Q
2
, let
S
={r : q
2
(r) S}. (21)
For T
S
, the (S, T )-sampled 2IP verifier, denoted V |
S,T
,isa(Q
1
, S, T, A)-2IP
verifier given by functions q
1
: T Q
1
, q
2
: T S, and Verdict
: T × A × A
{0, 1} obtained by restricting q
1
, q
2
and Verdict to T .
In the following lemma, we show that a sufficiently large randomly sampled
set S from Q
2
is very likely to approximately preserve the value of a verifier.
Furthermore, the value continues to be preserved approximately if we pick T to be
a sufficiently large random subset of
S
.
L
EMMA 4.3. There exist absolute constants c
1
, c
2
such that the following holds
for every Q
1
, Q
2
,,A, and γ>0. Let V be an (Q
1
, Q
2
,,A)-uniform 2IP
verifier.
Completeness. For any S and T , the (S, T )-sampled verifier preserves the
perfect completeness of V . That is, if ω(V ) = 1 then, for every S Q
2
and
T
S
, it holds that ω(V |
S,T
) = 1.
594 O. GOLDREICH AND M. SUDAN
Soundness. For sufficiently large S and T , a random (S, T )-sampled verifier
preserves the soundness of V up-to a constant factor. Specifically, let N
1
=
c
1
·
(|Q
1
|log |A|+log
1
γ
) and N
2
=
c
2
· (N
1
log |A|+log
1
γ
), and suppose that S is a
uniformly selected multi-set of size N
1
of Q
2
, and T is a uniformly selected multi-set
of size N
2
of
S
. Then, for ω(V ) , with probability at least 1 γ , it holds that
ω(V |
S,T
) 2.
Note that the reduction in the randomness complexity (i.e., obtaining N
2
=
˜
O(|Q
1
|)) relies on the shrinking of the second prover to size N
1
=
˜
O(|Q
1
|).
Without shrinking the second prover, we would obtain N
2
=
˜
O(|Q
2
|), which is
typically useless (because, typically, ||=
˜
O(|Q
2
|)).
P
ROOF. We focus on the soundness condition, and assume that ω(V ) .
The proof is partitioned into two parts. First we show that a random choice of S
is unlikely to increase the value of the game to above
3
2
· . Next, assuming that
S satisfies the latter condition, we show that a random choice of T is unlikely to
increase the value of the game above 2. The second part of the proof is really a
standard argument, which has been observed before in the context of PCPs (e.g.,
in Bellare et al. [1998]). We thus focus on the first part, which abstracts the idea of
the random projection from Section 3.
Our aim is to bound the value ω(V |
S,
S
), for a randomly chosen S. Fix any prover
strategy P
1
: Q
1
A for the first prover. Now, note that an optimal strategy, de-
noted P
2
, for the second prover answer each question q
2
Q
2
by an answer that
maximizes the acceptance probability with respect to the fixed P
1
(i.e., an opti-
mal answer is a string a
2
that maximizes
E
r|q
2
(r)=q
2
[Verdict(r, P
1
(q
1
(r)), a
2
)]).
We stress that this assertion holds both for the original 2IP verifier V as well as
for any (S,
S
)-sampled verifier.
8
For every question q
2
Q
2
, let
q
2
denote the
acceptance probability of the verifier V given that the second question is q
2
(i.e.,
q
2
=
E
r|q
2
(r)=q
2
[Verdict(r, P
1
(q
1
(r)), P
2
(q
2
))]). By (uniformity ad) the defini-
tion of
q
2
,wehave
E
q
2
Q
2
[
q
2
] =
E
r
[
q
2
(r)
] . The quantity of interest to
us is
E
r
S
[
q
2
(r)
], which by uniformity equals
E
q
2
S
[
q
2
]. A straightforward ap-
plication of Chernoff Bound (see Footnote 2) shows that the probability that this
quantity exceeds
3
2
· is exponentially vanishing in N
1
. Taking the union bound
over all possible P
1
s, we infer that the probability that there exists a P
1
, P
2
such
that
E
r
S
[Verdict(r, P
1
(q
1
(r)), P
2
(q
2
(r)))] >
3
2
· is at most exp(N
1
) ·|A|
|Q
1
|
.
Thus, using N
1
=
c
1
(|Q
1
|log |A|+log
1
γ
) for some absolute constant c
1
, it follows
that ω(V |
S,
S
)
3
2
· with probability at least 1
γ
2
(over the choices of S). The
lemma follows.
9
8
But, the assertion does not hold for most (S, T )-sampled verifiers.
9
Indeed, we have ignored the effect of sampling
S
; that is, the relation of ω(V |
S,
S
) and ω(V |
S,T
),
for a random T
S
of size N
2
. As stated above, this part is standard. Fixing any S such that
ω(V |
S,
S
)
3
2
· , we assume without loss of generality that ω(V |
S,
S
) . First, we fix any
choice of P
1
: Q
1
A and P
2
: S A, and applying Chernoff Bound (again) we infer that
the probability that the restrictions of
S
to T lead to acceptance with probability greater than
4
3
· ω(V |
S,
S
) is exp(( N
2
)). Taking the union bound over all choices of P
1
and P
2
, we infer
that ω(V |
S,T
) >
4
3
· ω(V |
S,
S
) with probability at most exp(( N
2
)) ·|A|
|Q
1
|+|S|
. Thus, using
N
2
=
c
2
(|S|log |A|+log(1 )), we conclude that ω(V |
S,T
)
4
3
· ω(V |
S,
S
) 2with probability
at least 1
γ
2
(over the choices of T ).
Locally Testable Codes and PCPs of Almost-Linear Length 595
4.2. I
MPROVED 3-PROVER PROOF SYSTEM FOR NP. We now define the more
general notion of a constant-prover one-round interactive proof system (MIP).
We actually extend the standard definition from languages to promise problems
(cf. Even et al. [1984] and Bellare et al. [1998]).
Definition 4.4. For positive reals c, s, integer p and functions r, a :
Z
+
Z
+
,
we say that a promise problem = (
YES
,
NO
)isinMIP
c,s
[p, r, a] (or,
has a p-prover one-round proof system with randomness r and answer length a) if
there exists a probabilistic polynomial-time verifier V interacting with p provers
P
1
,...,P
p
such that
Operation. On input x of length n, the verifier tosses r(n) coins, gener-
ates queries q
1
,...,q
p
to provers P
1
,...,P
p
, obtain the corresponding answers
a
1
,...,a
p
∈{0, 1}
a(n)
, and outputs a Boolean verdict that is a function of x, its
randomness and the answers a
1
,...,a
p
.
Completeness. If x
YES
then there exist strategies P
1
,...,P
p
such that V
accepts their response with probability at least c.
If c = 1, then we say that V has perfect completeness.
Soundness. If x
NO
then for every sequence of prover strategies
P
1
,...,P
p
, machine V accepts their response with probability at most s, which is
called the soundness error.
If for every choice of verifier’s coins, its queries to P
i
reside in a set Q
i
, then we
say that prover P
i
has size |Q
i
|.
Recall that a language L is captured by the promise problem (L, {0, 1}
\ L).
Harsha and Sudan [2000] presented a randomness efficient 3-prover one-round
proof system for SAT, with answer length poly(log n). Specifically, their proof sys-
tem has randomness complexity (3 +) log
2
n, where >0 is an arbitrary constant
and n denotes the length of the input. Here, we reduce the randomness required
by their verifier to (1 + o(1)) log n. Actually, we do not reduce the randomness
complexity of the proof system for SAT, but rather present a randomized reduction
of SAT to a problem for which we obtain a 3-prover one-round proof system with
answer length poly(log n) and randomness complexity (1 + o(1)) log n. It is, of
course, crucial that our reduction does not increase the length of the instance by too
much. To capture this condition, we present a quantified notion of length preserving
reductions.
Definition 4.5. For a function :
Z
+
Z
+
, a reduction is -length preserving
if it maps instances of length n to instances of length at most (n).
Our key technical result is summarized as follows.
T
HEOREM 4.6 (RANDOM PROJECTION OF CERTAIN MIPS ). Let m, : Z
+
Z
+
be functions satisfying (n) = (m(n)
(m(n))
n
1+(1/m(n))
) and m(n) 2. Then,
for any constant >0, SAT reduces in probabilistic polynomial time, under -
length preserving reductions, to a promise problem in MIP
1,
[3, r, a], where
r(n) = (1 +1/m(n)) log n + O(m(n) log m(n)) and a(n) = m(n)
O(1)
n
O(1/m(n))
.
We comment that the reduction actually runs in time . Before proving
Theorem 4.6, let us see a special case of it obtained by setting m(n) =
log n
(which is an approximately optimal choice).
596 O. GOLDREICH AND M. SUDAN
COROLLARY 4.7. For every μ>0, SAT reduces in probabilistic poly-
nomial time, under -length preserving reductions, to a promise problem
in MIP
1
[3, r, a], where (n) = n
1+O((log log n)/
log n)
,r(n) = (1 +
O((log log n)/
log n)) ·log
2
n and a(n) = 2
O(
log n)
.
In Section 4.5, we show how to apply state-of-the-art proof composition to the
aforementioned MIPs in order to derive our main result (i.e., a PCP with similar
randomness complexity using a constant number of queries).
Overview and Organization of the Proof of Theorem 4.6. The rest of the
Section 4.2 is devoted to proving Theorem 4.6. We start with an overview of the
proof, which modifies the proof of Harsha and Sudan [2000], improving the lat-
ter in two points. The proof of Harsha and Sudan [2000] first reduces SAT to a
parameterized problem, called GapPCS, under
(n)-length preserving reductions
for
(n) = n
1+γ
for any γ>0. Then they give a 3-prover MIP proof system for
the reduced instance of GapPCS, where the verifier tosses (3 +γ ) log
(n) random
coins.
Our first improvement shows that the reduction of Harsha and Sudan [2000]
actually yields a stronger reduction than stated there, in two ways. First we note
that their proof allows for smaller values of
(n) than stated there, allowing in
particular for the parameters we need; that is, we get
(n) = (n), where is
as in Theorem 4.6. Furthermore, we notice that their result gives rise to instances
from a restricted class, for which slightly more efficient proof systems can be
designed. In particular, we can reduce the size of the smallest prover in their MIP
system to (n) (as opposed to their result which gives a prover of size
(n)
1+γ
for
arbitrarily small γ ). These improvements are stated formally in Appendix A (see
Lemmas A.3 and A.4, yielding Theorem A.5).
The second improvement is more critical to our purposes. Here, we improve
the randomness complexity of the MIP verifier of Harsha and Sudan [2000], by
applying a random projection to it. In order to allow for a clean presentation of
this improvement, we first abstract the verifier of Harsha and Sudan [2000] (or
rather the one obtained from Theorem A.5). This is done in Section 4.2.2. We then
show how to transform such a verifier into one with (1 + o(1)) log n randomness.
This transformation comes in three stages, described in Sections 4.2.3–4.2.5. The
key stage (undertaken in Section 5.2.4) is a shrinking of the sizes of all provers to
roughly .
4.2.1. Abstracting the Verifier of Theorem A.5. The verifier (underlying the
proof) of Theorem A.5 interacts with three provers, which we denote P, P
1
, and P
2
.
We let Q, Q
1
, and Q
2
denote the question space of the three provers, respectively.
Similarly, A, A
1
, and A
2
denote the corresponding spaces of (prover) answers; that
is, P : Q A (respectively, P
1
: Q
1
A
1
and P
2
: Q
2
A
2
). We denote by
V
x
(r, a, a
1
, a
2
) the acceptance predicate of the verifier on input x ∈{0, 1}
n
, where
r denotes the verifier’s coins, and a (respectively, a
1
, a
2
) the answer of prover
P (respectively, P
1
, P
2
). (Note: The value of V
x
is 1 if the verifier accepts.) We
will usually drop the subscript x unless needed. Let us denote by q(r), (respectively
q
1
(r), q
2
(r)) the verifier’s query to P (respectively, P
1
, P
2
) on random string r ,
where denotes the space of verifier’s coins. We note that the following properties
hold for the 3-prover proof system given by Theorem 4 (cf. Section A.3).
Locally Testable Codes and PCPs of Almost-Linear Length 597
(1) Sampleability. The verifier only tosses O(log n) coins (i.e., ={0, 1}
O(log n)
).
Thus, it is feasible to sample from various specified subsets of the space of all
possible coin outcomes. For example, given S
1
Q
1
, we can uniformly select
in poly(n)-time a sequence of coins r such that q
1
(r) S
1
.
(2) Uniformity. The verifier’s queries to prover P (respectively P
1
and P
2
) are uni-
formly distributed over Q (respectively over Q
1
and Q
2
); that is, q is ||/|Q|-
to-1 (respectively q
i
is ||/|Q
i
|-to-1).
(3) Decomposability. The acceptance-predicate V decomposes in the sense that
for some predicates V
1
and V
2
it holds that V (r, a, a
1
, a
2
) = V
1
(r, a, a
1
)
V
2
(r, a, a
2
), for all r, a, a
1
, a
2
. Furthermore, for any constant >0 (as in
Theorem A.5), if x is a
NO-instance then for every possible P strategy, there
exists a subset Q
= Q
P
Q such that for every P
1
and P
2
the following two
conditions holds
Pr
r
[q(r ) Q
V
1
(r, P(q(r)), P
1
(q
1
(r)))] <
2
(22)
Pr
r
[q(r ) ∈ Q
V
2
(r, P(q(r)), P
2
(q
2
(r)))] <
2
(23)
where V
1
and V
2
are the decomposition of V = V
x
.
Indeed, Pr
r
[V (r, P(q(r)), P
1
(q
1
(r)), P
2
(q
2
(r))) = 1] <follows, but the
current property says something much stronger.
The Decomposition Property plays a central role in the rest of our argument.
Intuitively, it allows us to reduce the three-prover case to the two-prover case
(treated in Section 4.2).
4.2.2. The 3-Prover MIP: Stage I. The current stage is merely a preparation
towards the next stage, which is the crucial one in our construction. The preparation
consists of modifying the verifier of Theorem A.5 such that its queries to provers
P
1
and P
2
are “independent” (given the query to the prover P). That is, we define
a new verifier, denoted W , that behaves as follows:
Construction 4.8 (Verifier W ). On input x, let V = V
x
be the (original) veri-
fier’s predicate and let V
1
and V
2
be as given in the Decomposability Property.
(1) Pick q Q uniformly and pick coins r
1
and r
2
uniformly and independently
from the set
q
def
={r : q(r) = q}.
(2) Make queries q (which indeed equals q(r
1
) = q(r
2
)), q
1
= q
1
(r
1
) and q
2
=
q
2
(r
2
), to P, P
1
and P
2
, respectively. Let a = P(q), a
1
= P
1
(q
1
) and a
2
=
P
2
(q
2
) denote the answers received.
(3) Accept if and only if V
1
(r
1
, a, a
1
) V
2
(r
2
, a, a
2
).
In Step (1), we use the Sampleability Property (with respect to a specific set of
rs). The analysis of W relies on the Uniformity Property, and more fundamentally
on the Decomposition Property. We note that Construction 4.8 merely motivates
the construction in Stage II, and thus the analysis of Construction 4.8 (captured by
Proposition 4.9) is not used in the rest of the article (although it provides a good
warm-up).
P
ROPOSITION 4.9. Verifier W has perfect completeness and soundness at
most .
598 O. GOLDREICH AND M. SUDAN
PROOF. The completeness is obvious, and so we focus on the soundness. Fix
a
NO-instance x and any choice of provers P, P
1
and P
2
. By the Decomposition
Property, the probability that W accepts is given by
Pr
qQ , r
1
,r
2
q
[
EV
1
(r
1
) EV
2
(r
2
)
]
(24)
where EV
1
(r
1
)
def
= V
1
(r
1
, P(q), P
1
(q
1
(r
1
))) and EV
2
(r
2
)
def
= V
2
(r
2
, P(q), P
2
(q
2
(r
2
))). Note that q = q(r
1
) = q(r
2
), where (q and) r
1
, r
2
are selected as
above. Thus, EV
i
only depends on r
i
, and the shorthand above is legitimate. Letting
Q
= Q
P
be the subset of Q as given by the Decomposition Property of the MIP,
we upper-bound Eq. (24) by
Pr
qQ , r
1
,r
2
q
q Q
EV
1
(r
1
) EV
2
(r
2
)
+ Pr
qQ , r
1
,r
2
q
q ∈ Q
EV
1
(r
1
) EV
2
(r
2
)
Pr
qQ , r
1
q
q Q
EV
1
(r
1
)
+ Pr
qQ , r
2
q
q ∈ Q
EV
2
(r
2
)
(25)
By the Uniformity Property, the process of selecting r
1
(respectively, r
2
) in Eq. (25)
is equivalent to selecting it uniformly in (and setting q = q(r
i
)). We thus upper
bound (25) by
Pr
r
1
[q(r
1
) Q
EV
1
(r
1
)] + Pr
r
2
[q(r
2
) ∈ Q
EV
2
(r
2
)].
Using the Decomposition Property, each of these two terms is bounded by /2 and
thus their sum is upper-bounded by .
4.2.3. The 3-Prover MIP: Stage II. In the next stage, which is the crucial one
in our construction, we reduce the size of the provers P
1
and P
2
by a random
projection. Specifically, we reduce the size of P
i
from |Q
i
| to |S
i
|.
Construction 4.10 (The Projected W ). For sets S
1
Q
1
and S
2
Q
2
,we
define the (S
1
, S
2
)-restricted verifier, denoted W
S
1
,S
2
, as follows: Again, on input
x, let V = V
x
be the verifier’s predicate and let V
1
and V
2
be as given in the
Decomposability Property.
(1) Pick q Q uniformly and pick coins r
1
and r
2
uniformly and independently
from the sets
(1)
q,S
1
def
={r : q(r) =q q
1
(r) S
1
} and
(2)
q,S
2
def
={r :
q(r) = q q
2
(r) S
2
}, respectively. If either of the sets is empty, then the
verifier simply accepts.
(2) Make queries q = q(r
1
) = q(r
2
), q
1
= q
1
(r
1
) and q
2
= q
2
(r
2
), to P, P
1
and P
2
,
respectively. Let a = P(q), a
1
= P
1
(q
1
) and a
2
= P
2
(q
2
) denote the answers
received.
(3) Accept if and only if V
1
(r
1
, a, a
1
) V
2
(r
2
, a, a
2
).
Again, in the construction, we use the sampleability of various subsets of the
verifier coins, whereas we will rely on the Uniformity and Decomposability Prop-
erties for the analysis. As in Construction 4.8, it is clear that the verifier W
S
1
,S
2
has
perfect completeness (for every S
1
and S
2
). We now bound the soundness of this
verifier, for most choices of sufficiently large sets S
1
and S
2
:
Locally Testable Codes and PCPs of Almost-Linear Length 599
L
EMMA 4.11. For randomly chosen sets S
1
and S
2
, each of size
N
def
= O(|Qmax{log |A|, log |Q|}), with probability at least 5/6, the soundness
error of the verifier W
S
1
,S
2
is at most 4.
P
ROOF. We start with some notation. Recall that denotes the space of random
strings of the verifier V (of Section 4.2.1). For i ∈{1, 2} and a fixed set S
i
, let W
i
denote the distribution on induced by picking uniformly a query q Q, then
picking r
i
uniformly from the set
(i)
q,S
i
, and outputting r
i
. Note that the verifier
W
S
1
,S
2
picks r
1
(respectively, r
2
) according to distribution W
1
(respectively, W
2
),
where r
1
and r
2
depend on the same random q Q. Similarly, let U
i
denote the
distribution on induced by picking a random string r
i
uniformly from the set
qQ
(i)
q,S
i
; that is, U
i
is the uniform distribution on {r : q
i
(r) S
i
}. Note
that both W
i
and U
i
depend on S
i
, but to avoid cumbersome notation we did not
make this dependence explicit. Still, at times, we use W
i
(S
i
) (respectively, U
i
(S
i
))
to denote the distribution W
i
(respectively, U
i
) that is defined as above based on
the set S
i
.
We use the notation r D to denote that r is picked according to distribution D.In
our analysis, we will show that, for a (sufficiently large) random S
i
, the distributions
U
i
and W
i
are statistically close, where as usual the statistical difference between
U
i
and W
i
is defined as max
T
Pr
r
i
U
i
[r
i
T ] Pr
r
i
W
i
[r
i
T ]
. We will then
show that a modified verifier W
S
1
,S
2
that picks r
1
and r
2
independently from the
distributions U
1
and U
2
, respectively, has low soundness error. We stress that in
contrast to W
S
1
,S
2
, the verifier W
S
1
,S
2
selects r
1
and r
2
(from distributions W
1
and
W
2
) such that the r
1
and r
2
are not independent (but rather depend on the same
q Q). Still, as in the proof of Proposition 4.9, the Decomposition Property (of
Section 4.2.2) allows for the analysis to go through.
The above informal description is made rigorous by considering the following
bad events, over the probability space defined by the random choices of S
1
and S
2
:
BE1. The statistical difference between U
1
(S
1
) and W
1
(S
1
) is more than .
BE2. The statistical difference between U
2
(S
2
) and W
2
(S
2
) is more than .
BE3. There exist P and P
1
such that for Q
= Q
P
(as in Decomposition
Property) the condition of Eq. (22) is strongly violated when selecting r
1
according
to U
1
(S
1
) (rather than uniformly in ); that is,
Pr
r
1
U
1
(S
1
)
(q(r
1
) Q
) V
1
(r
1
, P(q(r
1
)), P
1
(q
1
(r
1
)))
>.
BE4. There exist P and P
2
such that for Q
= Q
P
the condition of Eq. (23) is
strongly violated when selecting r
2
according to U
2
(S
2
); that is,
Pr
r
2
U
2
(S
2
)
(q(r
2
) Q
) V
2
(r
2
, P(q(r
2
)), P
2
(q
2
(r
1
)))
>.
Below, we will bound the probability of these bad events, when S
1
and S
2
are chosen
at random. But first, we show that if none of the bad events occur, then the verifier
W
S
1
,S
2
has small soundness error.
C
LAIM 4.11.1. If for sets S
1
and S
2
, none of the four bad event occurs, then the
soundness error of W
S
1
,S
2
is at most 4.
P
ROOF. Let (r
1
, r
2
)W
S
1
,S
2
denote a random choice of the pair (r
1
, r
2
)as
chosen by the verifier W
S
1
,S
2
. Fix proofs P, P
1
, P
2
and let Q
= Q
P
(and V
1
, V
2
)
600 O. GOLDREICH AND M. SUDAN
be as in the Decomposition Property. Then,
Pr
(r
1
,r
2
)W
S
1
,S
2
[
V
1
(r
1
, P(q(r
1
)), P
1
(q
1
(r
1
))) V
2
(r
2
, P(q(r
2
)), P
2
(q
2
(r
2
)))
]
Pr
r
1
W
1
(S
1
)
(q(r
1
) Q
) V
1
(r
1
, P(q(r
1
)), P
1
(q
1
(r
1
)))
+Pr
r
2
W
2
(S
2
)
(q(r
2
) ∈ Q
) V
2
(r
2
, P(q(r
2
)), P
2
(q
2
(r
2
)))
Pr
r
1
U
1
(S
1
)
(q(r
1
) Q
) V
1
(r
1
, P(q(r
1
)), P
1
(q
1
(r
1
)))
+ [¬BE1 and ¬BE2]
+Pr
r
2
U
2
(S
2
)
(q(r
2
) ∈ Q
) V
2
(r
2
, P(q(r
2
)), P
2
(q
2
(r
2
)))
+
2 + 2 [¬BE3 and ¬BE4]
where the first inequality uses manipulation as in the proof of Proposition 4.9 (cf.
Eq. (25)).
We now turn to upper-bound the probability of the bad events.
C
LAIM 4.11.2. The probability of event BE1 (respectively, BE2) is at most
1/24.
P
ROOF. To estimate the statistical difference between U
i
= U
i
(S
i
) and W
i
=
W
i
(S
i
), we take a closer look at the distribution U
i
. We note that sampling r
i
according to U
i
is equivalent to selecting r
i
U
i
(i.e., r
i
is selected uniformly
in {r : q
i
(r) S
i
}), setting q = q(r
i
), and picking r
i
uniformly from the set
{r :(q(r) =q)(q
i
(r)S
i
)}=
(i)
q,S
i
. In contrast, in the distribution W
i
, the output is
selected uniformly in
(i)
q,S
i
, where q is selected uniformly in Q. Thus, the statistical
difference between U
i
and W
i
is due to the statistical difference in the distributions
induced on q = q(r
i
), which in turn equals
1
2
·
qQ
Pr
r
i
U
i
(S
i
)
[q(r
i
) = q] Pr
r
i
W
i
(S
i
)
[q(r
i
) = q]
=
1
2
·
qQ
Pr
r
i
U
i
(S
i
)
[q(r
i
) = q]
1
|Q|
.
To bound this sum, we bound the contribution of each of its terms (for a random
S
i
of size N ). Fixing an arbitrary q Q, we consider the random variable
ζ
q
= ζ
q
(S
i
)
def
= Pr
r
U
i
(S
i
)
[q(r ) = q] =
|{r :(q(r) =q) (q
i
(r)S
i
)}|
|{r : q
i
(r)S
i
}|
(as a function of the random choice of S
i
of size N ). Using the Uniformity Property,
we infer that the denumenator equals N ·
||
|Q
i
|
, and the expected value of the numerator
equals |{r : q(r) =q}|·
N
|Q
i
|
=
||
|Q|
·
N
|Q
i
|
. Thus,
E
[ζ
q
] = 1/|Q|. A simple application
of Chernoff Bound (see Footnote 2) shows that, with probability at least exp((
2
·
N/|Q|)), this random variable is (1 ±)/|Q|. Thus, for N = c ·|Q|log |Q| (where
c = O(1/
2
)), the probability that Pr
rU
i
[q(r ) = q] is not in [(1 ± )/|Q|]isat
most |Q|
1
/24. By the union bound, the probability that such a q exists is at most
1/24, and if no such q exists then the statistical difference is bounded by at most
.
CLAIM 4.11.3. The probability of event BE3 (respectively, BE4) is at most
1/24.
P
ROOF. We will bound the probability of the event BE3. The analysis for BE4
is identical. Both proofs are similar to the proof of Lemma 4.3 (i.e., projection
in the two-prover case). Indeed, our interest in the Decomposition Property is
Locally Testable Codes and PCPs of Almost-Linear Length 601
motivated by the fact that it allows for a reduction of the three-prover case to the
two-prover case. This reduction culminates in the current proof, which refer only
to the communication with two provers (i.e., P and P
i
).
Fix P and let Q
= Q
P
be the set as given by the Decomposition Property (of
Section 4.2.1). We will show that for a randomly selected subset S
1
Q
1
of size
N the following holds
Pr
S
1
P
1
such that Pr
r
1
U
1
(S
1
)
(q(r
1
) Q
) V
1
(r
1
, P(q(r
1
)), P
1
(q
1
(r
1
)))
2
1
24
·|A|
−|Q|
(26)
The claim will follow by a union bound over the |A|
|Q|
possible choices of P.
Note that, for each fixed P (and thus fixed Q
= Q
P
), there is an opti-
mal prover P
1
= P
1
that maximizes the quantity
q
1
def
= Pr
r|q
1
(r)=q
1
[(q(r )
Q
) V
1
(r, P(q(r)), P
1
(q
1
))], for every q
1
Q
1
. Furthermore, by (the Uni-
formity Property and) the Decomposition Property (see Eq. (22)), it holds that
E
q
1
Q
1
[
q
1
] =
E
r
[
q
1
(r)
] </2. For simplicity, assume that the expectation is at
least /3 (by possibly augmenting the event that defines
q
1
). Applying Chernoff
Bound (see Footnote 2), we get that the probability that when we pick N elements
from Q
1
, uniformly and independently, their average is more than (or even more
than twice the expectation) is at most exp(( N )). Thus, if N c·|Q|log |A|for
some large enough constant c, then this probability is at most
1
24
|A|
−|Q|
as claimed
in Eq. (26). The current claim follows.
Combining Claims 4.11.2 and 4.11.3, we conclude that for random sets S
1
and
S
2
, with probability at most 4/24 a bad event occurs; that is, with probability at least
5/6 none of the four BEis occurs. Invoking Claims 4.11.1, the lemma follows.
4.2.4. The 3-Prover MIP: Stage III. Having reduced the sizes of the three prover
strategies, it is straightforward to reduce the amount of randomness used by the
verifier. Below we describe a reduced randomness verifier W
S
1
,S
2
,T
, where S
i
Q
i
for i = 1, 2, and T .
Construction 4.12 (The Final Verifier W
S
1
,S
2
,T
). For sets S
1
Q
1
and S
2
Q
2
, and
T ⊆{(r
1
, r
2
):(q(r
1
) = q(r
2
)) (q
i
(r
i
) S
i
, i ∈{1, 2})}, (27)
we define the (S
1
, S
2
, T )-restricted verifier, denoted W
S
1
,S
2
, as follows: Again, on
input x, let V = V
x
, V
1
and V
2
be as given in Construction 4.10.
(1) Pick (r
1
, r
2
) T uniformly at random.
(2) Make queries q = q(r
1
) = q(r
2
), q
1
= q
1
(r
1
) and q
2
= q
2
(r
2
), to P, P
1
and P
2
,
respectively. Let a = P(q), a
1
= P
1
(q
1
) and a
2
= P
2
(q
2
) denote the answers
received.
(3) Accept if and only if V
1
(r
1
, a, a
1
) V
2
(r
2
, a, a
2
).
Here, again, we use sampleability of subsets of the verifier coins, It is obvious that
the verifier uses log
2
|T | random bits, and that perfect completeness is preserved
(for any S
1
, S
2
and T ). It is also easy to see that a sufficiently large random set T
yields W
S
1
,S
2
,T
of low soundness error; that is:
602 O. GOLDREICH AND M. SUDAN
LEMMA 4.13. Let s
def
= O(|Q|max{log |A|, log |Q|}) and t
def
= O(|Q|
(log |A|) + s · ((log |A
1
|) + (log |A
2
|)). Suppose that S
1
and S
2
are uni-
formly selected s-subsets of Q
1
and Q
2
, and that T is a uniformly selected t-subset
satisfying Eq. (27). Then, with probability at least
2
3
, the verifier W
S
1
,S
2
,T
has
soundness error at most 5.
P
ROOF. By Lemma 4.11, with probability 5/6, the verifier W
S
1
,S
2
has soundness
error at most 4. Using the Uniformity Property, W
S
1
,S
2
can be seen as selecting
(r
1
, r
2
) uniformly in the set on the right-hand side of Eq. (27), and setting q =
q(r
1
) = q(r
2
). It is quite straightforward
10
to show that for a random T , with
probability at least 5/6, the resulting W
S
1
,S
2
,T
has soundness error at most 5. The
lemma follows.
Using Lemma 4.13, we now prove Theorem 4.6.
P
ROOF OF THEOREM 4.6. Fix
= /5. Let V be the 3-prover verifier for
SAT as obtained from Theorem A.5. In particular, V has perfect completeness and
soundness
. The size of the smallest prover is
(n) = m(n)
O(m(n))
· n
1+O(1/m(n))
,
the answer length is bounded by a
(n) = m(n)
O(1)
· n
O(1/m(n))
, and V satisfies the
properties listed in Section 4.2.2.
For sets S
1
, S
2
, T , let W
S
1
,S
2
,T
be the verifier obtained by modifying V as de-
scribed in the current section (see Constructions 4.8, 4.10 and 4.12). Consider the
promise problem whose instances are tuples (φ, S
1
, S
2
, T ) where an instance
is a
YES-instance if W
S
1
,S
2
,T
accepts φ with probability one, and the instance is a
NO-instance if W
S
1
,S
2
,T
accepts with probability at most . We note that an instance
of of size N > n has a 3-prover proof system using at most log
2
N random
coins, having answer length a
(n) < a
(N ), perfect completeness and soundness
error 7
= (since W
S
1
,S
2
,T
is such a verifier).
Now, consider the reduction that maps an instance φ (of length n) of SAT to the
instance (φ, S
1
, S
2
, T ), where S
1
Q
1
and S
2
Q
2
are random subsets of queries
of V of size s
(n) =
(n)·a
(n) and T is a random subset of size t
(n) = s
(n)·a
(n)
of the random strings used by the verifier W
S
1
,S
2
(see Constructions 4.10 and 4.12).
This reduction always maps satisfiable instances of SAT to
YES-instances of and,
by Lemma 4.13, with probability at least
2
3
, it maps unsatisfiable instances of SAT
to
NO-instances of . Finally, note that
|(φ, S
1
, S
2
, T )|=
˜
O(t
(n)) =
˜
O(
(n)·a
(n)
2
) = m(n)
O(m(n))
·n
1+O(1/m(n))
= (n) .
The theorem follows.
4.3. REDUCING THE ANSWER SIZE AND OBTAINING PCPS. Applying state-of-
the-art composition lemmas to the MIP constructed in Section 4.3 gives our final
results quite easily. In particular, we use the following lemmas.
10
The proof proceeds along the outline provided in Footnote 9 (to the proof of Lemma 4.3). First,
fixing any choice of strategies P : Q A, P
1
: S
1
A
1
and P
2
: S
2
A
2
, we consider the
event that W
S
1
,S
2
,T
accepts with probability greater than 5(when interacting with these strategies).
Using Chernoff Bound (again), we see that for a random T this event occurs with probability at
most exp((t)). Taking the union bound over all possible strategies (i.e., choices of P, P
1
and
P
2
), we infer that W
S
1
,S
2
,T
fails with respect to some choice of strategies with probability at most
|A|
|Q|
|A
1
|
|S
1
|
|A
2
|
|S
2
|
· exp((t)) < 1/6 (by the setting of t).
Locally Testable Codes and PCPs of Almost-Linear Length 603
L
EMMA 4.14. (CF.ARORA AND SUDAN [2003] OR BELLARE ET AL. [1993] AND
RAZ AND SAFRA [1997]): For every >0 and integer p, there exists δ>0 such
that for every r, a :
Z
+
Z
+
,
MIP
1
[p, r, a] MIP
1,
[p + 3, r + O(log a), poly(log a)].
Starting with the MIP of Theorem 4.6, we apply Lemma 4.11 repeatedly till the
answer lengths become poly(log log log n). Then, to terminate the recursion, we
use the following result of Harsha and Sudan [2000].
L
EMMA 4.15 (HARSHA AND SUDAN [2000, LEM. 2.6]). For every >0 and
integer p, there exists γ>0 such that for every r, a :
Z
+
Z
+
,
MIP
1
[p, r, a] PCP
1,
1
2
+
[r + O(2
pa
), p + 7],
where PCP
1,s
[r, q] denotes the set of promise problems having a PCP verifier
of perfect completeness, soundness error s, randomness complexity r and query
complexity q.
Recall that in case of PCP the query complexity is measured in bits. Combining
the above lemmas with the nearly-linear 3-prover systems obtained in Section 4.2,
we obtain:
T
HEOREM 4.16 (OUR MAIN PCP RESULT ). For every >0, SAT reduces
probabilistically, under n
1+O((log log n)/
log n)
-length preserving reductions to a
promise problem PCP
1,
1
2
+
[(1 + O((log log n)/
log n)) ·log n, 19]. Further-
more, the reduction runs in time .
P
ROOF. We start with Corollary 4.7 and apply Lemma 4.14 thrice, obtaining
a 12-prover MIP system with answer lengths poly(log log log n). Specifically, we
start with a 3-prover MIP of randomness complexity r
0
(n) = log
2
n + (log log n) ·
log n and answer length a
0
(n) = exp(O(
log n)), and after i iterations we get a
(3 + 3i)-prover MIP of randomness complexity r
i
(n) = r
i1
(n) + O(log a
i1
(n))
and answer length a
i
(n) = poly(log a
i1
(n)). Thus, a
i
(n) = poly(log
(i)
n) and
r
i
(n) = r
0
(n) + O(log a
0
(n)) = log
2
n + O((log log n) ·
log n).
Next, applying Lemma 4.15 gives the desired 19-query PCP. Specifically, this
(12 + 7)-query PCP has randomness complexity r
3
(n) + poly(2
a
3
(n)
) < r
3
(n) +
2
(log log n)/2
, which is upper-bounded by log
2
n + O((log log n) ·
log n). The fur-
thermore clause follows by recalling that Corollary 4.7 (which is merely an instan-
tiation of Theorem 4.6) is proven using a reduction that appends (n) random bits
to the original instance (see proof of Theorem 4.6).
COROLLARY (THEOREM 2.5). Theorem 4.16 implies Theorem 2.5, because the
(effective) length of the oracle used by a PCP[r, q] system is at most 2
r
·q.
5. Shorter Locally Testable Codes from PCPs
In this section, we strengthen the results of Section 3 by presenting locally testable
binary codes of nearly linear length (i.e., n = k
1+o(1)
rather than n = k
1+
, for any
constant >0, as in Section 3.4). We do so by starting with the random projection
of the FS/RS-code from Section 3.2, and applying PCP techniques to reduce the
alphabet size (rather than following the paradigm of concatenated codes as done
604 O. GOLDREICH AND M. SUDAN
in the rest of Section 3). Specifically, in addition to encoding individual alphabet
symbols via codewords of a binary code, we also augment the new codewords with
small PCPs that allow to emulate the local-tests of the original codeword tester.
Using an off-the-shelf PCP (e.g., the one of Arora et al. [1998]) this yields a weak
locally testable code (i.e., one satisfying Definition 2.1); for details, see Section 5.1.
As we explain in Section 5.2, using an off-the-shelf PCP fails to provide a locally
testable code (i.e., one satisfying Definition 2.2), and some modifications are re-
quired in order to redeem this state of affairs (as well as in order to obtain a linear
code). Most of the current section is devoted to implementing these modifications.
Still, the easy derivation of the weak testability result (in Section 5.1) serves as a
good warm-up.
Organization. After presenting (in Section 5.1) the weak codeword testing re-
sult, and discussing (in Section 5.2) the difficulties encountered when trying to
obtain a strong codeword testing result, we turn to establish the latter (i.e., prove
Theorem 2.3). We start by developing (in Section 5.3) a framework for PCPs with
extra properties that are useful to our goal of using these PCPs in the construction
of locally testable codes. We call the reader’s attention to Section 5.3.1, which pro-
vides a wider perspective that may be of independent interest. We then construct
such PCPs (by modifying known constructions in Section 5.4), and combine all the
ingredients to establish Theorem 2.3 (in Section 5.5). Finally, in Section 5.6, we
consider the actual randomness and query complexities of codeword testers, and
show that logarithmic randomness and three queries suffices (for establishing our
main results).
5.1. E
ASY DERIVATION OF A WEAK TESTABILITY RESULT. We start with the
locally testable code C
R
:
k
n
, where n =|R|, presented in Section 3.2.
Recall that codewords in C
R
assigns to each line R a univariate polynomial of
low-degree (represented as a -symbol, where = F
d+1
). We refer to the code-
word test of Construction 3.2, which works by selecting a pair of intersecting lines
and checking that the two polynomials assigned to these lines agree on the value
of their point of intersection. We wish to convert C
R
, which is a code over a large
alphabet, into to a binary code that is locally testable and preserves the distance and
rate of C
R
.
The basic idea is to augment C
R
with small PCPs, each corresponding to a pair
of intersecting lines that can be selected by the C
R
-tester, such that each PCP
asserts that the corresponding two polynomials (i.e., the two polynomials residing
at the locations associated with these two lines) agree on the value of the point of
intersection. Each such PCP has length polynomial in the length of its assertion,
which in turn has length 2 · log ||, and can be verified using a constant number
of queries (see, e.g., Arora et al. [1998]). Assuming that R covers all points almost
uniformly (see Claim 3.3.1), we note that the number of pair of intersecting lines
that can be selected by the C
R
-tester (of Construction 3.2) is approximately |R|·|F|,
where |F|=O(log ||). Thus, the total length of the proofs that we need to add
to the code is at most a poly(log ||) factor larger than n, which is fine under an
adequate choice of parameters (discussed below). Essentially, the tester for the new
code will emulate the old codeword tester by invoking the PCP verifier, which in
turn accesses only a constant number of bits in the adequate proof.
The main problem with the above description is that the PCP verifier needs to
be given (explicitly) the assertion it verifies, whereas we are only willing to read a
Locally Testable Codes and PCPs of Almost-Linear Length 605
constant number of bits (both of the assertion and the corresponding proof). Still,
all standard PCP constructs (e.g., Feige et al. [1996] and Arora et al. [1998]) can
be extended to yield similar results in case one is charged for oracle access to both
the input and the proof-oracle, provided that the input is presented in a suitable
error-correcting format. Actually, this property is stated explicitly in Babai et al.
[1991a],
11
and is always referred to when using the PCP as an “inner-verifier” (in the
setting of PCP composition). Furthermore, these (inner) PCPs can also handle an
input that is presented by a constant number of encoding of substrings that cover the
entire input. Indeed, we are using the PCP here as an inner-verifier (but compose it
with a codeword-tester rather than with an outer-verifier). Lastly, we should replace
each symbol in the C
R
-codeword by its encoding under a suitable (for the inner-
verifier) code C
: →{0, 1}
poly(log ||)
of linear distance. This allows to verify
that two substrings provide the encoding of two (low-degree) polynomials that
agree on a certain point, by making a constant number of (bit) queries. (Needless
to say, it is only guaranteed that the verifier rejects with high probability when
the two substrings are far from having the desired property, which suffices for our
purposes.)
A last issue regarding the code construction is that we should apply a suitable
number of repetitions to the resulting n-sequence (of C
-codewords) such that its
length dominates the length of the added PCPs (denoted L). Recall that the number
of PCPs equals the size of the (“effective”) probability space of the codeword
tester of C
R
(given in Construction 3.2),
12
which in turn equals |R|·|F|=|Fn.
The size of each proof is polynomial in the length of the assertion, which in turn
consists of two C
-codewords, each of length n
def
= poly(log ||), where = F
d+1
and d < |F|. Thus, the total length of the added PCPs is approximately
L
def
= (|Fn) ·poly(2n
) = poly(|F|) ·n = n
1+O(1/m)
, (28)
because n =|R| > |F|
m
and log ||=(d +1) log |F|=
˜
O(|F|) (using d < |F|).
Since the length of each PCP is greater than n
, it follows that L is bigger than n ·n
,
and so repetitions are indeed needed to make the (concatenated) code dominate the
length of the final code. On the other hand, L is not too large (i.e., L = n
1+O(1/m)
),
and so the repetition will not effect the rate of the code by too much. This yield the
following construction:
Construction 5.1. For a suitable number of repetitions t, the resulting code
maps x = (x
1
,...,x
k
)
k
to ((C
(y
1
),....,C
(y
n
))
t
1
,....,π
r
), where
(y
1
,....,y
n
) = C
R
(x
1
,...,x
k
) and π
i
is a PCP (to be further discussed) that refers
to the ith possible choice of a pair of lines examined by the C
R
-tester, and r de-
notes the number of such possible choices. Specifically, we set t = ω(L/nn
); for
example, t = (L/nn
)·log n. As for the π
i
s they are PCPs that establish that the cor-
responding C
-codewords in the first block of nn
bits in the new codeword encode
11
In fact, the presentation of Babai et al. [1991a] is in these terms, as captured by their notion of a
holographic proof. We mention that the recently introduced notion of a PCP of Proximity [Ben-Sasson
et al. 2004] (a.k.a Assignment Tester [Dinur and Reingold 2004]) generalizes holographic proofs by
omitting the reference to the encoding (of inputs via a good error-correcting code).
12
Recall that this tester uniformly selects a point in F
m
and a line in R going through this point. The
effective probability (relevant for the following construction) is the number of possible choices of
such (point and line) pairs, which equals |R|·|F|.
606 O. GOLDREICH AND M. SUDAN
-symbols that would have been accepted by the codeword test of C
R
. In particular,
these PCPs establish that the corresponding n
-bit long strings are C
-codewords.
Indeed, for i = (x,), the proof π
i
refers to the lines
x
and , where
x
is the
canonical line of x F
m
and is a random line (in R) that passes through x (i.e.,
one of approximately |R|/|F|
m1
possibilities). This proof (i.e., π
i
) asserts that the
two n
-bit long strings in locations corresponding to
x
and are C
-codewords that
encode two polynomials, denoted h
x
and h, that satisfy h
x
(α) = h(β), where α
and β are determined by i = (x,) such that
x
(α) = (β) = x. In the sequel,
we will identify the index of these PCPs with the corresponding pair of lines (i.e.,
i (
x
,)).
By our choice of t, the distance of the new code (of Construction 5.1) is de-
termined by the distance of C
R
(and the distance of C
). The block-length of the
new code is N
def
= (1 + log n) · L, where (by Eq. (28)) L = n
1+O(1/m)
. Using
d = m
m
,wehavem > (
log k)/(log log k) (by Eq. (2)). Furthermore, by Eq. (3),
we have n = exp(
˜
O(
log k)) · k and log ||=exp(
˜
O(
log k)). Thus, we have
N =
˜
O(L) = n
1+O(1/m)
. Note that
n
1+O(1/m)
= (exp(
˜
O(
log k)) · k)
1+O((log log k)/(
log k))
= exp(
˜
O(
log k)) · k.
Thus, the code of Construction 5.1 maps K = k · log
2
|| > k bits to N-bit long
codewords, where N = exp(
˜
O(
log K )) · K .
The tester for the code of Construction 5.1 emulates the testing of C
R
by in-
specting the PCP that refers to the selected pair of lines. In addition, it also tests
(at random) that the first t blocks (of length nn
each) are identical. A specific
implementation of this scheme follows.
Construction 5.2 (Weak Codeword Tester for Construction 5.1). When test-
ing w = (w
1
,....,w
tn
, w
tn+1
,...,w
tn+r
), where w
i
:[n
] →{0, 1} for
i = 1,...,tn and w
i
:[L/r] →{0, 1} for i = tn + 1,...,tn + r, proceed
as follows.
(1) Invoke the C
R
-tester in order to select a random pair of intersecting lines (
1
,
2
).
That is, (
1
,
2
) is distributed as in Step (1) of Construction 3.2.
(2) Invoke the PCP-verifier providing it with oracle access to the input-oracles w
1
and w
2
and the proof-oracle w
tn+i
, where i (
1
,
2
). If the verifier reject
then halt and reject, otherwise continue.
(3) Check that w
jn+
= w
, for uniformly selected j [t 1] and [n], by
comparing w
jn+
(i) = w
(i) for a randomly selected i [n
]. If equality holds
then accept.
Clearly, Construction 5.2 makes a constant number of queries and accepts every
codeword of Construction 5.1. We thus turn to analyze this test’s performance on
noncodewords. The key point in the following (relatively easy) analysis is that if a
sequence is far from the new code then most of the distance must be due to the tnn
-
bit long prefix of the N -bit sequence, because L = N tnn
< N / log n. That is, if
w = (w
1
,....,w
tn
, w
tn+1
,...,w
tn+r
)isδ-far from the code, then (w
1
,....,w
tn
)
must be δ
-far from the code (denoted E
in Lemma 5.3) that consists of the tnn
-bit
long prefix of the code described in Construction 5.1, where δ
(1/ log n).
Thus, for any constant δ>o(1) or even any δ>2/ log n, we may focus on
analyzing the case that the tnn
-bit long prefix of w is δ/2-far from the residual
Locally Testable Codes and PCPs of Almost-Linear Length 607
code (obtained by ignoring the PCP part of Construction 5.1). We undertake this
task next.
L
EMMA 5.3. Let E
be the code obtained by projecting the code described
in Construction 5.1 on the first tnn
coordinates; that is, E
(x) is the tnn
-
bit long prefix of the encoding of x by Construction 5.1. Suppose that w
=
(w
1
,....,w
tn
) ({0, 1}
n
)
tn
is δ
-far from E
. Then, when given oracle access to
(w
1
,....,w
tn
, w
tn+1
,...,w
tn+r
), the tester of Construction 5.2 rejects with prob-
ability (δ
), regardless of the values of w
tn+1
,...,w
tn+r
∈{0, 1}
L/r
.
It follows that if w is δ-far from the code of Construction 5.1, then w is rejected
by Construction 5.2 with probability (δ (1/ log n)).
P
ROOF. Let us denote by the average (relative) distance of (w
1
,...,w
n
) from
(w
jn+1
,...,w
jn+1
), for a random j [t 1]. Let E be the code obtained by taking
the first nn
bits of the code E
; that is, the bits corresponding to (w
1
,....,w
n
).
We observe (see Proposition 5.5 at the end of this section) that either δ
/2or
(w
1
,....,w
n
)is(δ
/2)-far from the code E. Noting that the first case is detected
with probability by the (“repetition”) test of Step (3), we focus on the second case
and consider what happens when invoking the PCP verifier.
Bearing in mind that (w
1
,....,w
n
)is(δ
/2)-far from E, let us denote by
i
the
(relative) distance of w
i
from C
. We distinguish two cases, regarding the average
of the
i
’s:
Case 1. If
n
i=1
i
/n
/4, then the PCP verifier will reject with probability
(δ
). The reason is that the second query of the C
R
-tester (i.e., the random line
passing through a random point) is almost uniformly distributed, and so the PCP
verifier will be invoked on a pair of input-oracles such that on the average the
second input-oracle is (δ
/8)-far from the code C
, where the average is taken
over this (slightly skewed) choice of the second line (which is the choice used
in Construction 5.2). In such a case, the PCP verifier will reject with probability
(δ
/8).
Case 2. If
n
i=1
i
/n δ
/4 then we consider the C
-codewords, denoted by
c
i
s, that are closest to these w
i
s. In the current case, (c
1
,....,c
n
)is(δ
/4)-far
from E, because (w
1
,....,w
n
)is(δ
/2)-far from E. Let d
i
be the C
-decoding of
c
i
(i.e., c
i
= C
(d
i
)). Then, (d
1
,...,d
n
)is(δ
/4)-far from the code C
R
, and would
have been rejected by the C
R
-tester with probability p
def
= (δ
/4).
Let us call a pair of lines (
1
,
2
) good if the C
R
-tester would have rejected the
values d
1
and d
2
. By the above, with probability p, the C
R
-tester selects a good
pair of lines. On the other hand, for a good pair of lines (
1
,
2
), when given access
to the input-oracles c
1
and c
2
(and any proof-oracle), the PCP verifier rejects
with constant probability. We need, however, to consider what happens when the
PCP verifier is given access to the input-oracles w
1
and w
2
(and the proof-oracle
w
tn+(
1
,
2
)
), when (
1
,
2
) is a good pair. In the rest of this proof, we show that,
for a good (
1
,
2
), the PCP verifier rejects the input-oracles w
1
and w
2
with
constant probability. This happens regardless of whether or not (w
1
, w
2
) is close
to (c
1
, c
2
).
Letting δ
C
denote the constant relative distance of C
, we consider two sub-cases:
(1) If both w
i
s are (δ
C
/4)-close to the corresponding c
i
s then, (w
1
, w
2
)is
(δ
C
/4)-far from any pair of acceptable strings, and the PCP verifier rejects
608 O. GOLDREICH AND M. SUDAN
the input-oracles w
1
and w
2
with constant probability (i.e., (δ
C
/4) =
(1)).
The reason that (w
1
, w
2
)is(δ
C
/4)-far from any acceptable pair of strings
is due to the fact that the latter are pairs of codewords and the code has relative
distance δ
C
. Specifically, if (c
1
, c
2
) is a pair of acceptable codewords, then
(c
1
, c
2
) = (c
1
, c
2
) and
((w
1
, w
2
), (c
1
, c
2
)) ((c
1
, c
2
), (c
1
, c
2
)) ((w
1
, w
2
), (c
1
, c
2
))
δ
C
· n
2 ·
δ
C
4
· n
which equals (δ
C
/4) · 2n
.
(2) Otherwise (i.e., some w
i
is (δ
C
/4)-far from the corresponding c
i
, which by
definition is the codeword closest to w
i
), one of the input-oracles is δ
C
/4-
far from being a codeword, and again the PCP verifier rejects with constant
probability.
We conclude that, for a good pair (
1
,
2
), when given access to the input-oracles
w
1
and w
2
, the PCP verifier rejects with constant probability (regardless of the
contents of the proof-oracle). Recalling that a good pair is selected with probability
p = (δ
), it follows that in this case (i.e., Case 2) the PCP verifier rejects with
probability (δ
).
The lemma follows.
Combining all the above, we obtain:
T
HEOREM 5.4 (WEAK VERSION OF THEOREM 2.3). For infinitely many K ’s,
there exist weak locally testable binary codes of length N = exp(
˜
O(
log K ))·K =
K
1+o(1)
and constant relative distance.
In contrast to Theorem 2.3, the codes asserted in Theorem 5.4 only have weak
codeword tests (i.e., tests satisfying Definition 2.1). Furthermore, these codes are
not necessarily linear.
Digression on Distances in Repetition Codes. In the proof of Lemma 5.3, we
noted above that the distance of a string from a code obtained by repeating some
basic code can be attributed (in half) either to the distance of the first block from
the basic code or to the distance of the other blocks from the first block. Here
we state a more general result that suggests that, for any probability distribution
(p
1
,..., p
t
), we may test a “repetition of some basic code” by selecting the ith
block with probability p
i
and checking whether this block is in the basic code and
whether this block equals a uniformly selected block.
P
ROPOSITION 5.5. Let be a finite set and δ : × R be any nonnega-
tive function that satisfies the triangle inequality (i.e., δ(x, z) δ(x, y) + δ(y, z)
x, y, z ). For any S , define δ
S
(x)
def
= min
yS
{δ(x, y)}. Fixing any t, define
δ((x
1
,...,x
t
), (y
1
,...,y
t
)) =
t
i=1
δ(x
i
, y
i
)/t and R(S) ={x
t
: x S}. Also, for
any T
t
and x
t
, let δ
T
(x)
def
= min
yT
{δ(x, y)}. Then, for any probability
distribution (p
1
,..., p
t
) on [t], and for every x = (x
1
,...,x
t
)
t
, it holds that
i[t ]
p
i
· δ
S
(x
i
) +
i[t ]
p
i
·
j[t]
δ(x
j
, x
i
)
t
δ
R(S)
(x).
Locally Testable Codes and PCPs of Almost-Linear Length 609
P
ROOF. We first establish the claim for the special case in which p
1
= 1 and
p
2
=···= p
t
= 0. We do so by using the triangle inequality (and the definitions
of
δ and R(S)), and observing that
δ
R(S)
(x) δ(x, x
t
1
) + δ
R(S)
(x
t
1
)
=
j[t]
δ(x
j
, x
1
)
t
+ δ
S
(x
1
).
Clearly, this generalizes to any i (i.e., using x
i
instead of x
1
), and taking the weighted
average (weighted by the general p
i
s), the proposition follows.
5.2. PROBLEMS WITH AN EASY DERIVATION OF THE STRONG TESTABILITY
RESULT. Before turning to the actual constructions, we explain why merely
plugging-in a standard (inner-verifier) PCP will not work (for strong codeword
testability). We start with the most severe problem, and then turn to additional
ones.
Noncanonical Encoding. As discussed in Section 1.1, the soundness property
of standard PCPs does not guarantee that only the “canonical” proof (obtained
by the designated construction) is accepted with high probability. The standard
soundness property only guarantees that false assertions are rejected with high
probability (no matter which proof-oracle is used). Furthermore, typical PCPs tend
to accept also noncanonical proofs. This is due to a gap between the canonical
oracles (used in the completeness condition) that encodes information as polyno-
mials of specific individual degree, and the verification procedure that only refers
to the total degree of the polynomial.
13
This problem was avoided in Section 5.1 by
discarding non-codewords that are close to the code and making the PCPs them-
selves a small part of the codeword. Thus, the noncanonical PCPs by themselves
could not make the sequence too far from the code, and so nothing is required
when we use the weak definition of codeword testing. However, when we seek
to achieve the stronger definition, this problem becomes relevant (and cannot be
avoided).
An additional potential problem is that, per definition, PCPs do not necessarily
provide “strong soundness” (i.e., reject a proof that is -far from being correct with
probability ()). Although some known PCPs (e.g., Arora et al. [1998]) have this
added property, other (e.g., Hastad[1999]) don’t.
Linearity. We wish the resulting code to be linear, and it is not clear whether
this property holds when composing a linear code with a standard inner-verifier.
Since we start with an F-linear code (and an F -linear codeword test), there is hope
that the proof-oracle added to the concatenated code will also be linear (over GF(2),
provided that F is an extension field of GF(2)). Indeed, with small modifications
of standard constructions, this is the case.
13
In basic constructions of codes, this is not a real problem because we can define the code to be the
collection of all polynomials of some total degree as opposed to containing only polynomials satisfying
some individual degree bound. However, when using such a code as the inner code in composition,
we cannot adopt the latter solution because we only know how to construct adequate inner-verifiers
for inputs encoded as polynomials of individually-bounded degree (rather than bounded total degree).
610 O. GOLDREICH AND M. SUDAN
Other Technical Problems. Other problems arise in translating some of the
standard “complexity-theoretic tricks” that are used in all PCP constructions. For
example, PCP constructions are typically described in terms of a dense collection
of input lengths (e.g., the input length must fit |H |
m
for some suitable sizes of
|H| and m (i.e., m = (|H |/ log |H |)), and are extended to arbitrary lengths by
padding (of the input). In our context, such padding, depending on how it is done,
either permits multiple encodings (of the same information), or forces us to check
for additional conditions on the input (e.g., that certain bits of the input are zeroes).
Other complications arise when one attempts to deal with “auxiliary variables” that
are introduced in a process analogous to the standard reduction of verification of
an arbitrary computation to the satisfiability of a 3CNF expression.
This forces us to re-work the entire PCP theory, while focusing on “strongly re-
jecting” noncanonical proofs and on obtaining “linear PCP oracles” when asked to
verify homogeneous linear conditions on the input. By strongly rejecting noncanon-
ical proofs, we mean that any string should be rejected with probability proportional
to its distance from the canonical proof (which is indeed analogously to the defini-
tion of strong codeword tester). We comment that for the purposes of constructing
short locally testable codes, it suffices to construct verifiers verifying systems of
homogeneous linear equations and this is all we will do (although we could verify
affine equations equally easily). In what follows, whenever we refer to a linear
system, we mean a conjunction of homogeneous linear constraints.
5.3. I
NNER VERIFIERS FOR LINEAR SYSTEMS:DEFINITION AND COMPOSITION.
We use PCP techniques to transform linear locally testable codes over a large al-
phabet into locally testable codes over a smaller alphabet. Specifically, we adapt the
construction of inner-verifiers such that using them to test linear conditions on the
input-oracles can be done while utilizing a proof-oracle that is obtained by a linear
transformation of the input-oracles. Furthermore, the constructions are adapted to
overcome the other difficulties mentioned in Section 5.2 (most importantly, the
issue of noncanonical proofs).
The basic ingredient of our transformations is the notion of an inner verifier for
linear codes. Since the definition is quite technical, we consider it useful to start
with a wider perspective on the various ingredients of this definition. We consider
this perspective, provided in Section 5.3.1, to be of independent interest. The actual
definition of an inner verifier for linear codes and its various composition properties
are presented in Sections 5.3.2–5.3.4.
5.3.1. A Wider Perspective. Two basic extensions of the standard definition of
soundness (for PCP systems) were mentioned in Section 5.2: The first is a require-
ment to reject “noncanonical” proofs, where a canonical proof is one specified
in the completeness condition. The second extension is a requirement for strong
soundness, which means the rejection of nonvalid proofs with probability that is
proportional to their distance from a valid proof. In the following definition, we
incorporate both requirements, while considering strings over arbitrary alphabets
(rather than binary strings).
Definition 5.6 (Strong PCP). A standard verifier, denoted V , is a probabilistic
polynomial-time oracle machine. On input x
, we only consider oracles of
length (|x|), where :
N N satisfies (n) exp(poly(n)). A prover strategy,
denoted P, is a function that maps
YES-instances to adequate proof-oracles. In
Locally Testable Codes and PCPs of Almost-Linear Length 611
particular, |P(x)|=(|x|). We say that V is a strong PCP for the promise problem
if it satisfies the following two conditions:
Completeness (with respect to P). For every
YES-instance x
(of ),
on input x and access to oracle P(x), the verifier always accepts x. That is,
Pr[V
P(x)
(x) = 1] = 1.
The string P(x) is called the canonical proof for x.
Strong soundness (with respect to canonical proofs). For every x
and
π
(|x |)
, the following holds:
(1) If x is a
NO-instance (of ), then P(x) = λ, and every π is said to be 1-far
from λ.
(2) If π is δ-far from P(x) then, on input x and access to oracle π, the verifier
rejects with probability (δ). That is, Pr[V
π
(x) = 1] = ((π, P(x)))/|π |,
for every x and π .
Standard soundness follows by combining the two parts of the strong soundness
condition. We comment that strong soundness per se (i.e., with respect to any valid
proof) can be defined by letting P(x) be the set of all (“absolutely”) valid proofs
(i.e., P(x) ={π
(|x |)
: Pr[V
π
(x) = 1] = 1}). That is, strong soundness (with
respect to any valid proof) says that for any
YES-instance x and every π
(|x |)
,
the rejection probability of V
π
(x) = 1 is proportional to the distance of π from
the set of all proofs that are accepted with probability 1 (i.e., Pr[V
π
(x) = 1] =
(δ
x
(π)), where δ
x
(π) is the minimum of (π, π
)/|π|taken over all π
satisfying
Pr[V
π
(x) = 1] = 1, and δ
x
(π) = 1 if no such proof exists (i.e., x is a NO-
instance)). It seems that, in the context of PCP, strong soundness with respect
to any valid proof is a more natural notion than strong soundness with respect to
canonical proofs.
14
Things change, when one wishes to use PCP in the construction
of locally testable codes. Strong soundness (with respect to canonical or arbitrary
valid proofs) extends naturally to PCPs of Proximity (PCPP, as defined recently
in Ben-Sasson et al. [2004] and Dinur and Reingold [2004]):
Definition 5.7 (Strong PCPP). A proximity verifier, denoted V , is a proba-
bilistic polynomial-time oracle machine that is given access to two oracles, an
input-oracle x :[n] and a proof-oracle π :[(n)] , where n is V s only
explicitly given input, and is as in Definition 5.6. A prover strategy, denoted P,
is defined as in Definition 5.6. We say that V is a strong PCPP for the promise
problem if it satisfies the following two conditions:
Completeness (with respect to P). For every
YES-instance x
, on input
1
|x |
and access to the oracles x and P(x), the verifier always accepts x. That is,
Pr[V
x, P(x)
(1
|x |
) = 1] = 1. Again, P(x) is called the canonical proof for x.
Strong soundness (with respect to canonical proofs). For every x
and
π
(|x |)
, on input 1
|x |
and access to the oracles x and π, the verifier rejects
14
Indeed, standard PCP constructions tend to satisfy strong soundness with respect to any valid
proof. Furthermore, some of the valid proofs correspond to the “encoding” of different NP-witnesses,
whereas others arise from the gap between individual degree bound and total degree bound (discussed
in Sections 1.1 and 5.2).
612 O. GOLDREICH AND M. SUDAN
with probability (δ(x)), where
δ(x)
def
= min
x
max
(x, x
)
|x|
;
(P(x)
)
(|x|)

(29)
and, as in Definition 5.6, for any
NO-instance x we define P(x) = λ, and say
that any π is 1-far from λ. Alternatively, δ(x) can be defined as the minimum of
max(
(x ,x
)
|x |
;
(P(x )
)
(|x |)
) taken over all YES-instances x
and every π
.
We mention that the above formulation benefits from Ben-Sasson et al. [2004] and
Dinur and Reingold [2004], which has appeared after the preliminary publication of
the current work. In the current work, we follow the older tradition (rooted in Babai
et al. [1991a]) of considering only the special case in which the
YES-instances of
are encodings, under some good error correcting code E,of
YES-instances in some
other set S. That is, the
YES-instances of are {E(x):x S}, and the NO-instances
of are all strings that are far from the
YES-instances of . (We stress that the
notion of a PCPP (let alone Definition 5.7) is not used in the rest of this work,
except for a few clarifying comments.)
The definition presented in Section 5.3.2 incorporates all the above themes, while
adding two additional themes. First, we refer to a situation (which arises naturally
in proof composition as in Arora and Safra [1998] and Arora et al. [1998] in
which the verifier is given access to p > 1 input-oracles rather than to one. (These
oracles are supposed to contain the encoding of strings whose concatenation yields
a
YES-instance of another language.) Second, we refer to PCPPs that check linear
relations, while utilizing verifiers that only conduct linear tests (on the retrieved
oracle answers) and having canonical proofs that are linear transformations of the
(actual) input.
5.3.2. The Actual Definition. One basic ingredient of our constructions is the
notion of an inner-verifier for linear codes. These inner-verifiers are actually strong
PCPPs (as in Definition 5.7) for assertions regarding linear conditions on the input-
oracles. This means that their definition is quite complex: it refers to strong sound-
ness with respect to canonical proofs as well as to a formalism regarding encoding
of inputs. In addition, the following definition refers to a formalism for expressing
(conjunctions of) linear conditions.
The “linear inner PCP systems” defined below have quite a few parameters, where
the main ones specify the field F, the number of input-oracles q and the set F
b
of
possible symbols that they encode, and the number of queries p made by the inner
verifier and the set F
a
of possible answers to these queries. That is, each of the q
input-oracles is supposed to encode an element of F
b
as a sequence over F
a
, where
typically a b. Thus, an (F, (q, b) ( p, a)) linear inner-PCP system is the main
ingredient in a transformation of an F-linear code over an alphabet = F
b
that
is testable by q queries, into an F-linear code (of a typically longer length) over an
alphabet = F
a
that is testable by p queries, where typically a b but p > q.
Informally, the inner-verifier allows to emulate a local test in the given code over ,
by providing an encoding (over ) of each symbol in the original codeword as well
as auxiliary proofs (regarding the satisfiability of homogeneous linear conditions)
that can be verified based on a constant number of queries. That is, given a locally
testable code C
0
:
K
0
× [N
0
] , we consider the mapping of x
K
0
to
(E(C
0
(x, 1)),...,E(C
0
(x, N
0
))), where E :
n
is the aforementioned encod-
ing. Then, an (F, (q, b) (p, a)) inner-PCP system should allow to transform a
Locally Testable Codes and PCPs of Almost-Linear Length 613
q-query codeword tester of C
0
(which makes F -linear checks) into a p-query code-
word tester of the code resulting from appending adequate (inner-verifier) proofs to
the aforementioned mapping (i.e., the concatenated code of C
0
and E). In addition,
we wish these auxiliary proofs to be obtained by F-linear transformations of x.
We start by presenting the basic syntax of such linear inner PCP systems, which
depend on a formalism for expressing (conjunctions of) linear conditions. We ob-
serve that verifying that a vector satisfies a conjunction of (homogeneous) linear
conditions is equivalent to verifying that it lies in some linear subspace (i.e., the
space of vectors that satisfy these conditions). For integer d and field F, we let
L
F,d
denote the set of all linear subspaces of F
d
. We will represent such a subspace
L L
F,d
by a matrix M F
d×d
such that L ={x F
d
: Mx =
0}. When
convenient, we will sometimes say that a vector lies in L and sometimes say that
it satisfies the conditions L.
Definition 5.8 (The Mechanics of Linear Inner Verification). For a field F,
positive integers q, b, p, a and δ (0, 1), an (F, (q, b) ( p, a))-linear in-
ner system consists of a triple (E, P, V ) such that
(1) E : F
b
(F
a
)
n
is an F-linear code of minimum distance at least δn over the
alphabet F
a
. We call E the encoding function.
(2) P : L
F,qb
×(F
b
)
q
(F
a
)
is called the proving function. For every L L
F,qb
,
the mapping x → P(L, x) is required to be F-linear.
(3) V is an oracle machine, called the verifier, that gets as input L L
F,qb
and
(coins) ω ∈{0, 1}
r
and has oracle access to q + 1 vectors over F
a
, denoted
X
1
,...,X
q
:[n] F
a
and X
q+1
:[] F
a
. That is, a query j [n] to oracle
i [q] is answered by X
i
( j ), and a query j [] to oracle q + 1 is answered
by X
q+1
( j ). It is required that V satisfies the following two conditions:
Query Complexity. For every L L
F,qb
and ω ∈{0, 1}
r
, machine V makes
a total of exactly p oracle calls to the oracles X
1
,...,X
q+1
.
Linearity of Verdict. For every L and ω, the acceptance condition of V is
a conjunction of F-linear constraints on the responses to the queries. That is,
based on L and ω, machine V determines some L
L
F,pa
and accepts if and
only if (α
1
,...,α
p
) L
, where α
i
is the answer obtained for the j th query.
The vectors X
1
,...,X
q
are called the input-oracles and the vector X
q+1
is
called the proof-oracle.
Such a system is said to use r coins, encodings of length n and proofs of length
.
Indeed, the requirement that V makes exactly p queries (rather than at most
p queries) is made for technical convenience (and can be easily met by making
dummy queries if necessary). Definition 5.8 makes no reference to the quality of
the decisions made by the verifier. This is the subject of the next definition.
Definition 5.9(Linear Inner Verification–Perfect Completeness and Strong
Soundness). A system (E, P, V ) as in Definition 5.8 is called γ -good if it satisfies
the following two conditions:
Completeness. If the first q oracles encode a q-tuple of vectors over F
b
that
satisfies L and if X
q+1
= P(L , x
1
,...,x
q
) then V always accepts.
614 O. GOLDREICH AND M. SUDAN
That is, for every x
1
,...,x
q
F
b
and L L
F,qb
such that (x
1
,...,x
q
) L,
and for every ω ∈{0, 1}
r
, it holds that V
E(x
1
),...,E(x
q
),P(L,x
1
,...,x
q
)
(L ) = 1.
Strong Soundness. If the first q oracles are far from encoding any q-tuple of
vectors over F
b
that satisfies L then V rejects with significant probability, no matter
which X
q+1
is used. Furthermore, if the first q oracles are close to encoding some q-
tuple that satisfies L but X
q+1
is far from the corresponding unique proof determined
by P then V rejects with significant probability. Actually, in both cases, we require
that the rejection probability be proportional to the relevant relative distance, where
γ is the constant of proportionality.
That is, for every L L
F,qb
, every X
1
,...,X
q
:[n] F
a
and X
q+1
:[]
F
a
, it holds that
Pr
ω
[V
X
1
,...,X
q
,X
q+1
(L ) = 1] γ · δ
L
(X
1
,...,X
q
, X
q+1
)
where for
X = (X
1
,...,X
q
, X
q+1
),
δ
L
(X) = min
(x
1
,...,x
q
)L
max
max
i[q]
(X
i
, E(x
i
))}
n
;
(X
q+1
, P(L, x
1
,...,x
q
))

(30)
The quantity δ
L
(X
1
,...,X
q
, X
q+1
) will be called the deviation of
(X
1
,...,X
q
, X
q+1
).
In such a case, we say that (E, P, V )isan(F, (q, b) ( p, a))-linear inner
proof system, abbreviated as (F, (q, b) ( p, a))-LIPS.
We comment that there is a redundancy in the linearity requirements made in
Definition 5.8. Specifically, we have required E, P and V (or rather its acceptance
condition) to be F-linear. However, under the completeness and soundness condi-
tions of Definition 5.9, the linearity of E and V implies the linearity of P, and the
linearity of E and P implies without loss of generality the linearity of V .
15
Typically, we aim at having n, and 2
r
be small functions of b (i.e., polyno-
mial or even almost-linear in b), whereas p may grow as a function of q (which
is typically a constant). Note that Definition 5.9 is designed to suit our applica-
tions. Firstly, the strong notion of soundness, which refers also to “noncanonical”
proofs of valid statements, fits our aim of obtaining a code that is locally testable
(because it guarantees rejection of sequences that are not obtained by the trans-
formation induced by (the encoding function and) the proving function). Indeed,
this augmentation of soundness is nonstandard (and arguably even unnatural) in
the context of PCP. Secondly, the strong notion of soundness allows also to reject
with adequate probability inputs that are close to the code (or alleged proofs that
are close to the canonical ones), and thus support the strong definition of codeword
15
To see that the linearity of E and V implies the linearity of P, note that the combination of perfect
completeness and strong soundness means that the set S
def
={(E (x
1
),...,E(x
q
), P(L, x
1
,...,x
q
)) :
(x
1
, .., x
q
) L} equals the set of q + 1-tuples (X
1
,...,X
q
, X
q+1
) that pass all possible checks of V .
Since all the latter checks are F-linear, it follows that the set S is an F-linear subspace. Using the
fact that E is F-linear, it follows that {(x
1
, .., x
q
, P(L, x
1
, .., x
q
)) : (x
1
, .., x
q
) L} is an F -linear
subspace, and hence (x
1
, .., x
q
) → P(L , x
1
, .., x
q
)isF-linear. To see that the linearity of E and
P implies the linearity of V , we refer to Ben-Sasson et al. [2003a, Prop. A.1] which implies that
when testing membership in a linear subspace by a one-sided error tester (i.e., perfect completeness),
without loss of generality, the tester may make only linear checks.
Locally Testable Codes and PCPs of Almost-Linear Length 615
testing (i.e., Definition 2.2). Finally, Definition 5.9 only handles the verification
of linear conditions, and does so while using proofs that are linear transformation
of the input. Indeed, this fits our aim of transforming F -linear codes over a large
alphabet (i.e., the alphabet F
b
)toF-linear codes over a smaller alphabet (i.e., F
a
).
5.3.3. Obtaining Locally Testable Codes. The utility of linear inner proof sys-
tems (LIPSes) in constructing locally testable codes is demonstrated by two of the
following results (i.e., Proposition 5.10 and Theorem 5.13). In Proposition 5.10, we
show that any LIPS yields a locally testable code, where the distance is provided
by the encoding function of the LIPS. In Theorem 5.13, we compose a locally
testable code over a large alphabet with a LIPS to obtain a locally testable code
over a smaller alphabet. Proposition 5.10 merely serves as a warm-up towards the
Theorem 5.13, which is the result actually used in the rest of our work.
To simplify the exposition, we are going to confine ourselves to (·, ·, ·)-LIPSes
with γ 1. Indeed, for any γ
,any(·, ·, ·)-LIPS constitute a (···
)-LIPS.
Furthermore, this is typically the case anyhow (because the deviation parameter may
equal 1, or at least be very close to 1).
P
ROPOSITION 5.10. Suppose that a < b divides b, and γ (0, 1]. Then an
(F, (1, b) (p, a))-LIPS implies the existence of an F-linear locally testable
code of relative distance at least δ/2, over the alphabet = F
a
, mapping F
b
=
b/a
to
M
for M < 2(n + ), where n and are the corresponding lengths of the
encoding and the proof used by the LIPS. Specifically, the code is testable with p
queries, and the tester rejects a word that is -far from the code with probability at
least (γ/4) · . Furthermore, the tester tosses 1 +max(r
V
, log M) coins, where r
V
is the number of coins tossed by the inner-verifier of the above LIPS.
The point is that Proposition 5.10 establishes a locally testable code while only
relying on a standard error-correcting code (i.e., the encoding E :
b/a
n
that
is part of the LIPS).
P
ROOF. Let (E, P, V ) be the (F, (1, b) ( p, a))-LIPS, where E :
F
b
(F
a
)
n
and P : L
F,b
× F
b
F
.Welett =/n, where n is
typically the case. Under this setting, tn and tn + <2 + n. We construct
a locally testable code C :
b/a
tn+
, where
b/a
=
F
b
, such that the en-
coding of x equals the sequence C(x) = (E(x)
t
, P(L, x)), where L = F
b
(i.e.,
L is satisfied by every vector) and E(x) is replicated t times. Thus, at least half
of the length of C(x) is taken by replications of E(x), and so the relative distance
of C is at least δ/2, because E has relative distance δ. Indeed C has block-length
M = tn + <2( +n).
To test a potential codeword (X
1
,...,X
t
, Y ), where X
i
:[n] and Y :
[] , we perform at random one out of two kinds of tests: With probability
1
2
we test that the t strings X
i
s are replications of X
1
. We do so by picking at random
i [t] and j [n], and testing that X
1
( j ) = X
i
( j ). With the remaining probability
we pick a random test as per the verifier V (F
b
, ·), and emulate V s execution. In
particular, we answer V s queries to its (single) input-oracle by querying our oracle
X
1
, and answer V s queries to its proof-oracle by querying our oracle Y . Note that
although we set no condition on the vector encoded by the input-oracle (i.e., every
b-ary vector over F satisfies the conditions L = F
b
), the verifier needs to verify
that the input-oracle is a codeword of E, which is what we need in order to provide
a codeword test for C.
616 O. GOLDREICH AND M. SUDAN
The above tester has randomness complexity 1 + max(r
V
, log M), and always
accepts any codeword of C. We need to show that words at distance from the code
C are rejected with probability (γ ·). Analogously to the proof of Proposition 5.5,
we have
C
((X
1
,...,X
t
, Y )) ((X
1
,...,X
t
, Y ), (X
t
1
, Y )) +
C
((X
t
1
, Y )).
Thus, if (X
1
,...,X
t
, Y )is-far from C then either (X
1
,...,X
t
, Y )is(/2)-
far from (X
t
1
, Y )or(X
t
1
, Y )is(/2)-far from C.Inthefirst case, we have
((X
1
,...,X
t
, Y ), (X
t
1
, Y )) (/2) · (tn + ), and so
t
i=1
(X
1
, X
i
)/t
(/2) · (n + (/t)) > (/2) · n. Thus, the new tester rejects with probabil-
ity at least (1/2) · (/2) γ · /4, by virtue of the replication test (and
γ 1). In the second case,wehave
C
((X
t
1
, Y )) (/2) · (tn + ), and so
((X
t
1
, Y ), (E(x)
t
, P(F
b
, x)) (/2) · (tn + ) for every x F
b
. Thus, for ev-
ery x, either X
1
is (/2)-far from E(x)orY is (/2)-far from P(F
b
, x), which
means that the deviation of (X
1
, Y ) (as defined in Eq. (30)) is at least /2, be-
cause here the deviation is the minimum taken over all x F
b
of the maximum
between (X
1
, E(x))/n and (Y, P(F
b
, x))/. It follows that (in this case), the
inner-verifier V rejects with probability at least p
def
= γ ·/2, and thus our codeword
test rejects with probability at least p/2 = γ · /4.
Remark 5.11. We wish to highlight an interesting fact regarding the code con-
structed in the proof of Proposition 5.10. Unlike in Section 5.1, the replication of
the basic codeword (conducted in the construction) does not help the analysis of
the new codeword test (but rather complicate it by the need to analyze the repli-
cation test). That is, the test presented in the proof of Proposition 5.10 is a strong
codeword test (for C), regardless of the choice of the parameter t (which governs
the number of replications). The sole role of replication is to guarantee that the
resulting code has constant relative distance. This requires setting t = (/n) (or
alternatively relying on distance properties of the proving function). On the other
hand, the bigger t the worse rate we get for the resulting code, and thus we pick
t = O(/n). This remark applies also to Theorem 5.13.
Composing Locally Testable Codes and LIPSes. The following theorem (i.e.,
Theorem 5.13) will be used to compose locally testable codes over large alpha-
bets with suitable linear inner proof systems, obtaining locally testable codes over
smaller alphabets. Specifically, given a q-query testable F-linear code over the
alphabet = F
b
, we wish to construct a (F-linear) locally testable code over a
smaller alphabet = F
a
, by using a suitable LIPS. The latter includes an adequate
encoding of F
a
by (F
a
)
n
, and its verifier will be used to emulate the local con-
ditions checked by the codeword test of the original code. (Recall that, using the
F-linearity of C, we may assume without loss of generality (cf. Ben-Sasson et al.
[2003a, Prop. A.1]) that the codeword tester makes only F-linear checks.) These
conditions are subspaces of F
q·b
, and so we need a (F, (q, b) (·, a), ·, ·)-LIPS in
order to verify them. Regarding the unspecified parameters of the above-mentioned
(F, (q, b) (p, a))-LIPS, we wish p to be as small as possible and γ,δ be
as large as possible. The construction will be similar to the one used for deriving
the weak testability result in Section 5.1. Thus, in addition to the above, we wish
the randomness complexity of the codeword tester and the (encoding and) proof
length of the LIPS to be as small as possible.
Locally Testable Codes and PCPs of Almost-Linear Length 617
Although the aforementioned composition (captured by Theorem 5.13) is very
natural, we were only able to establish its validity in case the locally testable code
is testable by a procedure that makes (almost) uniformly distributed queries. We
note that the tester presented in Section 3.2 has a version that satisfies this property
(see Remark 3.4). This motivates the following definition.
Definition 5.12 (Codeword Testers with Almost Uniform Queries). For α
(0, 1], a probabilistic oracle machine is said to make α-uniform queries if when
given access to an oracle of length N , a random query in a random execution equals
any fixed i [N ] with probability at least α/N and at most α
1
/N . That is, for
every i [N ], we denote by p
( j )
i
the probability that, in a random invocation, the
jth query of the q-query tester is to location i, and require that
α
N
1
q
·
q
j=1
p
( j )
i
α
1
N
(31)
T
HEOREM 5.13 (COMPOSING AN OUTER CODE WITH AN INNER-VERIFIER).
Consider integers a < b such that a divides b, a finite field F, = F
b
and
α, β, γ , δ (0, 1]. Suppose that the following two constructs exist:
(1) A locally testable F-linear code C :
K
N
of relative distance at least
δ
C
, having a codeword test that makes q queries that are α-uniform, and uses
r coins. Furthermore, suppose that this tester rejects -far sequences with
probability at least β ·.
(2) A (F, (q, b) ( p, a))-linear inner proof system, (E, P, V ), where E :
F
b
(F
a
)
n
,P: L
F,q·b
× (F
b
)
q
(F
a
)
, and V tosses r
V
coins.
Then, there exists an F-linear locally testable code of relative distance at least
δ · δ
C
/2, over the alphabet = F
a
, mapping
K
b·K /a
to
M
, for M <
2 · (Nn + 2
r
). Furthermore, this code can be tested by making p queries and
tossing 1 + max(r + r
V
, log M) coins such that -far sequences are rejected with
probability at least (αβγ δ
2
/16q) · .
Typically, r
V
> 2 + log and 2
r
>Nn, which implies that log M < r + 2 +
log <r +r
V
. We comment that the resulting codeword test does not necessarily
make almost uniform queries; we will redeem this state of affairs at a later point
(in Theorem 5.15).
P
ROOF. The new code consists of two parts (which are properly balanced). The
first part is obtained by encoding each -symbol of the codeword of C by the code
E, whereas the second part is obtained by providing proofs (testable by the inner-
verifier) for the validity of each of the 2
r
possible checks that may be performed
by the codeword test. Specifically, let us denote by i
ω, j
the jth query that the C-
tester makes when using coins ω, and let L
ω
be the linear condition verified on
these coins. (Recall that, using the F-linearity of C, we may assume without loss of
generality (cf. Ben-Sasson et al. [2003a, Prop. A.1]) that the codeword tester makes
only F-linear checks.) Let t =2
r
· /Nnand note that M
def
= t ·Nn+2
r
satisfies
M 2tNn and M < 2 · (Nn + 2
r
). Then, viewing C as C :
K
× [N ]
and recalling that = F
b
b/a
, the string x
K
is encoded by the sequence
618 O. GOLDREICH AND M. SUDAN
(C
(x)
t
, P
(x)), where
C
(x)
def
= (E(C(x, 1)),...,E(C(x, N ))) (32)
P
(x)
def
=P(L
ω
, C(x, i
ω,1
),...,C(x, i
ω,q
)) : ω ∈{0, 1}
r
. (33)
Let us denote this encoding by C

; that is, C

(x) = (C
(x)
t
, P
(x)). Note that
C

:
bK/a
M
, and that C

has distance at least t · δ
C
n · δ N , which means a
relative distance of at least δδ
C
· tNn/M δδ
C
/2 (because M 2tNn).
Testing the code C

is essentially done by emulating the codeword test of C. That
is, to test a potential codeword (X
1
,...,X
tN
; Y
0
r
,...,Y
1
r
), where X
i
:[n]
and Y
ω
:[] , we select uniformly ω ∈{0, 1}
r
, determine the corresponding
linear condition (i
ω,1
,...,i
ω,q
, L
ω
) that would have been checked by the C-tester,
and invoke the inner-verifier V on input L
ω
while providing V with oracle access to
X
i
ω,1
,...,X
i
ω,q
and Y
ω
. Note that i
ω,1
,...,i
ω,q
[N ], and that V tosses additional
coins, denoted ω
∈{0, 1}
r
V
. As in the proof of Proposition 5.10, this is done
with probability 1/2, and otherwise we check the correctness of the replication (by
randomly selecting i [t], j
1
[N ] and j
2
[n] and comparing X
j
1
( j
2
) and
X
(i1)t +j
1
( j
2
)). Let us denote the resulting procedure by T

.
The above procedure T

has randomness complexity 1 + max(r + r
V
, log M),
makes max(p, 2) = p queries (because q 1 implies p 2), and always ac-
cepts any codeword of C

. Although T

looks very appealing, it may not satisfy
the requirements (of a codeword tester) in case the C-tester does not make al-
most uniform queries. Nevertheless, we will show that T

is indeed a C

-tester,
provided that the C-tester makes almost uniform queries (as guaranteed by the
theorem’s hypothesis). Before doing so, we discuss the reason for this technical
condition.
On the Necessity of Almost Uniform Queries. Consider, for example, the case
in which C(x) = (0, C
0
(x)), where C
0
is a locally testable code with tester T
0
,
and suppose that the first query of the C-tester is always to the first position in
the sequence (i.e., the position that is supposed to be identically 0) but the C-
tester usually ignores the answer (and with probability 1/N checks that the answer
equals 0). (The other queries of the C-tester emulate T
0
.) Further suppose that the
proving function of the LIPS sets the first /2 symbols of the (-symbol long) proof
to 0, and that the inner-verifier always compares a random symbol in its first (n-
symbol long) input-oracle (which is supposed to encode the answer to the C-tester’s
first query and hence is supposed to be E(0) = 0
n
) to a random symbol in the first
half of its proof-oracle. (In addition, the inner-verifier emulates some “normal”
inner-verifier using the same input-oracles and the second half of its proof-oracle.)
Then, the corresponding code C

has non-codewords that are very far from the code,
where the difference is concentrated almost only in the “proof-part”, but these non-
codewords are rejected by T

with negligible probability. For example, consider
the noncodeword ((1, C
0
(x))
t
, 1
/2
π
ω
: ω ∈{0, 1}
r
), where 0
/2
π
ω
is the canonical
proof associated with coins ω and input x (i.e., P
(x) =0
/2
π
ω
: ω ∈{0, 1}
r
).
This sequence is 1/4-far from C

but is rejected by T

only if it emulates the
checking of the first bit of C, which happens with probability 1/N = o(1). The above
discussion establishes the necessity of the upper-bound on
q
j=1
p
( j )
i
provided in
Eq. (31). The lower-bound provided by Eq. (31) is inessential, because an alternative
one follows by the fact that the tester must query each location with sufficiently
Locally Testable Codes and PCPs of Almost-Linear Length 619
high probability (in order to reject noncodewords that are corrupted only at that
location).
Overview of the Rest of the Proof. To evaluate the rejection probability of T

,
we consider any (X ; Y ) = (X
1
,...,X
tN
; Y
0
r
,...,Y
1
r
) that is -far from C

, where
throughout the proof (unless said differently), all distances refer to sequences over
. Our aim is to prove that T

rejects this sequence with probability that is propor-
tional to , and thus establishing that T

is a C

-tester. The proof combines elements
from the proof of Lemma 5.3 and Proposition 5.10. As in the proof of Propo-
sition 5.10, we focus our attention on the case that ((X
1
,...,X
N
)
t
; Y
0
r
,...,Y
1
r
)
is /2-far from C

, because the other case is handled by the replication test. We
consider three (remaining) cases:
(1) The sequence (X
1
,...,X
N
) is relatively far from a sequence of E-codewords.
(2) The sequence of E-codewords closest to (X
1
,...,X
N
) is relatively far from
the code C
.
(3) The sequence (X
1
,...,X
N
) is relatively close to a sequence of E-codewords,
which in turn is relatively close to the code C
. (In this case, the distance of
((X
1
,...,X
N
)
t
; Y ) from C

is due to Y .)
Each of these cases will be handled by a corresponding claim. The first two cases
correspond to the two cases considered in the proof of Lemma 5.3, whereas the
third case was not relevant there.
Analogously to the proof of Lemma 5.3, we denote by
i
the (relative) distance
of X
i
from E; that is,
i
=
E
(X
i
)/n.
CLAIM 5.13.1 (CASE 1–USING
= /4). If
N
i=1
i
/N >
, then the inner-
verifier rejects with probability at least αγ ·
.
P
ROOF. Things would have been very easy if at least one of the queries made
by C-tester was uniformly distributed. In such a case, one of the input-oracles
accessed by the inner-verifier would be at expected distance
from the code, and
the inner-verifier would reject with probability at least γ · β
. Unfortunately, the
aforementioned condition does not necessarily hold. Surely, we could modify the
C-tester to satisfy this condition, by adding a uniformly distributed query, but here
we take an alternative route by recalling that (by the hypothesis that the tester makes
“almost uniform” queries) the queries cover each possible location with sufficiently
high probability.
16
Specifically, recall that
q
j=1
p
( j )
i
αq/N , where p
( j )
i
denotes
the probability that the j th query of the C-tester is to location i. Now, consider
the oracles X
i
ω,1
,...,X
i
ω,q
accessed by the inner-verifier, where ω ∈{0, 1}
r
is
16
Here we use the lower-bound on
q
j=1
p
( j )
i
provided by Eq. (31). Alternatively, we could prove that
a different lower-bound follows by the fact that the tester must reject non-codewords with adequate
probability. Specifically, let us denote by p
i
the probability that, on a random invocation, at least
one of the C-tester’s queries is to location i. Clearly, p
i
q
j=1
p
( j )
i
. On the other hand, we claim
that p
i
β/N . The reason is that C-tester must accept 0
N
with probability 1, and reject 0
i1
10
N i
with probability at least β/N , but it cannot possibly distinguish the two cases unless it probes the ith
location. Thus,
q
j=1
p
( j )
i
β/N , for every i [N ].
620 O. GOLDREICH AND M. SUDAN
uniformly selected by our tester. By the above,
E
ω
q
j=1
E
(X
i
ω, j
)
n
=
q
j=1
N
i=1
p
( j )
i
·
i
αq
N
·
N
i=1
i
q ·
which implies that, for uniformly distributed ω ∈{0, 1}
r
and j [q], the oracle
X
i
ω, j
is α
-far from E. Thus, the excepted deviation of (X
i
ω,1
,...,X
i
ω,q
; Y
ω
)isat
least α
and the inner-verifier rejects with probability at least γ ·α
.
We are going to consider the E-codewords, denoted by c
i
s, that are closest to
these X
i
s (i.e., (X
i
, c
i
) =
i
n). Let c = (c
1
,...,c
N
). Let d
i
be the E-decoding
of c
i
(i.e., c
i
= E(d
i
)), and d = (d
1
,...,d
N
). Assuming that Case 1 does not hold,
we have (
c
t
; Y )is(/4)-far from C

, because c
t
is (/4)-close to (X
1
,...,X
N
) and
((X
1
,...,X
N
)
t
; Y )is(/2)-far from C

. However, unlike in the proof of Lemma 5.3,
the sequence
c is not necessarily far from C
, because the distance between (c
t
, Y )
and C

may be due to Y . Thus, there are two additional cases to consider.
C
LAIM 5.13.2 (CASE 2–USING

= αδ/4q ). If cis

-far from C
, then the
inner-verifier rejects with probability at least (βγδ/2) ·

.
The condition
N
i=1
i
/N
is omitted from this claim, because the claim
holds regardless of this condition. The following proof is analogous to the treatment
of Case 2 in the proof of Lemma 5.3.
P
ROOF. By the claim’s hypothesis, for every x
bK/a
, it holds that the set
D
x
def
={i [N ]:c
i
= E(C(x, i))} has cardinality at least

·N , because

·nN
(
c, C
(x)) ≤|D
x
n. Noting that D
x
={i [N]:d
i
= C(x, i )}, it follows that
d is

-far from the code C, when both are viewed as sequences over . Thus, d
would have been rejected by the C-tester with probability at least p
def
= β ·

.
Let us call a choice of coins ω for the C-tester good if the C-tester would have
rejected the values (d
i
ω,1
,...,d
i
ω,q
); that is, (d
i
ω,1
,...,d
i
ω,q
) ∈ L
ω
. By the above,
with probability p, the C-tester selects good coins. On the other hand, for good
coins ω, when given input L
ω
and access to the input-oracles c
i
ω,1
,...,c
i
ω,q
(and any
proof-oracle), the inner-verifier V rejects with constant probability (i.e., probability
at least γ ·δ). We need, however, to consider what happens when V is given access
to the input-oracles X
i
ω,1
,...,X
i
ω,q
(and the proof-oracle Y
ω
), for a good ω.
We next show that, for a good ω, the inner-verifier V rejects the input-oracles
(X
i
ω,1
,...,X
i
ω,q
) with constant probability (regardless of Y
ω
). This happens regard-
less of whether or not X
i
ω,1
,...,X
i
ω,q
is close to (c
i
ω,1
,...,c
i
ω,q
). We consider two
cases:
(1) Suppose that, for every j [q], the oracle X
i
ω, j
is (δ/2)-close to
c
i
ω, j
. Then, for every acceptable q-tuple (a
1
,...,a
q
) (i.e., (a
1
,...,a
q
)
{(E(z
1
),...,E(z
q
)) : (z
1
,...,z
q
) L
ω
}), there exists a j such that the or-
acle X
i
ω, j
is (δ/2)-far from a
j
. The reason being that for every acceptable
(a
1
,...,a
q
) there exists a j such that a
j
= c
i
ω, j
, while on the other hand a
j
must also be an E-codeword. Thus,
(X
i
ω, j
, a
j
)
n
(c
i
ω, j
, a
j
)
n
(X
i
ω, j
, c
i
ω, j
)
n
δ
δ
2
.
It follows that the deviation of (X
i
ω,1
,...,X
i
ω,q
; Y
ω
) is at least δ/2, and V rejects
it with probability at least γ · δ/2.
Locally Testable Codes and PCPs of Almost-Linear Length 621
(2) Otherwise (i.e., some X
i
ω, j
is (δ/2)-far from the corresponding c
i
ω, j
), one of the
input-oracles is δ/2-far from being an E-codeword (because the c
i
ω, j
s are the E-
codewords closest to the X
i
ω, j
s). Again, the deviation of (X
i
ω,1
,...,X
i
ω,q
; Y
ω
)
is at least δ/2, and V rejects it with probability at least γ · δ/2.
We conclude that, for a good ω, when given access to (X
i
ω,1
,...,X
i
ω,q
; Y
ω
), the
inner-verifier V rejects with probability at least γ · δ/2. Recalling that a good ω is
selected with probability at least p = β ·

, it follows that V rejects with probability
at least β

· γδ/2.
CLAIM 5.13.3 (CASE 3). If
N
i=1
i
/N /4 and cis(αδ/4q)-close to C
,
then the inner-verifier rejects with probability at least (γδ/8) · .
P
ROOF. Referring to the second hypothesis, let x
bK/a
be such that c is
(αδ/4q)-close to C
(x), when both are viewed as nN-long sequences over .
Since both
c and C
(x) are N -sequences of E-codewords, these sequences may
differ on at most a (αδ/4q) fraction of these codewords; that is, |{i : c
i
=
E(C(x, i))}| (α/4q)N .
Combining the two hypotheses, it follows that (X
1
,...,X
N
)is((/4) +
(αδ/4q))-close to C
(x), and thus Y is (/2)-far from P
(x). (Note that, for the last
implication, we only use the fact that (X
1
,...,X
N
)is(/2)-close to C
(x) whereas
((X
1
,...,X
N
)
t
, Y )is(/2)-far from C

(x).) For future usage, let us restate the fact
that Y is (/2)-far from P
(x) as follows
E
ω
(Y
ω
, P(L
ω
, C(x, i
ω,1
),...,C(x, i
ω,q
)))
2
.
(34)
We first analyze what happens when our procedure T

is given oracle access to
(
c
t
, Y ). Recalling that |{i : c
i
= E(C(x, i ))}| (α/4q)N and using the hypothesis
that an average query of the C-tester hits each location with probability at most
α
1
/N , it follows that Pr
ω∈{0,1}
r
, j [q]
[c
i
ω, j
= E(C(x, i
ω, j
))] α
1
· (α/4q) =
/4q. Thus, for a uniformly chose ω, with probability at least 1 q · /4q, the
sequences (c
i
ω,1
,...,c
i
ω,q
) and (E(C(x, i
ω,1
)),...,E(C(x, i
ω,q
))) are identical. Let
us denote the set of these choices by G. Then,
G ={ω :(j) d
i
ω, j
= C(x, i
ω, j
)}, and Pr
ω
[ω G] 1 (/4). (35)
We observe that, for any ω G, the deviation of ((c
i
ω,1
,...,c
i
ω,q
); Y
ω
)islower-
bounded by the minimum between (Y
ω
, P(L
ω
, d
i
ω,1
,...,d
i
ω,q
))/ and δ, where
the first term is due to changing Y
ω
to fit the d
i
ω, j
s and the second term is due to
changing at least one of the c
i
ω, j
s so to obtain some other (acceptable) sequence
of codewords. The minimum of the two terms is obviously lower-bounded by their
product; that is, the deviation of ((c
i
ω,1
,...,c
i
ω,q
); Y
ω
) is at least
(Y
ω
, P(L
ω
, d
i
ω,1
,...,d
i
ω,q
))
· δ
.
(36)
Note that this lower-bound is in terms of the distance of Y
ω
from the proof computed
for the d
i
ω, j
s, whereas Eq. (34) refers to the distance from the proof computed for
the C(x, i
ω, j
)’s. Yet, recalling that the d
i
ω, j
s equal the C(x, i
ω, j
)’s with probability
622 O. GOLDREICH AND M. SUDAN
at least 1 (/4), (and using Eq. (34)) we have
E
ω
(Y
ω
, P(L
ω
, d
i
ω,1
,...,d
i
ω,q
))
2
4
=
4
. (37)
Using Eq. (36), it follows that the expected deviation of ((c
i
ω,1
,...,c
i
ω,q
); Y
ω
), when
the expectation is taken uniformly over ω ∈{0, 1}
r
, is at least δ/4 (and so V rejects
(
c
t
, Y ) with probability at least γδ/4).
However, we need to estimate the deviation of ((X
i
ω,1
,...,X
i
ω,q
); Y
ω
), which we
do next (in a way analogous to Case 2). For ω G, we lower-bound the deviation of
((X
i
ω,1
,...,X
i
ω,q
); Y
ω
) by considering two cases (as in the proof of Claim 5.13.2):
(1) Suppose that, for every j [q], the oracle X
i
ω, j
is (δ/2)-close to c
i
ω, j
. Re-
call that (c
i
ω,1
,...,c
i
ω,q
) equals (E(C(x, i
ω,1
)),...,E(C(x, i
ω,q
))), or equiv-
alently (d
i
ω,1
,...,d
i
ω,q
) equals (C(x, i
ω,1
),...,C(x, i
ω,q
)). Thus, the devia-
tion of ((X
i
ω,1
,...,X
i
ω,q
); Y
ω
) is lower-bounded by the minimum between
(Y
ω
, P(L
ω
, C(x, i
ω,1
),...,C(x, i
ω,q
)))/ and δ (δ/2), where the first term
is due to changing Y
ω
to fit the C(x, i
ω, j
)’s and the second term is due to chang-
ing at least one of the X
i
ω, j
s so to obtain some other (acceptable) sequence of
codewords. As before, we lower-bound the deviation by the product of these
terms, yielding half the value of Eq. (36).
(2) Otherwise (i.e., some X
i
ω, j
is (δ/2)-far from the corresponding c
i
ω, j
), one of the
input-oracles is δ/2-far from being an E-codeword (because the c
i
ω, j
s are the
E-codewords closest to the X
i
ω, j
s). As in the proof of Claim 5.13.2, in this
case, the deviation of ((X
i
ω,1
,...,X
i
ω,q
); Y
ω
) is at least δ/2.
We conclude that, for ω G, the deviation of ((X
i
ω,1
,...,X
i
ω,q
); Y
ω
) is at least
half the value of Eq. (36). Using Eq. (37), we lower-bound the expected deviation of
((X
i
ω,1
,...,X
i
ω,q
); Y
ω
), where the expectation is taken uniformly over ω ∈{0, 1}
r
,
by (δ/2) · (/4). It follows that the inner-verifier V rejects with probability at least
γ · δ/8.
Combining Claims 5.13.1–5.13.3, while setting
= /4 and

= αδ/4q,it
follows that the inner-verifier rejects with probability at least
min
αγ
;
βγδ

2
;
γδ
8
= min
αγ
4
;
αβγ δ
2
8q
;
γδ
8
· =
αβγ δ
2
8q
· .
Recalling that the inner-verifier is invoked with probability 1/2 (and otherwise
the repetition test is invoked with much better corresponding performance), the
theorem follows.
Preserving the Almost-Uniform Queries Property. Note that the proof of
Theorem 5.13 yields a codeword tester that does not make almost uniform queries.
We wish to redeem this state of affairs, both for the sake of elegancy and for future
use in Section 5.6. This requires a minor modification of the construction presented
in the proof of Theorem 5.13 as well as using a LIPS that makes almost uniform
queries (as defined next). We stress that our main results (e.g., Theorem 2.3) do not
use the following Theorem 5.15, but we will need the following definition in any
case (i.e., also in case we do not use Theorem 5.15).
Definition 5.14 (LIPS with Almost Uniform Queries). For α (0, 1], a veri-
fier (of an (F, (q, b) (p, a), ·, ·)-LIPS) is said to make α-uniform queries to its
Locally Testable Codes and PCPs of Almost-Linear Length 623
oracles if for each of its oracles and each location in that oracle the probability that
a random query to that oracle equals that location is proportional to the oracle’s
length, where the constant of proportion is in [α, α
1
]. That is, for every i [q +1],
we denote by p
j
(i) the probability that a random query to the jth oracle is to location
i. We require that, for every j [q] and i [n], it holds that α/n p
j
(i) α
1
/n,
and that, for every i [], it holds that α/ p
p+1
(i) α
1
/, where n and are
the length of the encoding and proof, respectively.
Note that the definition only requires almost uniformity of the queries made to
each individual oracle, and nothing is required regarding the proportion of queries
made to the different oracles.
T
HEOREM
5.15 (THEOREM 5.13, REVISITED ). Suppose we are given a locally
testable code and a LIPS as in the hypothesis of Theorem 5.13. Furthermore,
suppose that the LIPS makes α
V
-uniform queries to its oracles and that 2
r
>
2N n. Then, there exists an F-linear locally testable code as in the conclusion of
Theorem 5.13. Furthermore, for t
def
=2
r
/Nn, this code can be tested by making
2p queries that are ((1 t
1
) · αα
V
)-uniform and tossing 1 + (log
2
t) +max(r +
r
V
, log M) coins such that -far sequences are rejected with probability at least
(αβγ δ
2
/16q) · , where M = tNn+ 2
r
<2
r+1
.
Theorem 5.15 provides a codeword tester that makes almost uniform queries
(whereas the codeword tester provided by Theorem 5.13 does not have the feature).
This is done at the expense of doubling the query complexity (i.e., from p to 2 p),
and adding a term of log
2
t to the randomness complexity. Typically, r
V
> 1+log
and so log
2
t < r + r
V
log N (and log
2
M < r + 1 + log <r + r
V
), where
r (respectively, r
v
) is the randomness complexity of the original codeword tester
(respectively, the LIPS verifier) and N is the length of the original code. In this case,
the randomness complexity grows from r +r
V
+1 to less than 2·(r +r
V
+1)log
2
N .
We note that in our applications r + r
V
= (1 + o(1)) · log
2
N , and so the increase
in the randomness complexity merely doubles the o(1) term.
P
ROOF. We use the same code C

as constructed in the proof of Theorem 5.13,
and slightly modify the codeword tester presented there. Instead of emulating the
C-tester using the first N encodings (in the tested word), we use the ith block of
such N encodings, for a uniformly chosen i [t]. The replication test is modified
accordingly (i.e., we compare this block to the i
-th block, for a uniformly chosen
i
[t]). In addition, we add dummy queries such that the resulting tester makes an
equal number of queries to each of the two parts of the tested word (i.e., the tNn-
long prefix and the suffix). Each of these dummy queries is uniformly distributed
in the corresponding part (and the answer is ignored by the tester). The purpose of
these modification is to obtain a codeword tester that makes almost uniform queries.
Analogously to Remark 3.4 (see also the proof of Proposition 5.5), the (sound-
ness) analysis presented in the proof of Theorem 5.13 remains valid. Thus, we focus
on the syntactic conditions.
Clearly, it suffices to use p dummy queries, and thus the query complexity of the
new tester is 2 p. By choosing a careful implementation (which recycles randomness
624 O. GOLDREICH AND M. SUDAN
in order to implement the dummy queries
17
), the randomness complexity increases
only by a log
2
t term (for selecting the index i [t] as mentioned above).
It remains to analyze the uniformity of the queries made by the tester. The dummy
queries guarantee that each of the two parts of the tested word be probed the same
number of times, and furthermore that the distribution of the dummy queries does
not skew the distribution in each of the two parts. Since the two parts are of almost
the same length (i.e., upto a factor of about 1 t
1
), it suffices to analyze the
distribution in each part.
For the first part (i.e., the tNn-long prefix), we combine the hypothesis that
the C-tester makes α-uniform queries with the fact that we use a random copy of
C
. This means that the q input-oracles that we select (for the inner verifier) are
almost uniformly distributed among the tN encodings (of length n each). Using
the hypothesis that the LIPS makes α
V
-uniform queries to each of its oracles (and
thus to each of its input-oracle), it follows that the queries made to the first part are
αα
V
-uniform.
For the second part (i.e., the proof part), we combine the fact that the tester
selects a uniformly distributed proof (i.e., ω ∈{0, 1}
r
is uniformly distributed)
with the hypothesis that the LIPS makes α
V
-uniform queries to each of its oracles
(and thus to its proof-oracle). It follows that the queries made to the second part are
α
V
-uniform. The theorem follows.
5.3.4. Composing Inner Verifiers. Section 5.3.3 (e.g., Theorem 5.13) refers to
the composition of an outer code with an inner-verifier yielding a new code.In
contrast, the following theorem refers to composing two inner-verifiers yielding
a new inner-verifier. Indeed, we could have worked only with Theorem 5.13 (or
actually with Theorem 5.15), but it seems more convenient to (have and) work with
both types of composition theorems.
18
As in the case of Theorem 5.13, we need to
assume that the outer construct (in this case the outer LIPS) makes almost uniform
queries; the reader is thus referred to Definition 5.14. We comment that the resulting
LIPS does not necessarily make almost uniform queries; we will redeem this state
of affairs at a later point (in Theorem 5.17).
T
HEOREM 5.16 (COMPOSITION OF LINEAR INNER-VERIFIERS). Consider a fi-
nite field F, real numbers α
1
1
2
1
2
(0, 1], and integers b > b
>
b

such that b

divides b
, which divides b. Suppose that there exist a
(F, (p, b) ( p
, b
)
1
1
)-LIPS that makes α
1
-uniform queries to its oracles
and a (F, ( p
, b
) ( p

, b

)
2
2
)-LIPS. Then, there exists a (F, ( p, b)
(p

, b

)
1
δ
2
)-LIPS, where γ = α
1
γ
1
γ
2
δ
2
/8p
. Furthermore, if the ith original
LIPS uses r
i
coins, encoding length n
i
and proof length
i
, then the resulting LIPS
uses r
1
+r
2
coins, encoding length n
1
· n
2
and proof length
1
· n
2
+ 2
r
1
·
2
.
17
Note that the number of coins exceeds log
2
M, and thus we may re-use these coins to select a random
position for the dummy queries. It does not matter that all these dummy positions will be identical,
nor that they are correlated with the other queries.
18
An analogous comment applies to the construction of PCP systems. That is, it suffices to have a
composition theorem that refers to using a standard PCP as an outer verifier and composes it with an
inner-verifier (as done in Arora and Safra [1998] and Arora et al. [1998] and most subsequent works).
However, it is useful to consider also the composition of two inner-verifiers (i.e., the composition of
PCPPs [Ben-Sasson et al. 2004] or assignment testers [Dinur and Reingold 2004]). We note that the
following composition result predates [Ben-Sasson et al. 2004; Dinur and Reingold 2004].
Locally Testable Codes and PCPs of Almost-Linear Length 625
P
ROOF. We start with the construction, which is analogous to the one used in
the proof of Theorem 5.13 (except that no replication is needed here). The basic
idea is to start from the (F, ( p, b) (p
, b
)
1
1
)-LIPS, which uses p input-
oracles that are supposed to be encoded using a function E
1
: F
b
(F
b
)
n
1
and
a proving function with range (F
b
)
1
, and encode each of the F
b
symbols using a
function E
2
: F
b
(F
b

)
n
2
(i.e., the encoding function of the second LIPS). This
yields p new input-oracles and a part of the new proof-oracle. In addition, we use
the proving function of the second LIPS to produce auxiliary proofs for each of
the possible coin tosses of the first (i.e., outer) verifier. The concatenation of these
auxiliary proofs yields the second part of the new proof-oracle. The new verifier will
check the execution of the first (i.e., outer) verifier by invoking the second verifier
and giving it access to the suitable oracles, which are blocks in the oracles to which
the new verifier is given access. Specifically, given a (F, ( p, b) (p
, b
)
1
1
)-
LIPS, denoted (E
1
, P
1
, V
1
), and a (F, ( p
, b
) (p

, b

)
2
2
)-LIPS, denoted
(E
2
, P
2
, V
2
), we define their composition, denoted (E, P, V ), as follows:
The encoding function E : F
b
(F
b

)
n
1
·n
2
is the concatenation of the en-
coding functions E
1
: F
b
(F
b
)
n
1
and E
2
: F
b
(F
b

)
n
2
. That is, for
x (F
b

)
b/b

(F
b
)
b/b
F
b
,wehaveE(x) = (E
2
(y
1
),...,E
2
(y
n
1
)), where
(y
1
,...,y
n
1
)
def
= E
1
(x).
The proving function P = (P
(1)
, P
(2)
) operates as follows: Given L L
F,pb
and x
1
,...,x
p
F
b
, the first part of the proof (i.e., P
(1)
(L , x
1
,...,x
p
))
is the symbol-by-symbol encoding under E
2
of P
1
(L , x
1
,...,x
p
) (F
b
)
1
.
That is, P
(1)
(L , x
1
,...,x
p
) = (E
2
(y
1
),...,E
2
(y
1
)), where (y
1
,...,y
1
)
def
=
P
1
(L , x
1
,...,x
p
).
The second part of the proof (i.e., P
(2)
(L , x
1
,...,x
p
)) consists of 2
r
1
blocks
corresponding to each of the 2
r
1
possible checks of V
1
. For each ω
1
∈{0, 1}
r
1
,
the block corresponding to ω
1
in P
(2)
(L , x
1
,...,x
p
) is the
2
-long sequence
P
2
(L
ω
1
, z
ω
1
,1
,...,z
ω
1
, p
), where z
ω
1
,1
,...,z
ω
1
, p
denote the p
symbols (i.e.,
F
b
-symbols) of E
1
(x
1
),...,E
1
(x
p
) and P
1
(L , x
1
,...,x
p
) that are inspected by
V
1
(L
1
), and L
ω
1
is the conjunction of F-linear conditions checked by V
1
.
That is, if the jth query of V
1
(L
1
) is to location
j
of its i
j
th input-oracle
(respectively, of its proof-oracle), then z
ω
1
, j
equals the
j
th symbol in E
1
(x
i
j
)
(respectively, in P
1
(L , x
1
,...,x
p
)).
Note that the proof length is
1
· n
2
+ 2
r
1
·
2
, where the first (respectively,
second) term corresponds to P
(1)
(respectively, P
(2)
).
The verifier V is given L L
F,pb
as well as oracle access to p input-oracles,
denoted X
1
,...,X
p
, and to a proof-oracle, denoted = (
(1)
,
(2)
). The input-
oracle X
i
:[n
1
n
2
] F
b

is viewed as consisting of n
1
blocks, each of length
n
2
, and
(1)
:[
1
n
2
] F
b

is viewed as consisting of
1
such blocks. We also
view
(2)
:[2
r
1
2
] F
b

as
(2)
ω
1
: ω
1
∈{0, 1}
r
1
, where
(2)
ω
1
:[
2
] F
b

.
Note that X
1
,...,X
p
and
(1)
are supposed to be the encodings, under E
2
,of
corresponding oracles X
1
,...,X
p
and
that are of the format expected by V
1
.
Intuitively, V checks the claim that V
1
would have accepted these X
1
,...,X
p
and
. The verifier V does so by selecting a check for V
1
, and using V
2
to verify
the corresponding check, while utilizing a proof that is part of
(2)
.
626 O. GOLDREICH AND M. SUDAN
Specifically, on input L L
F,pb
and coins (ω
1
2
) ∈{0, 1}
r
1
+r
2
, the verifier
V (L , (ω
1
2
)) operates as follows:
(1) It determines the queries q
ω
1
,1
,...,q
ω
1
, p
that V
1
(L
1
) makes into its p +1
oracles (i.e., to X
1
,...,X
p
and
) on randomness ω
1
, and the conjunction
of linear conditions L
ω
1
that V
1
(L
1
) needs to verify on the p
responses.
Note that each of these p
queries is actually a pair indicating an oracle and
a position in it; that is, q
ω
1
, j
= (i
ω
1
, j
, q
ω
1
, j
), where i
ω
1
, j
[ p + 1] and
q
ω
1
, j
[k] such that k = n
1
if i
ω
1
, j
[ p] and k =
1
otherwise.
(2) Next, V invokes V
2
(L
ω
1
2
) providing it with oracle access to the input-
oracles as determined by q
ω
1
,1
,...,q
ω
1
, p
and to the proof-oracle that is the
block of
(2)
that corresponds to ω
1
. Specifically, for every j = 1,..., p

,
the jth input-oracle of V
2
, denoted X

j
, is defined to equal the q
ω
1
, j
-th n
2
-
long block of X
i
ω
1
, j
if i
ω
1
, j
[ p] and the q
ω
1
, j
-th n
2
-long block of
(1)
otherwise (i.e., X

j
(i) = X
i
ω
1
, j
((i
ω
1
, j
1) ·n
1
+i )ifi
ω
1
, j
[ p] and X

j
(i) =
(1)
(p · n
1
+ i) otherwise). The proof-oracle of V
2
, denoted

, is defined
to equal P
(2)
ω
1
.
(3) The verifier V accepts if and only V
2
accepts.
Aside from the (strong) soundness requirement, it is clear that the resulting
LIPS satisfies all other requirements. To evaluate the rejection probability of the
latter, we consider any (X
1
,...,X
p
, (
(1)
,
(2)
)), where X
i
:[n
1
n
2
] F
b

,
(1)
:[
1
n
2
] F
b

and
(2)
:[2
r
1
2
] F
b

. The analysis follows the outline of
the proof of Theorem 5.13. Specifically, we consider three cases (which correspond
to the three (main) cases considered in the proof of Theorem 5.13):
Case 1. Either some X
i
(for i [p]) or
(1)
is relatively far from a sequence
of E
2
-codewords. In this case, V
2
is given access to some oracle (i.e., a block of
either X
i
or
(1)
) that is far (on the average) from a E
2
-codeword, and rejects with
proportional probability. For details, see Claim 5.16.1.
Otherwise, we let X
1
,...,X
p
and
denote the corresponding sequence of E
2
-
decodings. That is, encoding the elements of X
i
(respectively,
) under E
2
yields
the sequence of E
2
-codewords that is closest to X
i
(respectively,
(1)
).
Case 2. The deviation of (X
1
,...,X
p
,
) with respect to (E
1
, P
1
, V
1
) is rel-
atively big. In this case, with proportional probability V
1
determines p
positions
in (X
1
,...,X
p
,
) such that their contents violates the linear condition (checked
by V
1
). In the latter case, V
2
is given access to a corresponding sequence of p
ora-
cles that has a big deviation, and rejects with constant probability. Thus, V rejects
with probability proportional to the deviation of (X
1
,...,X
p
,
). For details, see
Claim 5.16.2.
Case 3. Otherwise, (X
1
,...,X
p
,
(1)
) is close to the sequence of E
2
-encodings
of (X
1
,...,X
p
,
), which in turn has a relatively small deviation with respect to
(E
1
, P
1
, V
1
). It follows that
(2)
is far from the corresponding sequence of canonical
proofs, and V
2
rejects with probability proportional to the latter distance. For details,
see Claim 5.16.3.
The proofs of the aforementioned claims are very similar to the proofs of the
corresponding claims in the proof of Theorem 5.13. The key difference is that we
Locally Testable Codes and PCPs of Almost-Linear Length 627
refer to the deviation of sequences of oracles (as defined in Definition 5.9) rather
than to distances from codewords.
Suppose that (X
1
,...,X
p
, (
(1)
,
(2)
)) has deviation with respect to the linear
inner proof system (E, P, V ). Our aim is to show that these oracles will be rejected
by V with probability proportional to . We will use the following notations:
To simplify notations, let use denote X
p+1
def
=
(1)
. Let k
i
= n
1
if i [ p] and
k
p+1
=
1
.
For every i [p + 1], let X
i
= (w
i,1
,...,w
i,k
i
), where w
i, j
(F
b

)
n
2
, and
i, j
=
E
2
(w
i, j
)/n
2
for j [k
i
].
That is,
i, j
is the relative distance of the jth block (of length n
2
)inX
i
from an
E
2
-codeword.
C
LAIM 5.16.1 (CASE 1—USING
= /4). If, for some i [p + 1], it holds
that
k
i
j=1
i, j
/k
i
>
, then V rejects with probability at least α
1
γ
2
·
.
P
ROOF. Recalling that V
1
makes α
1
-uniform queries to each of its oracles, we
note that the queries of V
1
to its ith oracle correspond to n
1
-long blocks in X
i
that
are at expected (relative) distance at least α
1
·
from E
2
. Thus, V
2
is given access
to p
oracles that have an expected deviation of at least α
1
, and so V
2
rejects with
probability at least γ
2
·α
1
, where the probability is taken over the random choices
of ω
1
and ω
2
. The claim follows.
By Claim 5.16.1, we may focus on the case that
k
i
j=1
i, j
/k
i
(for every
i [ p + 1]). We are going to consider the E
2
-codewords, denoted by c
i, j
s, that
are closest to the w
i, j
s (i.e., (w
i, j
, c
i, j
) =
i, j
n
2
). Let d
i, j
be the E
2
-decoding of
c
i, j
(i.e., c
i, j
= E
2
(d
i, j
)), and X
i
= (d
i,1
,...,d
i,k
1
).
C
LAIM 5.16.2 (CASE 2—USING

= α
1
/4 p
). If the deviation of (X
1
,
...,X
p+1
) with respect to (E
1
, P
1
, V
1
) is at least

then V rejects with proba-
bility at least (γ
1
γ
2
δ
2
/2) ·

.
The condition
k
i
j=1
i, j
/k
i
(for every i ) is omitted from Claim 5.16.2,
because this claim holds regardless of this condition.
P
ROOF. By the claim’s hypothesis (and the strong soundness of V
1
), the verifier
V
1
would have been rejected (X
1
,...,X
p+1
) with probability at least p
def
= γ
1
·

. Let us call a choice of coins ω
1
for V
1
good if V
1
would have rejected the
values (d
q
ω
1
,1
,...,d
q
ω
1
, p
); that is, (d
q
ω
1
,1
,...,d
q
ω
1
, p
) ∈ L
ω
1
. (Recall that q
ω
1
, j
=
(i
ω
1
, j
, q
ω
1
, j
), where i
ω
1
, j
[ p +1] and q
ω
1
, j
[k
i
ω
1
, j
], and that d
q
ω
1
, j
is the q
ω
1
, j
-th
symbol in X
i
ω
1
, j
.) By the above, with probability p, a uniformly selected choice of
ω
1
∈{0, 1}
r
1
is good. On the other hand, for good ω
1
, when given input L
ω
1
and
access to the input-oracles (c
q
ω
1
,1
,...,c
q
ω
1
, p
) = (E
2
(d
q
ω
1
,1
),...,E
2
(d
q
ω
1
, p
)) (and
any proof-oracle), the verifier V
2
rejects with constant probability (i.e., probability
at least γ
2
· δ
2
). We need, however, to consider what happens when V
2
is given
access to the input-oracles X
q
ω
1
,1
,...,X
q
ω
1
, p
(and the proof-oracle
(2)
ω
1
), for a
good ω
1
. We next show that, for a good ω
1
, the verifier V
2
rejects the input-oracles
X
ω
1
def
= (X
q
ω
1
,1
,...,X
q
ω
1
, p
) with constant probability (regardless of
(2)
ω
1
). This
happens regardless of whether or not
X
ω
1
is close to (c
q
ω
1
,1
,...,c
q
ω
1
, p
). We consider
628 O. GOLDREICH AND M. SUDAN
two cases:
(1) Suppose that, for every j [p
], the oracle X
q
ω
1
, j
is (δ
2
/2)-close to c
q
ω
1
, j
. Then,
for every acceptable p
-tuple (i.e., (a
1
,...,a
p
) ∈{(E
2
(z
1
),...,E
2
(z
p
)) :
(z
1
,...,z
p
) L
ω
1
}), there exists a j such that the oracle X
q
ω
1
, j
is (δ
2
/2)-
far from the j th element in the acceptable sequence (i.e., from a
j
). The reason
being that for every acceptable (a
1
,...,a
p
) there exists a j such that a
j
= c
q
ω
1
, j
,
while on the other hand a
j
must also be an E
2
-codeword. It follows that the
deviation of (
X
ω
1
;
(2)
ω
1
) with respect to (E
2
, P
2
, V
2
) is at least δ
2
/2, and V
2
rejects it with probability at least γ
2
· δ
2
/2.
(2) Otherwise (i.e., some X
q
ω
1
, j
is (δ
2
/2)-far from the corresponding c
q
ω
1
, j
), one of
the input-oracles is δ/2-far from being an E
2
-codeword (because the c
q
ω
1
, j
s are
the E
2
-codewords closest to the X
q
ω
1
, j
s). Again, the deviation of (X
ω
1
;
(2)
ω
1
)is
at least δ
2
/2, and V
2
rejects it with probability at least γ
2
· δ
2
/2.
We conclude that, for a good ω
1
, when given access to (X
ω
1
;
(2)
ω
1
) the verifier V
2
rejects with probability at least γ
2
· δ
2
/2. Recalling that a good ω
1
is selected with
probability at least p = γ
1

, it follows that V rejects with probability at least
γ
1

· γ
2
δ
2
/2.
CLAIM 5.16.3 (CASE (3)). If for every i [ p +1] it holds that
k
i
j=1
i, j
/k
i
/4 and the deviation of (X
1
,...,X
p+1
) with respect to (E
1
, P
1
, V
1
) is at most
(α
1
/4 p
) then V rejects with probability at least (γ
2
δ
2
/8) · .
P
ROOF. Referring to the second hypothesis, let (x
1
,...,x
p
) L (where
L L
F,pb
) be such that for every i [p] the input-oracle X
i
is (α/4 p
)-close to
(y
i,1
,...,y
i,n
1
)
def
= E
1
(x
i
) and X
p+1
is (α/4 p
)-close to (y
p+1,1
,...,y
p+1,
1
)
def
=
P
1
(L , x
1
,...,x
p
), when all these objects are viewed as sequences over F
b
.
It follows that the E
2
-encodings of these objects (i.e., (c
i,1
,...,c
i,k
i
) and
(E
2
(y
i,1
),...,E
2
(y
i,k
i
))) differ on at most a (α/4p
) fraction of these E
2
-
codewords; that is, for every i,wehave|{j : c
i, j
= E
2
(y
i, j
)}| (α/4p
) · k
i
.
Combining the two hypotheses, it follows that each X
i
is ((/4) +
(α
1
/4 p
))-close to (E
2
(y
i,1
),...,E
2
(y
i,k
i
)), and thus Y
def
=
(2)
is (/2)-far
from P
(2)
(L , x
1
,...,x
p
). (Note that for the last implication we only use the
fact that each X
i
is (/2)-close to (y
i,1
,...,y
i,k
i
), whereas the deviation of
(X
1
,...,X
p
;(X
p+1
, Y )) with respect to (E, R, V )is.) For future usage, let us re-
state the fact that Y =Y
ω
1
: ω
1
∈{0, 1}
r
1
is (/2)-far from P
(2)
(L , x
1
,...,x
p
) =
P
2
(L
ω
1
, y
q
ω
1
,1
,...,y
q
ω
1
, p
):ω
1
∈{0, 1}
r
1
as follows
E
ω
(Y
ω
1
, P
2
(L
ω
1
, y
q
ω
1
,1
,...,y
q
ω
1
, p
))
2
2
.
(38)
We first analyze what happens when V is given oracle access to
(
c
1
,...,c
p
;(c
p+1
, Y )), where c
i
def
= (c
i,1
,...,c
i,k
i
). Recalling that |{j : c
i, j
=
E
2
(y
i, j
)}| (α/4 p
) · k
i
and using the hypothesis that V
1
makes α
1
-uniform
queries to each of its oracles, it follows that Pr
ω
1
∈{0,1}
r
1
, j [ p
]
[c
q
ω
1
, j
= E
2
(y
q
ω
1
, j
)]
α
1
1
·(α
1
/4 p
) = /4p
. Thus, for a uniformly chose ω
1
, with probability at least
1 p
·/4 p
, the sequences (c
q
ω
1
,1
,...,c
q
ω
1
, p
) and (E
2
(y
q
ω
1
,1
),...,E
2
(y
q
ω
1
, p
)) are
Locally Testable Codes and PCPs of Almost-Linear Length 629
identical. Let us denote the set of these choices by G. Then,
G ={ω
1
:(j) d
q
ω
1
, j
= y
q
ω
1
, j
}, and Pr
ω
1
[ω
1
G] 1 (/4). (39)
We observe that, for any ω
1
G, the deviation of ((c
q
ω
1
,1
,...,c
q
ω
1
, p
); Y
ω
1
)
with respect to (E
2
, P
2
, V
2
) is lower-bounded by the minimum between
(Y
ω
1
, P
2
(L
ω
1
, d
q
ω
1
,1
,...,d
q
ω
1
, p
))/
2
and δ
2
, where the first term is due to chang-
ing Y
ω
1
to fit the d
q
ω
1
, j
s and the second term is due to changing at least one of the
c
q
ω
1
, j
s so to obtain some other (acceptable) sequence of codewords. The minimum
of the two terms is obviously lower-bounded by their product; that is, the deviation
of ((c
q
ω
1
,1
,...,c
q
ω
1
, p
); Y
ω
1
) is at least
(Y
ω
1
, P
2
(L
ω
1
, d
q
ω
1
,1
,...,d
i
ω
1
, p
))
2
· δ
2 .
(40)
Note that this lower-bound is in terms of the distance of Y
ω
1
from the proof computed
for the d
q
ω
1
, j
s, whereas Eq. (38) refers to the distance from the proof computed for
the y
q
ω
1
, j
s. Yet, recalling that the d
q
ω, j
s equal the y
q
ω
1
, j
s with probability at least
1 (/4), (and using Eq. (38)) we have
E
ω
1
(Y
ω
1
, P
2
(L
ω
1
, d
q
ω
1
,1
,...,d
i
ω
1
, p
))
2
4
. (41)
Using Eq. (40), it follows that the expected deviation of ((c
q
ω
1
,1
,...,c
q
ω
1
, p
); Y
ω
1
),
when the expectation is taken uniformly over ω
1
∈{0, 1}
r
1
, is at least δ
2
/4 (and
so V rejects (
c
1
,...,c
p
, (c
p+1
, Y )) with probability at least γ
2
δ
2
/4).
However, we need to estimate the deviation of (
X
ω
1
; Y
ω
1
), where X
ω
1
=
(X
q
ω
1
,1
,...,X
q
ω
1
, p
), which we do next. For ω
1
G, we lower-bound the devi-
ation of (
X
ω
1
; Y
ω
1
) by considering two cases (as in the proof of Claim 5.16.2):
(1) Suppose that, for every j [p
], the oracle X
q
ω
1
, j
is (δ
2
/2)-close to c
q
ω
1
, j
. Then,
the deviation of ((X
i
ω
1
,1
,...,X
i
ω
1
, p
); Y
ω
1
) is lower-bounded by the minimum
between (Y
ω
1
, P
2
(L
ω
1
, d
q
ω
1
,1
,...,d
q
ω
1
, p
))/
2
and δ
2
(δ
2
/2), where the first
term is due to changing Y
ω
1
to fit the d
q
ω
1
, j
s and the second term is due to
changing at least one of the X
q
ω
1
, j
s so to obtain some other (acceptable) se-
quence of codewords. As before, we lower-bound the deviation by the product
of these terms, yielding half the value of Eq. (40).
(2) Otherwise (i.e., some X
q
ω
1
, j
is (δ
2
/2)-far from the corresponding c
q
ω
1
, j
), one
of the input-oracles is δ/2-far from being an E
2
-codeword. As in the proof of
Claim 5.16.2, in this case, the deviation of (
X
ω
1
, Y
ω
1
) is at least δ
2
/2.
We conclude that, for ω
1
G, the deviation of (X
ω
1
, Y
ω
1
) is at least half the value
of Eq. (40). Using Eq. (41), we lower-bound the expected deviation of (
X
ω
1
, Y
ω
1
)
by (δ
2
/2) ·(/4). It follows that the inner-verifier V rejects with probability at least
γ
2
· δ
2
/8.
Combining Claims 5.16.1–5.16.3, while setting
= /4 and

= α
1
/4 p
,it
follows that the inner-verifier rejects with probability at least
min
α
1
γ
2
;
γ
1
γ
2
δ
2

2
;
γ
2
δ
2
8
= min
α
1
γ
2
4
;
α
1
γ
1
γ
2
δ
2
8p
;
γ
2
δ
2
8
· =
α
1
γ
1
γ
2
δ
2
8p
·,
and the theorem follows.
630 O. GOLDREICH AND M. SUDAN
Preserving the Almost-Uniform Queries Property. Note that the proof of
Theorem 5.16 yields an inner verifier that does not necessarily make almost uniform
queries to its oracles. We wish to redeem this state of affairs, both for the sake of
elegancy and for future use in Section 5.6. This requires a minor modification of
the construction presented in the proof of Theorem 5.16 as well as using an “inner”
system (i.e., (E
2
, P
2
, V
2
)) that makes almost uniform queries to its oracles. We also
assume that each of the composed verifier makes the same number of queries to
each of its input-oracles, which is hereafter referred to as regularity. We stress that
our main results (e.g., Theorem 2.3) do not use the following Theorem 5.17.
T
HEOREM 5.17 (THEOREM 5.16, REVISITED ). Let (E
1
, P
1
, V
1
) and (E
2
, P
2
,
V
2
) be two LIPSes as in the hypothesis of Theorem 5.16, and γ = α
1
γ
1
γ
2
δ
2
/8p
as
there. Furthermore, suppose that V
2
makes α
2
-uniform queries to its oracles and
that each of the two verifiers makes the same number of queries to each of its input-
oracles. Then, for any (0, 1), there exists a (F, ( p, b) (p

, b

)
1
δ
2
,γ/2)-
LIPS that makes (1)α
1
α
2
-uniform queries to its oracles and makes the same num-
ber of queries to each of its input-oracles. Furthermore, if the ith given LIPS uses
r
i
coins, encoding length n
i
and proof length
i
then, assuming that n
i
<
i
< 2
r
i
,
the resulting inner verifier V uses 1 + r
1
+ r
2
+ log(2
r
1
2
/
1
n
2
) + log( p
p

/)
coins, encoding length n
1
· n
2
and proof length (p
p

/) ·(
1
· n
2
+ 2
r
1
·
2
).
We comment that the hypothesis that a verifier makes the same number of
queries to each of its input-oracles is quite natural. We note that in comparison
to Theorem 5.16, the proof-oracle of V is only p
p

/ times longer. In our ap-
plications, we don’t care about constant factors in the randomness complexity of
the inner-verifier, and thus it is worthwhile to note that the randomness complex-
ity of V is at most 2 · (r
1
+ r
2
) + log(p
p

/), whereas in Theorem 5.16 it was
r
1
+r
2
.
P
ROOF. We use almost the same construction as in the proof of Theorem 5.16.
The only modification is that we replicate the two different parts of the resulting
proof-oracle an adequate number of times. The purpose of this replication is to
guarantee that uniform queries to each of the two parts yield uniform queries to
the resulting proof-oracle. Needless to say, we need to check the validity of the
replication by an adequate replication test.
For parameters t
1
and t
2
to be determined later, we let the proof (constructed by
the proving function P) consist of t
1
copies of
(1)
= P
(1)
(L , x
1
,...,x
p
) followed
by t
2
copies of
(2)
= P
(1)
(L , x
1
,...,x
p
). The corresponding verifier V tests
these replications (by comparing two random locations) with probability 1/2, and
otherwise acts as before when using randomly selected copies of
(1)
and
(2)
.As
in the proof of Theorem 5.15, the rejection probability of the resulting verifier is
maintained (upto a factor of 1/2). Thus, the (completeness and) strong soundness
holds for any choice of the parameters t
1
and t
2
. Turning to analyze the distribution
of queries made by V , we consider three types of queries:
(1) Queries made by V to one of its p input-oracles. Recall that these queries
are determined by the queries that V
1
makes to its own p input-oracles,
and the queries made by V
2
to the input-oracles determined by the former
queries. Using the uniformity conditions of these two verifiers (and the regu-
larity of V
2
s queries), we conclude that V makes α
1
α
2
-uniform queries to its
Locally Testable Codes and PCPs of Almost-Linear Length 631
input-oracles. Furthermore, if both V
1
and V
2
are regular (i.e., make the same
number of queries to each of the input-oracles) then so is V .
(2) Queries made by V to the first part of its proof-oracle. These queries are de-
termined by the queries that V
1
makes to its proof-oracle and the queries made
by V
2
to the input-oracles determined by the former queries. Similarly to the
previous item, we conclude that V makes α
1
α
2
-uniform queries to the first part
of its proof-oracle.
(3) Queries made by V to the second part of its proof-oracle. These queries are
determined by the uniformly chosen coins ω
1
∈{0, 1}
r
1
and the queries made
by V
2
to its proof-oracle. Thus, V makes α
2
-uniform queries to the second part
of its proof-oracle.
The issue at hand is the proportion between the number of queries that V makes
to each of the two parts of its proof-oracle. Suppose that, on the average, V
1
(re-
spectively, V
2
) makes p
1
p
(respectively, p
2
p

) queries to its proof-oracle.
Then, on the average, V makes p
1
· ( p

p
2
)/ p
queries to the first part of its
proof-oracle, and p
2
queries to the second part. Thus, we should replicate the two
parts of the proof-oracle so to fit these proportions; that is, we should have
t
1
·
1
n
2
t
2
· 2
r
1
2
p
1
· ( p

p
2
)
p
· p
2
(42)
Recalling that
1
n
2
< 2
r
1
2
, it suffices to have t
2
[p
p

/] in order to obtain in
Eq. (42) an approximation up to factor (1 ± ). Furthermore, t
1
·
1
n
2
+ t
2
· 2
r
1
2
need not be greater than (p
p

/) · (
1
n
2
+ 2
r
1
2
). The added randomness (re-
quired for selecting random copies) is thus bounded by log
2
t
1
+ log
2
t
2
=
log
2
(p
p

2
r
1
2
/
1
n
2
). (Note that the replication test itself can be implemented
using log
2
((p
p

/) · (
1
n
2
+ 2
r
1
2
)) coins, which in turn is upper-bounded by
r
1
+ log
2
+ log
2
(p
p

/) < r
1
+r
2
+ log
2
(p
p

/).) The theorem follows.
5.4. LINEAR INNER VERIFIERS:TWO CONSTRUCTIONS. Throughout the rest of
this section, F
2
def
= GF(2). We present two LIPSes, one based on the Hadamard
encoding function, and the other based on the Reed–Muller encoding function. The
first LIPS is a straightforward adaptation of the “inner-most” verifier of Arora et al.
[1998], whereas the second LIPS is obtained by a careful adaptation of the “outer”
verifier of [Arora et al. 1998].
5.4.1. LIPS Based on the Hadamard-Encoding Function. We start by present-
ing a linear inner verifier that corresponds to the inner-most verifier of Arora et al.
[1998]. Things are only simpler in our context, since we only need to prove (and
verify) linear conditions (and so we do not need the table of quadratic forms used
in the original work). Thus, these F
2
-linear conditions, which refer to p elements
of F
k
2
, may be easily verified by accessing the Hadamard encoding of these p
elements.
We comment that one possible implementation of the aforementioned idea
amounts to testing each of these p encodings (via a 3-query codeword test), and
checking a random F
2
-linear condition by self-correction (requiring 2 queries to
each input-oracle). Indeed, this implementation requires no proof-oracle, but seems
to require at least 2p queries (whereas a straightforward implementation uses 5p
queries). The alternative implementation presented below makes only p + O(1)
632 O. GOLDREICH AND M. SUDAN
queries and is closer in spirit to the inner-most verifier of Arora et al. [1998] (espe-
cially, as interpreted in Harsha and Sudan [2000, Lem. 2.6]).
P
ROPOSITION 5.18. For every pair of integers p and k, there exists a
(F
2
, ( p, k) ( p + 5, 1),
1
2
,
1
8
)-LIPS. Furthermore, the length of the encoding
is 2
k
, the length of the proof is 2
pk
, and the randomness in use equals 3 pk + p.
Moreover, the verifier makes uniformly distributed queries to each of its oracles,
and makes exactly one query to each of the p input-oracles.
PROOF
. The encoding function E : F
k
2
F
2
k
2
is just the Hadamard encod-
ing (having relative distance
1
2
). The proving function P(L, x
1
,...,x
p
) F
2
pk
2
is also the Hadamard encoding, this time of the vector (x
1
,...,x
p
). (Indeed,
P(L , x
1
,...,x
p
) = E(x
1
···x
p
) is oblivious of L.) The verifier V is given a
linear subspace L, in the form of a matrix M F
pk×pk
2
, and access to input-oracles
X
1
,...,X
p
: F
k
2
F
2
and a proof-oracle : F
pk
2
F
2
. It operates as follows:
(1) Selects uniformly r = (r
1
,...,r
p
) F
pk
2
and s = (s
1
,...,s
p
) F
pk
2
, and
checks that
p
i=1
X
i
(r
i
) = (r) and (r) + (s) = (r s).
(2) Selects a random linear combination v of the constraints of L (i.e., picks a
random vector w F
pk
2
and sets v = w ·L), and verifies that (r) = (r +v).
(3) Selects uniformly σ
1
,...,σ
p
F
2
and checks that
p
i=1
σ
i
· X
i
(r
i
) = (s
r
) (s), where r
= (r
1
,...,r
p
) such that r
i
= r
i
if σ
i
= 1 and r
i
= 0
k
otherwise.
We note that self-correction (cf. Blum et al. [1993]) is performed only on the
proof-oracle, whereas each input-oracle is queried at a single (random) point. Fur-
thermore, a couple of queries to the proof-oracle are being re-used (yielding a saving
of two queries). We observe that all complexities are as stated in the proposition,
and claim that (strong) soundness follows by the standard analysis. Still, since
strong soundness was not analyzed explicitly before, we provide a detailed analy-
sis next. Suppose that (X
1
,...,X
p
; ) has deviation with respect to (E, P, V ).
For
= min(/2, 1/8), we consider the following possible sources of the value of
the deviation.
Case 1. The proof-oracle is
-far from the code E. In this case, the linearity
test applied to in Step (1) guarantees that V rejects with probability at least
(cf. Bellare et al. [1996]).
Thus, we may assume for the rest of the analysis that is
-close to E(x
1
···x
p
),
for some x
1
,...,x
p
F
k
2
. We fix these x
i
-s for the rest of the analysis. Viewing
E : F
2
F
2
2
as E : F
2
× F
2
F
2
, we note that E(x
1
···x
p
, r
1
···r
p
) =
p
i=1
E(x
i
, r
i
), and so
Pr
r
(r) =
p
i=1
E(x
i
, r
i
)
1
, (43)
where r = (r
1
,...,r
p
) is uniformly distributed in F
pk
2
.
Case 2. Pr
r
[(r) =
p
i=1
X
i
(r
i
)]
. In this case, the other test in Step (1)
rejects with probability at least
.
Locally Testable Codes and PCPs of Almost-Linear Length 633
Thus, we may assume that Pr
r
[(r) =
p
i=1
X
i
(r
i
)] 1
. Combining this
with Eq. (43), we have
Pr
r
p
i=1
X
i
(r
i
) =
p
i=1
E(x
i
, r
i
)
1 2
. (44)
It follows that for every j [p] there exists c = (c
1
,...,c
j1
, c
j+1
,...,c
p
)
F
(p1)k
2
such that Pr
r
j
[X
j
(r
j
) = E(x
j
, r
j
) + b
j,c
] 1 2
, where b
j,c
def
=
i=j
(E(x
i
, c
i
) X
i
(c
i
)).
Case 3. For some j , the input-oracle X
j
is not 2
-close to E(x
i
). Recalling
that X
j
is 2
-close to E(x
i
) b
2
k
, where b
def
= b
j,c
F
2
is as defined above, we
show that in this case (i.e., when b = 1) the test in Step (3) rejects with constant
probability.
We first note that, for any r
i
, s
i
F
k
2
and σ
i
F
2
, it holds that σ
i
· E(x
i
, r
i
) =
E(x
i
, r
i
) = E(x
i
, s
i
r
i
) E(x
i
, s
i
), where r
i
is as defined in Step (3). Thus, for
random r, s F
pk
2
and σ = (σ
1
,...,σ
p
) F
p
2
, and for r
as defined in Step (3),
Pr
r,s
(s r
) (s) =
p
i=1
σ
i
· E(x
i
, r
i
)
= Pr
r,s
(s r
) (s) =
p
i=1
E(x
i
, s
i
r
i
)
p
i=1
E(x
i
, s
i
)
1 2
,
where the inequality is due to Eq. (43). This means that Step (3) essentially checks
whether
p
i=1
σ
i
· X
i
(r
i
) equals
p
i=1
σ
i
· E(x
i
, r
i
), or equivalently whether σ
j
·
(X
j
(r
j
) E(x
j
, r
j
)) equals
i=j
σ
i
· (E(x
i
, r
i
) X
i
(r
i
)). On the other hand, for
each possible choice of b
F
2
, it holds that
Pr
r
j
j
[σ
j
· (X
j
(r
j
) E(x
j
, r
j
)) = b
] Pr
r
j
[X
j
(r
j
) E(x
j
, r
j
) = 1] ·Pr
σ
j
[σ
j
= b
]
(1 2
) ·
1
2
,
where the second inequality is due to the case’s hypothesis. Using a random choice
of r
1
,...,r
j1
, r
j+1
,...,r
p
F
k
2
and σ
1
,...,σ
j1
j+1
,...,σ
p
F
2
, and set-
ting b
=
i=j
σ
i
·(E(x
i
, r
i
) X
i
(r
i
)), it follows that, in the current case, Step (3)
rejects with probability at least ((12
)/2)2
= (1/2)3
1/8. Specifically:
Pr
r,s
1
,...,σ
p
p
i=1
σ
i
· X
i
(r
i
) = (s r
) (s)
Pr
r
1
,...,r
p
1
···σ
p
σ
j
· (X
j
(r
j
) E(x
j
, r
j
)) =
i=j
σ
i
· (E(x
i
, r
i
) X
i
(r
i
))
Pr
r,s
1
,...,σ
p
(s r
) (s) =
p
i=1
σ
i
· E(x
i
, r
i
)
1 2
2
2
=
1
2
3
.
634 O. GOLDREICH AND M. SUDAN
Case 4. x
def
= (x
1
,...,x
p
) ∈ L. Recall that Step (2) ensures that encodings of
vectors not in L are rejected with probability 1/2. It follows that, in this case
Pr
r,w
[(r wL) (r) = E(x, wL) = 0] (1 2
)/2 ,
because Pr
r
[(r v) (r) = E(x, r v) E(x, r)] 1 2
for every v, and
Pr
w
[E(x, wL) = 0] = 1/2 for x ∈ L.
Thus, in each case, V rejects with probability at least min(
, 1/8)
min(/2, 1/8) /8. On the other hand, one of these cases must occurs, be-
cause otherwise (X
1
,...,X
p
; ) has deviation less than (in contradiction to the
hypothesis).
5.4.2. LIPS Based on the Reed–Muller Encoding Function. The main result in
this subsection is an adaptation of the intermediate inner-verifier of Arora et al.
[1998, Sect. 7]. Recall that the latter uses significantly shorter encoding and proofs
(and less randomness) than the simpler Hadamard-based verifier, but verification
is based on (a constant number of) non-Boolean answers.
T
HEOREM 5.19. There exists a γ>0 such that for every pair of integers p and
k > 2
p
, there exists a (F
2
, ( p, k) ( p + 4, poly(log pk)),
1
2
)-LIPS. Further-
more, the lengths of the encoding and the proof are poly( pk), and the randomness
in use equals O(log pk). Moreover, the verifier makes (1 k
1
)-uniformly dis-
tributed queries to each of its oracles, and makes exactly one query to each of the
p input-oracles.
Our construction is a modification of the inner-verifier presented by Arora et al.
[1998]; we refer specifically to the proof of Theorem 2.1.9 presented in Arora
et al. [1998, Sect. 7.5], as interpreted in Harsha and Sudan [2000]. We thus start
by providing an overview of this proof and discuss the main issues that need to be
addressed in adapting it to a proof of Theorem 5.19.
Overview of the Proof of Arora et al. [1998, Thm. 2.1.9]. We use the formalism
of Harsha and Sudan [2000] to interpret the main steps in the proof of Arora et al.
[1998]. As a first step in their proof, Arora et al. [1998] reduce SAT to a GapPCS
problem (see Definition A.1 in Appendix A). Then, using a low-total-degree test,
they give a 3-prover 1-round proof system for the latter problem. Finally, they
observe that the proof system with slight modifications also works for proving
properties of inputs presented as oracles that encode strings that when concatenated
yield the input. Let us review the completeness and soundness condition of the
reduction (used in the first step). Recall that an instance of GapPCS consists of
a sequence of algebraic constraints on the values of a function g : F
m
F .
Each constraint is dependent on the value of g at only poly-logarithmically many
inputs. The goal is to find a low-degree polynomial g that satisfies all (or many)
constraints. Actually, the reduction consists of a pair of algorithms A and B, where
A reduces instances of SAT to instances of GapPCS, and B transforms pairs (φ,τ)
to polynomials g such that if τ satisfies the formula φ then g satisfies all constraints
of A(φ). The properties of the reduction are as follows:
Locally Testable Codes and PCPs of Almost-Linear Length 635
Completeness. If τ is an assignment satisfying φ, then g = B(φ,τ) is a poly-
nomial of total degree d that satisfies all constraints of A(φ).
Soundness. If φ is not satisfiable, then no polynomial of total degree d satisfies
more than an fraction of the constraints of A(φ).
Since the soundness condition only focuses on degree d polynomials (and does
not refer to arbitrary functions), constructing such a reduction turns out to be easier
than constructing a full-fledged PCP. On the other hand, by combining this reduction
with a low-degree test it is easy to extend the soundness to all functions.
One would hope to use the above reduction directly to get a LIPS by setting φ
to be some formula enforcing the linear conditions L. But as noted earlier, several
problems come up: First, B is not a linear map, but this is fixed easily. The more
serious issue is that the soundness condition permits the existence of low-degree
functions that satisfy all constraints but are not even close to B(φ,τ) for any τ .
Indeed, in standard reductions the only functions in the range of B are polynomi-
als of individual degree d/m in each variable, but this is not something that the
low-degree test checks (nor can this be checked directly by a constant number of
queries). Thus, to apply the low-degree test and protocol of Arora et al. [1998], we
augment the reduction (from SAT to GapPCS) itself such that it satisfies the follow-
ing stronger soundness condition (which corresponds to rejection of noncanonical
proofs (cf. Section 5.3.1)).
Modified Soundness. If g is a polynomial of total degree d that is not in the
range of B(φ,·), then g does not satisfy more than an fraction of the constraints
of A(φ).
We note that in our setting τ is provided by the input-oracles
19
(whereas the linear
constraints are given as an explicit input), and so the modified soundness refers to
this τ (i.e., we require that if the degree d polynomial g differs from B(φ,τ) then
g does not satisfy more than an fraction of the constraints of A(φ)). We comment
that strong soundness (as defined in Section 5.3.1) will follow by combining this
modified soundness with a low-degree test.
To obtain the modified soundness condition, we need to delve further into the
reduction of Arora et al. [1998] (including the corresponding transformation B).
Suppose that their reduction produces a GapPCS instance on m variate polynomials.
Then, the corresponding solution g = B(φ,τ) satisfies the following additional
conditions:
(1) The m-variate polynomial g = B(φ,τ) has the form g(i, x) = g
i
(x), for
i [m
], where the g
i
s are polynomials (of varying degrees) in m 1 variables.
Furthermore, g is a polynomial of degree m
1 < d in the first variable.
(2) There exists a sequence of integers m
i
i[m
]
such that the polynomial g
i
only
depends on the first m
i
m 1 variables.
(3) For every i [m
] there exists a sequence of integers d
i, j
j[m1]
such that g
i
has a degree bound of d
i, j
(d m
+ 1)/(m 1) in its jth variable.
19
Indeed, as hinted in previous subsections, the terminology of assignment testers [Dinur and Reingold
2004] (or PCPPs [Ben-Sasson et al. 2004]) is perfectly tailored to express what is going on.
636 O. GOLDREICH AND M. SUDAN
(4) The polynomial g must evaluate to zero on some subset of the points (due to
padding of the actual input to adequate length).
(5) Finally, over some subset of the points, g evaluates to either 0 or 1.
(Note that this condition is not trivial because we will not be working with F
2
but
some extension field K of F
2
. In fact over the extension field, these constraints
are not even linear. However, these conditions turn out to be F
2
-linear.)
In what follows we will, in effect, be augmenting the reduction from SAT to GapPCS
so as to include all constraints of the above form. This will force the GapPCS
problem to only have satisfying assignments of the form g = B(φ,τ) and thus
salvage the reduction.
Actually, we will be considering satisfying assignments that are presented as
a concatenation of several pieces that are individually encoded (in corresponding
input-oracles), and the constraints of the system we build will be verifying that the
“concatenation” of the various pieces is a satisfying assignment. Furthermore, we
will only by looking at systems of linear equations and not at general satisfiability.
The Actual Construction (i.e., Proof of Theorem 5.19). Recall that we need to
describe the three ingredients in the LIPS: the encoding function E : F
k
2
(F
k
2
)
n
,
the proving function P : F
pk
2
(F
k
2
)
N
, and the verifier (oracle machine) V .As
stated above, we do so by adapting known constructions. (In particular, whenever
we refer to a step as being “standard”, such a step is performed explicitly in Harsha
and Sudan [2000].) We start by developing the machinery for the encoding function
and the proving function. We do so by transforming the question of satisfaction of
a system of linear equations into a sequence of consistency relationships among
polynomials and using this sequence to describe the encoding and proving function.
For the rest of the discussion, we fix a linear space L L
F
2
, pk
and vectors x
1
,...,x
p
such that (x
1
,...,x
p
) L.
Obtaining a Width-3 Linear System. Our first step corresponds to the reduction
of SAT (or NP) to 3SAT, which is taken for granted in the standard setting. Here,
we reduce the linear conditions to ones that refer to three variables each (i.e., width-
3 linear constraints). As in the standard case, this is done by introducing auxiliary
variables.
To convert L into a conjunction of width-3 linear constraints, we introduce a
vector, denoted x
p+1
, of at most n = (pk)
2
auxiliary variables, and transform L
into a linear space L
of width 3-constraints such that (x
1
,...,x
p
) L if and only
if there exists x
p+1
such that (x
1
,...,x
p+1
) L
. (Indeed, each linear condition in
t pk variables is replaced by t 2 width 3-constraints using t 3 new auxiliary
variables.) Furthermore, for each (x
1
,...,x
p
) L there exists a unique x
p+1
such
that (x
1
,...,x
p+1
) L
.
For sake of simplicity, we will assume in the sequel that x
1
,...,x
p+1
are all
inputs, although x
p+1
is actually not an input but rather (only) part of the proof.
Thus, it is important to note here that the bits of x
p+1
are (uniquely determined as)
linear combinations of the bits of x
1
,...,x
p
. Indeed, one may think of the current
step as a reduction (while noting that this reduction is a linear transformation).
Note that L
L
F
2
, pk+n
, because |x
i
|=k if i p whereas |x
p+1
|=n k.We
will take care of the latter discrepancy in the next step.
Locally Testable Codes and PCPs of Almost-Linear Length 637
Input Representation: Low-Degree Extensions and Dealing with Padding. The
step of taking a low-degree extension is standard, but we need to deal with
the padding (of inputs) that it creates (as well as with the padding required to
eliminate the discrepancy in the input lengths, created in the previous step). That
is, we have to augment the linear system to verify that the padded parts of the input
are indeed all-zero.
For h =log n and m =log n/ log log n(so that h
m
n), we pick a field
K ={η
0
= 0
1
= 1,...,η
|K|−1
} of size poly(h) that extends F
2
(i.e., K =
GF(2
O(log h)
)), and a subset H ={η
0
,...,η
h1
} of K. Next, we let x
i
= x
i
0
h
m
−|x
i
|
(i.e., we pad x
i
with enough zeroes so that its length is exactly h
m
). Now, we let
L

be the F
2
-linear constraints indicating that the padded parts of x
i
are zero, and
(x
1
,...,x
p+1
) correspond to the padding of (x
1
,...,x
p+1
) L
.
Finally, as usual, we view x
i
as a function from H
m
→{0, 1} and let
f
1
,..., f
p+1
:K
m
Kbem-variate polynomials of degree h 1 in each of
the m variables that extend the functions described by x
1
,...,x
p+1
.
(We mention that the encoding function E will essentially map x
i
to the table of
all values of the function f
i
.)
Concatenating the p Pieces (Standard). We let f :K
m+1
K be the function
given by f (η
i
,...) = f
i
(...) for i ∈{1,..., p +1} such that f is a polynomial of
degree p in its first variable.
Low-Degree Extension of L

(Standard). Note that L

imposes a linear con-
straints on the values of f , where each constraint depends on at most three values of
f . Thus, each constraint has the generic form α
1
f (z
1
)+α
2
f (z
2
)+α
3
f (z
3
), for some
α
1
2
3
∈{0, 1} and z
1
, z
2
, z
3
H
p,m
def
={η
1
,...,η
p+1
H
m
. We view L

as
a function L

: {0, 1}
3
× H
3
p,m
→{0, 1} such that L

(α
1
2
3
, z
1
, z
2
, z
3
) = 1
if the constraint α
1
f (z
1
) + α
2
f (z
2
) + α
3
f (z
3
) is imposed by L

, and extend it to
ˆ
L

:K
3(m+1)+3
K that is linear in the first three variables, has degree p in other
three variables and degree h 1 in all other 3m variables. Thus, using h > p, the
polynomial
ˆ
L

has individual degree h 1.
We comment that the current step does not rely on L

being a linear subspace
(but rather on it being a system of width-3 equations). The linearity of L

(or rather
of the generic conditions α
1
f (z
1
) + α
2
f (z
2
) + α
3
f (z
3
)) will be used in the next
step (i.e., in rule (R
0
)).
Verifying Satisfiability of L

via Sequence of Polynomials. This step corre-
sponds to the “sum check” in Arora et al. [1998] (which is one of the two proce-
dures in the original inner-verifier, the other being a low-degree test). The current
presentation follows [Harsha and Sudan 2000].
The current step is standard except for rule (R
0
) below, which capitalizes on
the linearity of the condition being checked. That is, in the standard presentation
g
1
is the product of three values of g
0
(corresponding to an OR of three Boolean
values), whereas here it is their sum (corresponding to a width-3 linear constraint).
In addition, rule (R
0
) includes an extra check that some elements being considered
are in {0, 1}.
Let m
= 4m+8. We define a sequence of polynomials g
0
,...,g
m
+1
:K
m
K,
where g
0
is essentially f , and each g
i
is related to g
i1
(i.e., g
1
is related to g
0
by
an F
2
-linear relationship, and g
i
is related to g
i1
by a K-linear relationship). The
638 O. GOLDREICH AND M. SUDAN
motivation behind these polynomials is the following: The function g
1
is defined
such that the condition (x
1
,...,x
p+1
) L
is equivalent to the condition g
1
(u) = 0
for every u H
m
. The polynomials g
i
gradually expand the set of points on
which the function vanishes from H
m
to K
m
; specifically, g
i+1
should vanish on
K
i
× H
m
i
. Indeed, rule (R
i
) implies that g
i
vanishes on K
i1
× H
m
i+1
if and
only if g
i+1
vanishes on K
i
× H
m
i
. Thus, finally we have (x
1
,...,x
p+1
) L
if
and only if g
m
+1
0.
For α
i
s and u
i
s from K and z
i
s from K
m+1
, we require that
g
0
(z
1
,...,z
4
1
,...,α
4
) = f (z
1
). (45)
Whereas Eq. (45) seems at this stage as merely a notational convention, it actually
imposes a condition that will have to be checked. It is more evident that the following
conditions impose relations between the various polynomials. As stated above, these
relations deviate from the standard ones only in the next rule (R
0
).
(R
0
): g
1
(z
1
,...,z
4
1
,...,α
4
) =
ˆ
L

(α
1
2
3
, z
1
, z
2
, z
3
) ·
3
i=1
α
i
· g
0
(z
i
0)
+α
4
· (g
0
(z
4
0)
2
g
0
(z
4
0)).
We call the reader attention to the fact that the main term in (R
0
) is linear in the
three (typically different) values of g
0
, whereas in the standard construction this
term is the produce of three such values. In contrast, the secondary term in (R
0
)
involves a power of a single value of g
0
(i.e., it includes g
0
(z
4
0)
2
), which is not
K-linear but is F
2
-linear. The latter fact is based on the fact that the map β → β
2
is
an F
2
-linear map over fields of characteristic two. We mention that the secondary
term in (R
0
) is meant to verify that for every z
4
H
m
the value of g
0
(z
4
0) is in
{0, 1} (bearing in mind that we will require g
1
to vanish on H
m
). This verification
is “optional” in standard PCPs, in the sense that it is not needed for soundness, but
is occasionally thrown in because it serve the intuition (and do not involve much
extra work). In contrast, in our case this verification is necessary to enforce the
strong soundness condition (i.e., to rule out the possibility that the input-oracles
are not valid encodings).
The standard relations are, for i = 1,...,m
(and u
j
s in K):
(R
i
):g
i+1
(u
1
,...,u
i1
, u
i
, u
i+1
,...,u
4m+8
)
=
h1
j=0
u
j
i
· g
i
(u
1
,...,u
i1
j
, u
i+1
,...,u
4m+8
).
Merging the Different Polynomials into a Single Polynomial g (Standard). Let
g :K
m
+1
K be the function given by g(η
i
, z) = g
i
(z) for i ∈{0,...,m
+ 1}
such that g is a polynomial of degree m
+ 1 in the first variable (i.e., i ). Using
h > m
> p, we have that g is a polynomial of individual degree at most 2h,
because g
0
has individual degree h, the polynomial g
1
has individual degree 2h,
and each g
i+1
has individual degree h in its first i variables and individual degree
2h in the other variables. Thus, g has total degree at most d = 2m
h.
Lines and Curves over g (Standard). Let g|
lines
:K
2(m
+1)
K
d+1
be the
function describing the total degree d polynomial g :K
m
+1
K restricted to
lines; that is, for a line K
2(m
+1)
the value of g|
lines
() is a univariate degree d
Locally Testable Codes and PCPs of Almost-Linear Length 639
polynomial representing the values of g on . Let w = 2(m
+1)h and k

= wd+1
and let g|
curves
: C K
k

be the restriction of g to some subset C of degree w
curves, where C is the set of all the curves that arise in the computation of the
verifier described below. That is, a curve C C is a function C = (C
1
,...,C
m
):
K K
m
+1
, where each C
i
is a univariate polynomial of degree w, and g|
curves
(C)
is the univariate degree wd polynomial that represents the value of g on the curve
C (i.e., on the set of points {C(e):e K}). (Indeed, a line is a curve of degree 1.)
The Encoding and Proving Functions. Finally, we get to define the encoding
and proving functions. This step is standard, but we highlight a few nonstandard
aspects of it.
The encoding function E(x
i
) is the table of values of the function f
i
:K
m
K
k

,
where f
i
(x) = ( f
i
(x), 0
k

1
); i.e., elements of K are being written as vectors from
K
k

. Recall that f
i
is a low-degree extension of (the padded version of) x
i
. Thus,
each of the values of f
i
(i.e., the value of f
i
at each point) is a F
2
-linear combination
of the values of (the bits in) x
i
. This is due to the fact that polynomial extrapolation
is a linear operation (on the function’s values).
The proving function P(L

, x
1
,...,x
p+1
) = P
0
(L , x
1
,...,x
p
) consists of
the triple of functions (g
, g|
lines
, g|
curves
), where g
:K
m
+1
K
k

and
g|
lines
:K
2(m
+1)
K
k

are the functions g and g|
lines
with their range being
mapped, by padding, into K
k

; that is, g
(x) = (g(x), 0
k

1
) and g|
lines
() =
(g|
lines
(), 0
k

(d+1)
). Note that P(L

, x
1
,...,x
p+1
) refers to the proving function
of the reduced instance (obtained in the reduction to width-3 constraints), whereas
P
0
(L , x
1
,...,x
p
) refers to the proving function of the original instance. Recall that
x
p+1
(as well as the other x
i
-s) are F
2
-linear combinations of the original x
i
-s. We
highlight the fact that the values of g are linear in the values of the g
i
s, which in
turn are linear in g
1
, which in turn are F
2
-linear in g
0
(and hence in the f
i
s). Also,
the values of g|
lines
and g|
curves
are linear in the values of g.
We note that the encoding is a sequence of length |K|
m
= poly(h)
m
= poly(n) =
poly(pk) over the alphabet K
k

, and its relative distance is at least 1 (h
2
/|K|) >
1/2. The proof length (i.e., |K|
m
+1
) is polynomial in the encoding length (because
m
= O(m)).
To motivate the description of the
verifier V , we note that the verifier, which
essentially has access to the input-oracles f
1
,..., f
p+1
and to the proof-oracle
(g, g|
lines
, g|
curves
), needs to verify the following conditions:
(1) The function g is a polynomial of degree at most d, the function g|
lines
is the
restriction of g to lines, and g|
curves
is the restriction of g to curves.
(2) The degree of g in its first variable is at most m
+ 1.
(3) For i ∈{1,...,m
+ 1}, the function g
i
:K
m
K given by g
i
(z) = g(η
i
, z)
is computed correctly from g
i1
by an application of the rule (R
i1
).
(4) The function g
m
+1
is identically zero.
(5) The function g
0
is a polynomial of degree 0 in all but its first m +1 variables.
(6) The function f :K
m+1
K given by f (x) = g
0
(x, 0
m
(m+1)
) is a polynomial
of degree at most p in its first variable and degree at most h 1 in each of the
remaining m variables.
640 O. GOLDREICH AND M. SUDAN
(7) The function f satisfies f (η
i
, x) = f
i
(x) for every i ∈{1,..., p + 1} and
x K
m
.
Working one’s way upwards, one can see that P
0
(L , x
1
,...,x
p
) is the only
function that satisfies all the above conditions. In particular, Conditions (5)–(7)
force g
0
to uniquely represent the f
i
s, Conditions (1)–(4) guarantee that the f
i
’s
are the encoding of inputs that satisfy L, and Conditions (1)–(2) also force the
uniqueness of the three parts of the proof-oracle. (We comment that g|
lines
and
g|
curves
are included in the proof-oracle (merely) in order to allow the verification
of the aforementioned conditions using very few queries.)
Indeed, it is time to describe the verifier’s actions. The aim is to emulate a large
number of checks (i.e., random verification of all the above conditions) by using
only p + 4 oracle calls, and still incur only a constant error probability. Specifically,
ignoring Condition (1) for a moment, a random test of Condition (2) requires m
+2
points in the domain of g, Condition (3) involves m
+ 1 equalities (which refer
to m
+ 1 different parts of g and each (but one) of these equalities refers to h
values), Condition (5) involves m
m equalities (one per each suitable variable in
g
0
) and Condition (7) involves p equalities, each referring to a different function
f
i
. Following Arora et al. [1998], all these different conditions will be checked by
retrieving the corresponding (random) g-values from a suitable curve in g|
curves
, and
obtaining the f
i
-values from the corresponding oracles. Finally, Condition (1) will
be tested by comparing the value of g at a random point to the values of g|
lines
and
g|
curves
on random lines and curves that pass through this point. The comparison to
g and g|
lines
(which is the well known low-degree test) will also establish the claim
that g has low-degree. Details follow.
The verifier first picks one random test (to be emulated) per each of the equalities
corresponding to the Conditions (2)–(7) above. Specifically, in order to emulate the
testing of Conditions (2), (5) and (6), it picks random axis parallel lines (one per
each of the relevant variables) and picks O(h) arbitrary points on these K
m
+1
-lines
with the intention of inspecting the value of g
at these points. (We stress that the
verifier does not query g
at these points, but rather only determines these points at
this stage.) Similarly, in order to emulate the testing of Conditions (3), (4) and (7),
it picks random points from the domain of the corresponding g
i
s and f . Having
chosen these points, it picks one totally random point in K
m
. All in all this amounts
to determining w = O(mh) points in the domain of g
. The verifier then determines
a degree w curve, denoted C :K K
m
+1
, that passes through these w points.
Finally (in order to check Condition (1)), it picks a random point α on this curve
and a random line through the point α.
Overall, the above random choices can be implemented by picking a constant
number of random points in K
m
+1
and recycling randomness among the various
tests (see details in Arora et al. [1998] on Harsha and Sudan [2000]). Thus, the ran-
domness complexity of the verifier is O(m
log |K|) = O(m log h) = O(log n) =
O(log pk). At this point, we may also bound the size of of the set of curves used
by the verifier (i.e., C) by poly( pk). This bounds the size of g|
curves
and thus the
length of the entire proof (by poly(pk)).
We finally get to the actual queries of the verifier. It queries the proof-oracle for
the values of g
(α), g|
lines
() and g|
curves
(C). It verifies that g
(α) is actually in K and
that g|
lines
()isinK
d+1
(as opposed to K
k
). It then verifies that the three responses
agree at α, thus checking Condition (1). Finally, it verifies the values of g
on the
Locally Testable Codes and PCPs of Almost-Linear Length 641
test points for tests (2)–(7), as provided (or “claimed”) by g|
curves
(C), are consistent
with the Conditions (2)–(7). In particular, verifying Condition (7) requires a single
probe into each of the input-oracles. (Once again the responses to these probes are
elements of K
k
and the verifier checks that the responses are in K padded with 0’s.)
This concludes the description of the verifier. We stress that this description
is identical to the one in Arora et al. [1998] (as interpreted in Harsha and Sudan
[2000]), except for two aspects. First, the curve suboracle provides the value of g on
some additional points in order to support the additional checks in Conditions (2),
(5) and (6). Indeed, these conditions were added here in order to enforce the modified
soundness condition (which implies strong soundness). Secondly, Conditions (1)–
(7) refer to the functions f
1
,..., f
p+1
, g and g|
lines
, whereas the verifier actually has
access to padded versions of these functions (i.e., f
1
,..., f
p+1
, g
and g|
lines
) and
verifiers the correctness of the padding. Indeed, the “0-padding verifications” are
only intended to guarantee the modified notion of soundness (and are not needed for
the standard notion of soundness). Omitting all these extra tests, would get us back
to the interpretation of Arora et al. [1998] as provided in Harsha and Sudan [2000].
In total, the verifier makes only (p +1)+3 queries. Furthermore, the single query
made to each of the p+1 input-oracles is uniformly distributed and the three queries
made to the proof-oracle are each uniformly distributed in the corresponding part of
the proof-oracle. (We will address the issue of making almost-uniform queries to the
proof-oracle, as a single entity, at the end of the proof.) The answers received by V
are from K
k

and thus the answer length equals k

log
2
|K|, which is poly(log(pk))
as required (using k

log
2
|K|=O(wd ·log h) and d < w = O(mh) < (log n)
2
=
O(log(pk))
2
). Finally, note that all checks by the verifier are actually K-linear,
except for the satisfaction of rule (R
0
), which is only F
2
-linear.
The (strong) soundness of the above verifier is established, as usual, assuming
|K|≥poly(h). In particular, if the function g :K
m
+1
K (obtained by ignoring
the last k

1 coordinates of the function g
) is not 0.01-close to some polynomial
ˆ
g of total degree d then the (point-versus-line) low-degree test will reject with con-
stant probability. Thus, we may assume that g
is 0.01-close to such a
ˆ
g. Standard
soundness follows by the standard argument, but actually the same argument also
establishes strong soundness. Intuitively, the low-degree test also guarantees that g
is rejected with probability proportional to its distance from
ˆ
g. Furthermore, a dis-
agreement of either g|
lines
or g|
curves
with
ˆ
g is detected with proportional probability
by the test that checks Condition (1). Similarly, disagreement between f
i
and
ˆ
g is
detected with proportional probability by the test that checks Condition (7). Finally,
if any of the Conditions (2)–(6) is violated (when applied to
ˆ
g), then the verifier
rejects with constant probability (also when accessing g rather than
ˆ
g). Following
is a more detailed analysis.
We consider an arbitrary (X
1
,...,X
p
, X
p+1
; ), where X
i
:K
m
K
k

and
= (g
, g|
lines
, g|
curves
) such that g
:K
m
+1
K
k

, g|
lines
:K
2(m
+1)
K
k

and
g|
curves
: C K
k

. We denote by the deviation of (X
1
,...,X
p
, X
p+1
; ) with
respect to (E, P, V ). Our aim is to show that V rejects (X
1
,...,X
p
, X
p+1
; ) with
probability (). For
= /5 1/5, we consider the following possible sources
of the value of the deviation.
Case 1. Either Pr
z
[g
(z) ∈ K ×0
k

1
]
or Pr
[g|
lines
() ∈ K
d
×0
k

(d+1)
]
. In this case, by virtue of the 0-padding verification, V rejects with probability
at least
.
642 O. GOLDREICH AND M. SUDAN
Thus, we assume in the rest of the analysis that, for some functions g :K
m
+1
K and g|
lines
:K
2(m
+1)
K
d+1
, it holds that Pr
z
[g
(z) = (g(z), 0
k

1
)] > 1
and Pr
[g|
lines
() = (g|
lines
(), 0
k

(d+1)
)] > 1
.
Case 2. The function g defined above is
-far from being a degree d polyno-
mial. In this case, by virtue of the point-versus-line test included in Condition (1),
the verifier rejects with probability (
) Arora et al. [1998, Lem. 7.2.1.4]. (Here
we use |K|=poly(d). The constant in the is unspecified in [Arora et al. 1998],
but explicit bounds are known now. For example, Arora and Sudan [2003, Thm. 16]
lower bounds the rejection probability by
2
3
.)
Thus, we assume in the rest of the analysis that the function g is
-close to a
degree d polynomial, denoted
ˆ
g.
Case 3. Pr
[e K g|
lines
()(e) =
ˆ
g((e))] 4
, where g|
lines
()(e) denotes
the value of the univariate polynomial g|
lines
()ate. Note that if g|
lines
()(e) =
ˆ
g((e)) for some e K then the two different (degree d) univariate polynomials
g|
lines
() and
ˆ
g() must disagree on at least |K|−d > |K|/2 of the points on the
line . Thus, in this case, Pr
,e
[g|
lines
()(e) =
ˆ
g((e))] 4
/2. Noting that (e)
is uniformly distributed in K
m
+1
, it follows that Pr
,e
[g|
lines
()(e) = g((e))]
2
, which means that (again by virtue of the point-versus-line test) V will
reject with probability at least
.
Case 4. Pr
CC
[e K g|
curves
(C)(e) =
ˆ
g(C(e))] 4
, where g|
curves
(C)(e)
denotes the value of the univariate polynomial g|
curves
(C)ate. Again, using
the degree bound (i.e., wd = O(d
2
)) of these two univariate polynomials (and
|K | > wd/2), it follows that Pr
CC,e
[g|
curves
(C)(e) =
ˆ
g(C(e))] 4
/2, and
Pr
CC,e
[g|
curves
(C)(e) = g(C(e))] 2
, because C(e) is uniformly distributed
in K
m
+1
. Thus, in this case (by virtue of the point-versus-curve test), V will reject
with probability at least
.
Thus, in the rest of the analysis, we assume that
Pr
CC
[e K g|
curves
(C)(e) =
ˆ
g(C(e))] 1 4
. (46)
In the rest of the analysis, we will heavily rely on the fact that when the verifier needs
the values of
ˆ
g at certain locations (for a random test of some of Conditions (2)–(7)),
it obtains these values by a single random query to g|
curves
. Furthermore, Eq. (46)
guarantees that the answers obtained from g|
curves
typically match all relevant values
of
ˆ
g.
Case 5. For some i it holds that Pr
z
K
m
[
ˆ
g(η
0
i
, z
, 0
m
(m+1)
) = X
i
(z
)] 5
.
In this case, the testing of Conditions (6)–(7), will cause rejection with probability
at least 5
4
, where the latter term is due to Eq. (46) and the fact that in testing
these conditions we obtain the values of
ˆ
g by a single (random) probe to g|
curves
.
Thus, in the rest of the analysis, we assume that each X
i
is 5
-close to
ˆ
g(η
0
i
, ·, 0
m
(m+1)
). In particular, it follows that X
i
is 5
-close to some m-variant
polynomial of total degree d, denoted f
i
.
Case 6. Some f
i
has individual degree greater than h 1 in one of its variables.
In this case, the verifier rejects with constant probability by virtue of checking
Condition (6). (Indeed, here we rely on the negation of Cases 4 and 5.)
Thus, in the rest of the analysis, we assume that each f
i
is an m-variant polynomial
of individual degree h 1, which encodes an h
m
-long input, denoted x
i
.
Locally Testable Codes and PCPs of Almost-Linear Length 643
Case 7. Either (x
1
,...,x
p+1
) ∈ L

or some x
i
is not in {0, 1}
h
m
. In this case, the
verifier rejects with constant probability by virtue of checking Conditions (3)–(4).
Case 8. The polynomial
ˆ
g does not equal P(L

, x
1
,...,x
p+1
). Since both
polynomials satisfy the same relations, this case may be due only to the individual
degrees of
ˆ
g, which are checked in Conditions (2), (5) and (6). Thus, in this case,
the verifier rejects with constant probability.
Thus, in each case, the verifier rejects with probability at least min((
),(1)) =
(
) = (). On the other hand, one of these cases must occurs, because other-
wise (X
1
,...,X
p+1
; ) has deviation less than 5
= (in contradiction to the
hypothesis).
This establishes the theorem, except for the extra condition that requires that the
verifier makes almost-uniform to each of its oracles. Recall that the single query
made to each of the input-oracles is uniformly distributed, and that each of the three
queries made to the proof-oracle is uniformly distributed in the corresponding part
of the proof-oracle. The problem is that these three parts do not have the same
length. The solution is to modify the construction such that each part of the proof-
oracle has approximately the same size. This is done by replications, and as usual a
replication test will be used (i.e., with probability 1/2 and otherwise we invoke the
verifier V described above while providing it with access to random copies of the
corresponding parts). Since we may afford a factor k blow-up in the proof length
(and randomness complexity that is logarithmic in the proof length), we can easily
make the lengths equal up to a 1 ± k
1
factor. Thus, the modified verifier makes
(1 k
1
)-uniform queries to each of its oracles, and the theorem follows.
5.5. COMBINING ALL THE CONSTRUCTIONS. We are now ready to prove the
main theorem of this section.
T
HEOREM 5.20 (THEOREM 2.3, RESTATED). For infinitely many k, there exists
a locally testable binary code of constant relative distance mapping k bits to n
def
=
exp(
˜
O(
log k)) · k bits. Furthermore, the code is linear.
P
ROOF. The theorem is proved by composing the locally testable code of (Part 1
of) Theorem 2.4 with the two LIPSes constructed in Section 5.4 (i.e., in Proposi-
tion 5.18 and Theorem 5.19). Actually, we apply three composition operations,
using the LIPS of Theorem 5.19 twice. The sequence of compositions can be or-
dered arbitrarily. For example, we may first compose the locally testable code
(LTC) with the LIPS of Theorem 5.19, obtaining a new LTC, which is composed
again with the latter LIPS, and finally compose the resulting LTC with the LIPS
of Proposition 5.18. This, “top-down” order requires to use the augmented com-
position theorems (which guarantee preservation of the almost-uniformity of the
tester’s queries). Wishing to use only the “vanilla” composition theorems (which do
not preserve the said feature), we use instead a “bottom-up” order of compositions.
This will only require that, in each of the compositions, the outer construct (which
is one of the abovementioned basic constructs) makes almost-uniform queries. We
start by recalling the constructs being used (going from the bottom upwards):
(1) The (F
2
, ( p
H
, k
H
) ( p
H
+5, 1),
1
2
,
1
8
)-LIPS of Proposition 5.18, for any choice
of p
H
and k
H
. This (Hadamard-based) LIPS uses encoding length 2
k
H
, proof
length 2
p
H
k
H
, and randomness 3p
H
k
H
+ p
H
< 4 p
H
k
H
.
644 O. GOLDREICH AND M. SUDAN
(2) The (F
2
, ( p
RM
, k
RM
) ( p
RM
+ 4, poly(log p
RM
k
RM
)),
1
2
,(1))-LIPS of The-
orem 5.19, for any choice of p
RM
and k
RM
. This (Reed–Muller-based) LIPS
uses encoding and proof length poly( p
RM
k
RM
), and randomness O(log p
RM
k
RM
).
Moreover, the verifier makes (1 k
RM
1
)-uniformly distributed queries to each
of its oracles.
(3) The locally testable code
k
n
used in Section 3.2 to establish
Part (1) of Theorem 2.4, where n = exp(
˜
O(
log k)) · k and = F
b
2
for
b = exp(
˜
O(
log k)). Recall that this locally testable code (LTC) is F
2
-linear
and has constant relative distance, and that the underlying parameters in its
construction are d = m
m
such that n = m
m
2
+o(m)
and k = m
m
2
2mo(m)
(see Eq. (3) and the parameter setting before it). Furthermore, referring to
Remark 3.4, the tester makes two 0.8-uniform queries, and uses randomness
complexity r such that 2
r
≈|F|
m
· (|F|·|R|/|F|
m
)
2
, where |F|=O(d)
and |R|=n. Thus, 2
r
< (n/d
m2
) · n < m
3m
· n, which in turn equals
exp(
˜
O(
log k)) · k, since n < m
3m
· k and m <
log k.
We start by using Theorem 5.16 to compose the LIPS of Item (2) (as the outer
LIPS) with the LIPS of Item (1) (as the inner LIPS), which means setting k
H
=
poly(log p
RM
k
RM
) and p
H
= p
RM
+ 4. Setting p
RM
= p
and k
RM
= k
, the result is a
(F, (p
, k
) ( p
+9, 1),(1),(1/ p
))-LIPS, denoted S
, that uses O(log p
k
)+
O(p
· poly(log p
k
)) = poly(p
· log k
) random coins, and encoding (and proof)
length poly(p
k
) · exp(poly(log p
k
)) = exp(poly(log p
k
)).
Next, we compose the LIPS of Item (2) (as the outer LIPS) with the
LIPS S
(as the inner LIPS), which means setting k
= poly(log p
RM
k
RM
)
and p
= p
RM
+ 4. Setting p
RM
= p

and k
RM
= k

, the result is a
(F, (p

, k

) ( p

+ 13, 1),(1),(1/ p

)
2
)-LIPS, denoted S

, that uses
O(log p

k

) + poly(p

·log log k

) = O( p

·log k

) random coins, and encoding
(and proof) length poly(p

k

).
Finally, using Theorem 5.13, we compose the LTC of Item (3) with the LIPS
S

(as the inner LIPS), which means setting k

= b = exp(
˜
O(
log k)) and
p

= 2 + 13. The result is a binary linear LTC of constant relative distance
having length (exp(
˜
O(
log k)) · k) · poly(b), which equals exp(
˜
O(
log k)) · k.
The theorem follows.
5.6. ADDITIONAL REMARKS. In this section, we show that certain locally
testable linear codes over small alphabets can be modified such that the code-
word tester makes only three queries, while essentially preserving the distance and
rate of the code. Specifically, we refer to testers that make almost-uniform queries,
and start by providing a version of Theorem 5.20 that satisfies this condition.
P
ROPOSITION 5.21 (THEOREM 5.20, REVISITED ). For infinitely many k, there
exists a linear locally testable binary code of relative constant distance that maps k
bits to n
def
= exp(
˜
O(
log k)) ·k bits. Furthermore, for any α (0, 1), the codeword
tester makes α-uniform queries, and uses log
2
k +
˜
O(
log k)) random coins.
P
ROOF. The proposition is proved by following the proof of Theorem 5.20,
while using composition theorems (i.e., Theorems 5.15 and 5.17) that preserve
the almost-uniformity of the queries made by the verifier (or tester). We note that
Theorem 5.17 requires that the inner LIPS make the same number of queries to each
of its input-oracles, and we observe that this property holds for each of the two basic
Locally Testable Codes and PCPs of Almost-Linear Length 645
LIPSes used in the proof of Theorem 5.20. Furthermore, the Hadamard-based LIPS
makes uniformly distributed queries to each of its oracles (cf. Proposition 5.18). We
also note that the extra overhead created by Theorems 5.15 and 5.17 (as compared
to Theorems 5.13 and 5.16) is insignificant in our case. Details follow.
We first note that the almost-uniformity of the resulting LTC is essentially the
product of the almost-uniformity parameters of the basic constructs, which are
dominated by the 0.8-uniformity of the LTC of Remark 3.4. However, as stated in
Remark 3.4, this bound is arbitrary and we may obtain (1 )-uniformity for any
constant >0.
We could have proved the current proposition using any order of composition,
but it seems best to verify it using the same order used in the proof of Theorem 5.20.
We merely verify that the extra overhead of the composition theorems used here is
indeed insignificant. This is obvious for the randomness complexity of the LIPSes
obtained by the first two compositions, in which Theorem 5.15 is to be used (instead
of Theorem 5.16). The reason is that the randomness in Theorem 5.17 is at most
twice than in Theorem 5.16, whereas in the proof of Theorem 5.20 we anyhow stated
the randomness complexity of the resulting LIPSes upto a multiplicative constant.
Recalling that we compose constructs that make a constant number of queries,
this suffices for establishing the current proposition, except for the randomness
complexity of the resulting tester.
To analyze the randomness complexity of the resulting tester, we take a closer
look at the third composition (i.e., the composition of the LTC with the resulting
LIPS, which uses Theorem 5.15). Note that the LTC being composed has random-
ness complexity r = log
2
k + O(log(n/k)), and so the composition may incur
an extra term of at most r log
2
k. Furthermore, the randomness complexity of
the LIPS verifier is O(log(n/ k)), and so the resulting tester also has randomness
complexity log
2
k + O(log(n/k)) = log
2
k +
˜
O(
log k).
Reducing the Randomness Complexity of Testers. As in the case of PCP (cf. Bel-
lare et al. [1998, Prop. 11.2]), the randomness complexity of codeword testers can
be reduced to be logarithmic in the length of the codeword. This complexity reduc-
tion is not important in case we start with Proposition 5.21, but we state it for sake
of generality.
P
ROPOSITION 5.22 (REDUCING THE RANDOMNESS COMPLEXITY OF CODEWORD
TESTERS). Let C :
k
n
be a code.
(1) Every (weak) codeword tester for C can be modified into one that has random-
ness complexity log
2
n + O(log(1/)) + log log ||, and maintains the same
rejection probabilities up-to an additive term of , while preserving the number
of queries.
(2) If = F
and C is F -linear then every (strong) codeword tester for C can
be modified into one that has randomness complexity log
2
n + log log n +
log log ||+O(1), while preserving the number of queries.
The rejection probability may decrease by a constant factor. Furthermore,
if the original tester made α-uniform queries then the resulting one makes
(α o(1))-uniform queries.
Note that Part (1) may be used to obtain weak codeword testers of essentially
optimal randomness complexity, whereas Part (2) is used to obtain strong codeword
testers (but requires C to be linear).
646 O. GOLDREICH AND M. SUDAN
PROOF. The proof of Part (1) is straightforward (and is analogous to the easy
case in Step (2) of the proof of Claim 3.3.2). Specifically, using the probabilistic
method, there exists a set of O(
2
log
2
|
n
|) possible random-tapes for the original
tester such that if the tester restricts its choices to this set then its rejection probability
on every potential sequence is preserved up to an additive term of . The reason is
that, with probability 1 exp(
2
t), a random set of t random-tapes approximates
the rejection probability for any fixed sequence up to , while the number of possible
sequences is |
n
|.
The proof of Part (2) is analogous to the general case in Step (2) of the proof of
Claim 3.3.2. As in the proof of Claim 3.3.2, it suffices to consider the noncodewords
that have C(0
k
) as the codeword closest to them. We first observe that, for every
fixed w
n
that is δ-far from C(0
k
), with probability exp((δ · t)), a random
set of t random-tapes approximates the rejection probability of w up-to a constant
factor. Next, we upper-bound the number of noncodewords that are at distance δn
from C(0
k
)by(||−1)
δn
·
n
δn
< (|n)
δn
. Thus, the probability that a random
set of t random-tapes approximates the rejection probability of all noncodewords
(up-to a constant factor) is at least 1 exp((δ ·t) +δn log(|n)). Thus, setting
t = O(n log(|n)) and using δ 1/n, the main claim of Part (2) follows.
Regarding the almost-uniformity of queries, note that with probability at least
1 n ·exp(
2
·t/n) (over the choices of the set of t random-tapes) the resulting
tester makes ((1 ) ·α)-uniform queries. The proposition follows.
Reducing the Query Complexity of Testers. The relevance of low randomness
complexity to the project of reducing the query complexity becomes clear in the
next proposition. (Note that low randomness complexity of the tester was also used
in establishing Theorem 5.20.)
P
ROPOSITION 5.23. Let = F and suppose that C :
k
n
is a locally
testable F-linear code of constant relative distance. Furthermore, suppose that, for
some α (0, 1), the codeword tester makes α-uniform queries and has randomness
complexity r = r (k, n). Then, for n
= n + O(2
r
), there exists an F-linear code
C
:
k
n
of constant relative distance that is testable with three queries.
Proposition 5.23 can be extended to the case = F
, for any constant ,
obtaining n
= n +(q)
2
·2
r
, where q is the query complexity of the original tester.
An analogous result can be stated for nonlinear codes (and proven by using the
Long Code of Bellare et al. [1998], but in this case the length blows-up double-
exponentially with q log ||).
P
ROOF. The current proposition follows by composing the C-tester, which
makes q = O(1) queries, with the (F, (q, 1) (3, 1), 1, q
2
)-LIPS presented
next, where the composition uses Theorem 5.15.
We note that the LIPS that we are going to construct is fundamentally different
from the ones considered so far. It does not reduce the alphabet (but rather keeps
it invariant), and it reduces the number of queries (from any q to 3) rather than
increasing it. We pay however in the parameter representing the soundness feature
(i.e., the proportion between the deviation and the rejection probability). Following
is a description of this LIPS:
The encoding function E : F F is the identity function.
Letting L L
F,q
be represented by a q-by-q matrix over F, the proving function
P : L
F,q
× F
q
F
q(q1)
is as follows: For every i
1
[q] and i
2
[q 1], the
Locally Testable Codes and PCPs of Almost-Linear Length 647
((i
1
1)(q 1) +i
2
)th element of P(L, x
1
,...,x
q
) equals
i
2
j=1
c
i
1
, j
x
j
, where
c
i, j
is the (i, j)th entry in the matrix representing L.
On input L L
F,q
and access to input-oracles X
1
,...,X
q
F (each containing
a single symbol) and proof-oracle Y :[q] ×[q 1] F, the verifier V selects
uniformly i
1
, i
2
[q] and proceed according to the value of i
2
.
(1) For i
2
= 1, the verifier checks whether c
i
1
,1
· X
1
equals Y (i
1
, 1).
(2) For i
2
∈{2,...,q 1}, the verifier checks whether Y (i
1
, i
2
1) +c
i
1
,i
2
· X
i
2
equals Y (i
1
, i
2
).
(3) For i
2
= q, the verifier checks whether Y (i
1
, q 1) + c
i
1
,q
· X
q
= 0.
The verifier accepts if and only if the relevant check passes.
Note that if X
def
= (X
1
,...,X
p
) ∈ L then (X, Y ) has deviation 1, for every Y .On
the other hand, in such a case, there exists an i
1
[q] such that
q
j=1
c
i
1
, j
X
j
= 0.
For this i
1
[q], there exists an i
2
[q] such that the above V rejects (because
otherwise 0 = Y (i
1
, q 1) + c
i
1
,q
· X
q
··=
q
j=1
c
i
1
, j
X
j
). Similarly, if
(X
1
,...,X
p
) L and Y = P(L , X
1
,...,X
p
), then for some i
1
, i
2
[q] the
verifier rejects. The proposition follows.
Short 3-Query Testable Binary Codes. Using Propositions 5.21 and 5.23, we
show that our main result regarding locally testable codes (i.e., Theorem 2.3) holds
also with a tester that make only three queries.
C
OROLLARY 5.24. For infinitely many k, there exists a linear binary code of
relative constant distance that maps k bits to n
def
= exp(
˜
O(
log k)) ·k bits and has
a three-query codeword test.
Perspective. Corollary 5.24 asserts that three queries suffice for a meaningful
definition of locally testable linear codes. This result is analogous to the three-query
PCPs available for NP-sets.
20
In both cases, the constant error probability remains
unspecified, and a second level project aimed at minimizing the error of three-query
test arises. Another worthy project refers to the trade-off between the number of
queries and the error probability, which in the context of PCP is captured by the
notion of amortized query complexity. The definition of an analogous notion for
locally testable codes is less straightforward because one needs to specify which
strings (i.e., at what distance from the code) should be rejected with the stated error
probability. One natural choice is to consider the rejection probability of strings
that are at distance d/2 from the code, where d is the distance of the code itself.
Alternatively, one may consider the proportion between the relative distance to the
code and the rejection probability.
6. Subsequent Work and Open Problems
We have presented locally testable codes and PCP schemes of almost-linear length,
where :
N N is called almost-linear if (n) = n
1+o(1)
. For PCP, this improved
over a previous result where for each >0 a scheme of length n
1+
was presented
(with query complexity O(1/)). Recall that our schemes have length (n) =
20
In both cases, testability by two queries is weak: see Bellare et al. [1998, Prop. 10.3] for PCPs
and Ben-Sasson et al. [2003a] for locally testable codes.
648 O. GOLDREICH AND M. SUDAN
exp(
˜
O(
log n)) · n. In earlier versions of this work (e.g., Goldreich and Sudan
[2002]), we wondered whether length (n) = poly(log n) ·n (or even linear length)
can be achieved. Similarly, the number of queries in our proof system is really
small, say 19, while simultaneously achieving nearly linear-sized proofs. Further
reduction of this query complexity is very much feasible and it is unclear what the
final limit may be. Is it possible to achieve nearly-linear (or even linear?) proofs
with 3 query bits and soundness nearly 1/2?
Turning to more technical issues, we note that our constructions of codes and
PCPs are actually randomized. In case of codes, this means that we prove the
existence of certain codes (by using the probabilistic method), but we do not provide
fully-explicit codes. In case of PCPs, we obtained PCPs for a problem to which
SAT can be randomly reduced (rather for SAT itself). In both cases, the probabilistic
method is used to determine a sample of random-tapes for a relevant test, and the
probabilistic analysis shows that almost all choices of the subspace will do. A
natural (de-randomization) goal, stated in our preliminary report [Goldreich and
Sudan 2002], has been to provide an explicit construction of a good subspace. For
example, in case of the low-degree test (which underlies our codeword tester), the
goal was to provide an explicit set of
˜
O(|F|
m
) lines that can be used for this test
(as the set R in the construction of Section 3.2).
In our preliminary report [Goldreich and Sudan 2002], we also suggested the
following seemingly easier goal of de-randomizing the linearity test of Blum et al.
[1993]. Recall that in order to test whether f : G H is linear, one uniformly
selects (x, y) G × G and accepts if and only if f (x) + f (y) = f (x + y). Now,
by the probabilistic method, there exists a set R G × G of size O(|G|log |H |)
such that the test works well when (x, y) is uniformly selected in R (rather than in
G × G).
21
The challenge suggested in Goldreich and Sudan [2002] was to present
an explicit construction of such a set R.
The latter challenge as well as the more general goal of de-randomizing all
our results were recently resolved by Ben-Sasson et al. [2003b]. Specifically, they
showed that for low-degree testing one may use a small set of lines that consists of
all lines going in a small set of directions. They also showed that this result suffices
for the derandomization of our PCP result.
Another natural question that arises in this work refers to obtaining locally
testable codes for coding k
< k information symbols out of codes that ap-
ply to k information symbols. The straightforward idea of converting k
-symbol
messages into k-symbol messages (via padding) and encoding the latter by the
original code, preserves many properties of the code but does not necessarily pre-
serve local-testability.
22
Finally, we mention a few recent works that address the main question raised by
our work and mentioned above (i.e., whether PCPs and codes of length poly(log n)·n
are achievable, where n is the length of the relevant input). The first quantitative
21
For every f : G H , with probability 1 exp(−|R|) a random set R will be good for testing
whether f is linear, and the claim follows using the union bound for all |H |
|G|
possible functions
f : G H.
22
Indeed, this difficulty (as well as other difficulties regarding the gap between PCPs and codes)
disappears if one allows probabilistic coding. That is, define a code C :
k
n
as a randomized
algorithm (rather than a mapping), and state all code properties with respect to randomized codewords
C(a)’s.
Locally Testable Codes and PCPs of Almost-Linear Length 649
improvement over our work was obatined by Ben-Sasson et al. [2004] that, for
every constant >0, presented PCPs and (weak) locally testable codes of length
exp(log
n)·n. Building on the work of Ben-Sasson and Sudan [2005], Dinur [2006]
has resolved the aforementioned problem by presenting PCPs and (weak) locally
testable codes of length poly(log n)·n. Specifically, Dinur applied her “PCP amplifi-
cation” technique, which is the main contribution of her work, to the PCP presented
by Ben-Sasson and Sudan [2005]. We note, however, that these improved codes
(respectively, PCP constructions) do not achieve strong codeword testability (re-
spectively, strong soundness). Indeed, obtaining such strong constructs of length
that improves on exp(
˜
O(
log n) ·n is an open problem.
Appendix A. The 3-Prover System of Harsha and Sudan [2001], Revisited
The 3-prover system of Harsha and Sudan [2000] handles an NP-complete (promise)
problem called GapPCS. This promise problem is revisited in Section A.1, where
we also present a restricted version of it called rGapPCS. In Section A.2, we adapt
the results of Harsha and Sudan [2000] to the variant introduced in Section A.1,
while in Section A.3 we describe the high level operation of the 3-prover system
of Harsha and Sudan [2000]. The latter section is aimed to support the claims made
when abstracting this proof system in Section 4.2.2.
A.1. The Gap Polynomial-Constraint-Satisfaction Problem
We start by recalling the “Gap Polynomial Constraint Satisfaction Problem” and
introducing a restricted version of this problem.
Standard CSPs. Constraint satisfaction problems (CSPs) are a natural class
of optimization problems where an instance consists of t Boolean constraints
C
1
,...,C
t
placed on n variables, each taking on values from some finite domain,
say {0,...,D 1}. Each constraint is restricted in that it may only depend on
a small number, w, of variables. The goal of the optimization problem is to find
an assignment to the n variables that maximizes the number of constraints that
are satisfied. The complexity of the optimization task depends on the nature of
constraints that may be applied, and thus each class of constraints gives rise to a
different optimization problem (cf. Creignou et al. [2001]). CSPs form a rich sub-
domain of optimization problems that include Max-3SAT, Max-2SAT, Max-Cut,
Max-3-Colorability etc., and lend themselves as targets for reductions from PCPs
(i.e., PCPs with certain parameters were often reduced to CSP problems of certain
types and parameters).
Algebraic CSPs. Following Harsha and Sudan [2000], we consider algebraic
variants of CSPs. These problems differ from the standard CSPs in certain syntactic
ways. The domain of the values that a variable can assume is associated with a finite
field F ; the index set of the variables is associated with F
m
for some integer m,
rather than being the set [n]; and thus an assignment to the variables may be viewed
naturally as a function f : F
m
F. Thus, the optimization problem(s) ask for
functions that satisfy as many constraints as possible. In this setting, constraints
are also naturally interpreted as algebraic functions, say given by an algebraic
circuit.
The interesting (nonsyntactic) aspect of these problems is when we optimize over
a restricted class of functions, rather than over the space of all functions. Specifically,
650 O. GOLDREICH AND M. SUDAN
for a given degree bound d, we consider the maximum number of constraints
satisfied by degree d polynomial f : F
m
F. Under this restriction on the space of
solutions, it is easier to establish NP-hardness of the task of distinguishing instances
where all constraints are satisfiable from instances where only a tiny fraction of the
constraints are satisfiable. This motivates the “Gap Polynomial CSP”, first defined
by Harsha and Sudan [2000].
Definition A.1 (Gap Polynomial Constraint Satisfaction (GapPCS)). For in-
tegers k, m, s and a finite field F,an(m, k)-
ary algebraic constraint of complexity s
over
F isa(k + 1)-tuple C = ( A; v
1
,...,v
k
), where A :(F
m
)
k
F is an alge-
braic circuit of size s, and v
1
,...,v
k
F
m
are variable names. For : Z
+
R
+
and m, b, q : Z
+
Z
+
, the promise problem GapPCS
,m,b,q
has as instances tuples
(1
n
, d; C
1
,...,C
t
), where d, k b(n) are integers and C
j
= (A
j
; v
j,1
,...,v
j,k
)is
an (m(n), k)-ary algebraic constraint of complexity b(n) over F = GF(q(n)). The
promise problem consists of the following sets of
YES and NO instances.
YES-instances. The instance (1
n
, d; C
1
,...,C
t
)isaYES-instance if there exists
a polynomial p : F
m
F of total degree at most d such that, for every j , the
constraint C
j
is satisfied by p; that is, A
j
(p(v
j,1
),..., p(v
j,k
)) = 0, for every
j [t].
NO-instances. The instance (1
n
, d; C
1
,...,C
t
)isaNO-instance if, for every
polynomial p of total degree at most d, at most (n) · t constraints are satisfied
(i.e.,evaluate to 0).
Note that all the varying parameters are expressed in terms of (the explicitly
given) parameter n, whereas the instance length is essentially n + log b(n) + t ·
b(n) ·(m(n) + 1) ·log q(n).
We stress that these gap problems are shown to be NP-hard (in Harsha and Sudan
[2000]) via a reduction that does not start from a PCP; instead the ideas underlying
the PCP construction of Babai et al. [1991a], and Feige et al. [1996] are (directly)
used in the reduction. Furthermore, these (algebraic) CSPs are used as the problem
for which PCPs are designed (rather than as the target of reduction from certain
PCPs). We comment that, so far (including our work), this approach was used
to design PCPs with certain parameters per se (and not to establish “hardness of
approximation” results).
Restricting the Algebraic CSPs. In order to facilitate the design of PCPs, we
consider a restricted version of the algebraic CSPs considered in Harsha and Sudan
[2000]. Specifically, we consider a restriction on the class of instances, where
each constraint, in addition to being restricted to apply only to k variables, is
restricted to apply only to variables that lie on some “2-dimensional variety” (i.e.,
the names/indices of the variables that appear in a constraint must lie on such a
variety). We define this notion first.
A d-dimensional variety of degree r is represented by a function Q =
(Q
1
,...,Q
m
):F
d
F
m
where each Q
i
is a d-variate polynomial of degree
r, and consists of the set of points V
Q
def
={Q(x):x F
d
}. (Note that this formu-
lation is more restrictive than the standard definitions of varieties.) A set of points
is said to
lie on the variety V
Q
if this set is contained in V
Q
.
In the following definition, in addition to requiring that the variables of each
constraint lie on a 2-dimensional variety (of degree r ), we include this variety in
Locally Testable Codes and PCPs of Almost-Linear Length 651
the description of the constraint. (This was not required in Harsha and Sudan [2000],
because they used a canonical higher-dimensional variety, which was constructed
generically from the aforementioned points and did not rely on the special structure
of these points.)
Definition A.2 (Restricted Gap Polynomial Constraint Satisfaction (rGapPCS)).
For integers k, m, s, r and a finite field F,a(2, r)-
restricted (m, k)-ary algebraic
constraint of complexity
s over F isa(k + 2)-tuple C = (A; v
1
,...,v
k
; Q), where
A :(F
m
)
k
F is an algebraic circuit of size s, and v
1
,...,v
k
F
m
are variable
names that lie on the 2-dimensional variety of degree r represented by Q.For
: Z
+
R
+
and r, m, b, q : Z
+
Z
+
, the promise problem rGapPCS
,r,m,b,q
has as instances tuples (1
n
, d; C
1
,...,C
t
), where d, k b(n) are integers and
C
j
= (A
j
; v
j,1
,...,v
j,k
; Q
j
)isa(2, r(n))-restricted (m(n), k)-ary algebraic
constraint of complexity b(n) over F = GF(q(n)). The partition of these instances
to
YES and NO instances is as in Definition A.1.
Again, all the varying parameters are expressed in terms of (the explicitly given)
parameter n, whereas the instance length is
N
def
=|(1
n
, d; C
1
,...,C
t
)|≤n + log b(n)
+ t · (b(n)+r(n)
2
)·(m(n)+1) ·log q(n) (47)
(when ignoring the effect on length involved in encoding sequences as a single
string).
A.2. The Complexity of rGapPCS
The following lemma is a slight variant of Lemma 3.16 in Harsha and Sudan [2000].
Specifically, while Harsha and Sudan [2000] use the generic fact that any k points lie
on a d-dimensional variety of degree d ·k
1/d
, we note that the specific O(m(n)b(n))
points chosen for each constraint (in the reduction) happen to lie on a 2-dimensional
variety of degree O(m(n)). This is because each constraint refers to O(m(n)b(n))
points such that each point lies on one out of O(m(n)) lines. Furthermore, we can
construct a representation of this variety, given that we have both the points and
the lines on which they lie. The following lemma simply lists conditions on the
parameters that allows for restricted GapPCS to be NP-hard.
L
EMMA A.3 (SLIGHT VARIANT OF HARSHA AND SUDAN [2000, LEM. 3.16]).
There exists constants c
1
, c
2
and a polynomial p
1
such that for any collection of
functions ε : Z
+
R
+
and m, r, b, q, : Z
+
Z
+
that satisfy b(n) log n,
(b(n)/m(n))
m(n)
n, r(n) c
1
m(n),q(n) (b(n)(n)) · p
1
(m(n)), and
(n) q(n)
m(n)+c
2
, it holds that SAT reduces to rGapPCS
ε,r,m,b,q
under a -length
preserving reduction.
The proof of Lemma A.3 is immediate from the description in Harsha and Sudan
[2000] and the aforementioned observation about the existence and constructibil-
ity of an adequate (2-dimensional) variety (of degree r(n)). On the other hand,
when applying the MIP system of Harsha and Sudan [2000, Sect. 3.6] to restricted
GapPCS instances, we get:
652 O. GOLDREICH AND M. SUDAN
LEMMA A.4 (IMPLICIT IN HARSHA AND SUDAN [2000, SECT. 3.6]). There ex-
ists a polynomial p
2
such that if ε : Z
+
R
+
and r, m, b, q : Z
+
Z
+
satisfy
q(n) p
2
(r(n)) · (b(n)(n)), then the promise problem rGapPCS
ε,r,m,b,q
has a
3-prover MIP proof with perfect completeness, soundness O(ε(n)), answer length
poly(b(n) +r(n)) ·log q(n), and randomness O(log N ) + O(m(n) log q(n)), where
N denotes the size of the GapPCS instance and n denotes the first parameter in the
instance. Furthermore, the size of the first prover is q(n)
m(n)
, and its answer length
is log q(n).
When wishing to derive 3-prover MIPs for SAT by using Lemma A.4, we may use
the reduction provided by Lemma A.3 for an appropriate choice of the parameters
ε, m, b, q,. Indeed, combining Lemmas A.3 and A.4, we state the following result
regarding 3-prover MIPs for SAT, where we restrict attention to the case of constant
>0 (and set most of the free parameters appearing in the two lemmas).
T
HEOREM A.5. For every constant >0 and m : Z
+
Z
+
, let (n) =
m(n)
O(m(n))
· n
1+O(1/m(n))
. Then SAT has a 3-prover proof system with perfect
completeness, soundness , randomness O(log n), and answer length m(n)
O(1)
·
n
O(1/m(n))
, in which the first prover has size O((n)), where n denotes the length of
the input.
P
ROOF. Assume, without loss of generality, that m(n) (log n)/(log log n).
(For larger m(·), the requirements on both the function (n) and the answer length
become weaker.) Let p be a polynomial such that p(t) max( p
1
(t), p
2
(c
1
t))
for every t 1, where c
1
is the constant in Lemma A.3 and p
1
and p
2
are the
polynomials in Lemmas A.3 and A.4, respectively. We use the following setting of
the functions b, r and q.
b(n) = m(n) ·n
1/m(n)
r(n) = c
1
· m(n)
ε(n) = /O(1)
q(n) = (b(n)(n)) · p(m(n)).
The reader can easily verify that this setting satisfies all relevant conditions in
Lemmas A.3 and A.4. To verify the remaining condition, which refers to , note that
q(n)
m(n)+c
2
= ((m(n) · n
1/m(n)
(n)) · p(m(n)))
m(n)+c
2
(m(n)(n))
O(m(n))
· n
1+(c
2
/m(n))
Using ε(n)
1
m(n)
O(1)
,wehaveq(n)
m(n)+c
2
< m(n)
O(m(n))
· n
1+O(1/m(n))
and (n) q(n)
m(n)+c
2
follows for a suitable constant c in the setting (n) =
m(n)
c·m(n)
·n
1+(c/m(n))
. We note that log q(n) = O(log m(n)) +(1/m(n)) log n, and
recall that m(n) log m(n) < log n. Now, invoking Lemmas A.3 and A.4 (with the
setting of parameters as above), we obtain a 3-prover proof system for SAT with
perfect completeness, soundness , and the following parameters
Answer length poly(b(n) +r (n)) ·log q(n) = m(n)
O(1)
· n
1/O(m(n))
.
Randomness O(log (n)) + O(m(n) log q(n)) = O(log n).
The size of the first prover oracle is q(n)
m(n)
<(n).
The theorem follows.
Locally Testable Codes and PCPs of Almost-Linear Length 653
A.3. The Proof System of Theorem A.5
In this section, we provide a high-level description of the operation of the 3-prover
system that underlies the proof of Theorem A.5, which in fact is the system under-
lying the proof of Lemma A.4. (Needless to say, a full description of this system is
given in the original work of Harsha and Sudan [2000].)
Recall that the problem (instance) consists of parameters n, d and a sequence of
constraints C
1
,...,C
t
. (See Definition A.2.) The field F = GF(q(n)) is determined
by n (and so are the values m = m(n) and r = r(n)). In the 3-prover one-round
system underlying the proof of Lemma A.4, the verifier expects the three provers
P, P
1
, P
2
to answer its queries as follows:
P should answer according to an assignment function f that satisfies the condi-
tions of Definition A.2. In particular, f is supposed to be a degree d polynomial
in m variables over F.
P
1
should provide the value of f when restricted to any plane in F
m
, where a plane
is defined by three points in F
m
(i.e., =
a,b,c
={i ·a + j ·b+c : i, j F},
for a, b, c F
m
). That is, P
1
should answer the query =
a,b,c
with the
bivariate polynomial f
= f () over F, where f
(x, y) = f (x ·a + y ·b +c).
P
2
should provide the value of f when restricted to any curve (of appropriate
flexibility) in F
m
. Specifically, the curves are 3-dimensional varieties of degree
r, given by m trivariate polynomials of degree r (over F).
The verifier operates as follows. It picks a random constraint C
j
=
(A
j
; v
j,1
,....,v
j,k
; Q
j
) and a random point v
0
, picks a random plane that passes
through v
0
, and a random curve C (i.e., a 3-dimensional variety of degree r ) that
extends the variety represented by Q
j
and passes through the point v
0
. (Specifically,
this curve may be the one given by C(s, t
1
, t
2
) = s ·v
0
+(1 s) ·Q
j
(t
1
, t
2
).) It sends
v
0
to P, to P
1
, and C to P
2
, receiving the answers a
def
= P(x
0
), g = P
1
(), and
h = P
2
(C). The verifier accepts if and only if the following two conditions hold:
(1) The function g is consistent with Ps answer at v
0
; that is, g(t
, t

) = a, where
(t
, t

) = v
0
.
(2) The function h is consistent with Ps answer at v
0
and the values of f (as
provided by h)onv
j,1
,....,v
j,k
satisfy A
j
. That is:
(a) h(α
0
) = a, where C(α
0
) = v
0
.
(b) A
j
(h(α
1
),...,h(α
k
)) = 0, where C(α
i
) = v
j,i
for i = 1,...,k.
Note that this verifier has logarithmic randomness complexity (i.e., it tosses
(log t) + O(m log q(n)) coins, whereas its input length exceeds t · m log q(n)),
and that each of its queries is uniformly distributed in the corresponding domain.
Thus, this verifier satisfies the Sampleability and Uniformity Properties defined
in Section 4.2.2. Before turning to the Decomposition Property, we note that the
verifier has perfect completeness (i.e., if a good solution f exists then setting the
prover strategies as suggested above makes the verifier accept with probability 1).
Soundness and Decomposition Property. Suppose that f = P does not satisfy
the rGapPCS instance. Consider the set of all m-variate polynomials of degree
d that agree with f on at least /2 of the domain. Denoting these polynomials
by p
1
,..., p
L
, we denote by S
i
the set of points where f agrees with p
i
(i.e.,
654 O. GOLDREICH AND M. SUDAN
S
i
={x F
m
: f (x) = p
i
(x)}). Let Q
= Q
P
= F
m
\∪
i
S
i
. We consider the
following two cases (concerning whether or not the random point v
0
is in Q
):
v
0
Q
. This case is analyzed as Event 1 in the proof of Harsha and Sudan
[2000, Claim 3.30], where it is shown that for every P
1
Pr
v
0
,
[v
0
Q
and (P
1
())(t
, t

) = f (v
0
)] </2
where (t
, t

) = v
0
.
v
0
∈∪
i
S
i
. This case is analyzed as Events 2 and 3 in the proof of Harsha and
Sudan [2000, Claim 3.30], where it is shown that for every P
2
Pr
v
0
, j,C
[v
0
∈ Q
,(P
2
(C))(α
0
) = f (v
0
)
and
A
j
(P
2
(C))(α
1
),...,(P
2
(C))(α
k
)) = 0] </2
where C
j
= (A
j
, v
j,1
,....,v
j,k
), (α
0
) = v
0
, and (α
) = v
j,
for = 1,...,k.
Combining the two cases, soundness is established. Furthermore, the above anal-
ysis satisfies the Decomposition Property.
ACKNOWLEDGMENTS. We are grateful to Salil Vadhan for suggesting some
modifications to the construction and analysis in Section 3.2, yielding stronger
results with simpler proofs. We also wish to thank the anonymous referees for
their helpful comments.
REFERENCES
ALON, N., KAUFMAN, T., KRIVELEVICH, M., LITSYN, S., AND RON, D. 2003. Testing low-degree poly-
nomials over GF(2). In Proceedings of the 7t h International Workshop on Randomization and Approx-
imation Techniques in Computer Science (RANDOM 2003). Lecture Notes in Computer Science, vol.
2754, Springer, New York, 188–199.
A
RORA, S., LUND, C., MOTWANI, R., SUDAN M., AND SZEGEDY, M. 1998. Proof verification and the
hardness of of approximation problems. J. ACM, 45, 3 (May), 501–555.
A
RORA, S., AND SAFRA, S. 1998. Probabilistic checkable proofs: A new characterization of NP. J. ACM
45, 1 (Jan.), 70–122.
A
RORA, S., AND SUDAN, M. 2003. Improved low degree testing and its applications. Combinatorica 23,
3, 365–426
B
ABAI, L., FORTNOW, L., LEVIN,L.,AND SZEGEDY, M. 1991a. Checking computations in polylogarithmic
time. In Proceedings of the 23rd ACM Symposium on the Theory of Computing. ACM, New York, 21–31.
B
ABAI, L., FORTNOW,L.,AND LUND. C. 1991b. Non-deterministic exponential time has two-prover
interactive protocols. Comput. Complex. 1, 13–40.
B
ELLARE, M., COPPERSMITH, D., H
˚
ASTAD, J., KIWI,M.,AND SUDAN, M. 1996. Linearity testing over
characteristic two. IEEE Trans. Info. Theory 42, 6 (Nov.), 1781-1795.
B
ELLARE,M.GOLDREICH, O., AND SUDAN, M. 1998. Free bits, PCPs, and nonapproximability—towards
tight results. SIAM J. Comput. 27, 3, 804–915.
B
ELLARE, M., GOLDWASSER, S., LUND,C.,AND RUSSELL, A. 1993. Efficient probabilistically checkable
proofs and applications to approximation. In Proceedings of the 25th ACM Symposium on the Theory of
Computing ACM, New York, 294–304.
B
EN-SASSON, E., GOLDREICH, O., HARSHA, P., SUDAN,M.,AND VADHAN, S. 2004. Robust PCPs of
proximity, shorter PCPs and applications to coding. In Proceedings of the 36th ACM Symposium on the
Theory of Computing ACM, New York, 1–10.
B
EN-SASSON, E., GOLDREICH, O., AND SUDAN, M. 2003a. Bounds on 2-query codeword testing. In
Proceedings of the 7th International Workshop on Randomization and Approximation Techniques in
Computer Science (RANDOM 2003). Lecture Notes in Computer Science, Vol. 2764. Springer, New
York, 216–227.
Locally Testable Codes and PCPs of Almost-Linear Length 655
BEN-SASSON,E.,AND SUDAN, M. 2005. Short PCPs with poly-log rate and query complexity. In Pro-
ceedings of the 37th Annual ACM Symposium on Theory of Computing. ACM, New York, 266–275.
B
EN-SASSON, E., SUDAN,M.VADHAN, S., AND WIGDERSON, A. 2003b. Randomness-efficient low degree
tests and short PCPs via biased sets. In Proceedings of the 35th ACM Symposium on the Theory of
Computing. ACM, New York, 612–621.
B
LUM, M., LUBY,M.,AND RUBINFELD, R. 1993. Self-testing/correcting with applications to numerical
problems. J. Comput. Syst. Sci. 47, 3, 549–595.
C
REIGNOU, N., KHANNA, S., AND SUDAN, M. 2001. Complexity Classifications of Boolean Constraint
Satisfaction Problems. SIAM Press, Philadeplhia, PA.
D
INUR, I. 2006. The PCP theorem by gap amplification. In Proceedings of the 38th ACM Symposium on
the Theory of Computing. ACM, New York, 241–250.
D
INUR,I.,AND REINGOLD, O. 2004. Assignment-testers: Towards a combinatorial proof of the PCP-
theorem. In Proceedings of the 45th Annual IEEE Symposium on on Foundations of Computer Science.
IEEE. Computer Society Press, Los Alamitos, CA, 155–164.
E
VEN, S., SELMAN,A.L.,AND YACOBI, Y. 1984. The complexity of promise problems with applications
to public-key cryptography. Info. Control 61, 2 (May), 159–173.
F
EIGE, U., GOLDWASSER,S.LOV
´
ASZ, L., SAFRA, S., AND SZEGEDY, M. 1996. Interactive proofs and the
hardness of approximating cliques. J. ACM, 43, 268–292.
F
ORNEY,JR., G. D. 1966. Concatenated Codes. MIT Press, Cambridge, MA.
F
RIEDL, K., AND SUDAN, M. 1995. Some improvements to low-degree tests. In Proceedings of the 3rd
Annual Israel Symposium on Theory and Computing Systems, (Washington, DC).
G
OLDREICH, O., GOLDWASSER, S., AND D. RON. 1998. Property testing and its connection to learning
and approximation. J. ACM, 653–750.
G
OLDREICH,O,KARLOFF, H., SCHULMAN,L.J.,AND TREVISAN, L. 2002. Lower bounds for linear locally
decodable codes and private information retrieval. In the Proceedings of the 17th IEEE Conference on
Computational Complexity. IEEE, Computer Society Press, Los Alamitos, CA, 175–183.
G
OLDREICH, O., AND SUDAN, M. 2002. Locally testable codes and PCPs of almost-linear length tech.
Rep. TR02-050, ECCC.
H
ARSHA,P.,AND SUDAN. M. 2000. Small PCPs with low query complexity. Computat. Complex. 9,
(3–4), 157–201.
H
˚
A
STAD
, J. 1999. Clique is hard to approximate within n
1
. Acta Math. 182, 105–142.
K
ATZ,J.,AND TREVISAN, L. 2000. On the efficiency of local decoding procedures for error-correcting
codes. In STOC’00: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing. ACM,
New York, 80–86.
K
IWI, M. 2003. Algebraic testing and weight distribution of dual codes. TR97-010.
M
OTWANI,R.,AND RAGHAVAN, P. 1995. Randomized Algorithms. Cambridge University Press.
P
OLISHCHUK, A., AND SPIELMAN, D. A. 1994. Nearly-linear size holographic proofs. In Proceedings of
the 26th Annual ACM Symposium on the Theory of Computing, ACM, New York, 194–203.
R
AZ
,R.,AND SAFRA, S. 1997. A sub-constant error-probability low-degree test, and a sub-constant
error-probability PCP characterization of NP. In Proceedings of the 29th Annual ACM Symposium on the
Theory of Computing. ACM, New York, 475–484.
R
UBINFELD,R.,AND SUDAN, M. 1996. Robust characterization of polynomials with applications to
program testing. SIAM J. Comput. 25, (2) (Apr.), 252–271.
RECEIVED FEBRUARY 2004; REVISED FEBRUARY 2005 AND MARCH 2006; ACCEPTED MARCH 2006
Journal of the ACM, Vol. 53, No. 4, July 2006.