<?xml version="1.0" encoding="utf-8"?>
<!-- name="GENERATOR" content="github.com/mmarkdown/mmark Mmark Markdown Processor - mmark.miek.nl" -->
<rfc version="3" ipr="trust200902" docName="draft-google-cfrg-libzk-01" submissionType="IETF" category="info" xml:lang="en" xmlns:xi="http://www.w3.org/2001/XInclude" indexInclude="true">

<front>
<title>The Longfellow Zero-knowledge Scheme</title><seriesInfo value="draft-google-cfrg-libzk-01" stream="IETF" status="informational" name="Internet-Draft"></seriesInfo>
<author initials="M." surname="Frigo" fullname="Matteo Frigo"><organization>Google</organization><address><postal><street></street>
</postal><email>matteof@google.com</email>
</address></author><author initials="a." surname="shelat" fullname="abhi shelat"><organization>Google</organization><address><postal><street></street>
</postal><email>shelat@google.com</email>
</address></author><date/>
<area>Internet</area>
<workgroup>Network Working Group</workgroup>

<abstract>
<t>This document defines an algorithm for generating and verifying a succinct non-interactive zero-knowledge argument that for a given input <tt>x</tt> and a circuit <tt>C</tt>, there exists a witness <tt>w</tt>, such that <tt>C(x,w)</tt> evaluates to 0. The technique here combines the MPC-in-the-head approach for constructing ZK arguments described in Ligero <xref target="ligero"></xref> with a verifiable computation protocol based on sumcheck for proving that <tt>C(x,w)=0</tt>.</t>
</abstract>

</front>

<middle>

<section anchor="introduction"><name>Introduction</name>
<t>A zero-knowledge (ZK) scheme allows a Prover who holds an arithmetic circuit <tt>C</tt> defined over a finite field <tt>F</tt> and two inputs <tt>(x,w)</tt> to convince a Verifier who holds only <tt>(C,x)</tt> that the Prover knows <tt>w</tt> such that <tt>C(x,w) = 0</tt> without revealing any extra information to the Verifier.</t>
<t>The concept of a zero-knowledge scheme was introduced by Goldwasser, Micali, and Rackoff <xref target="GMR"></xref>, and has since been rigourously explored and optimized in the academic literature.</t>
<t>There are several models and efficiency goals that different ZK schemes aim to achieve, such as reducing prover time, reducing verifier time, or reducing proof size.  Some ZK schemes also impose other requirements to achieve their efficienc goals.  This document considers the scenario in which there are no common reference strings, or trusted parameter setups that are available to the parties.  This immediately rules out several succinct ZK scheme from the literature.  In addition, this document also focuses on schemes that can be instantiated from a collision-resistant hash function and require no other complexity theoretic assumption.  Again, this rules out several schemes in the literature.   All of the ZK schemes from the literature that remain can be defined in the Interactive Oracle Proof (IOP) model, and this document specifies a family of them that enjoys both efficiency and simplicity.</t>

<section anchor="the-longfellow-system"><name>The Longfellow system</name>
<t>This document specifies the Longfellow ZK scheme described in the paper <xref target="longfellow"></xref>.  The scheme is constructed from two components: the first is the Ligero scheme, which provides a cryptographic commitment scheme that supports an efficient ZK argument system that enables proving linear and quadratic constraints on the committed witness, and the second is a public-coin interactive protocol (IP) for producing an argument that <tt>C(x,w)=0</tt> where <tt>C</tt> is such a circuit, <tt>x</tt> is a public input, and <tt>w</tt> is a private witness. The overall scheme works by having the Prover commit to the witness <tt>w</tt> as well as a <tt>pad</tt> used to commit the transcript of the IP, then to run the IP with the verifier in a way that produces a commitment to the transcript of the IP, and finally, by running the Ligero proof system to prove that the transcript in the commitment induces the IP verifier to accept.</t>
</section>
</section>

<section anchor="basic-operations-and-notation"><name>Basic Operations and Notation</name>
<t>The key words &quot;<bcp14>MUST</bcp14>&quot;, &quot;<bcp14>MUST NOT</bcp14>&quot;, &quot;<bcp14>REQUIRED</bcp14>&quot;, &quot;<bcp14>SHALL</bcp14>&quot;, &quot;<bcp14>SHALL NOT</bcp14>&quot;, &quot;<bcp14>SHOULD</bcp14>&quot;, &quot;<bcp14>SHOULD NOT</bcp14>&quot;, &quot;<bcp14>RECOMMENDED</bcp14>&quot;, &quot;<bcp14>MAY</bcp14>&quot;, and &quot;<bcp14>OPTIONAL</bcp14>&quot; in this document are to be interpreted as described in RFC 2119 <xref target="RFC2119"></xref>.</t>
<t>Additionally, the key words &quot;<strong>MIGHT</strong>&quot;, &quot;<strong>COULD</strong>&quot;, &quot;<strong>MAY WISH TO</strong>&quot;, &quot;<strong>WOULD
PROBABLY</strong>&quot;, &quot;<strong>SHOULD CONSIDER</strong>&quot;, and &quot;<strong>MUST (BUT WE KNOW YOU WON'T)</strong>&quot; in
this document are to interpreted as described in RFC 6919 <xref target="RFC6919"></xref>.</t>
<t>Except if said otherwise, random choices in this specification refer to drawing with uniform distribution from a given set (i.e., &quot;random&quot; is short for &quot;uniformly random&quot;).  Random choices can be replaced with fresh outputs from a cryptographically strong pseudorandom generator, according to the requirements in <xref target="RFC4086"></xref>, or pseudorandom function.</t>

<section anchor="array-primitives"><name>Array primitives</name>
<t>The notation <tt>A[0..N]</tt> refers to the array of size <tt>N</tt> that contains <tt>A[0],A[1],...,A[N-1]</tt>, i.e., the right-boundary in the notation <tt>X..Y</tt> is an exclusive index bound.
The following functions are used throughout the document:</t>

<ul spacing="compact">
<li>copy(n, Dst, Src): copies n elements from Src to Dst with different strides</li>
<li>axpy(n, Y, A, X): sets Y[i] += A*X[i] for 0 &lt;= i &lt; n.</li>
<li>sum(n, A): computes the sum of the first n elements in array A</li>
<li>dot(n, A, Y): computes the dot product of length n between arrays A and Y.</li>
<li>add(n, A, Y): returns the array <tt>[A[0]+Y[0], A[1]+Y[1], ..., A[n-1]+Y[n-1]]</tt>.</li>
<li>prod(n, A, Y): returns the array <tt>[A[0]*Y[0], A[1]*Y[1], ..., A[n-1]*Y[n-1]]</tt>.</li>
<li>equal(n, A, Y): true if <tt>A[i]==Y[i]</tt> for 0 &lt;= i &lt; n and false otherwise.</li>
<li>gather(n, A, I): returns the array <tt>[A[I[0]], A[I[1]], ..., A[I[n-1]]</tt>.</li>
<li><tt>A[n][m] = [0]</tt>: initializes the 2-dimensional n x m array A to all zeroes.</li>
<li><tt>A[0..NREQ] = X</tt> : array assignment, this operation copies the first NREQ elements of X into the corresponding indicies of the A array.</li>
</ul>
</section>

<section anchor="polynomial-operations"><name>Polynomial operations</name>
<t>This section describes operations on and associated with polynomials
that are used in the main protocol.</t>

<section anchor="extend-method-in-field-f-p"><name>Extend method in Field F_p</name>
<t>The <tt>extend(f, n, m)</tt> method interprets the array <tt>f[0..n]</tt> as the evaluations of a polynomial <tt>P</tt> of degree less than <tt>n</tt> at the points <tt>0,...,n-1</tt>, and returns the evaluations of the same <tt>P</tt> at the points <tt>0,...,m-1</tt>.  For sufficiently large fields <tt>|F_p| = p &gt;= m</tt>, polynomial <tt>P</tt> is uniquely determined by the input, and thus <tt>extend</tt> is well defined.</t>
<t>As there are several algorithms for efficiently performing the extend operation, the implementor can choose a suitable one.  In some cases, the brute force method of using Lagrange interpolation formulas to compute each output point independently may suffice.  One can employ a convolution to implement the <tt>extend</tt> operation, and in some cases, either the Number Theoretic Transform or Nussbaumer's algorithm can be used to efficiently compute a convolution.</t>
</section>

<section anchor="extend-method-in-field-gf-2-k"><name>Extend method in Field GF 2<sup>k</sup></name>
<t>The previous section described an extend method that applies to odd prime-order finite fields which contain the elements 0,1,2...,m.  In the special case of GF(2^k), the extend operator is defined in an opinionated way inspired by the Additive FFT algorithm by Lin et al <xref target="additivefft"></xref>.
Lin et al. define a novel polynomial basis for polynomials as an alternative to the usual monomial
basis x<sup>i</sup>, and give an algorithm for evaluating a degree-(d-1) polynomial at all d points in a subspace, for d=2<sup>ell</sup>, and for polynomials expressed in the novel basis.</t>
<t>Specifically, this document implements GF(2<sup>128</sup>) as GF{2}[x] / (Q(x)) where</t>

<artwork><![CDATA[    Q(x) = x^{128} + x^{7} + x^{2} + x + 1
]]></artwork>
<t>With this choice of Q(x), <tt>x</tt> is a generator of the multiplicative group of the field.
Next, choose GF(2<sup>16</sup>) as the subfield of GF(2<sup>128</sup>) with <tt>g=x^{(2^{128}-1) / (2^{16}-1)}</tt> as its generator, and <tt>beta_i=g^i^</tt> for 0 &lt;= i &lt; 16 as the basis of the subfield.  For relevant problem sizes, this allows encoding elements in a commitment scheme with 16-bits instead of 128.</t>
<t>Writing <tt>j_i</tt> for the <tt>i</tt>-th bit of the binary representation of <tt>j</tt>, that is,</t>

<artwork><![CDATA[    j = sum_{0 <= i < k} j_i 2^i     j_i \in {0,1}
]]></artwork>
<t>inject integer <tt>j</tt> into a field element <tt>inj(j)</tt> by interpreting the bits of <tt>j</tt>  as coordinates in terms of the basis:</t>

<artwork><![CDATA[    inj(j) = sum_{0 <= i < k} j_i beta_i
]]></artwork>
<t>In this setting, define the extend operator to interpret the array <tt>f[0..n]</tt> to consist of the evaluations of a polynomial <tt>p(x)</tt> of degree at most <tt>n-1</tt> at the <tt>n</tt> points <tt>x \in { inj(i) : 0 &lt;= i &lt; n }</tt> and to return the set <tt>{ p(inj(i)) : 0 &lt;= i &lt; m}</tt> which consist of the evaluations of the same polynomial <tt>p(x)</tt> at the injected points <tt>0,...,m-1</tt>.</t>
<t>This convention allows this operation to be completed efficiently using various forms of the additive FFT as described in <xref target="longfellow"></xref> <xref target="additivefft"></xref>.</t>
</section>
</section>
</section>

<section anchor="fiat-shamir-primitives"><name>Fiat-Shamir primitives</name>
<t>A ZK protocol must in general be interactive whereby the Prover and Verifier engage in multiple rounds of communication.  However, in practice, it is often more convenient to deploy so-called ``non-interactive&quot; protocols that only require a single message from Prover to Verifier.  It is possible to apply the Fiat-Shamir heuristic to transform a special class of interactive protocols into single-message protocols from Prover to Verifier.</t>
<t>The Fiat-Shamir transform is a method for generating a verifier's public coin challenges by processing the concatenation of all of the Prover's messages.   The transform can be proven to be sound when applied to an interactive protocol that is round-by-round sound and when the oracle is implemented with a hash function that satisfies a correlation-intractability property with respect to the state function implied by the round-by-round soundness.  See Theorem 5.8 of <xref target="rbr"></xref> for details.</t>
<t>In practice, whether an implementation of the random oracle satisfies this correlation-intractability property becomes an implicit assumption.  Towards that, this document adapts best practices in selecting the oracle implementation. First, the random oracle should have higher circuit depth and require more gates to compute than the circuit C that the protocol is applied to.  Furthermore, the size of the messages which are used as input to the oracle to generate the Verifier's challenges should be larger than C.  These choices are easy to implement and add very little processing time to the protocol. On the other hand, they seemingly avoid attacks against correlation-intractability in which the random oracle is computed within the ZK protocol thereby allowing the output of the circuit to be related to the verifier's challenge.</t>
<t>As an additional property, each query to the random oracle should be able to be uniquely mapped into a protocol transcript. To facilitate this property, the type and length of each message is incorporated into the query string.</t>

<section anchor="implementation"><name>Implementation</name>
<t>Let <tt>H</tt> be a collision-resistant hash function.
A protocol consists of multiple rounds in which a Prover sends a message, and a verifier responds with a public-coin or random challenge. The Fiat-Shamir transform for such a protocol is implemented by maintaining a <tt>transcript</tt> object.</t>

<section anchor="initialization"><name>Initialization</name>
<t>At the beginning of the protocol, the transcript object must be initialized.</t>

<ul spacing="compact">
<li><tt>transcript.init(session_id)</tt>: The initialization begins by
selecting an oracle, which concretely consists of selecting a fresh
session identifier. This process is handled by the encapsulating
protocol---for example, the transcript that is used for key
exchange for a session can be used as the session identifier as it
is guaranteed to be unique.</li>
</ul>
</section>

<section anchor="writing-to-the-transcript"><name>Writing to the transcript</name>
<t>The transcript object supports a <tt>write</tt> method that is used to record
the Prover's messages.  To produce the verifier's challenge message, the transcript object internally maintains a Fiat-Shamir Pseudo-random Function (FSPRF) object that
generates a stream of pseudo-random bytes.  Each invocation of
<tt>write</tt> creates a new FSPRF object, which we denote by <tt>fs</tt>.</t>

<ul spacing="compact">
<li><tt>transcript.write(msg)</tt>: appends the Prover's next message to
the transcript.</li>
</ul>
<t>There are three types of messages that can be appended to the transcript: a field element, an array of bytes, or an array of field elements.</t>

<ul>
<li><t>To append a field element, first the byte designator <tt>0x1</tt> is appended, and then the canonical byte serialization of the field element is appended.</t>
</li>
<li><t>To append an array of bytes, first the byte designator <tt>0x2</tt> is
appended, an 8-byte little-endian encoding of the number of bytes in
the array is appended, and then the bytes of the array are appended.</t>
</li>
<li><t>To append an array of field elements, the byte designator <tt>0x3</tt> is
added, an 8-byte little-endian encoding of the number of field
elements is appended, and finally, all of the field elements in array
order are serialized and appended.</t>
</li>
</ul>
</section>

<section anchor="special-rules-for-the-first-message"><name>Special rules for the first message</name>
<t>The <tt>write</tt> method for the first prover message incorporates
additional steps that enhance the correlation-intractability property
of the oracle.  To process the Prover's first message (which is usually a
commitment):</t>

<ol spacing="compact">
<li>The Prover message is appended to the transcript. Specifically, the length of the message, as per the above convention, is appended, and then the bytes of the message are appended.</li>
<li>Next, an encoding of the statement to be proven, which consists of
the circuit identifier, and a serialization of the input and
output of the statement is appended. Each of these three message are added as
byte sequences, with their length appended as per convention.</li>
<li>Finally, the transcript is augmented by the byte-array 0<sup>|C|</sup>,
which consists of |C| bytes of zeroes.</li>
</ol>
<t>One might at first think of performing steps 2 and 3 first so as to
simplify the description of the protocol, and moreover step 3 may
appear to be unnecessary.  Performing the steps in the indicated order
protects against the attack described in <xref target="krs"></xref>, under the assumption
that it is infeasible for a circuit C that contains |C| arithmetic
gates to compute the hash of a string of length |C|.</t>
<t>Subsequent calls to the <tt>write</tt> method are used to record the Prover's
response messages <tt>msg</tt>. In this case, the message is appended
following the conventions described above.</t>
</section>
</section>

<section anchor="the-fsprf-object"><name>The FSPRF object</name>
<t>Each <tt>write</tt> internally creates an FSPRF object <tt>fs</tt> that is seeded
with the hash digest of the transcript at the end of the write
operation.</t>
<t>The FSPRF object is defined to produce an infinite stream of bytes that can be used to sample all of the verifier's challenges in this round. The stream is organized in blocks of 16 bytes each,
numbered consecutively starting at 0.  Block <tt>i</tt> contains</t>

<artwork><![CDATA[    AES256(KEY, ID(i))
]]></artwork>
<t>where <tt>KEY</tt> is the seed of the FSPRF object, and <tt>ID(i)</tt> is the
16-byte little-endian representation of integer <tt>i</tt>.</t>
<t>The FSPRF object supports a <tt>bytes</tt> method:</t>

<ul spacing="compact">
<li><tt>b = fs.bytes(n)</tt> returns the next <tt>n</tt> bytes in the stream.</li>
</ul>
<t>Thus, <tt>fs</tt> implicitly maintains an index into the next position in
the stream.  Calls to <tt>bytes</tt> without an intervening <tt>write</tt> read
pseudo-random bytes from the same stream.</t>
</section>

<section anchor="generating-challenges"><name>Generating challenges</name>
<t>When the prover has finished sending messages for a round in the interactive
protocol, it can make a sequence of calls to <tt>transcript.generate_{nat,field_element,challenge}</tt> to obtain the Verifier's random challenges.</t>
<t>The <tt>bytes</tt> method of the FSPRF is used by the transcript object to sample pseudo-random field elements and
pseudo-random integers via rejection sampling as follows:</t>

<ul spacing="compact">
<li><tt>transcript.generate_nat(m)</tt> generates a random natural between <tt>0</tt> and
<tt>m-1</tt> inclusive, as follows.</li>
</ul>
<t>Let <tt>l</tt> be minimal such that <tt>2^l &gt;= m</tt>.  Let <tt>nbytes = ceil(l / 8)</tt>.
  Let <tt>b = fs.bytes(nbytes)</tt>.  Interpret bytes <tt>b</tt> as a little-endian
  integer <tt>k</tt>.  Let <tt>r = k mod 2^l</tt>, i.e., mask off the high <tt>8 * nbytes - l</tt>
  bits of <tt>k</tt>.  If <tt>r &lt; m</tt> return <tt>r</tt>, otherwise start over.</t>

<ul spacing="compact">
<li><tt>transcript.generate_field_element(F)</tt> generates a field element.</li>
</ul>
<t>If the field <tt>F</tt> is <tt>Z / (p)</tt>, return <tt>generate_nat(fs, p)</tt> interpreted
  as a field element.</t>
<t>If the field is <tt>GF(2)[X] / (X^128 + X^7 + X^2 + X + 1)</tt> obtain
  <tt>b = fs.bytes(16)</tt> and interpret the 128 bits of <tt>b</tt> as a little-endian
  polynomial.  This document does not specify the generation of
  a field element for other binary fields, but extensions SHOULD follow
  a similar pattern.</t>

<ul spacing="compact">
<li><tt>a = transcript.generate_challenge(F, n)</tt> generates an array of <tt>n</tt>
field elements in the straightforward way: for <tt>0 &lt;= i &lt; n</tt>
in ascending order, set <tt>a[i] = transcript.generate_field_element(F)</tt>.</li>
</ul>
</section>
</section>

<section anchor="ligero-zk-proof"><name>Ligero ZK Proof</name>
<t>This section specifies the construction and verification method for a Ligero commitment and zero-knowledge argument. The Ligero system as described by Ames, Hazay, Ishai, and Venkitasubramaniam <xref target="ligero"></xref>, consists of a commitment scheme, and a method for proving linear and quadratic constraints on the committed values in zero-knowledge. The later interface is sufficient to prove arbitrary circuits, but in the Longfellow scheme, it suffices to describe how to use such constraints to directly verify an IP transcript.</t>

<section anchor="merkle-trees"><name>Merkle trees</name>
<t>This section describes how to construct a Merkle tree from a sequence of <tt>n</tt> strings, and how to verify that a given string <tt>x</tt> was placed at leaf <tt>i</tt> in a Merkle tree. These methods do not assume that <tt>n</tt> is a power of two. This construction is parameterized by a cryptographic hash function such as SHA-256 <xref target="RFC6234"></xref>.  In this application, a leaf in a tree is a message digest instead of an arbitrary string; for example, if the hash function is SHA-256, then the leaf is a 32-byte string.</t>
<t>A tree that contains <tt>n</tt> leaves is represented by an array of <tt>2 * n</tt> message digests in which the input digests are written at indicies <tt>n..(2*n - 1)</tt>.  The tree is constructed by iteratively hashing the concatenation of the values at indicies <tt>2*j</tt> and <tt>2*j+1</tt>, starting at <tt>j=n-1</tt>, and continuing until <tt>j=1</tt>. The root is at index 1. In this specification, the prover and verifier will already know the value of <tt>n</tt> when they produce or verify a Merkle tree.</t>

<section anchor="constructing-a-merkle-tree-from-n-digests"><name>Constructing a Merkle tree from <tt>n</tt> digests</name>

<artwork><![CDATA[struct {
   Digest a[2 * n]
} MerkleTree

def set_leaf(M, pos, leaf) {
  assert(pos < M.n)
  M.a[pos + n] = leaf
}

def build_tree(M) {
  FOR M.n < i <= 1 DO
    M.a[i] = hash(M.a[2 * i] || M.a[2 * i + 1])
  return M.a[1]
}
]]></artwork>
</section>

<section anchor="constructing-a-proof-of-inclusion"><name>Constructing a proof of inclusion</name>
<t>This section describes how to construct a Merkle proof that <tt>k</tt> input digests at indicies <tt>i[0],...,i[k-1]</tt> belong to the tree.  The simplest way to generate such a proof is to produce independent proofs for each of the <tt>k</tt> leaves. However, this turns out to be wasteful in that internal nodes may be included multiple times along different paths, and some nodes may not need to be included at all because they are implied by nodes that have already been included.</t>
<t>To address these inefficiencies, this section explains how to produce a batch proof of inclusion for <tt>k</tt> leaves. The main idea is to start from the requested set of leaves and build all of the implied internal nodes given the leaves. For example, if sibling leaves are included, then their parent is implied, and the parent need not be included in the compressed proof.  Then it suffices to revisit the same tree and include the necessary siblings along all of the Merkle paths.  It is assumed that the verifier already has the leaf digests that are at the indicies, and thus the proof only contains the necessary internal nodes of the Merkle tree that are used to verify the claim.</t>
<t>It is important in this formulation to treat the input digests as a sequence, i.e. with a given order. Both the prover and verifier of this batch proof must use the same order of the <tt>requested_leaves</tt> array.</t>

<artwork><![CDATA[def compressed_proof(M, requested_leaves[], n) {
  marked = mark_tree(requested_leaves, n)
  FOR n < i <= 1 DO
    IF (marked[i]) {
      child = 2 * i
      IF (marked[child]) {
        child += 1
      }
      IF (!marked[child]) {
        proof.append(M.a[child])
      }
    }
  return proof
}

def mark_tree(requested_leaves[], n) {
  bool marked[2 * n]   // initialized to false

  for(index i : requested_leaves)
    marked[i + n] = true

  FOR n < i <= 1 DO
    // mark parent if child is marked
    marked[i] = marked[2 * i] || marked[2 * i + 1];

  return marked
}
]]></artwork>
</section>

<section anchor="verifying-a-proof-of-inclusion"><name>Verifying a proof of inclusion</name>
<t>This section describes how to verify a compressed Merkle proof. The claim to verify is that &quot;the commitment <tt>root</tt> defines an <tt>n</tt>-leaf Merkle tree that contains <tt>k</tt> digests s[0],..s[k-1] at corresponding indicies i[0],...i[k-1].&quot;  The strategy of this verification procedure is to deduce which nodes are needed along the <tt>k</tt> verification paths from index to root, then read these values from the purported proof, and then recompute the Merkle tree and the consistency of the <tt>root</tt> digest. As an optimization, the <tt>defined[]</tt> array avoids recomputing internal portions of the Merkle tree that are not relevant to the verification. By convention, a proof for the degenerate case of <tt>k=0</tt> digests is defined to fail. It is assumed that the <tt>indicies[]</tt> array does not contain duplicates.</t>

<artwork><![CDATA[def verify_merkle(root, n, k,  s[], indicies[], proof[]) {
  tmp = []
  defined = []

  proof_index = 0
  marked = mark_tree(indicies, n)
  FOR n < i <= 1 DO
    if (marked[i]) {
      child = 2 * i
      if (marked[child]) {
        child += 1
      }
      if (!marked[child]) {
        if proof_index > |proof| {
          return false
        }
        tmp[child] = proof[proof_index++]
        defined[child] = true
      }
    }

  FOR 0 <= i < k DO
    tmp[indicies[i] + n] = s[i]
    defined[indicies[i] + n] = true

  FOR n < j <= 1 DO
    if defined[2 * i] && defined[2 * i + 1] {
      tmp[i] = hash(tmp[2 * i] || tmp[2 * i + 1])
      defined[i] = true
    }

  return defined[1] && tmp[1] = root
}
]]></artwork>
</section>
</section>

<section anchor="common-parameters"><name>Common parameters</name>
<t>The Prover and Verifier in Ligero must agree on the following parameters. These parameters can be agreed upon out of band.</t>

<ul spacing="compact">
<li><tt>F</tt>: The finite field over which the commit is produced.</li>
<li><tt>NREQ</tt>: The number of columns of the commitment matrix that the Verifier requests to be revealed by the Prover.</li>
<li><tt>rate</tt>: The inverse rate of the error correcting code. This parameter, along with <tt>NREQ</tt> and Field size, determines the soundness of the scheme.</li>
<li><tt>BLOCK</tt>: the size of each row, in terms of number of field elements</li>
<li><tt>DBLOCK</tt>: 2 * <tt>BLOCK</tt> - 1</li>
<li><tt>WR</tt>: the number of witness values included in each row.</li>
<li><tt>QR</tt>: the number of quadratic constraints written in each row</li>
<li><tt>IW</tt>: Row index at which the witness values start, usually IW = 2.</li>
<li><tt>IQ</tt>: Row index at which the quadratic constraints begin, it is the first row after all of the witnesses have been encoded.</li>
<li><tt>NL</tt>: Number of linear constraints.</li>
<li><tt>NQ</tt>: Number of quadratic constraints.</li>
<li><tt>NWROW</tt>: Number of rows used to encode witnesses.</li>
<li><tt>NQT</tt>: Number of row triples needed to encode the quadratic constraints.</li>
<li><tt>NQW</tt>: <tt>NWROW + NQT</tt>, rows needed to encode witnesses and quadratic constraints.</li>
<li><tt>NROW</tt>: Total number of rows in the witness matrix, <tt>NQW + 2</tt></li>
<li><tt>NCOL</tt>: Total number of columns in the tableau matrix.</li>
</ul>
<t>A row of the tableau consists of</t>
<t>|     NREQ     |        WR          | ... DBLOCK | ... NCOL  |
|  random pad  |   witness values   | polynomial evaluations |</t>

<section anchor="constraints-on-parameters"><name>Constraints on parameters</name>

<ul>
<li><tt>BLOCK &lt; |F|</tt> The block size must be smaller than the field size.</li>
<li><tt>BLOCK &gt; NREQ</tt> The block size must be larger than the number of columns requested.</li>
<li><t><tt>BLOCK = NREQ + WR</tt></t>
</li>
<li><t><tt>BLOCK &gt;= 2 * (NREQ + QR) + (NREQ + WR) - 2</tt></t>
</li>
<li><t><tt>WR &gt;= QR</tt>.</t>
</li>
<li><t><tt>BLOCK &gt;= 2 * (NREQ + WR) - 1</tt>.</t>
</li>
<li><t><tt>QR &gt;= NREQ</tt> (and thus <tt>WR &gt;= NREQ</tt>) to avoid wasting too much space.</t>
</li>
</ul>
</section>
</section>

<section anchor="ligero-commitment"><name>Ligero commitment</name>
<t>The first step of the proof procedure requires the Prover to commit to a witness vector <tt>W</tt>.  The witness vector is assumed to be padded with zeros at the end so that its length is an even multiple of <tt>WR</tt>. The commitment is the root of a Merkle tree. The leaves of the Merkle tree are a sequence of columns of the tableau matrix <tt>T[][]</tt>.</t>
<t>This tableau matrix is constructed row-by-row by applying the extend procedure to arrays that are formed from random field elements and elements copied from the witness vector. Matrix T[][] has size NROW x NCOL and has the following structure:</t>

<artwork><![CDATA[row ILDT = 0                         : RANDOM row for low-degree test
row IDOT = 1                         : RANDOM row for linear test
row IQD  = 2                         : RANDOM row for quadratic test
row i for IW = IDOT + 1 <= i < IQ    : witness rows
row i for IQ <= i < NROW             : quadratic rows
]]></artwork>

<ol type="%d)">
<li><t>The first ILDT row is defined as</t>

<artwork><![CDATA[extend(RANDOM[BLOCK], BLOCK, NCOL)
]]></artwork>
<t>by selecting BLOCK random field elements and applying extend.</t>
</li>
<li><t>The second IDOT row is defined as</t>

<artwork><![CDATA[Z = RANDOM[DBLOCK] such that sum_{i = NREQ ... NREQ + WR - 1} Z_i = 0
extend(Z, DBLOCK, NCOL)
]]></artwork>
<t>by first selecting DBLOCK random field elements such that the subarray
from index NREQ to NREQ + WR sums to 0 and then applying extend.
The first step can be performed by selecting DBLOCK-1 random
field elements, and then setting element of the specified range to be the additive inverse of the sum of elements from NREQ...NREQ + WR - 1.</t>
</li>
<li><t>The third IQD row is defined as
    ZQ = RANDOM[DBLOCK]
    ZQ[NREQ ... NREQ + WR - 1] = 0
    extend(ZQ, DBLOCK, NCOL)
by first selecting DBLOCK random field elements, and then setting the
portion coresponding to the witness values to 0 and then applying extend.</t>
</li>
<li><t>The next rows from IW=3,...,IQ are <em>padded witness</em> rows that contain
random elements and portions of the witness vector.
Specifically, row i is formed by applying <tt>extend</tt> to an array that
consists of <tt>NREQ</tt> random elements and then <tt>WR</tt> elements from the vector <tt>W</tt>:</t>

<artwork><![CDATA[extend([RANDOM[NREQ], W[(i-2) * WR .. (i-1) * WR]], BLOCK, NCOL)
]]></artwork>
<t>When the finite field contains a subfield, and if all of the witness elements in a given row are elements from this subfield, then the randomness for that row can also be chosen from the subfield.
Consequently, the <tt>extend</tt> method for that row produces polynomial evaluations that are elements of the subfield. When these elements are serialized, they will require less space.
The simplest way to apply this optimization is for the commiting process to maintain an index <tt>SF</tt> such that witnesses at indices <tt>0..SF</tt> belong to the subfield, and the rest do not. This value <tt>SF</tt> can be conveyed to the verifier as part of the proof, or part of the circuit.</t>
</li>
<li><t>The final portion of the witness matrix consists of <em>padded quadratic</em> rows
that consists of NREQ random elements and WR quadratic constraint elements:</t>

<artwork><![CDATA[extend([RANDOM[NREQ], QX[WR]], BLOCK, NCOL)
extend([RANDOM[NREQ], QY[WR]], BLOCK, NCOL)
extend([RANDOM[NREQ], QZ[WR]], BLOCK, NCOL)
]]></artwork>
<t>The specific elements in the QX, QY, QZ array are determined by the quadratic
constraints on the witness values that are verified by the proof.</t>
</li>
</ol>
<t>The second step of the procedure is to compute a Merkle tree on columns
of the tableau matrix. Specifically, the i-th leaf of the tree is defined
to be columns DBLOCK...NCOL of the i-th row of the tableau T.</t>
<t>Input:</t>

<ul spacing="compact">
<li>The witness vector <tt>W</tt>.</li>
<li>Array of quadratic constraints <tt>lqc[]</tt>, which consists of triples <tt>(x,y,z)</tt> that represent the constraint that <tt>W[x] * W[y] = W[z]</tt>.</li>
</ul>
<t>Output:</t>

<ul spacing="compact">
<li>A digest; root of a Merkle tree formed from columns of the tableau.</li>
</ul>

<artwork><![CDATA[def commit(W[], lqc[]) {
    T[NROW][NCOL] = [0];   // 2d array initialized with 0

    layout_zk_rows(T);
    layout_witness_rows(T, W);
    layout_quadratic_rows(T, W, lqc);

    MerkleTree M;
    FOR DBLOCK <= j < NCOL DO
      M.set_leaf(j - BLOCK,
          hash( T[0][j] || T[1][j] || .. || T[NROW][j]) );

    return M.build_tree();
}

def layout_zk_rows(T) {
    T[0][0..NCOL] = extend(random_row(BLOCK), BLOCK, NCOL);

    Z = random_row(DBLOCK)
    s = SUM_{i = NREQ ... NREQ + WR - 2} Z_i 
    Z[NREQ + WR - 1] = -s
    T[1][0..NCOL] = extend(Z, DBLOCK, NCOL)

    ZQ = random_row[DBLOCK]
    ZQ[NREQ ... NREQ + WR - 1] = 0
    T[2][0..NCOL] = extend(ZQ, DBLOCK, NCOL)
}
]]></artwork>

<artwork><![CDATA[def layout_witness_rows(T, w) {

  FOR IW <= i <= IQ DO
    bool subfield = false;

    IF W[i * WR .. (i+1) * WR] are all in the subfield {
      subfield = true;
    }

    row[0...NREQ-1] = random_row(NREQ, subfield)
    row[NREQ..BLOCK] = W[i * WR .. (i+1) * WR]

    T[i + IW][0..NCOL] = extend(row, BLOCK, NCOL)
}
]]></artwork>

<artwork><![CDATA[def layout_quadratic_rows(T, w, lqc[]) {
    FOR 0 <= i < NQT DO
      qx[0..NREQ] = random_row(NREQ)
      qy[0..NREQ] = random_row(NREQ)
      qz[0..NREQ] = random_row(NREQ)

      FOR 0 <= j < BLOCK  DO
        IF (j + i * Q < NQ)
          assert( W[ lqc[j].x ] * W[ lqc[j].x ] == W[ lqc[j].z ] )
          qx[NREQ + j] = W[ lqc[j].x ]
          qy[NREQ + j] = W[ lqc[j].y ]
          qz[NREQ + j] = W[ lqc[j].z ] 

      T[IQ + i * NQT    ][0..NCOL] = extend(qx, BLOCK, NCOL)
      T[IQ + i * NQT + 1][0..NCOL] = extend(qy, BLOCK, NCOL)
      T[IQ + i * NQT + 2][0..NCOL] = extend(qz, BLOCK, NCOL)
}
]]></artwork>
</section>

<section anchor="ligero-prove"><name>Ligero Prove</name>
<t>This section specifies how a Ligero proof for a given sequence of linear constraints and quadratic constraints on the committed witness vector <tt>W</tt> is constructed. The proof consists of a low-degree test on the tableau, a linearity test, and a quadratic constraint test.</t>

<section anchor="low-degree-test"><name>Low-degree test</name>
<t>In the low-degree test, the verifier sends a challenge vector consisting of <tt>NROW</tt> field elements, <tt>u[0..NROW]</tt>.  This challenge is generated via the Fiat-Shamir transform. The prover computes the sum of <tt>u[i]*T[i]</tt> where <tt>T[i]</tt> is the i-th row of the tableau, and returns the first BLOCK elements of the result. The verifier applies the <tt>extend</tt> method to this response, and then verifies that the extended row is consistent with the positions of the Merkle tree that the verifier will later request from the Prover.</t>
<t>The Prover's task is therefore to compute a summation. For efficiency, set <tt>u[0]=1</tt> because this first row corresponds to a random row meant to ``pad&quot; the witnesses for zero-knowledge.</t>
</section>

<section anchor="linear-and-quadratic-constraints"><name>Linear and Quadratic constraints</name>
<t>The linear test is represented by a matrix <tt>A</tt>, and a vector <tt>b</tt>, and aims to verify that <tt>A*W = b</tt>.  The constraint matrix <tt>A</tt> is given as input in a sparse form: it is an array of triples <tt>(c,j,k)</tt> in which <tt>c</tt> indicates the constraint number or row of A, <tt>j</tt> represents the index of the witness or column of A, and <tt>k</tt> represents the constant factor.  For example, if the first constraint (at index 0) is <tt>W[2] + 2W[3] = 3</tt>, then the linear constraints array contains the triples <tt>(0,2,1), (0,3,2)</tt> and the <tt>b</tt> vector has <tt>b[0]=3</tt>.</t>
<t>The quadratic constraints are given as input in an array <tt>lqc[]</tt> that contains triples <tt>(x,y,z)</tt>; one such triple represents the constraint that <tt>W[x] * W[y] = W[z]</tt>. To process quadratic constraints, tableau <tt>T</tt> is augmented with 3 extra rows, called <tt>Qx</tt>, <tt>Qy</tt>, and <tt>Qz</tt> which hold <em>copied</em> witnesses and their products. If the <tt>i</tt>-th quadratic constraint is <tt>(x,y,z)</tt>, then the prover sets <tt>Qx[i] = W[x]</tt>, <tt>Qy[i] = W[y]</tt> and <tt>Qz[i] = W[x] * W[y]</tt>. Next, the prover adds a linear constraint that <tt>Qx[i] - W[x] = 0</tt>, <tt>Qy[i] - W[y] = 0</tt> and <tt>Qz[i] - W[z] = 0</tt> to ensure that the copied witness is consistent.</t>
<t>In this sense, the quadratic constraints are reduced to linear constraints, and the additional requirement for the verifier to check that each index of the <tt>Qz</tt> row is the product of its counterpart in the <tt>Qx</tt> and <tt>Qy</tt> row.</t>
</section>

<section anchor="selection-of-challenge-indicies"><name>Selection of challenge indicies</name>
<t>The last step of the prove method is for the verifier to select a subset of unique indicies (i.e., they are sampled without replacement) from the range <tt>DBLOCK...NCOL</tt> and request that the prover open these columns of tableau <tt>T</tt>. These opened columns are then used to verify consistency with the previous messages sent by the prover.</t>
</section>

<section anchor="ligero-prover-procedure"><name>Ligero Prover procedure</name>

<artwork><![CDATA[def prove(transcript, digest, linear[], lqc[])  {

    u = transcript.generate_challenge([BLOCK]);
    transcript.write(digest)

    ldt[0..BLOCK] = T[ILDT][0..BLOCK]

    for(i=3; i < NROW; ++i) {
      ldt[0..BLOCK] += u[i] * T[i][0..BLOCK]
    }

    alpha_l = transcript.generate_challenge([NL]);
    alpha_q = transcript.generate_challenge([NQ,3]);

    A = inner_product_vector(linear, alpha_l, lqc, alpha_q);

    dot = dot_proof(A);
    uquad = transcript.generate_quad()

    qpr = quadratic_proof(lqc, uquad)

    transcript.write(ldt);
    transcript.write(dot);
    transcript.write(qpr);

    challenge_indicies = transcript.generate_challenge([NREQ]);

    columns = requested_columns(challenge_indicies);

    mt_proof = M.compressed_proof(challenge_indicies);

    return (ldt, dot, qpr, columns, mt_proof)
  }
]]></artwork>

<artwork><![CDATA[Input:
- linear: array of (w,c,k) triples specifying the linear constraints
- alpha_l: array of challenges for the linear constraints
- lqc: array of (x,y,z) triples specifying the quadratic constraints
- alpha_q: array of challenges for the quadratic constraints

Output:
- A: a vector of size WR x NROW that contains the combined 
     witness constraints.
     The first NW * W positions correspond to coefficients
     for the linear constraints on witnesses.
     The next 3*NQ positions correspond to coefficients
     for the quadratic constraints.

def inner_product_vector(A, linear, alpha_l, lqc, alpha_q) {
  A = [0]

  // random linear combinations of the linear constraints
  FOR 0 <= i < NL DO
    assert(linear[i].w < NW)
    assert(linear[i].c < NL)
    A[ linear[i].w ] += alpha_l[ linear[i].c ] * linear[i].k

  // pointers to terms for quadratic constraints
  a_x = NW * W
  a_y = NW * W + (NQ * W)
  a_z = NW * W + 2 * (NQ * W)

  FOR 0 <= i < NQT DO
    FOR 0 <= j < QR DO
      IF (j + i * QR < NQ)
        ilqc = j + i * QR  // index into lqc
        ia   = j + i * WR  // index into Ax,Ay,Az sub-arrays
        (x,y,z) = lqc[ilqc]

        // add constraints that the copies are correct
        A[a_x + ia] += alpha_q[ilqc][0]
        A[x]        -= alpha_q[ilqc][0]

        A[a_y + ia] += alpha_q[ilqc][1]
        A[y]        -= alpha_q[ilqc][1]

        A[a_z + ia] += alphaq[ilqc][2]
        A[z]        -= alphaq[ilqc][2]

  return A
}

def dot_proof(A) {
  y = T[IDOT][0..BLOCK]

  Aext[0..BLOCK] = [0]
  FOR 0 <= i < NQW DO
    Aext[0..NREQ]  = [0]
    Aext[NREQ..NREQ + WR] = A[i * WR..(i+1) * WR]
    Af = extend(Aext, BLOCK, DBLOCK)

    axpy(DBLOCK, y[0..DBLOCK], Af[0..DBLOCK], T[i + IW][0...DBLOCK])

  return y
}

def quadratic_proof(lqc, uquad) {

    y[0..DBLOCK] = T[IQD][0..DBLOCK]

    iqx = IQ;
    iqy = iqx + NQT
    iqz = iqy + NQT

    FOR 0 <= i < NQT 
      // y += u_quad[i] * (z[i] - x[i] * y[i])

      tmp = T[iqz + i][0..DBLOCK]

      // tmp -= x[i] \otimes y[i]
      sub(DBLOCK, tmp[0...DBLOCK],
                  mul(DBLOCK, T[iqx][0..DBLOCK],
                              T[iqy][0..DBLOCK]))

      // y += u_quad[i] * tmp
      axpy(DBLOCK, y[0..DBLOCK], u_quad[0..DBLOCK], tmp[0..DBLOCK])
    }

    // sanity check: the Witness part of Y is zero
    assert(y[NREQ...BLOCK] == 0)

    // extract the non-zero parts of y
    return y[0..NREQ], y[BLOCK..DBLOCK]
}

def requested_columns(challenge_indicies) {
  cols = []   // array of columns of T
  FOR (index i : challenge_indicies) {
    cols.append( [ T[0..NROW][i] ] )
  }
  return cols
}


]]></artwork>
</section>
</section>

<section anchor="ligero-verification-procedure"><name>Ligero verification procedure</name>
<t>This section specifies how to verify a Ligero proof with respect to a common set of linear and quadratic constraints.</t>

<artwork><![CDATA[Input:
- commitment: the first Prover message that commits to the witness
- proof: Prover's proof
- transcript: Fiat-Shamir
- linear: array of (w,c,k) triples specifying the linear constraints
- b: the vector b in the constraint equation A*w = b.
- lqc: array of (x,y,z) triples specifying the quadratic constraints

Output:
- a boolean

def verify(commitment, proof, transcript,
           linear[], digest, b[], lqc[]) {

  u = transcript.generate_challenge([BLOCK]);
  transcript.write(digest)
  alpha_l = transcript.generate_challenge([NL]);
  alpha_q = transcript.generate_challenge([NQ,3]);
  transcript.write(proof.ldt);
  transcript.write(proof.dot);
  challenge_indicies = transcript.generate_challenge([NREQ]);

  A = inner_product_vector(linear, alpha_l, lqc, alpha_q);

  // check the putative value of the inner product
  want_dot  = dot(NL, b, alpha_l);
  proof_dot = sum(proof.dot);

  return
    verify_merkle(commitment.root, BLOCK*RATE, NREQ,
          proof.columns, challenge_indicies, mt_proof.mt)
    AND quadratic_check(proof)
    AND low_degree_check(proof, challenge_indicies, u)
    AND dot_check(proof, challenge_indicies, A)
    AND want_dot == proof_dot
}
]]></artwork>

<artwork><![CDATA[def quadratic_check(proof, challenge_indices) {

  iqx = IQ;
  iqy = iqx + NQT
  iqz = iqy + NQT
  yc = proof.iquad

  FOR 0 <= i < NQT {
    // yc += u_quad[i] * (z[i] - x[i] * y[i])
    tmp = proof.z[iqz + i][0..DBLOCK]

      // tmp -= x[i] \otimes y[i]
    sub(DBLOCK, tmp[0...DBLOCK],
                mul(DBLOCK, T[iqx][0..DBLOCK],
                            T[iqy][0..DBLOCK]))

    // y += u_quad[i] * tmp
    axpy(DBLOCK, yc[0..DBLOCK], u_quad[0..DBLOCK], tmp[0..DBLOCK])
  }

  yquad = proof.qpr[0..NREQ] || 0 || proof.qpr[BLOCK...DBLOCK]
  yp = extend(yquad, DBLOCK, NCOL)

  // Verify that yp and yc agree at the challenge indices.
  want = gather(NREQ, yp, challenge_indices)
  return equal(NREQ, want, yc[{idx}])
}

def low_degree_check(proof, u, challenge_indicies) {

  got = proof.columns[ILDT][0..NREQ]

  FOR 1 <= i < NROW DO {
    axpy(NREQ, got, u[i], proof.columns[i][...])
  }

  row = extend(proof.ldt, BLOCK, NCOL)
  want = gather(NREQ, row, challenge_indicies)

  return equal(NREQ, got, want)
}

def dot_check(proof, challenge_indicies, A) {
  yc = proof.columns[IDOT][0..NREQ]

  Aext[0..BLOCK] = [0]
  FOR 0 <= i < NQW DO
    Aext[0..R]  = [0]
    Aext[R..R + WR] = A[i * WR..(i+1) * WR]
    Af = extend(Aext, R + WR, BLOCK)

    Areq = gather(NREQ, Af, challenge_indicies);

    // Accumulate z += A[j] \otimes W[j].
    sum( yc, prod(NCOL, Areq[0..NREQ], 
                        proof.columns[i][0..NREQ]))

  row = extend(proof.dot, BLOCK, NCOL)
  yp  = gather(NREQ, row, challenge_indicies)

  return equal(NREQ, yp, yc)
}
]]></artwork>
</section>
</section>

<section anchor="overview-of-the-longfellow-protocol"><name>Overview of the Longfellow protocol</name>
<t>The Longfellow ZK protocol utilizes two primitive operations. The first is a variant of the sumcheck protocol, modified to support zero knowledge. Informally, the non-padded sumcheck prover takes the description of a circuit and the concrete values of all the wires in the circuit, and produces a proof that all wires have been computed correctly.  The proof itself is a sequence of field elements.  The padded-variant of the sumcheck prover used in this document also takes as input a random and secret one-time pad and it outputs a &quot;padded&quot; proof such that each element in the padded proof is the difference of the element in the non-padded proof and of the element in the pad.  (The choice of &quot;difference&quot; instead of &quot;sum&quot; is a matter of convention.)</t>
<t>In this padded sumcheck variant, the verifier cannot check the proof
directly, because it cannot access the pad.  Instead of running the
sumcheck verifier directly, a commitment scheme is used to hide the
pad, and the sumcheck verifier is translated into a sequence of linear and
quadratic constraints on the inputs and the pad.  The commitment
scheme then produces a proof that the constraints are satisfied.</t>
<t>Some of the wires of the circuit are <em>inputs</em>, i.e., set outside the
circuit and not computed by the circuit itself.  Some of the inputs
are <em>public</em>, i.e., known to both parties, and some are <em>private</em>,
i.e., known only to the prover.  Sumcheck does not use the distinction
between public and private inputs, but this document distinguishes inputs
from the pad.  On the contrary, the commitment scheme does not use
public inputs at all, but it does treat private inputs and the pad
equally.  These constraints motivate the following terminology.</t>

<ul spacing="compact">
<li><em>public inputs</em>: inputs to the circuit known to both
parties.</li>
<li><em>private inputs</em>: inputs to the circuit known to the
prover but not to the verifier.</li>
<li><em>inputs</em>: both public and private inputs.  When forming
an array of all inputs, the public inputs come first, followed
by the private inputs.</li>
<li><em>witnesses</em>: the private inputs and the pad.  When forming
an array of all witnesses, the private inputs come first, followed
by the pad.</li>
</ul>
<t>Thus, at a high level, the sequence of operations in the ZK
protocol is the following:</t>

<ol>
<li><t>The prover commits to all witness values.</t>
</li>
<li><t>The prover runs the padded sumcheck prover on the witness values to producing a padded proof, and sends the padded proof to the verifier.</t>
</li>
<li><t>Both the prover and the verifier take the public inputs and the
padded proof and produce a sequence of constraints.</t>
</li>
<li><t>Using the commitment scheme and the witnesses, the prover generates
a proof that the constraints from step 3 are satisfied.</t>
</li>
<li><t>The verifier uses the proof from step 4 and the constraints from
step 3 to check the constraints.</t>
</li>
</ol>
<t>Steps 2 and 3 are referred to as &quot;sumcheck&quot;, and the rest as &quot;commitment scheme&quot;.  While the classification of step 3 as &quot;sumcheck&quot; is  arbitrary, there are situations where one might want to use a commitment scheme other than the Ligero protocol specified in this document.  In this case, the &quot;commitment scheme&quot; can change while the &quot;sumcheck&quot; remains unaffected.</t>
</section>

<section anchor="sumcheck"><name>Sumcheck</name>

<section anchor="special-conventions-for-sumcheck-arrays"><name>Special conventions for sumcheck arrays</name>
<t>The square brackets <tt>A[j]</tt> denote generic array indexing.</t>
<t>For the arrays of field elements used in the sumcheck protocol,
however, it is convenient to use the conventions that follow.</t>
<t>The sumcheck array <tt>A[i]</tt> is implicitly assumed to be defined for all
nonnegative integers <tt>i</tt>, padding with zeroes as necessary.  Here,
&quot;zero&quot; is well defined because <tt>A[]</tt> is an array of field elements.</t>
<t>Arrays can be multi-dimensional, as in the three-dimensional array
<tt>Q[g, l, r]</tt>.  It is understood that the array is padded with
infinitely many zeroes in each dimension.</t>
<t>Given array <tt>A[]</tt> and field element <tt>x</tt>, the function
<tt>bind(A, x)</tt> returns the array <tt>B</tt> such that</t>

<artwork><![CDATA[  B[i] = (1 - x) * A[2 * i] + x * A[2 * i + 1]
]]></artwork>
<t>In case of multiple dimensions such as <tt>Q[g, l, r]</tt>,
always bind across the first dimension.  For example,</t>

<artwork><![CDATA[  bind(Q, x)[g, l, r] =
     (1 - x) * Q[2 * g, l, r] + x * Q[2 * g + 1, l, r]
]]></artwork>
<t>This <tt>bind</tt> can be generalized to an array of field elements as follows:</t>

<artwork><![CDATA[  bindv(A, X) =
       A                                  if X is empty
       bindv(bind(A, X[0]), X[1..])       otherwise
]]></artwork>
<t>Two-dimentional arrays can be transposed in the usual way:</t>

<artwork><![CDATA[  transpose(Q)[l, r] = Q[r, l] .
]]></artwork>
</section>

<section anchor="the-eq-array"><name>The <tt>EQ[]</tt> array</name>
<t><tt>EQ_{n}[i, j]</tt> is a special 2D array defined as</t>

<artwork><![CDATA[   EQ_{n}[i, j] = 1   if i = j and i < n
                  0   otherwise
]]></artwork>
<t>The sumcheck literature usually assumes that <tt>n</tt> is a power of 2,
but this document allows <tt>n</tt> to be an arbitrary integer.  When <tt>n</tt> is clear from
context or unimportant, the subscript is omitted like
<tt>EQ[i, j]</tt>.</t>
<t><tt>EQ[]</tt> is important because the general expansion</t>

<artwork><![CDATA[   V[i] = SUM_{j} EQ[i, j] V[j]
]]></artwork>
<t>commutes with binding, yielding</t>

<artwork><![CDATA[   bindv(V, X) = SUM_{j} bindv(EQ, X)[j] V[j] .
]]></artwork>
<t>That is, one way to compute <tt>bindv(V, X)</tt> is via
dot product of <tt>V</tt> with <tt>bindv(EQ, X)</tt>.  This strategy
may or may not be advantageous in practice, but it
becomes mandatory when <tt>bindv(V, X)</tt> must be computed
via a commitment scheme that supports linear
constraints but not binding.</t>
<t>This document only uses bindings of <tt>EQ</tt> and never <tt>EQ</tt> itself,
and therefore the whole array never needs to be stored explicitly.
For <tt>n = 2^l</tt> and <tt>X</tt> of size <tt>l</tt>, <tt>bindv(EQ_{n}, X)</tt> can be computed
recursively in linear time as <tt>bindv(EQ_{n}, X) = bindeq(l, X)</tt> where</t>

<artwork><![CDATA[   bindeq(l, X) =
      LET n = 2^l
      allocate B[n]
      IF l = 0 THEN
         B[0] = 1
      ELSE
         LET A = bindeq(l - 1, X[1..])
         FOR 0 <= 2 * i < n DO
            B[2 * i]     = (1 - X[0]) * A[i]
            B[2 * i + 1] = X[0] * A[i]
         ENDFOR
      ENDIF
      return B
]]></artwork>
<t>For <tt>m &lt;= n</tt>, <tt>bindv(EQ_{n}, X)[i]</tt> and <tt>bindv(EQ_{m}, X)[i]</tt>
agree for <tt>0 &lt;= i &lt; m</tt>, and thus
<tt>bindv(EQ_{m}, X)[i]</tt> can be computed by padding <tt>m</tt> to the next power of 2
and ignoring the extra elements.
With some care, it is possible to compute <tt>bindeq()</tt>
in-place on a single array of arbitrary size <tt>m</tt> and eliminate
the recursion completely.</t>

<section anchor="remark"><name>Remark</name>
<t>Let <tt>m &lt;= n</tt>, <tt>A = bindv(EQ_{m}, X)</tt> and <tt>B = bindv(EQ_{n}, X)</tt>.  It
is true that <tt>A[i] = B[i]</tt> for <tt>i &lt; m</tt>.  However, it is also true that <tt>A[i] =
0</tt> for <tt>i &gt;= m</tt>, whereas <tt>B[i]</tt> is in general nonzero.  Thus, care
must be taken when computing a further binding <tt>bindv(A, Y)</tt>,
which is in general not the same as <tt>bindv(B, Y)</tt>.  A second binding is
not needed in this document,  but certain closed-form expressions for
the binding found in the literature agree with these definitions only
when <tt>m</tt> is a power of 2.</t>
</section>
</section>

<section anchor="circuits"><name>Circuits</name>

<section anchor="layered-circuits"><name>Layered circuits</name>
<t>A circuit consists of <tt>NL</tt> <em>layers</em>.  By convention, layer <tt>j</tt>
computes wires <tt>V[j]</tt> given wires <tt>V[j + 1]</tt>, where each <tt>V[j]</tt> is an
array of field elements.  A <em>wire</em> is an element <tt>V[j][w]</tt> for some <tt>j</tt>
and <tt>w</tt>.  Thus, <tt>V[0]</tt> denotes the output wires of the entire circuit,
and <tt>V[NL]</tt> denotes the input wires.</t>
<t>A circuit is intended to check that some property of the input holds,
and by convention, the check is considered successful if all output
wires are 0, that is, if <tt>V[0][w] = 0</tt> for all <tt>w</tt>.</t>
</section>

<section anchor="quad-representation"><name>Quad representation</name>
<t>The computation of circuit is defined by a set of <em>quads</em> <tt>Q[j]</tt>, one
per layer.  Given the output of layer <tt>j + 1</tt>, the output of of layer
<tt>j</tt> is given by the following equation:</t>

<artwork><![CDATA[  V[j][g] = SUM_{l, r} Q[j][g, l, r] V[j + 1][l] V[j + 1][r] .
]]></artwork>
<t>The quad <tt>Q[j][]</tt> is thus a three-dimensional array in the indices <tt>g</tt>,
<tt>l</tt>, and <tt>r</tt> where <tt>0 &lt;= g &lt; NW[j]</tt> and <tt>0 &lt;= l, r &lt; NW[j + 1]</tt>.  In
practice, <tt>Q[j][]</tt> is sparse.</t>
<t>The specification of the circuit contains an auxiliary
vector of quantities <tt>LV[j]</tt> with the property that <tt>V[j][w] = 0</tt>
for all <tt>w &gt;= 2^{LV[j]}</tt>.  Informally, <tt>LV[j]</tt> is the number
of bits needed to name a wire at layer <tt>j</tt>, but <tt>LV[j]</tt> may
be larger than the minimum required value.</t>
</section>

<section anchor="in-circuit-assertions"><name>In-circuit assertions</name>
<t>In the libzk system, a theorem is represented by a circuit such that
the theorem is true if and only if all outputs of the circuit are
zero.  It happens in practice that many output wires are computed early
in the circuit (i.e., in a layer closer to the input), but because of
layering, they need to be copied all the way to output layer in order
to be compared against zero.  This copy seems to introduce large
overheads in practice.</t>
<t>A special convention can mitigate this problem.  Abstractly,
a layer is represented by <em>two</em> quads <tt>Q</tt> and <tt>Z</tt>, and the
operation of the layer is described by the two equations</t>

<artwork><![CDATA[  V[j][g] = SUM_{l, r} Q[j][g, l, r] V[j + 1][l] V[j + 1][r]
       0  = SUM_{l, r} Z[j][g, l, r] V[j + 1][l] V[j + 1][r]
]]></artwork>
<t>Thus, the <tt>Z</tt> quad asserts that, for given layer <tt>j</tt>
and output wire <tt>g</tt>, a certain quadratic combination of
the input wires is zero.</t>
<t>The actual protocol verifies a random linear combination
of those two equations, effectively operating on a combined
quad <tt>QZ = Q + beta * Z</tt> for some random <tt>beta</tt>.</t>
<t>To allow for a compact representation of the two quads without
losing any real generality, the following conditions are imposed:</t>

<ul spacing="compact">
<li>The two quads <tt>Q</tt> and <tt>Z</tt> are disjoint: for all layers <tt>j</tt> and output
wire <tt>g</tt>, if any <tt>Q[j][g, ., .]</tt> are nonzero, then all <tt>Z[j][g, ., .]</tt>
are zero, and vice versa.</li>
<li><tt>Z</tt> is binary: <tt>Z[j][g, l, r] \in {0, 1}</tt></li>
</ul>
<t>With these choices, the two quads allow a compact sparse
representation as a single list of 4-tuples <tt>(g, l, r, v)</tt>
with the following conventions:</t>

<ul spacing="compact">
<li>If <tt>v = 0</tt>, the 4-tuple represents an element of <tt>Z</tt>,
and <tt>Z[j][g, l, r] = 1</tt>.</li>
<li>If <tt>v != 0</tt>, the 4-tuple represents an element of <tt>Q</tt>,
and <tt>Q[j][g, l, r] = v</tt>.</li>
<li>All other elements of <tt>Q</tt> and <tt>Z</tt> not specified by the list are
zero.</li>
</ul>
<t>Moreover, this compact representation can be transformed into
a representation of <tt>QZ = Q + beta * Z</tt> by replacing all <tt>v = 0</tt>
with <tt>v = beta</tt>.</t>
</section>
</section>

<section anchor="representation-of-polynomials"><name>Representation of polynomials</name>
<t>In a generic sumcheck protocol, the prover sends to the verifier
polynomials of a degree specified in advance.  In the present document,
the polynomials are always of degree 2, and are represented by their
evaluations at three points <tt>P0 = 0</tt>, <tt>P1 = 1</tt>, and <tt>P2</tt>, where <tt>0</tt>
and <tt>1</tt> are the additive and multiplicative identities in the field.
The choice of <tt>P2</tt> depends upon the field.  For fields of characteristic
greater than 2, set <tt>P2 = 2</tt> (= <tt>1 + 1</tt> in the field).  For <tt>GF(2^128)</tt>
expressed as <tt>GF(2)[X] / (X^128 + X^7 + X^2 + X + 1)</tt>, and set <tt>P2
= X</tt>.  This document does not prescribe a choice of P2 for binary
fields other than <tt>GF(2^128)</tt>, but other binary fields
represented as <tt>GF(2)[X] / (Q(X))</tt> SHOULD choose <tt>P2 = X</tt> for
consistency.</t>
</section>

<section anchor="transform-circuit-and-wires-into-a-padded-proof"><name>Transform circuit and wires into a padded proof</name>

<artwork><![CDATA[sumcheck_circuit(circuit, wires, pad, transcript) {
  G[0] = G[1] = transcript.gen_challenge(circuit.lv)
  FOR 0 <= j < circuit.nl DO
     // Let V[j] be the output wires of layer j.
     // The body of the loop reduces the verification of the
     // two "claims" bind(V[j], G[0]) and bind(V[j], G[1])
     // to the verification of the two claims
     // bind(V[j + 1], G'[0]) and bind(V[j + 1], G'[1]),
     // where the new bindings G' are chosen in sumcheck_layer()

     alpha = transcript.gen_challenge(1)

     // Form the combined quad QZ = Q + beta Z
     // to handle in-circuit assertions
     beta = transcript.gen_challenge(1)
     QZ = circuit.layer[j].quad + beta * circuit.layer[j].Z;

     // QZ is three-dimensional QZ[g, l, r]
     QUAD = bindv(QZ, G[0]) + alpha * bindv(QZ, G[1])
     // having bound g, QUAD is two-dimensional QUAD[l, r]
     
     (proof[j], G) =
         sumcheck_layer(QUAD, wires[j], circuit.layer[j].lv,
                        pad[j], transcript)
  ENDFOR
  return proof
}
]]></artwork>

<artwork><![CDATA[sumcheck_layer(QUAD, wires, lv, layer_pad, transcript) {
   (VL, VR) = wires
   FOR 0 <= round < lv DO
      FOR 0 <= hand < 2 DO
        Let p(x) =
           SUM_{l, r} bind(QUAD, x)[l, r] * bind(VL, x)[l] * VR[r]
        evals.p0 = p(P0) - layer_pad.evals[round][hand].p0
        // p(P1) is implied and not needed
        evals.p2 = p(P2) - layer_pad.evals[round][hand].p2
        layer_proof.evals[round][hand] = evals
        transcript.write(evals);
        challenge = transcript.gen_challenge(1)
        G[round][hand] = challenge

        // bind the L variable to CHALLENGE
        VL = bind(VL, challenge)
        QUAD = bind(QUAD, challenge)

        // swap L and R
        (VL, VR) = (VR, VL)
        QUAD = transpose(QUAD)
      ENDFOR
   ENDFOR
   layer_proof.vl = VL[0] - layer_pad.vl
   layer_proof.vr = VR[0] - layer_pad.vr
   transcript.write(layer_proof.vl)
   transcript.write(layer_proof.vr)
   return (layer_proof, G)
}
]]></artwork>
</section>

<section anchor="generate-constraints-from-the-public-inputs-and-the-padded-proof"><name>Generate constraints from the public inputs and the padded proof</name>
<t>This section defines a procedure <tt>constraints_circuit</tt> for transforming the proof
returned by <tt>sumcheck_circuit</tt> into constraints for the commitment
scheme.  Specifically, each layer produces one linear constraint and one quadratic constraint.</t>
<t>The main difficulty in describing the algorithm is that it operates
not on concrete witnesses, but on expressions in which the witnesses
are symbolic quantities.  Symbolic manipulation is necessary because
the verifier does not have access to the witnesses.  To avoid
overspecifying the exact representation of such symbolic expressions,
the convention is that the prefix <tt>sym_</tt> indicates not a concrete
value, but a symbolic representation of the value.  Thus, <tt>w[3]</tt> is
the fourth concrete witness in the <tt>w</tt> array, and <tt>sym_w[3]</tt> is a
symbolic representation of the fourth element in the <tt>w</tt> array.  The
algorithm does not need arbitrarily complex symbolic expressions.  It
suffices to keep track of affine symbolic expressions of the form
<tt>k + SUM_{i} a[i] sym_w[i]</tt> for some (concrete, nonsymbolic) field elements
<tt>k</tt> and <tt>a[]</tt>.</t>

<artwork><![CDATA[constraints_circuit(circuit, public_inputs, sym_private_inputs, 
                    sym_pad, transcript, proof) {
  G[0] = G[1] = transcript.gen_challenge(circuit.lv)
  claims = [0, 0]
  FOR 0 <= j < circuit.nl DO
     alpha = transcript.gen_challenge(1)
     beta = transcript.gen_challenge(1)
     QZ = circuit.layer[j].quad + beta * circuit.layer[j].Z;
     QUAD = bindv(QZ, G[0]) + alpha * bindv(QZ, G[1])
     (claims, G) = constraints_layer(
               QUAD, circuit.layer[j].lv, sym_pad[j], transcript,
               proof[j], claims, alpha)
  ENDFOR

  // now add constraints that the two final claims
  // equal the binding of sym_inputs at G[0], G[1]

  gamma = transcript.gen_challenge(1)
  LET eq2 = bindv(EQ, G[0]) + gamma * bindv(EQ, G[1])
  LET sym_layer_pad = sym_pad[circuit.nl - 1]
  LET npub = number of elements in public_inputs

  Output the linear constraint
      SUM_{i} (eq2[i + npub] * sym_private_inputs[i])
      - sym_layer_pad.vl 
      - gamma * sym_layer_pad.vr
    = 
      - SUM_{i} (eq2[i] * public_inputs[i])
      + claims[0]
      + gamma * claims[1]
}
]]></artwork>

<artwork><![CDATA[constraints_layer(QUAD, wires, lv, sym_layer_pad, transcript,
                  layer_proof, claims, alpha) {
   // Initial symbolic claim, which happens to be
   // a known constant but which will be updated to contain
   // symbolic linear terms later.
   LET sym_claim = claims[0] + alpha * claims[1]

   FOR 0 <= round < lv DO
      FOR 0 <= hand < 2 DO
        LET hp = layer_proof.evals[round][hand]
        LET sym_hpad = sym_layer_pad.evals[round][hand]

        transcript.write(hp);
        challenge = transcript.gen_challenge(1)
        G[round][hand] = challenge

        // Now the unpadded polynomial evaluations are expected
        // to be
        //   p(P0) = hp.p0 + sym_hpad.p0
        //   p(P2) = hp.p2 + sym_hpad.p2
        LET sym_p0 = hp.p0 + sym_hpad.p0
        LET sym_p2 = hp.p2 + sym_hpad.p2

        // Compute the implied p(P1) = claim - p(P0) in symbolic form
        LET sym_p1 = sym_claim - sym_p0

        LET lag_i(x) =
               the quadratic polynomial such that
                      lag_i(P_k) = 1  if i = k
                                   0  otherwise
               for 0 <= k < 3

        // given p(P0), p(P1), and p(P2), interpolate the
        // new claim symbolically
        sym_claim =   lag_0(challenge) * sym_p0
                    + lag_1(challenge) * sym_p1
                    + lag_2(challenge) * sym_p2

        // bind L
        QUAD = bind(QUAD, challenge);

        // swap left and right
        QUAD = transpose(QUAD)
      ENDFOR
   ENDFOR

   // now the bound QUAD is a scalar (a 1x1 array)
   LET Q = QUAD[0,0]

   // now verify that
   //
   //   SYM_CLAIM = Q * VL * VR
   //
   // where VL = layer_proof.vl + layer_pad.vl
   //       VR = layer_proof.vr + layer_pad.vr

   // decompose SYM_CLAIM into the known constant
   // and the symbolic part
   LET known + symbolic = sym_claim

   Output the linear constraint
      symbolic
      - (Q * layer_proof.vr) * sym_layer_pad.vl
      - (Q * layer_proof.vl) * sym_layer_pad.vr
      - Q * sym_layer_pad.vl_vr
     =
      Q * layer_proof.vl * layer_proof.vl - known

   Output the quadratic constraint

      sym_layer_pad.vl * sym_layer_pad.vr = sym_layer_pad.vl_vr

   transcript.write(layer_proof.vl)
   transcript.write(layer_proof.vr)

   return (G, [layer_proof.vl, layer_proof.vr])
}
]]></artwork>
</section>
</section>

<section anchor="serializing-objects"><name>Serializing objects</name>
<t>This section explains how a proof consists of smaller, related objects, and how to serialize each such component.  First, the standard methods for serializing integers and arrays are used:</t>

<ul spacing="compact">
<li><tt>write_size(n)</tt>: serializes an integer in [0, 2^{24} - 1] that represents the size of an array or an index into an array. The integer is serialized in little endian order.</li>
<li><tt>write_array(arr)</tt>: A variable-sized array is represented as <tt>type array[]</tt> and serialized by first writing its length as a size element, and then serializing each element of the array in order.</li>
<li><tt>write_fixed_array(arr)</tt>: When the length of the array is explicitly known to be <tt>n</tt>, it is specified as <tt>type array[n]</tt> and in this case, the array length is not written first.</li>
</ul>

<section anchor="serializing-structs"><name>Serializing structs</name>
<t>When a section includes just a struct definition, it is serialized in the natural way, starting from the top-most component and proceeding to the last one, each component is serialized in order.</t>
</section>

<section anchor="serializing-field-elements"><name>Serializing Field elements</name>
<t>This section describes a method to serialize field elements, particularly when the field structure allows efficient encoding for elements of subfields.</t>
<t>Before a field element can be serialized, the context must specify the finite field. In most cases, the Circuit structure will specify the finite field, and all other aspects of the protocol will be defined by this field.</t>
<t>A finite field or <tt>FieldID</tt> is specified using a variable-length encoding. Common finite fields have been assigned special 1-byte codes. An arbitrary prime-order finite field can be specified using the special <tt>0xF_</tt> byte followed by a variable number of bytes to specify the prime in little-endian order. For example, the 3 byte sequence <tt>f11001</tt> specifies F<sub>257</sub>. Similarly, a quadratic extension using the polynomial x^2 + 1 can be specified using the <tt>0xE_</tt> designators.</t>
<table><name>Finite field identifiers.
</name>
<thead>
<tr>
<th>Finite field</th>
<th align="right">FieldID</th>
</tr>
</thead>

<tbody>
<tr>
<td>p256</td>
<td align="right">0x01</td>
</tr>

<tr>
<td>p384</td>
<td align="right">0x02</td>
</tr>

<tr>
<td>p521</td>
<td align="right">0x03</td>
</tr>

<tr>
<td>GF(2<sup>128</sup>)</td>
<td align="right">0x04</td>
</tr>

<tr>
<td>GF(2<sup>16</sup>)</td>
<td align="right">0x05</td>
</tr>

<tr>
<td>2<sup>128</sup> - 2<sup>108</sup> + 1</td>
<td align="right">0x06</td>
</tr>

<tr>
<td>2^64 - 59</td>
<td align="right">0x07</td>
</tr>

<tr>
<td>2^64 - 2^32 + 1</td>
<td align="right">0x08</td>
</tr>

<tr>
<td>F_{2^64 - 59}<sup>2</sup></td>
<td align="right">0x09</td>
</tr>

<tr>
<td>secp256</td>
<td align="right">0x0a</td>
</tr>

<tr>
<td>F_{2<sup>{0--15}</sup>-byte prime}<sup>2</sup></td>
<td align="right">0xe{0--f}</td>
</tr>

<tr>
<td>F_{2<sup>{0--15}</sup>-byte prime}</td>
<td align="right">0xf{0--f}</td>
</tr>
</tbody>
</table><t>The GF(2<sup>128</sup>) field uses the irreducible polynomial x<sup>128</sup> + x<sup>7</sup> + x<sup>2</sup> + x + 1.
The p256 prime is equal to 115792089210356248762697446949407573530086143415290314195533631308867097853951, which is the base field used by the NIST P256 elliptic curve.
The p384 prime is equal to 39402006196394479212279040100143613805079739270465446667948293404245721771496870329047266088258938001861606973112319 which is the base field used by the NIST P384 curve.  The p512 prime is equal to 2<sup>521</sup> - 1.  The F_p64^2 field is the quadratic field extension of the base field defined by prime 18446744073709551557 using polynomial x^2 + 1, i.e. by injecting a square root of -1 to the field.</t>

<section anchor="serializing-a-single-field-element"><name>Serializing a single field element</name>
<t>Unless specified otherwise, a field element, referred to as an <tt>Elt</tt>, is serialized to bytes in little-endian order. For example, a 256-bit element of the finite field F<sub>p256</sub> is serialized into 32-bytes starting with the least-significant byte.</t>

<ul spacing="compact">
<li><tt>write_elt(e, F)</tt>: produces a byte encoding of a field element e in field F.</li>
</ul>
</section>

<section anchor="serializing-an-element-of-a-subfield"><name>Serializing an element of a subfield</name>
<t>In some cases, when both Prover and Verifier can explicitly conclude that a field element belongs to a smaller subfield, then both parties can use a more efficient sub-field serialization method.   This optimization can be used when the larger field <tt>F</tt> is a field extension of a smaller field, and both parties can conclude that the serialized element belongs to the smaller subfield.</t>

<ul spacing="compact">
<li><tt>write_subfield(Elt e, F2, F1)</tt>: produce a byte encoding of a field element e that belongs to a subfield F2 of field F1.</li>
</ul>
</section>
</section>

<section anchor="serializing-a-sumcheck-transcript"><name>Serializing a Sumcheck Transcript</name>

<artwork><![CDATA[struct {
	PaddedTranscriptLayer layers[];  // NL layers
} PaddedTranscript;

struct {
	Elt wires[];  // array of 2 * log_w Elts that store the
                // evaluations of deg-2 polynomial at 0, 2
	Elt wc0;
	Elt wc1;
} PaddedTranscriptLayer;
]]></artwork>
<t>The padded transcript incorporates the optimization in which the eval at 1 is omitted and reconstructed from the expected value of the previous challenge.</t>
</section>

<section anchor="serializing-a-ligero-proof"><name>Serializing a Ligero Proof</name>

<artwork><![CDATA[def serialize_ligero_proof(C, ldt, dot, columns, mt_proof) {
  write_array(ldt, C.BLOCK)
  write_array(dot, C.BLOCK)
  write_runs(columns, C.NREQ * C.NROW, C.subFieldID, C.FieldID)
  write_merkle(mt_proof)
}
]]></artwork>
<t>The concept of a <tt>run</tt> allows saving space when a long run of field elements belong to a subfield of the Finite field.  Runs consist of a 4-byte size element, and then size Elt elements that are either in the field or the subfield. Runs alternate, beginning with full field elements. In this way, rows that consist of subfield elements can save space.  The maximum run length is set to 2<sup>25</sup>.</t>

<artwork><![CDATA[def write_runs(columns, N, F2, F) {
    bool subfield_run = false
    FOR 0 <= ci < N DO
      size_t runlen = 0
      while (ci + runlen < N &&
             runlen < kMaxRunLen &&
             columns[ci + runlen].is_in_subfield(F2) == subfield_run) {
        ++runlen;
      }
      write_size(runlen, buf);
      for (size_t i = ci; i < ci + runlen; ++i) {
        if (subfield_run) {
          write_subfield(columns[i], F2, F);
        } else {
          write_elt(columns[i], F);
        }
      }
      ci += runlen;
      subfield_run = !subfield_run;
}

def write_merkle(mt_proof) {
  FOR (digest in mt_proof) DO
     write_fixed_array(digest, HASH_LEN)
}
]]></artwork>
</section>

<section anchor="serializing-a-sequence-of-proofs"><name>Serializing a Sequence of proofs</name>
<t>For the multi-field optimization, the proof string consists of a sequence of two proofs. This is handled by using the circuit identifier to specify the sequence of proofs to parse.</t>

<artwork><![CDATA[struct {
   Public pub;  // Public arguments to all circuits
   Proof proofs[]; // array of Proof
} Proofs;
]]></artwork>

<artwork><![CDATA[struct {
  uint8 oracle[32]; // nonce used to define the random oracle,
  Digest com;       // commitment to the witness
  PaddedTranscript sumcheck_transcript;
  LigeroProof lp;
} Proof;

struct {
  char* arguments[];   // array of strings representing
                       // public arguments to the circuit
} Public;
]]></artwork>
</section>

<section anchor="serializing-a-circuit"><name>Serializing a Circuit</name>
<t>A circuit structure consists of size metadata, a table of constants, and an array of structures that represent the layers of the circuit as follows.</t>

<artwork><![CDATA[struct {
  Version version;     // 1-byte identifier, 0x1.
  FieldID field;       // identifies the field
  FieldID subfield;    // identifies the subfield
  size nv;             // number of outputs
	size pub_in;         // number of public inputs
	size ninputs;        // number of inputs, including witnesses
	size nl;             // number of layers
	Elt const_table[];   // array of constants used by the quads
	CircuitLayer layers[]; 	// array of layers of size nl
} Circuit;
]]></artwork>
<t>The <tt>const_table</tt> structure contains an array of <tt>Elt</tt> constants that can be referred by any of the CircuitLayer structures. This feature saves space because a typical circuit uses only a handful of constants, which can be referred by a small index value into this table.</t>

<artwork><![CDATA[struct {
  size logw;     // log of number of wires
  size nw;       // number of wires
  Quads quads[];  // array of nw Quads
} CircuitLayer;
]]></artwork>
<t>The <tt>quads</tt> array stores the main portion of the circuit. Each <tt>Quad</tt> structure contains a g, h0, h1 and a constant <tt>v</tt> which is represented as an index into the <tt>const_table</tt> array in the <tt>Circuit</tt>.  Each <tt>g</tt>,<tt>h0</tt>, and <tt>h1</tt> is stored as a difference from the corresponding item in the <em>previous</em> quad. In other words, these three values are delta-encoded in order to improve the compressibility of the circuit representation. The Delta spec uses LSB as a sign bit to indicate negative numbers.</t>

<artwork><![CDATA[struct {
  Delta g;     // delta-encoded gate number
  Delta h0;    // delta-encoded left wire index
  Delta h1;    // delta-encoded right wire index
  size v;      // index into the const_table to specify const v
} Quad;

typedef Delta uint;
]]></artwork>
</section>
</section>

<section anchor="security-considerations"><name>Security Considerations</name>
<t>Both the Ligero and Longfellow systems satisfy the standard properties of a zero-knowledge argument system: completeness, soundness, and zero-knowledge.</t>
<t>Frigo and shelat <xref target="longfellow"></xref> provide an analysis of the soundness of the system, as it derives from the Soundness of the Ligero proof system and the sumcheck protocol.  Similarly, the zero-knowledge property derives almost entirely from the analysis of Ligero <xref target="ligero"></xref>. It is a goal to provide a mechanically verifiable proof for a high-level statement of the soundness.</t>
</section>

<section anchor="iana-considerations"><name>IANA Considerations</name>
<t>This document does not make any requests of IANA.</t>
</section>

</middle>

<back>
<references><name>References</name>
<references><name>Normative References</name>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4086.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6919.xml"/>
</references>
<references><name>Informative References</name>
<reference anchor="GMR" target="">
  <front>
    <title>THE KNOWLEDGE COMPLEXITY OF INTERACTIVE PROOF SYSTEMS</title>
    <author fullname="Shafi Goldwasser" initials="S." surname="Goldwasser"></author>
    <author fullname="Silvio Micali" initials="S." surname="Micali"></author>
    <author fullname="Charles Rackoff" initials="C." surname="Rackoff"></author>
    <date year="1989"></date>
  </front>
</reference>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6234.xml"/>
<reference anchor="additivefft" target="https://arxiv.org/abs/1404.3458">
  <front>
    <title>Novel polynomial basis and its application to Reed-Solomon&#xA;erasure codes</title>
    <author fullname="Sian-Jheng Lin" initials="S." surname="Lin"></author>
    <author fullname="Wei-Ho Chung" initials="W." surname="Chung"></author>
    <author fullname="Yunghsiang S. Han" initials="Y." surname="Han"></author>
    <date year="2014"></date>
  </front>
</reference>
<reference anchor="krs" target="https://eprint.iacr.org/2025/118">
  <front>
    <title>How to Prove False Statements: Practical Attacks on &#xA;            Fiat-Shamir</title>
    <author fullname="Dmitry Khovratovich" initials="D." surname="Khovratovich"></author>
    <author fullname="Ron D. Rothblum" initials="R. D." surname="Rothblum"></author>
    <author fullname="Lev Soukhanov" initials="L." surname="Soukhanov"></author>
    <date year="2025"></date>
  </front>
</reference>
<reference anchor="ligero" target="https://eprint.iacr.org/2022/1608">
  <front>
    <title>Ligero: Lightweight Sublinear Arguments Without a Trusted Setup</title>
    <author fullname="Scott Ames" initials="S." surname="Ames"></author>
    <author fullname="Carmit Hazay" initials="C." surname="Hazay"></author>
    <author fullname="Yuval Ishai" initials="Y." surname="Ishai"></author>
    <author fullname="Muthuramakrishnan Venkitasubramaniam" initials="M." surname="Venkitasubramaniam"></author>
    <date year="2022"></date>
  </front>
</reference>
<reference anchor="longfellow" target="https://eprint.iacr.org/2024/2010">
  <front>
    <title>Anonymous credentials from ECDSA</title>
    <author fullname="Matteo Frigo" initials="M." surname="Frigo"></author>
    <author fullname="abhi shelat" initials="a." surname="shelat"></author>
    <date year="2024"></date>
  </front>
</reference>
<reference anchor="rbr" target="https://eprint.iacr.org/2018/1004">
  <front>
    <title>Fiat-Shamir From Simpler Assumptions</title>
    <author fullname="Ran Canetti" initials="R." surname="Canetti"></author>
    <author fullname="Yilei Chen" initials="Y." surname="Chen"></author>
    <author fullname="Justin Holmgren" initials="J." surname="Holmgren"></author>
    <author fullname="Alex Lombardi" initials="A." surname="Lombardi"></author>
    <author fullname="Guy N. Rothblum" initials="G." surname="Rothblum"></author>
    <author fullname="Ron D. Rothblum" initials="R." surname="Rothblum"></author>
    <date year="2018"></date>
  </front>
</reference>
</references>
</references>

<section anchor="acknowledgements"><name>Acknowledgements</name>
</section>

<section anchor="test-vectors"><name>Test Vectors</name>
<t>This section contains test vectors. Each test vector in specifies the configuration information and inputs. All values are encoded in hexadecimal strings.</t>

<section anchor="test-vectors-for-merkle-tree"><name>Test Vectors for Merkle Tree</name>

<section anchor="vector-1"><name>Vector 1</name>

<ul spacing="compact">
<li>Leaves:
4bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a
dbc1b4c900ffe48d575b5da5c638040125f65db0fe3e24494b76ea986457d986
084fed08b978af4d7d196a7446a86b58009e636b611db16211b65a9aadff29c5
e52d9c508c502347344d8c07ad91cbd6068afc75ff6292f062a09ca381c89e71
e77b9a9ae9e30b0dbdb6f510a264ef9de781501d7b6b92ae89eb059c5ab743db</li>
<li>Root: f22f4501ffd3bdffcecc9e4cd6828a4479aeedd6aa484eb7c1f808ccf71c6e76</li>
<li>Proof for leaves (0,1):
084fed08b978af4d7d196a7446a86b58009e636b611db16211b65a9aadff29c5
f03808f5b8088c61286d505e8e93aa378991d9889ae2d874433ca06acabcd493</li>
<li>Proof for leaves (1,3):
e77b9a9ae9e30b0dbdb6f510a264ef9de781501d7b6b92ae89eb059c5ab743db
084fed08b978af4d7d196a7446a86b58009e636b611db16211b65a9aadff29c5
4bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a</li>
</ul>
</section>
</section>

<section anchor="test-vectors-for-circuit"><name>Test Vectors for Circuit</name>

<section anchor="vector-1-1"><name>Vector 1</name>

<ul spacing="compact">
<li>Description: Circuit C(n, m, s) = 0 if and only if n is the m-th s-gonal number in F_p128.  This circuit verifies that 2n = (s-2)m^2 - (s - 4)*m.</li>
<li>Field: 2<sup>128</sup> - 2<sup>108</sup> + 1 (Field ID 6)</li>
<li>Depth: 3 Quads: 11 Terms: 11</li>
<li>Serialization: 01060000010000010000020000040000020000040000ffffffffffffffffffffffffffefffff00000000000000000000000000f0ffff01000000000000000000000000000000fdffffffffffffffffffffffffefffff030000060000030000000000020000000000000000000000080000040000010000000000030000020000020000020000040000080000000000000000000000020000060000000000000000000000040000000000000000030000090000020000000000020000020000020000000000020000020000020000000000020000040000000000000000020000030000030000040000020000</li>
</ul>
</section>
</section>

<section anchor="test-vectors-for-sumcheck"><name>Test Vectors for Sumcheck</name>

<section anchor="vector-1-2"><name>Vector 1</name>

<ul spacing="compact">
<li>Description: Circuit C(n, m, s) = 0 if and only if n is the m-th s-gonal number in F_p128.  This circuit verifies that 2n = (s-2)m^2 - (s - 4)*m.</li>
<li>Field: 2<sup>128</sup> - 2<sup>108</sup> + 1 (Field id 6)</li>
<li>Fiat-Shamir initialized with</li>
<li>Serialization: 90e734c42b5f14ee432a0ed95ba2ada05c3f9ecc9b026ded61f00bf57434f93c6f70e9c8b6e3de005ba8b4da93b5fa35fc3efae1e6068399c7f7d009ab5a2711084c97cd5a6e28dd30c598907b328d81915e487c34dbf80aa5da14f0621011a33d838a7b0d9a03533c63c6606f5360f88cf97c728630afdcb9755894a6f5c9068e1fc29f97efc125ba580de64089c6e72433de2a3267b90daeaf418ac8a3df3bbddc6cb141c764c8262346baac2e28033778b1a71f153ba571e80ab29951f9440ba93fede225a35accf6e0114d5240ae92df02d2870e5258ebba416f3d815e1554b05627998fc9d3bf354b89394b27b39f69c6538dbc968a779369e47f214252e0955624e9f4d6dc2a95cf41c57703b8749b959315458d4076f0daf5fdbde23e16c10394ac884ab9cad0782e8f472cb4edb69682d17465363691aafc31b83cd764fb909b50e2fe907fd2137566ddb8c47cc13974957e7f76180860571035f7a4d2658a82e1be8fe155353bc10feae9541365926f0646b4a5351907cbd5d9dbb4</li>
</ul>
</section>
</section>

<section anchor="test-vectors-for-ligero"><name>Test Vectors for Ligero</name>

<section anchor="vector-1-3"><name>Vector 1</name>

<ul spacing="compact">
<li>Description: Circuit C(n, m, s) = 0 if and only if n is the m-th s-gonal number in F_p128.  This circuit verifies that 2n = (s-2)m^2 - (s - 4)*m.</li>
<li>Field: 2<sup>128</sup> - 2<sup>108</sup> + 1 (Field id 6)</li>
<li>Witness vector: [1, 45, 5, 6]</li>
<li>Pad elements: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4]</li>
<li><t>Parameters:</t>

<ul spacing="compact">
<li>NREQ: 6</li>
<li>RATE: 4</li>
<li>WR: 20</li>
<li>QR: 2</li>
<li>NROW: 7</li>
<li>NQ: 1</li>
<li>BLOCK: 51</li>
</ul></li>
<li>Commitment: 738d2ffb3a8bf24e7aedb94be59041fb2dc13da30fe6b05ebe5126ef8fc36ec2</li>
<li>Proof size: 3180 bytes</li>
<li>Proof: fa8d88a73b3a0f9c067658c45bb394a602000000000000000000000000000000fa8d8...2cd5f61cd2b2eb84c79e1707cbad0048fcd820c716584f31991cf1628fb041</li>
</ul>
</section>
</section>

<section anchor="test-vectors-for-libzk"><name>Test Vectors for libzk</name>
</section>
</section>

</back>

</rfc>
