HyperPlonk

This article aims to intuitively explain the goals and processes of the HyperPlonk protocol.

Introduction

Plonk is a powerful tool that can be tweaked to support custom gates (TurboPlonk) and lookup tables (Plonkup). However, as circuits grow larger, FFT becomes a bottleneck. HyperPlonk modifies Plonk by using multivariate polynomials over a boolean hypercube. This achieves two main advantages:

  1. Eliminating the need for FFT.

  2. Enabling efficient computation of high-degree custom gates without significant overhead, resulting in faster proving times.

Background

Why is FFT a bottleneck?

The diagram above illustrates the FFT process for a polynomial of size 8. First, as shown on the left side of the diagram, a bit reversal operation is performed. Then, butterfly operations are executed at each stage. A bit reversal operation takes O(N)O(N), and a butterfly operation also requires O(N)O(N). Since the number of stages is logN\log N, the overall complexity is O(NlogN)O(N \cdot \log N). Furthermore, FFT is notoriously difficult to parallelize. This is because, as the stages progress, the memory locations required to compute the outputs for the next stage become increasingly scattered. If calculations must be performed on hardware like GPUs with limited memory, these constraints can be critical.

Modern SNARK

Polynomial IOP (Interactive Oracle Proof) refers to a verification method where the prover provides an oracle to the verifier. The verifier interacts by sampling random points, and in the final round, accesses the oracle to verify computations on these random points.

Here, the oracle can be thought of as a polynomial. However, since polynomials often have high degrees, they are not succinct. To address this, a PCS (Polynomial Commitment Scheme) is used. Instead of sending the full polynomial, the prover sends a commitment that represents the polynomial. During the final round of verification, the prover can reveal (or "open") the evaluations at the random points along with a proof, enabling succinct verification. This combination allows for succinct SNARKs.

Modern SNARKs are built by combining Polynomial IOP with PCS, structured as follows:

When is FFT used in Plonk?

A
B
C

ω0\omega^0

1

1

2

ω1\omega^1

1

2

3

ω2\omega^2

2

3

5

ω3\omega^3

3

5

8

Let's walk through a simple circuit as an example. The circuit above represents a Fibonacci sequence starting with (1,1)(1, 1). Each column in the circuit is interpreted as a univariate polynomial. Since Plonk implements Univariate IOP + Univariate PCS, the protocol proceeds as follows:

For simplicity, the Wiring Check (Permutation Argument) is omitted in this example, as the focus is on why FFT is used.

  1. The prover commits to the polynomials A, B, and C to assert that the following relationship holds. This step uses PCS.BatchCommit and corresponds to Oracle:

PCS.BatchCommit({A,B,C}){CA,CB,CC}\mathsf{PCS.BatchCommit}(\{A, B,C\}) \rightarrow \{C_A, C_B, C_C\}
  1. After Poly IOP for Wiring Identity, to assert that the gate identity is satisfied, the prover commits to QQ. Since tt is a known polynomial (pre-shared between prover and verifier), it does not require a separate commitment. This step corresponds to Poly IOP for Gate Identity → Poly IOP for Quotient Check:

PCS.Commit(Q)CQ where A(X)+B(X)C(X)=t(X)Q(X) and t(X)=ω41\mathsf{PCS.Commit}(Q) \rightarrow C_Q \\ \text { where } A(X) + B(X) - C(X) = t(X) \cdot Q(X) \text { and } t(X) = \omega^4 - 1
  1. The verifier samples a random xx from the entire field space.

  2. Since xx is sampled randomly from the field, the prover must interpolate the polynomials A,B,A, B, and CC to their Lagrange basis using IFFT:

A(X)=L0(X)1+L1(X)1+L2(X)2+L3(X)3B(X)=L0(X)1+L1(X)2+L2(X)3+L3(X)5C(X)=L0(X)2+L1(X)3+L2(X)5+L3(X)8A(X) = L_0(X)\cdot 1 + L_1(X) \cdot1 + L_2(X) \cdot 2 + L_3(X) \cdot 3 \\ B(X) = L_0(X)\cdot 1 + L_1(X) \cdot 2+ L_2(X) \cdot 3 + L_3(X) \cdot 5\\ C(X) = L_0(X)\cdot 2 + L_1(X) \cdot 3+ L_2(X) \cdot 5 + L_3(X) \cdot 8
  1. The prover sends the evaluations A(x),B(x),C(x),A(x), B(x), C(x), and Q(x)Q(x) along with the opening proof:

PCS.BatchOpen({A,B,C,Q},x)({A(x),B(x),C(x),Q(x)},π)\mathsf{PCS.BatchOpen}(\{A, B, C, Q\}, x) \rightarrow \\(\{A(x), B(x), C(x),Q(x) \}, \pi)
  1. The verifier checks that the gate identity holds at the random point xx:

A(x)+B(x)C(x)t(x)=?Q(x)\frac{A(x) + B(x) - C(x)}{t(x)} \stackrel{?}= Q(x)
PCS.BatchVerify(Scommit,Sopen,x,π)=?1Scommit={CA,CB,CC,CQ}Sopen={A(x),B(x),C(x),A(x)+B(x)C(x)t(x)}\mathsf{PCS.BatchVerify}(S_{\mathsf{commit}}, S_{\mathsf{open}}, x, \pi) \stackrel{?}= 1 \\ S_{\mathsf{commit}} = \{C_A, C_B, C_C, C_Q\} \text{, } \\ S_{\mathsf{open}} = \{A(x), B(x), C(x), \frac{A(x) + B(x) - C(x)}{t(x)}\}

High-degree custom gates in Plonk

This section is based on the Halo2 book. Custom gates enable representation of complex operations that cannot easily be expressed with simple addition and multiplication. Let’s explore how they are implemented in Plonk.

Consider a polynomial of degree dd (or a table with d+1d + 1 rows). In the earlier example, we dealt with a simple addition operation, where the degree of A+BCA + B - C was dd. This simplicity made computation straightforward.

When dealing with more complex circuits, custom gates can be represented as fif_i, with selectors sis_i. For a given domain HH, each custom gate fif_i must satisfy:

si(X)fi(X)=t(X)Qi(X)XH,where si(X)={1if fi(X) is enabled0otherwises_i(X) \cdot f_i(X) = t(X) \cdot Q_i(X) \text {, } \forall X \in H\text{,} \\ \text{where } s_i(X) =\begin{cases} 1 & \text{if }f_i(X) \text { is enabled}\\ 0 & \text{otherwise} \end{cases}

Checking each gate fif_i individually is inefficient for the verifier. Instead, we use a common ZK technique: random linear combination. Starting from step 2 of the protocol:

  1. The verifier samples a random y.

  2. Using yy, the prover combines all custom gates into a single equation:

i=0m1yisi(X)fi(X)=t(X)Q(X)XH\sum_{i = 0}^{m - 1}y^i \cdot s_i(X)\cdot f_i(X) = t(X) \cdot Q(X)\text {, } \forall X \in H

Here, mm is the number of gates. The degree of QQ is calculated as:

deg(Q(X))=max0i<mdeg(si(X)fi(X))deg(t(X))=max0i<mdeg(fi(X))1deg(si(X))=d  deg(t(X))=d+1\deg(Q(X)) = \max_{0 \le i < m} \deg(s_i(X) \cdot f_i(X)) - \deg(t(X)) = \max_{0 \le i < m} \deg( f_i(X)) - 1 \\ \because \deg(s_i(X)) = d \space \land \space \deg(t(X)) = d + 1

For example, if m=3m = 3, the custom gates fif_i are the following:

f0(X)=A(X)B(X)C(X)1f1(X)=A(X)B(X)C(X)f2(X)=A(X)+B(X)C(X)f_0(X) = A(X) \cdot B(X) \cdot C(X) - 1\\ f_1(X) = A(X) \cdot B(X) - C(X) \\ f_2(X) = A(X) + B(X) - C(X)

Then the degree of QQ is nd1n \cdot d - 1, where nn is the max degree of the custom gates. (i.e., the maximum number of polynomial multiplications). Here, nn is 3. Since the degree of QQ can be much higher than dd, it needs to be decomposed into nn smaller polynomials qiq_i to fit within the commitment scheme:

Q(X)=i=0m1yisi(X)fi(X)t(X)=i=0n1Xi(d+1)qi(X)=q0(X)+Xd+1q1(X)++X(n1)(d+1)qn1(X)\begin{align*} Q(X) &= \frac{\sum_{i = 0}^{m - 1}y^i \cdot s_i(X)\cdot f_i(X)}{t(X)} = \sum_{i=0}^{n-1}X^{i(d+1)} \cdot q_i(X) \\ &= q_0(X) + X^{d + 1}\cdot q_1(X) + \dots + X^{(n - 1)\cdot(d + 1)}\cdot q_{n-1}(X) \end{align*}

Before committing to qiq_i, the prover must compute fif_i. For high-degree gates, this involves the following:

The degree becomes ndn \cdot d for the sections highlighted in green, which significantly slows down computation. If the nn of the custom gate is set too small, flexibility to create a diverse range of computations decreases. On the other hand, if nn is set too large, it introduces significant overhead in High-Degree FFT, Gate Evaluation, and High-Degree Inverse FFT. Therefore, nn must be set to an appropriate value. For example, f(x)=xe2piiξxf(x) = x * e^{2 pi i \xi x}n is 9 in Polygon Miden. On the other hand, there is no such limitation in Halo2.

In summary, in a simple protocol as discussed earlier, QQ could be committed before conducting the Inverse FFT. However, when custom gates are introduced, committing QQ requires going through the following steps:

Inverse FFT → High Degree FFT → Gate Evaluation → High Degree Inverse FFT.

  1. The prover commits to q0,q1,q2q_0, q_1, q_2.

PCS.BatchCommit({q0,q1,q2}){Cq0,Cq1,Cq2}\mathsf{PCS.BatchCommit}(\{q_0, q_1, q_2\}) \rightarrow \{C_{q_0}, C_{q_1}, C_{q_2}\}
  1. The verifier samples a random point xx.

  2. The prover sends A(x),B(x),C(x)A(x), B(x), C(x) and q(x)q(x) along with the opening proof π\pi to the verifier:

PCS.BatchOpen({A,B,C,Q},x)({A(x),B(x),C(x),q0(x)+xd+1q1(x)+x2(d+1)q2(x)},π)\mathsf{PCS.BatchOpen}(\{A, B, C, Q\}, x) \rightarrow \\(\{A(x), B(x), C(x), q_0(x) + x^{d+1} \cdot q_1(x) + x^{2(d+1)} \cdot q_2(x) \}, \pi)
  1. The verifier checks whether the following equation holds at the random point xx. Here, sis_i is assumed to be a polynomial already known to both the prover and the verifier:

s0(x)f0(x)+ys1(x)f1(x)+y2s2(x)f2(x)t(x)=?q0(x)+xd+1q1(x)+x2(d+1)q2(x)=q(x)\frac{s_0(x)\cdot f_0(x) + y\cdot s_1(x)\cdot f_1(x) + y^2\cdot s_2(x)\cdot f_2(x)}{t(x)} \stackrel{?}= \\ q_0(x) + x^{d + 1}\cdot q_1(x) + x^{2(d + 1)}\cdot q_2(x) = q(x)
  1. To validate the proof, the verifier performs the following test. If the test is passed, the verifier accepts the proof:

PCS.BatchVerify(Scommit,Sopen,x,π)=?1Scommit={CA,CB,CC,CQ}Sopen={A(x),B(x),C(x),s0(x)f0(x)+ys1(x)f1(x)+y2s2(x)f2(x)t(x)}\mathsf{PCS.BatchVerify}(S_{\mathsf{commit}}, S_{\mathsf{open}}, x, \pi) \stackrel{?}= 1 \\ S_{\mathsf{commit}} = \{C_A, C_B, C_C, C_Q\} \text{, } \\ \begin{align*} S_{\mathsf{open}} = \big\{A(x), B(x), C(x), \frac{s_0(x)\cdot f_0(x) + y\cdot s_1(x) \cdot f_1(x) + y^2 \cdot s_2(x) \cdot f_2(x)}{t(x)}\big\} \end{align*}

From the above, the approximate total execution time for the Plonk prover can be calculated as follows:

Ttotal=NpolyTcommit+Npoly(TIFFT+THighDegreeFFT)+Ngate(Tgateeval+THighDegreeIFFT)+Tbatchopen\begin{align*} T_{\mathsf{total}} = &N_{\mathsf{poly}} \cdot T_{\mathsf{commit}} + N_{\mathsf{poly}} \cdot (T_{\mathsf{IFFT}} + T_{\mathsf{HighDegreeFFT}}) + \\ &N_{\mathsf{gate}} \cdot (T_{\mathsf{gateeval}} + T_{\mathsf{HighDegreeIFFT}}) + T_{\mathsf{batchopen}} \end{align*}

Furthermore, according to the Halo2 benchmarks from the Tachyon repository, it can be observed that the majority of the time is spent on the build hh poly step, which is directly related to QQ.

Protocol Explanation

This image is taken from the paper, and describes HyperPlonk and HyperPlonk+.

ZeroCheck → SumCheck

In HyperPlonk, the way constraints are imposed is not fundamentally different from Plonk. For example, to represent an addition gate, the following expression can be used:

A(X)+B(X)C(X)A(\bm{X}) + B(\bm{X}) - C(\bm{X})

Plonk uses a ZeroCheck followed by a QuotientCheck, as shown below:

P(X)=0 XHP(X)=Q(X)T(X)P(X) = 0 \space \forall X \in H \rightarrow P(X) = Q(X)\cdot T(X)

where H:={ωii=0,1,,d1}H := \{ \omega^i | i = 0, 1, \dots, d-1 \} and ω\omega is dd-th root of unity.

HyperPlonk instead uses a ZeroCheck followed by a SumCheck:

P(X)=0 XBμbBμeq~(X,b)P(b)=0P(\bm{X}) = 0 \space \forall \bm{X} \in B_\mu \rightarrow \sum_{\bm{b} \in B_\mu}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot P(\bm{b}) = 0

where Bμ:={0,1}μB_\mu := \{0, 1\}^\mu is the boolean hypercube.

High-degree custom gates in HyperPlonk

A
B
C

(0, 0)

1

1

2

(0, 1)

1

2

3

(1, 0)

2

3

5

(1, 1)

3

5

8

Why does using SumCheck eliminate the overhead for high-degree custom gates? Using the same example as above, the prover's claim can be expressed as follows:

i=02yis^i(X)f^i(X)=?0,wheres^i(X)=b{0,1}μeq~(X,b)si(b),f^0(X)=(b{0,1}μeq~(X,b)A(b))(b{0,1}μeq~(X,b)B(b))(b{0,1}μeq~(X,b)C(b))1,f^1(X)=(b{0,1}μeq~(X,b)A(b))(b{0,1}μeq~(X,b)B(b))b{0,1}μeq~(X,b)C(b),f^2(X)=b{0,1}μeq~(X,b)A(b)+b{0,1}μeq~(X,b)B(b)b{0,1}μeq~(X,b)C(b)\sum_{i = 0}^2 y^i \cdot \hat{s}_i(\bm{X}) \cdot \hat{f}_i(\bm{X}) \stackrel{?}= 0\mathsf{, where}\\ \hat{s}_i(\bm{X}) = \sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot s_i(\bm{b}), \\ \begin{align*} \hat{f}_0(\bm{X}) = (\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot A(\bm{b}))(\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot B(\bm{b}))(\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot C(\bm{b})) - 1\mathsf{,}\\ \hat{f}_1(\bm{X}) = (\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot A(\bm{b}))(\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot B(\bm{b}))-\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot C(\bm{b})\mathsf{,}\\ \hat{f}_2(\bm{X}) = \sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot A(\bm{b}) + \sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot B(\bm{b}) -\sum_{\bm{b} \in \{0, 1\}^{\mu}}\widetilde{\mathsf{eq}}(\bm{X}, \bm{b})\cdot C(\bm{b})\\ \end{align*}

In the SumCheck protocol, the univariate polynomial sent by the prover to the verifier in each round has a degree that is less than or equal to the degree of the multivariate polynomial. Thus, the prover only needs to send univariate polynomials with a maximum degree equal to the degree of the custom gate + 1.

Consider how a univariate polynomial can be constructed from a 1st-degree multivariate polynomial. For a 1st-degree polynomial in 2 variables, it can be expressed as:

P(X0,X1)=(1X0)(1X1)v0+(1X0)X1v1+X0(1X1)v2+X0X1v3\begin{align*} P(X_0, X_1) = &(1 - X_0)\cdot(1 - X_1)\cdot v_0 + (1 - X_0)\cdot X_1 \cdot v_1 + \\ &X_0\cdot (1-X_1) \cdot v_2 + X_0 \cdot X_1 \cdot v_3 \end{align*}

Now let’s create an degree 1 univariate polynomial P0P_0 based on the multilinear polynomial above.

P0(X)=X1{0,1}P(X,X1)=(1X)v0+(1X)v1+Xv2+Xv3=v0+v1+(v2+v3v0v1)X\begin{align*} P_0(X) &= \sum_{X_1 \in \{0, 1\}} P(X, X_1)\\ &= (1 - X)\cdot v_0 + (1 - X) \cdot v_1 + X \cdot v_2 + X \cdot v_3 \\ &= v_0 + v_1 + (v_2 + v_3 - v_0 - v_1) \cdot X \end{align*}

The complexity for this computation in round ii (starting from 0) is O(2μi1)O(2^{μ - i - 1}).

For higher-degree multivariate polynomials, the same method is applied to each component, and the results are multiplied. For example, with f0f_0:

f0(X)=A0(X)B0(X)C0(X)1,where A0(X)=3X+2,B(X)=5X+3,C(X)=8X+5f_0(X) = A_0(X)\cdot B_0(X) \cdot C_0(X) - 1\text{,} \\ \text{where } A_0(X) = 3X + 2, B(X) = 5X + 3, C(X) = 8X + 5

Let’s analyze complexity for each round ii (starting from 0):

  • Univariate Polynomial Construction: O(n2μi1)O(n\cdot 2^{μ − i −1}), where nn is the number of components.

  • Polynomial Multiplication: O(n2)O(n^2)

Since the number of total rounds is μ\mu, the total complexity is computed to O(n2μ+μn2)O(n \cdot 2μ + μ \cdot n^2) as follows:

i=0μ1O(n2μi1)+O(n2)=O(n(2μ1))+O(μn2)=O(n2μ+μn2)\sum_{i = 0}^{\mu - 1}O(n \cdot 2^{\mu - i - 1}) + O(n^2) = O(n \cdot (2^{\mu} - 1)) + O(\mu \cdot n^2) = O(n \cdot 2^{\mu} + \mu \cdot n^2)

If nn is treated as a constant, the SumCheck protocol allows ZeroCheck to be performed in linear time. For reference, the complexity calculation here differs from the one provided in the paper. For more details, refer to Section 3.1 (Computing SumCheck for high-degree polynomials) and Section 3.8 of the original paper, which introduce further optimizations for the SumCheck protocol.

ProductCheck

ProductCheck is a method to verify whether the product of a polynomial over a boolean hypercube equals a constant cc:

X{0,1}μh(X)=c, where h(X)=f(X)g(X)\prod_{\bm{X} \in \{0, 1\}^\mu}h(\bm{X}) = c\text{, where } h(\bm{X}) = \frac{f(\bm{X})}{ g(\bm{X})}

For univariate polynomials, this is typically computed using a running product column, as explained in From Halo2, LogUp to LogUp-GKR. For multivariate polynomials, the process involves the following steps:

  1. Define vv to satisfy the following conditions:

v(0,X)=h(X), v(1,X)=v(X,0)v(X,1), where v(1,,1)=0v(0, \bm{X}) = h(\bm{X}), \space v(1, \bm{X}) = v(\bm{X}, 0) \cdot v(\bm{X}, 1)\text{, where } v(1, \dots, 1) = 0

This allows vv to be computed as shown in the diagram below:

f(X)
g(X)
v(X)

(0,0,0)(0, 0, 0)

f(0,0)f(0, 0)

g(0,0)g( 0,0)

f(0,0)g(0,0)\frac{f( 0,0)}{g(0, 0)}

(0,0,1)(0, 0, 1)

f(0,1)f(0, 1)

g(0,1)g(0, 1)

f(0,1)g(0,1)\frac{f( 0,1)}{g(0, 1)}

(0,1,0)(0, 1, 0)

f(1,0)f(1, 0)

g(1,0)g(1, 0)

f(1,0)g(1,0)\frac{f(1,0)}{g(1,0)}

(0,1,1)(0, 1, 1)

f(1,1)f(1, 1)

g(1,1)g(1,1)

f(1,1)g(1,1)\frac{f(1,1)}{g(1,1)}

(1,0,0)(1, 0, 0)

f(0,0)f(0,1)g(0,0)g(0,1)\frac{f(0,0)\cdot f(0,1)}{g(0, 0) \cdot g(0, 1)}

(1,0,1)(1, 0, 1)

f(1,0)f(1,1)g(1,0)g(1,1)\frac{f(1,0)\cdot f(1,1)}{g(1, 0) \cdot g(1, 1)}

(1,1,0)(1, 1, 0)

f(0,0)f(0,1)f(1,0)f(1,1)g(0,0)g(0,1)g(1,0)g(1,1)\frac{f(0, 0) \cdot f(0, 1) \cdot f(1,0)\cdot f(1,1)}{g(0, 0) \cdot g(0, 1) \cdot g(1, 0) \cdot g(1, 1)}

(1,1,1)(1, 1,1)

00

  1. Define a new function h^\hat{h} as follows:

h^(X0,X)=(1X0)(v(1,X)v(X,0)v(X,1))+X0(g(X)v(0,X)f(X))\begin{align*} \hat{h}(X_0, \bm{X}) = (1 - X_0)\cdot(v(1, \bm{X}) - v(\bm{X}, 0)\cdot v(\bm{X}, 1)) + X_0\cdot (g(\bm{X})\cdot v(0, \bm{X}) - f(\bm{X})) \end{align*}
  1. Using h^\hat{h}, perform ZeroCheck → SumCheck as described earlier. Since h^\hat{h} evaluates to zero for all points in {0,1}3\{0,1\}^3, this check ensures the validity of vv.

  2. Finally, the verifier queries vv at (1,,1,0)(1, \dots, 1, 0) to ensure the products of hh are equal to cc.

Wiring Identity → MultiSetEquality

In univariate polynomials, verifying Wiring Identity involves constructing siσs_i^\sigma as shown in the figure below. The green arrows denote how the label is moved.

This is then transformed into a MultiSetEquality using a random value α\alpha:

i=03(Xαs0(ωi)A(ωi))(Xαs1(ωi)B(ωi))(Xαs2(ωi)C(ωi))i=03(Xαs0σ(ωi)A(ωi))(Xαs1σ(ωi)B(ωi))(Xαs2σ(ωi)C(ωi))=?1\frac{\prod_{i = 0}^3(X - \alpha s_0(\omega^i )- A(\omega^i))(X - \alpha s_1(\omega^i )- B(\omega^i))(X - \alpha s_2(\omega^i ) - C(\omega^i))}{ \prod_{i = 0}^3(X - \alpha s^{\sigma}_0(\omega^i) - A(\omega^i))(X - \alpha s^{\sigma}_1(\omega^i) - B(\omega^i))(X - \alpha s^{\sigma}_2(\omega^i) - C(\omega^i))} \stackrel{?}= 1

In multivariate polynomials, siσs_i^\sigma is similarly constructed as shown in the figure below.

Using the same random value α\alpha, this is transformed into a MultiSetEquality as follows:

b{0,1}2(Xαs0(b)A(b))(Xαs1(b)B(b))(Xαs2(b)C(b))b{0,1}2(Xαs0σ(b)A(b))(Xαs1σ(b)B(b))(Xαs2σ(b)C(b))=?1 \frac{\prod_{\bm{b} \in \{0, 1\}^2}(\bm{X} - \alpha s_0(\bm{b} )- A(\bm{b}))(\bm{X} - \alpha s_1(\bm{b} )- B(\bm{b}))(\bm{X} - \alpha s_2(\bm{b} ) - C(\bm{b}))}{ \prod_{\bm{b} \in \{0, 1\}^2}(\bm{X} - \alpha s^{\sigma}_0(\bm{b}) - A(\bm{b}))(\bm{X} - \alpha s^{\sigma}_1(\bm{b}) - B(\bm{b}))(\bm{X} - \alpha s^{\sigma}_2(\bm{b}) - C(\bm{b}))} \stackrel{?}= 1

MultiSetEquality can now be replaced by the ProductCheck protocol described earlier. This enables efficient verification using the established steps for product consistency.

For handling Permutation PIOP on small fields, Section 3.6 of the paper provides a detailed explanation. Interested readers are encouraged to explore that section for additional insights.

Shift Function for Lookup Argument

Incorporating Plookup into HyperPlonk creates HyperPlonk+.

F(β,γ):=(1+β)ni[n](γ+fi)i[d1](γ(1+β)+ti+βti+1)G(β,γ):=i[n+d1](γ(1+β)+si+βsi+1)F(\beta, \gamma) := (1 + \beta) ^n \cdot \prod_{i \in [n] }(\gamma + f_i) \cdot \prod_{i \in [d-1]} (\gamma(1 + \beta) + t_i + \beta \cdot t_{i + 1}) \\ G(\beta, \gamma) := \prod_{i \in [n + d -1]} (\gamma(1 + \beta) + s_i + \beta \cdot s_{i + 1})

where ss is (f,t)(f, t) sorted by tt.

The equation above, taken from the Plookup paper, shows how values fi,si,ti+1,si+1f_i, s_i, t_{i+1}, s_{i+1} ​ are queried at both the ii-th and i+1i+1-th indices. For univariate polynomials in Plonk, this can be easily expressed as:

This "shift through the domain" is achieved using a shift function, which for univariate polynomials is defined as:

next(X)=ωX\mathsf{next}(X) = \omega \cdot X

For multivariate polynomials, the shift function can be represented using a mechanism similar to the one illustrated below (also from the paper):

Consider g4(b1,b2,b3,b4)g_4(b_1, b_2, b_3, b_4) in the context of a 4-dimensional boolean hypercube. The function g4g_4 operates as follows:

g4(b1,b2,b3,b4)=(b1+b2X+b3X2+b4X3)X=b1X+b2X2+b3X3+b4X4=b4+(b1b4)X+b2X2+b3X3=b4+(b1+b4)X+b2X2+b3X3=b4+(b1b4)X+b2X2+b3X3=(b4,b1b4,b2,b3)=(b4,b1,b2,b3),where p4=X4+X+1\begin{align*} g_4(b_1, b_2, b_3, b_4) &= (b_1 + b_2X + b_3X^2 + b_4X^3) \cdot X \\ &= b_1X + b_2X^2 + b_3X^3 + b_4X^4 \\ &= -b_4 + (b_1 - b_4)X + b_2X^2 + b_3X^3 \\ &= b_4 + (b_1 + b_4)X + b_2X^2 + b_3X^3 \\ &= b_4 + (b_1 \oplus b_4)X + b_2X^2 + b_3 X^3 \\ &= (b_4, b_1 \oplus b_4, b_2, b_3) \\ &= (b_4, b_1', b_2', b_3'), \end{align*} \\ \\ \text{where } p_4 = X^4 + X + 1

This cyclically shifts all non-zero points in the boolean hypercube.

1:g4(0,0,0,1)=(1,1,0,0)2:g4(1,1,0,0)=(0,1,1,0)3:g4(0,1,1,0)=(0,0,1,1)4:g4(0,0,1,1)=(1,1,0,1)5:g4(1,1,0,1)=(1,0,1,0)6:g4(1,0,1,0)=(0,1,0,1)7:g4(0,1,0,1)=(1,1,1,0)8:g4(1,1,1,0)=(0,1,1,1)9:g4(0,1,1,1)=(1,1,1,1)10:g4(1,1,1,1)=(1,0,1,1)11:g4(1,0,1,1)=(1,0,0,1)12:g4(1,0,0,1)=(1,0,0,0)13:g4(1,0,0,0)=(0,1,0,0)14:g4(0,1,0,0)=(0,0,1,0)15:g4(0,0,1,0)=(0,0,0,1)1: g_4(0, 0, 0, 1) = (1, 1, 0, 0) \\ 2: g_4(1, 1, 0, 0) = (0, 1, 1, 0) \\ 3: g_4(0, 1, 1, 0) = (0, 0, 1, 1) \\ 4: g_4(0, 0, 1, 1) = (1, 1, 0, 1) \\ 5: g_4(1, 1, 0, 1) = (1, 0, 1, 0) \\ 6:g_4(1, 0, 1, 0) = (0, 1, 0, 1) \\ 7:g_4(0, 1, 0, 1) = (1, 1, 1, 0)\\ 8:g_4(1, 1, 1, 0) = (0, 1, 1, 1)\\ 9:g_4(0, 1, 1, 1) = (1, 1, 1, 1)\\ 10: g_4(1, 1, 1, 1) = (1, 0, 1, 1)\\ 11: g_4(1, 0, 1, 1) = (1, 0, 0, 1)\\ 12: g_4(1, 0, 0, 1) = (1, 0, 0, 0)\\ 13: g_4(1, 0, 0, 0) = (0, 1, 0, 0)\\ 14: g_4(0, 1, 0, 0) = (0, 0, 1, 0)\\ 15:g_4(0, 0, 1, 0) = (0, 0, 0, 1)\\

The XOR operation can be algebraically expressed as:

XY=X+Y2XYX \oplus Y = X + Y - 2 \cdot X \cdot Y

However, this representation is quadratic, leading to an explosion in degree when using nested functions like f(g(X))f(g(X)).

Luckily, g4g_4 can be simplified based on specific rules:

g4(b1,b2,b3,b4)=(b1,b2,b3,b4)={(1,1b1,b2,b3)if b4=1(0,b1,b2,b3)if b4=0g_4(b_1, b_2, b_3, b_4) = (b_1', b_2', b_3', b_4') = \begin{cases} (1, 1 - b_1, b_2, b_3) & \text{if } b_4 = 1 \\ (0, b_1, b_2, b_3) & \text{if } b_4 = 0 \end{cases}

This reduces the complexity, allowing g4g_4 to be expressed as a linear function rather than a quadratic one. This is depicted in the formula below:

Using this linearized shift function, MultiSetEquality Check can be efficiently performed. By leveraging the defined Polynomial IOP, this shift function enables the protocol to verify lookup arguments in HyperPlonk+ without significant overhead.

This also implies that the domain in the circuit must be represented differently. Specifically, the point (0,0)(0, 0) cannot be included because it is not cyclic. Instead, the cyclic domain should exclude (0,0)(0, 0) and follow this order in the case of 4 or 8 rows:

Conclusion

We have explored how HyperPlonk operates through examples. Unlike Plonk, HyperPlonk introduces a method to leverage the boolean hypercube for proofs. This approach eliminates the computational overhead caused by FFTs as the maximum degree of custom gates increases, resulting in faster proof generation.

The figure above is a capture from the presentation by Benedikt Bünz at ZK Summit 8 (YouTube link):

Graph b above and the video screenshot both support this observation. While not covered, HyperPlonk has also contributed to the optimization of multivariate polynomial commitments, as seen in systems like Orion+. Despite these significant contributions, the Conclusions and Open Problems section of the paper highlights several areas for future work, introducing ideas for continued improvements and research.

Written by ryan Kim from A41

Last updated