GKR
Presentation: https://youtu.be/OTCxQ9qIzDY
Overview
Although interactive proofs like Sumcheck enabled the verifier to efficiently check the validity of solutions to problems in complexity classes beyond NP, real world computing entities cannot solve a large instance of problems beyond NP practically.
Still, it can be pretty useful regarding more solvable problems in the real world (even for the verifier) if the verifier's cost is much less than solving it by itself and the prover's work is not increased asymptotically compared to what it takes to merely solve it.
Let be the circuit in layered form whose number of gates (or circuit size) is and circuit depth is . The GKR protocol is important in a sense that :
-cost is almost ; it is not much more than the cost to solve the circuit.
-cost is ; it's even less than just reading through the entire circuit.
Note: This needs the assumption that the circuit is logspace uniform, which implies that it has a succinct implicit description. That's why can verify even without looking into the entire .
It can function as a general-purpose protocol for any and to run an IP on any arithmetic circuit evaluation instance.
The protocol starts from sending the claimed output value to as the first message.
In a high level view, the protocol iterates from the first layer (output layer) to the bottom (input layer), where in each iteration the goal is to reduce the claim of the output values of the gates in layer to the claim of the output values of the gates in layer .
Going through all the way down, the first claim is reduced to that of input values, which are known to and thus can be checked by itself. Reduction in each iteration is done by the sumcheck protocol.

When a circuit is in layered form, it means every leaf appears in the last layer (at depth ). For a circuit that is not layered, there is a transformation with a blowup in size of factor .
Protocol details
Let be a fan-in 2 circuit over a finite field in a layered form, with
: the size of the circuit (number of total gates)
: the number of gates in layer (hence each gates in layer is labeled by )
: the depth
: the input length (i.e. )
and each gate is either addition or multiplication.
Without loss of generality let . Let to be a mapping which takes as input the binary representation of gate labels in layer and gives the gate output.
Define "wiring predicate" , which takes as input the gate label and gives the label of one of its input gates in the lower layer.
Lastly, define "gate switch" which takes as input the in/out gate labels and gives its instruction type. For example, supposing that gate in layer is an addition gate,
Functions with a tilde on top of the function name indicates the multilinear extension of the mapping.
(TODO: Make polynomial extension page and set backlinks to it in pages Spartan, Lasso, etc.)

Lemma 1
The multilinear extension of the gate output polynomial can be represented as below.
Proof.
Obviously both hands match at . As both hands are multilinear in and a multlinear extension of such mapping is uniquely defined at domain , both hands are identical.

Protocol
sends a function claimed to equal (the function that maps output gate labels to output values).
picks a random , and lets . The remainder of the protocol is devoted to confirming that .
Now iterate for . Each iteration is devoted to reduce the claim on to the claim on .
Define the -variate polynomial:
's claim is identical to that .
So that may check this claim, and apply the sumcheck protocol to , up until 's final check in that protocol, where must evaluate at a randomly chosen point .
Note that only needs to know and and not the entire polynomial .
Now we need to reduce the two claims on and to one claim. Let be the unique line satisfying and . sends a univariate polynomial of degree at most to , claimed to equal restricted to .
now performs the final check in the sumcheck protocol, using and in place of and .
Note we assume here that for each layer , can evaluate the multilinear extensions and in polylogarithmic time. This is necessary for keeping the verifier cost sublinear to circuit size .
chooses at random and sets and .
Note that . The claim is successfully reduced to that of the next layer.
After the final round, directly checks . As knows the input vector, it can evaluate at any point with time complexity where , i.e. the length of input.

For a more detailed explanation and analysis on costs and soundness, please refer to Section 4.1 and 4.2 of the original paper.
In particular, the prover cost is reduced from of the naive method of sumcheck protocol to of the optimized method which exploits the fact that and are sparse and structured and thus can be evaluated very efficiently. That is, as most of the evaluations are guaranteed to be zero, instead of passing over the entire evaluation domain, it passes only over the values that are known to be nonzero. Furthermore, the prover cost can be reduced to when the computation can be parallelized by data, as shown in the next section.
Similarly in 's cost analysis, the cost for evaluating and is considered to be lower order and omitted. This owes to repeatedly structured wiring patterns that commonly arise in circuits of real world problems in interest. Even without structured wiring patterns, we also have an option to let the verifier itself commit to and (with preprocessing time) using polynomial commitment scheme and let the prover open it each time in need, which turns the interactive proof to an interactive argument.
Leveraging Data Parallelism
When the circuit has the data parallelism property; that is, the same sub-computation is applied independently on different pieces of data and then possibly aggregated, the performance of GKR is greatly improved. Fortunately, data parallel computation is pervasive in real-world computing, including the use cases in ZK era such as LogUp-GKR, Lasso, Libra and Ceno.

Let be a circuit of size with an arbitrary wiring pattern, and let be a "super-circuit" of size that applies independently to input data before aggregating results. Labels of gates consists of two indices, one for the location in the layer and the other for which circuit among the copies to choose. For example, .
Let denote the polynomial defined by
where
Then extends , just as in the single data case. However since is not multilinear like before, we multi-linearize , and represent the multilinear extension of as below. For any ,
where
Note that as the same circuit structure is repeated, and do not depend on , which significantly reduces the cost of evaluating them for both parties and mitigates the bottleneck.
The protocol is the same as the single data version, except for that at layer , the sumcheck protocol is applied to instead.

Check Section 5 of the paper for detailed cost analysis.
References
Last updated