Binius
This article aims to intuitively explain the objectives and process of the Binius protocol.
Introduction
In recent STARKs, there has been a trend of using smaller fields to achieve faster performance, such as Goldilocks (64-bit) and smaller 32-bit fields like Baby Bear and Mersenne 31. Naturally, this leads to the question: Can we use the smallest field, the binary field?
Given that computers are optimized for bit operations, using a binary field could promise better performance. Additionally, if we can express constraints efficiently using the binary field, this would allow us to handle binary operations in commonly used hash functions like Keccak more effectively.
While the STARK paper attempted to use binary fields, it faced challenges:
The binary field was too small to serve as an "alphabet" in Reed-Solomon (RS) codes compared to the block length, necessitating embedding.
As noted in HyperPlonk, using a shift function to traverse the entire evaluation domain introduced overhead.
This article will explore how Binius overcomes these issues.
Background
Embedding Overhead
Embedding overhead refers to the inefficiency caused when representing values smaller than a given field and filling the space difference with zero padding or similar techniques. This overhead often occurs during range checks or when binary operations are arithmetized. For example, in the Keccak-256 circuit, out of 2,531 columns, 2,328 columns each represent a single bit, but since each column can hold up to 64 bits, there is a massive embedding overhead.
Binary Field
Addition in a binary field operates as follows:
This is equivalent to the XOR operation. Since x + (-x) = 0 defines -x as the negative of x, and 0 + 0 = 0 and 1 + 1 = 0, the negative of any element in a binary field is the element itself. In other words, subtraction is equivalent to addition:
Furthermore, a doubling operation such as always equals 0, since it becomes a multiple of 2. Multiplication in a binary field is therefore defined as follows:
This is equivalent to the AND operation.
Binary Tower
A binary tower is a hierarchical structure of finite fields, where each field in the hierarchy is an extension of the previous field, and the base field is . In An Iterated Quadratic Extension of GF(2), it is stated that an extension field can be constructed as
This can be illustrated as follows:

For example, let’s represent 135 using a binary tower. Here, corresponds to , corresponds to , and corresponds to :
By continually substituting with , we can transform it into the form of a multilinear polynomial:
The equation above shows how we can transform a case of to a multilinear polynomial, while the equation below shows how we can transform a case of to a multilinear polynomial.
This allows for the canonical mapping of bitstrings. (For example, in the 3-bit space maps both 0 and 7 to 0.)
Here, the set of are coefficients of a multilinear polynomial.
In the extension field, addition can be performed using XOR as in the binary field. However, for multiplication, the following steps must be used:
Multiply both sides (for ):
Substitute with :
If , stop. Otherwise, repeat the process for , , , and .
Notably, using the Karatsuba algorithm, the number of multiplications can be reduced from 4 to 3:
According to On efficient inversion in tower fields of characteristic two, multiplication with this binary tower structure can be computed in .
Note that the classic method for constructing extension fields is as follows:
In this extension field, addition is still computed using XOR. For multiplication, polynomials of degree 127 are multiplied and then divided by .
Using Binary Towers for extension field operations provides an advantage over doing so with the classic method due to the following reasons:
In circuits, various data sizes (e.g., bits) can be represented efficiently with Binary Towers.
When expressing using 16 instances of , efficient embedding methods (packing) are available. In contrast, classic methods polynomial and are often less efficient.
Multiplying an element of with an element of is faster than multiplying two elements of . The cost is . In contrast, classic methods require embedding into , making them slower.
Protocol Explanation
Block-level Encoding-based PCS
In Brakedown, instead of Reed-Solomon (RS) codes, linear codes with linear encoding time are used. In comparison, Binius adopts the overall style of Brakedown while leveraging RS codes with the Singleton bound, which states that the maximum distance for -codes is . Furthermore, instead of using radix-2 FFT for RS encoding in a binary field, additive FFT (paper) is used; however, the size of the alphabet must be at least , but the size of the alphabet is far smaller than . To address this, we construct using, for example, 16 instances of . This method is called packing. Since the encoding is performed in blocks of , this is referred to as block-level encoding.
and evaluation point , let be an matrix consisting of the coefficients in the multilinear Lagrange basis. (For simplicity, let . As aforementioned in the Binary Towers section, . The evaluation can then be computed as follows:
Here, , and is the -th row of .
From the matrix , pack each row of length into a new matrix .
If the rate is , encode each row of the matrix into rows of length , forming an matrix . The prover sends a Merkle hash-based commitment of the matrix to the verifier. Note that while the polynomial represented by the matrix is , the polynomial represented by is a new polynomial .
The verifier samples and sends it to the prover.
The prover computes and , and sends both to the verifier. Note that is over , is over , and is over . As mentioned in Binary Towers, multiplication in this setting is performed efficiently.
The verifier samples for times, where is the repetition count, and sends to the prover (Remember ).
The verifier uses Merkle proofs to verify that is correct.
Finally, the verifier checks that packing and encoding were performed correctly: . Given that ranges between and , testing is performed at the block level in units of . This is referred to as block-level testing.
Pack Virtual Polynomial
Let us revisit the block-level encoding described earlier. In a Polynomial IOP, the polynomial that the verifier requests the prover to open is . However, the polynomial that the prover actually commits to and opens is , which is derived by packing into . The packing function is defined as follows:
Here, is the multilinear extension (MLE) of , and represents the multilinear Lagrange basis. For example, when and , the equation becomes:
Using a sumcheck protocol, this computation can be replaced with an evaluation claim such that:
Replacing with , the following equation is defined:
The following sum holds:
For example, when :
By applying a sumcheck protocol, this is replaced with an evaluation claim:
Thus, we obtain , allowing us to query the original polynomial through its virtual polynomial .
Shift Virtual Polynomial
As introduced in the previous article on HyperPlonk, the shift polynomial helps traverse the evaluation domain. For and shift offset , the shift mapping is defined such that, for , the condition holds. For example, shifting by (a single step) works as follows:
The shift operator is defined as:
The shift indicator can be expressed as follows: (Refer to the section 4.3. in the paper to see how it is algebraically expressed)
Thus, the shift polynomial is defined as:
For the earlier example:
By applying the sumcheck protocol, we get the evaluation claim:
This leads to:
The following sum holds:
This allows us to query the shift polynomial using the virtual polynomial and .
Conclusion
While this article does not cover all the details, it addresses the key aspects of utilizing the binary field. To enable RS encoding using the binary field, we applied a modified version of Brakedown with packing and used the shift operator to traverse the domain efficiently.

The image above is captured from the Binius paper. The first row, Hyrax, shows the PCS used in Lasso, while the second and third rows show FRI, the PCS used in Plonky3. In Plonky3, even when data smaller than 31 bits is used, there is no speed difference, so the data is omitted for efficiency. In the case of Binius, while it is slower than Plonky3 Keccak-256 for 32-bit data, it significantly reduces the commitment time for 1-bit data.

The image above is captured from the Binius paper. It turns out Binius is almost 2 times faster to commit, around 2~3 times faster to prove, and around 5~6 times faster to verify.
Last updated