Signed Bucket Index
Presentation: https://youtu.be/4bhCdM7B-Mk
AKA "NAF (non-adjacent form) method"
Note: this method is best used when negation for a group is cheap such as for an elliptic curve.
Motivation
When following a bucket-based method like Pippenger's algorithm, the amount of memory required for the buckets is demanding. How can we reduce the number of buckets per window from to , where is the number of bits per window, and reduce overhead?
Explanation
Naive Pippenger's
Let's take a look at an example of creating buckets with Pippenger's algorithm. Let's say the number of windows and the bit size per window . We'll look at the following example where denotes scalars and refers to points on an elliptic curve for the MSM problem.
With out buckets , we create our windows :
Introducing NAF for buckets
Note that any computation can be reduced to where is the bit width of .
This means we can define the following NAF bucket method for window as such:
If :
Accumulate into the -th bucket
If :
, then
Accumulate into the -th bucket
Add into the next window
Step 2c computes the part of the reduction and is also known as the "carry."
Let's change our naive Pippenger's to follow this method and generate our buckets and windows:


Notice that this method requires the presence of an extra carry bucket . This extra carry bucket should not exist. We'll take a look later at how to prevent this after this Naive Pippenger's to NAF method is fully shown.
Ashley Jeong: I'm unsure why they don't just add into the bucket. Perhaps this is a source of optimization when implementing MSM.
Bucket Accumulation
Bucket Reduction
Window Reduction
Preventing the formation of an extra bucket
1. Ensuring last window bit size is within limit
The formation of an extra carry bucket depends on the true bit size of the final window in relation to , the total bits of , and , the bits per windows.
Safe Case:
Example: , ⇒
Then , and
Since , carry cannot occur ⇒ It’s safe to assume the final carry is 0.
Carry Case:
Example: , ⇒
Here , and
Since it is possible that , carry can occur
Thus, if the restriction is held, we can ensure there will be no new carry bucket created.
2. Naive Fix: Use Larger Bucket Only for Final Window
if (is_last_window) {
bucket_size = size_t{1} << ctx_.window_bits;
} else {
bucket_size = size_t{1} << (ctx_.window_bits - 1);
}
Use a larger bit-width bucket for the final window to absorb potential carry.
Cons: This is more complex and can be inefficient due to extra bucket space.
3. Use a Larger Than the Actual Modulus Bit Size
Let’s say we are working with BN254, where and . The actual scalar field is 254 bits.
We express as:
Let be the modulus of BN254. Since the maximum value of is , we have:
This ensures that no overflow occurs in the final window.
M = 21888242871839275222246405745257275088548364400416034343698204186575808495617
print((M - 1) // (2**(16 * 15)) + 1 < 2 ** 14) # True
Yrrid’s Trick
Yrrid, the Zprize winner in 2022 for the category "Prize 1a: Accelerating MSM Operations on GPU" explains one more optimization they used for their solution to guarantee that no carry occurs in the final window. When the MSB of is set to 1, they apply the following transformation to reset the MSB to 0, thereby preventing any carry:
Here, is the modulus of the scalar field (i.e., curve order), and .
Let denote the -bit chunk in the final window. Yrrid's trick is said to ensure that doesn't overflow.
Though this is what they state, it appears as if overflow can still occur as shown in the following example:
For example, in with , the values where the MSB of is set to 1 are 8, 9, and 10. This means that -8, -9, and -10 will be transformed into 3, 2, and 1, respectively. With this transformation, the new maximum value of becomes . If , then:
In this case with the maximum value of as 7, will be at most 1, meaning it does not satisfy its limit constraints:
Code Example
Here's a code snippet from Tachyon implementing NAF for bucket MSM:
void FillDigits(uint32_t kLambda, uint32_t s, const ScalarField& f, std::vector<int64_t>* digits) {
// The size of `digits` must be equal to ceil(lambda / s)
// (lambda: total number of bits, s: window size in bits)
CHECK_EQ(digits->size(), static_cast<size_t>(std::ceil(kLambda / s)));
// `radix` is 2^s — the maximum range per window
uint64_t radix = uint64_t{1} << s;
// `carry` to be propagated to the next window
uint64_t carry = 0;
size_t bit_offset = 0;
for (size_t i = 0; i < digits->size(); ++i) {
// mᵢ: s-bit value extracted from the current window, including previous carry
uint64_t m_i = f.ExtractBits(bit_offset, s) + carry;
// Compute carry: if mᵢ is greater than or equal to radix / 2, carry = 1;
//. otherwise carry = 0
// This is based on the NAF rounding threshold (radix / 2)
carry = (m_i + (radix >> 1)) >> s;
// Convert mᵢ to its NAF form: mᵢ - carry * 2^s
(*digits)[i] = static_cast<int64_t>(m_i) - static_cast<int64_t>(carry << s);
bit_offset += s;
}
// Apply the final carry to the last digit
// This adjustment preserves the total scalar value
digits->back() += (carry << s);
}
void AddBasesToBuckets(uint32_t s, const std::vector<Point> bases,
const std::vector<std::vector<int64_t>>& digits,
size_t window_index, std::vector<Bucket>& buckets) {
for (size_t i = 0; i < scalar_digits.size(); ++i) {
int64_t scalar = scalar_digits[i][window_index];
if (0 < scalar) {
// If the scalar is positive:
// Place the corresponding base into the (scalar - 1)-th bucket
//
// Why `scalar - 1`?
// Because scalars range from 1 to 2^{s-1} - 1 in NAF,
// but bucket indices are 0-based (0 to 2^{s-1} - 2)
// so we subtract 1 to get the correct index.
buckets[static_cast<uint64_t>(scalar - 1)] += bases[i];
} else if (0 > scalar) {
// If the scalar is negative:
// Place the *negated* base into the (-scalar - 1)-th bucket
//
// Why `-scalar - 1`?
// Because negative scalars range from -1 to -2^{s-1},
// and we want to map -1 → index 0, -2 → index 1, ..., just like for positives.
buckets[static_cast<uint64_t>(-scalar - 1)] -= bases[i];
}
}
}
References
Written by ryan Kim & Ashley Jeong of A41
Last updated