Signed Bucket Index

Presentation: https://youtu.be/4bhCdM7B-Mk

AKA "NAF (non-adjacent form) method"

Note: this method is best used when negation for a group is cheap such as for an elliptic curve.

Motivation

When following a bucket-based method like Pippenger's algorithm, the amount of memory required for the buckets is demanding. How can we reduce the number of buckets per window from 2s12^s-1 to 2s12^{s-1}, where ss is the number of bits per window, and reduce overhead?

Explanation

Let's take a look at an example of creating buckets with Pippenger's algorithm. Let's say the number of windows λs=2\lceil\frac{\lambda}{s}\rceil=2 and the bit size per window s=3s = 3. We'll look at the following example where kk denotes scalars and PP refers to points on an elliptic curve for the MSM problem.

S=k1P1+k2P2+k3P3+k4P4+k5P5+k6P6+k7P7=57P1+50P2+43P3+36P4+29P5+22P6+15P7\begin{align*} S &= k_1P_1 + k_2P_2 + k_3P_3+ k_4P_4 + k_5P_5 + k_6P_6 + k_7P_7 \\ &= 57P_1 + 50P_2 + 43P_3 + 36P_4 + 29P_5 + 22P_6 + 15P_7 \\ \end{align*}
k1=57=120021022window 1123124125window 2=k1,1+k1,22s=1+7(23)k2=50=020121022window 1023124125window 2=k2,1+k2,12s=2+6(23)k3=43=120121022window 1123024125window 2=k3,1+k3,22s=3+5(23)k4=36=020021122window 1023024125window 2=k4,1+k4,22s=4+4(23)k5=29=120021122window 1123124025window 2=k5,1+k5,22s=5+3(23)k6=22=020121122window 1023124025window 2=k6,1+k6,22s=6+2(23)k7=15=120121122window 1123024025window 2=k7,1+k7,22s=7+1(23)\begin{align*} \begin{array}{ll} \begin{aligned} k_1 = 57 &= \overbrace{\underbrace{1}_{2^0}\underbrace{0}_{2^1}\underbrace{0}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{1}_{2^3}\underbrace{1}_{2^4}\underbrace{1}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{1,1} + k_{1,2}2^s \\ &= 1 + 7(2^3) \end{aligned} \quad\quad & \begin{aligned} k_2 = 50 &= \overbrace{\underbrace{0}_{2^0}\underbrace{1}_{2^1}\underbrace{0}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{0}_{2^3}\underbrace{1}_{2^4}\underbrace{1}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{2,1} + k_{2,1}2^s \\ &= 2 + 6(2^3) \end{aligned} \\[8ex] \begin{aligned} k_3 = 43 &= \overbrace{\underbrace{1}_{2^0}\underbrace{1}_{2^1}\underbrace{0}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{1}_{2^3}\underbrace{0}_{2^4}\underbrace{1}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{3,1} + k_{3,2}2^s \\ &= 3 + 5(2^3) \end{aligned} & \begin{aligned} k_4 = 36 &= \overbrace{\underbrace{0}_{2^0}\underbrace{0}_{2^1}\underbrace{1}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{0}_{2^3}\underbrace{0}_{2^4}\underbrace{1}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{4,1} + k_{4,2}2^s \\ &= 4 + 4(2^3) \end{aligned} \\[8ex] \begin{aligned} k_5 = 29 &= \overbrace{\underbrace{1}_{2^0}\underbrace{0}_{2^1}\underbrace{1}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{1}_{2^3}\underbrace{1}_{2^4}\underbrace{0}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{5,1} + k_{5,2}2^s \\ &= 5 + 3(2^3) \end{aligned} & \begin{aligned} k_6 = 22 &= \overbrace{\underbrace{0}_{2^0}\underbrace{1}_{2^1}\underbrace{1}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{0}_{2^3}\underbrace{1}_{2^4}\underbrace{0}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{6,1} + k_{6,2}2^s \\ &= 6 + 2(2^3) \end{aligned} \\[8ex] \begin{aligned} k_7 = 15 &= \overbrace{\underbrace{1}_{2^0}\underbrace{1}_{2^1}\underbrace{1}_{2^2}}^{\mathclap{\text{window 1}}} \overbrace{\underbrace{1}_{2^3}\underbrace{0}_{2^4}\underbrace{0}_{2^5}}^{\mathclap{\text{window 2}}} \\[1ex] &= k_{7,1} + k_{7,2}2^s \\ &= 7 + 1(2^3) \end{aligned} \end{array} \end{align*}

With out buckets , we create our windows WW:

W1=P1+2P2+3P3+4P4+5P5+6P6+7P7=B1,1+2B2,1+3B3,1+4B4,1+5B5,1+6B6,1+7B7,1W2=P7+2P6+3P5+4P4+5P3+6P2+7P1=B1,2+2B2,2+3B3,2+4B4,2+5B5,2+6B6,2+7B7,2\begin{align*} W_1 &= P_1 + 2P_2 + 3P_3 + 4P_4 + 5P_5 +6P_6 + 7P_7 = B_{1,1} + 2B_{2,1} + 3B_{3,1} + 4B_{4,1} + 5B_{5,1} + 6B_{6,1} + 7B_{7,1}\\ W_2 &= P_7 + 2P_6 + 3P_5+4P_4 + 5P_3 + 6P_2 + 7P_1 = B_{1,2} + 2B_{2,2} + 3B_{3,2} + 4B_{4,2} + 5B_{5,2} + 6B_{6,2} + 7B_{7,2}\\ \end{align*}

Introducing NAF for buckets

Note that any computation kPkP can be reduced to (k2s)P+2sP(k-2^{s})P + 2^{s}P where ss is the bit width of kk.

This means we can define the following NAF bucket method k=NAF(k,P)k' = \mathsf{NAF}(k, P) for window WjW_j as such:

  1. If 0<k<2s10 < k < 2^{s-1}:

    1. k=kk' = k

    2. Accumulate PP into the kk-th bucket Bk,jB_{k,j}

  2. If 2s1k<2s2^{s-1} \le k < 2^s:

    1. k=k2sk' = k - 2^s, then kP=k(P)+2sPkP = -k'(-P) + 2^sP

    2. Accumulate P-P into the k-k'-th bucket Bk,jB_{-k', j}

    3. Add PP into the next window

Step 2c computes the 2sP2^{s}P part of the kPkP reduction and is also known as the "carry."

Let's change our naive Pippenger's to follow this method and generate our buckets and windows:

Stage 1: Separating previous window W1W_1 into new buckets
Stage 2: Separating previous window W2W_2 into new buckets

Notice that this method requires the presence of an extra carry bucket B1,3B_{1,3}. This extra carry bucket should not exist. We'll take a look later at how to prevent this after this Naive Pippenger's to NAF method is fully shown.

Bucket Accumulation

Buckets
Computation

B1,1B_{1,1}

P1P7P_1-P_7

B2,1B_{2,1}

P2P6P_2-P_6

B3,1B_{3,1}

P3P5P_3-P_5

B4,1B_{4,1}

P4-P_4

B1,2B_{1,2}

P1-P_1

B2,2B_{2,2}

P7P2P_7-P_2

B3,2B_{3,2}

P6P4P3P_6-P_4-P_3

B4,2B_{4,2}

P5-P_5

B1,3B_{1,3}

P5+P4+P3+P2+P1P_5+P_4 + P_3+P_2+P_1

Bucket Reduction

W1=B1,1+2B2,1+3B3,1+4B4,1W2=B1,2+2B2,2+3B3,2+4B4,2B1,3\begin{align*} W_1 &= B_{1,1} + 2B_{2,1} + 3B_{3,1} + 4B_{4,1} \\ W_2 &= B_{1,2} + 2B_{2,2} + 3B_{3,2} + 4B_{4,2} \\ &\sim* B_{1,3}* \end{align*}

Window Reduction

S=W1+22W2+24(B1,3)S = W_1 + 2^2W_2+ 2^4(B_{1,3})

Preventing the formation of an extra bucket

1. Ensuring last window bit size is within limit

The formation of an extra carry bucket depends on the true bit size slasts_\mathsf{last} of the final window in relation to λ\lambda, the total bits of kk, and ss, the bits per windows.

Safe Case: slast<s1s_\mathsf{last} < s - 1

  • Example: λ=8\lambda = 8, s=5s = 5k=k1+25k2k = k_1 + 2^5 k_2

  • Then slast=3s_\mathsf{last} = 3, and 0k2<230 \le k_2 < 2^3

  • Since k2+1<24k_2 + 1 < 2^4, carry cannot occur ⇒ It’s safe to assume the final carry is 0.

Carry Case: slasts1s_\mathsf{last} \ge s - 1

  • Example: λ=8\lambda = 8, s=4s = 4k=k1+24k2k = k_1 + 2^4 k_2

  • Here slast=4s_\mathsf{last} = 4, and 0k2<240 \le k_2 < 2^4

  • Since it is possible that k2+123k_2 + 1 \ge 2^3, carry can occur

Thus, if the restriction slast<s1s_\mathsf{last} < s - 1 is held, we can ensure there will be no new carry bucket created.

2. Naive Fix: Use Larger Bucket Only for Final Window

https://github.com/kroma-network/tachyon/blob/e7b13063c6f110b9851655cca740ee62ffb41137/tachyon/math/elliptic_curves/msm/algorithms/pippenger/pippenger.h#L119-L123
    if (is_last_window) {
      bucket_size = size_t{1} << ctx_.window_bits;
    } else {
      bucket_size = size_t{1} << (ctx_.window_bits - 1);
    }
  • Use a larger bit-width bucket for the final window to absorb potential carry.

  • Cons: This is more complex and can be inefficient due to extra bucket space.

3. Use a Larger λ\lambda Than the Actual Modulus Bit Size

Let’s say we are working with BN254, where λ=256\lambda = 256 and s=16s = 16. The actual scalar field is 254 bits.

We express kk as:

k=k1+216k2++216×15k16k = k_1 + 2^{16}k_2 + \dots + 2^{16\times15}k_{16}

Let MM be the modulus of BN254. Since the maximum value of kk is M1M - 1, we have:

1k16+1=M1216×15+1<2141 \le k_{16} + 1 = \lfloor \frac{M-1}{2^{16\times15}} \rfloor + 1 <2^{14}

This ensures that no overflow occurs in the final window.

M = 21888242871839275222246405745257275088548364400416034343698204186575808495617
print((M - 1) // (2**(16 * 15)) + 1 < 2 ** 14) # True

Yrrid’s Trick

Code Example

Here's a code snippet from Tachyon implementing NAF for bucket MSM:

void FillDigits(uint32_t kLambda, uint32_t s, const ScalarField& f, std::vector<int64_t>* digits) {
  // The size of `digits` must be equal to ceil(lambda / s)
  // (lambda: total number of bits, s: window size in bits)
  CHECK_EQ(digits->size(), static_cast<size_t>(std::ceil(kLambda / s)));
  
  // `radix` is 2^s — the maximum range per window
  uint64_t radix = uint64_t{1} << s;

  // `carry` to be propagated to the next window
  uint64_t carry = 0;
  size_t bit_offset = 0;
  for (size_t i = 0; i < digits->size(); ++i) {
      // mᵢ: s-bit value extracted from the current window, including previous carry
    uint64_t m_i = f.ExtractBits(bit_offset, s) + carry;
  
    // Compute carry: if mᵢ is greater than or equal to radix / 2, carry = 1; 
    //.               otherwise carry = 0
    // This is based on the NAF rounding threshold (radix / 2)
    carry = (m_i + (radix >> 1)) >> s;
    
    // Convert mᵢ to its NAF form: mᵢ - carry * 2^s
    (*digits)[i] = static_cast<int64_t>(m_i) - static_cast<int64_t>(carry << s);
    bit_offset += s;
  }
  
  // Apply the final carry to the last digit
  // This adjustment preserves the total scalar value
  digits->back() += (carry << s);
}

void AddBasesToBuckets(uint32_t s, const std::vector<Point> bases, 
    const std::vector<std::vector<int64_t>>& digits,
    size_t window_index, std::vector<Bucket>& buckets) {
  for (size_t i = 0; i < scalar_digits.size(); ++i) {
    int64_t scalar = scalar_digits[i][window_index];
    if (0 < scalar) {
      // If the scalar is positive:
      // Place the corresponding base into the (scalar - 1)-th bucket
      //
      // Why `scalar - 1`?
      // Because scalars range from 1 to 2^{s-1} - 1 in NAF,
      // but bucket indices are 0-based (0 to 2^{s-1} - 2)
      // so we subtract 1 to get the correct index.
      buckets[static_cast<uint64_t>(scalar - 1)] += bases[i];
    } else if (0 > scalar) {
      // If the scalar is negative:
      // Place the *negated* base into the (-scalar - 1)-th bucket
      //
      // Why `-scalar - 1`?
      // Because negative scalars range from -1 to -2^{s-1},
      // and we want to map -1 → index 0, -2 → index 1, ..., just like for positives.
      buckets[static_cast<uint64_t>(-scalar - 1)] -= bases[i];
    }
  }
}

References

Written by ryan Kim & Ashley Jeong of A41

Last updated