LogoLogo
LogoLogo
  • Introduction
    • About Us
    • Notations & Definitions
      • MPC
      • ZK
    • Contribute to this Site!
  • Primitives
    • Multiplication
      • Karatsuba Multiplication
      • Toom-Cook Multiplication
    • NAF (Non-adjacent form)
    • Chinese Remainder Theorem (CRT)
    • Euclidean Algorithm
      • Extended Euclidean Algorithm
      • Binary Euclidean Algorithm
      • Extended Binary Euclidean Algorithm
    • Coding Theory
      • Linear Code
    • Number Theoretic Transform
    • Abstract Algebra
      • Group
        • -Morphisms
        • Batch Inverse
      • Elliptic Curve
        • Weierstrass Curve
          • Coordinate Forms
          • Fast Elliptic Curve Arithmetic and Improved WEIL Pairing Evaluation
        • Edwards Curve
          • Coordinate Forms
          • Twisted Edwards ↔ Short Weierstrass Transformation
        • Batch Inverse for Batch Point Additions
        • Scalar Multiplication
          • Double-and-add
          • GLV Decomposition
        • MSM
          • Pippenger's Algorithm
          • Signed Bucket Index
          • CycloneMSM
          • EdMSM
          • cuZK
        • 2-Chain and 2-Cycle of Elliptic Curves
    • Encryption Scheme
      • ElGamal Encryption
    • Modular Arithmetic
      • Modular Reduction
        • Barrett Reduction
        • Montgomery Reduction
      • Modular Inverse
        • Bernstein-Yang's Inverse
    • Multiset Check
    • Sumcheck
    • Commitment Scheme
      • Fflonk
      • SHPlonk
      • Zeromorph
  • MPC
    • Yao's Garbled Circuits
    • GMW
    • BMR
  • ZK
    • Arithmetization
      • R1CS
      • PLONK
      • AIR
    • Folding
      • LatticeFold
      • Nova
        • Nova over Cycles of Curves
    • Lookup
      • Lasso
      • LogUp-GKR
    • SNARK
      • Groth16
      • HyperPlonk
      • Spartan
        • SPARK
    • STARK
      • Additive NTT
      • Basefold
      • Binius
      • Brakedown
      • CircleSTARK
      • FRI
        • FRI Security Features and Optimizations
      • DEEP FRI
      • STIR
      • WHIR
    • Distributed ZK
      • Ryan's Trick for Distributed Groth16
  • Application
    • zkLogin
    • zkHoldem
    • zkTLS
      • DECO
      • Proxying is enough
  • zkVM
Powered by GitBook
On this page
  • Introduction
  • Background
  • AEAD (Authenticated Encryption with Associated Data)
  • AES-GCM: Galois/Counter Mode
  • ChaCha20-Poly1305
  • Context Attacks of AEAD
  • Lack of Key Commitment in AES-GCM and ChaCha20-Poly1305
  • Protocol Explanation
  • Proxy-based zkTLS
  • Case 1: If You Are Using HTTPS
  • Case 2: If HTTPS Is Not Used
  • Conclusion
  • References
Export as PDF
  1. Application
  2. zkTLS

Proxying is enough

Presentation: https://youtu.be/OL5rqvLkJfE

PreviousDECONextzkVM

Last updated 1 month ago

Introduction

achieves selective data provenance from a TLS server by enforcing a server-client-verifier communication pattern. However, DECO introduces significant performance overhead. This raises the question: is it possible to achieve the same functionality by placing the verifier as a proxy between the client and server? Naturally, the participant acting as the proxy must not be able to learn any sensitive information. While DECO's strength lies in requiring no modifications to the server side, a proxy-based approach offers the additional benefit of not requiring any modifications on the client side either.

Background

AEAD (Authenticated Encryption with Associated Data)

AEAD is a widely used encryption method in modern cryptographic systems, offering two key security guarantees simultaneously:

  1. Confidentiality (Encryption): It encrypts the plaintext so that third parties cannot read its content.

  2. Integrity and Authentication: It ensures that any tampering with the ciphertext or related information can be detected.

AEAD provides stronger security than simple encryption by also covering various types of metadata encountered in real-world scenarios. This is where the concept of Associated Data (AD) comes into play.

What is Associated Data (AD)?

Associated Data is data that is not encrypted but is still included in the authentication scope.

  • AD remains in plaintext and is visible to anyone.

  • However, during decryption, if the provided AD does not match the original AD used during encryption, the authentication fails.

✉️ Think of it like the address on an envelope — it’s visible to everyone, but it must not be altered.

Why is AD needed?

Traditional encryption schemes (e.g., AES-CBC) only encrypt the plaintext and leave other parts of the message (headers, identifiers, etc.) unprotected.

In real-world communication systems, the existence of unauthenticated metadata leads to several potential attacks:

Attack Type
Description

Reordering

Changing the order of messages to alter meaning or induce errors

Replay

Re-sending the same message repeatedly to confuse the system

Re-targeting

Forging a message to appear as if intended for a different recipient

→ These attacks arise because the context surrounding the ciphertext is not protected.

Effects of Including AD in AEAD

AEAD generates an authentication tag that includes both the ciphertext and the AD. This leads to the following outcomes:

  • The plaintext is encrypted,

  • The AD is authenticated but not encrypted,

  • If the AD does not match during decryption, the authentication fails and decryption is rejected.

Effect
Description

✅ Header forgery prevention

Protects metadata outside the encrypted portion

✅ Receiver verification

Allows messages to be accepted only by specific recipients

✅ Context binding

Cryptographically binds key, nonce, and AD into a secure “context”

AEAD Operations

An AEAD scheme defines the following interfaces:

AES-GCM: Galois/Counter Mode

AES-GCM is an AEAD (Authenticated Encryption with Associated Data) algorithm that combines the following two components:

The plaintext and associated data are divided into blocks:

  1. Encrypt plaintext using AES-CTR:

  1. Generate authentication tag using GHASH:

  1. Verify authentication tag using GHASH:

  1. Decrypt ciphertext using AES-CTR:

ChaCha20-Poly1305

  1. Encrypt the plaintext using ChaCha20:

  1. Generate the authentication tag using Poly1305:

  1. Verify the authentication tag using Poly1305:

  1. Decrypt the ciphertext using ChaCha20:

Context Attacks of AEAD

Context Discovery Attack (CDY)

  • This attack evaluates how strongly an AEAD scheme binds a ciphertext to a specific context.

Context Commitment Attack (CMT)

  • The success of this attack implies that a ciphertext is not uniquely bound to a single context.

  • It exploits the lack of Key Commitment in the AEAD scheme.

CDY vs. CMT: Relationship and Analogy

Attack Type
Hash Function Analogy

CDY Attack

Preimage Attack (ciphertext → context)

CMT Attack

Collision Attack (same ciphertext → two valid contexts)

This mirrors the security properties of cryptographic hash functions: If a hash function is collision-resistant, it is also preimage-resistant. Similarly, if an AEAD scheme is secure against CMT attacks, it is also secure against CDY attacks. (We will discuss Context Unforgeability in a later section.)

Lack of Key Commitment in AES-GCM and ChaCha20-Poly1305

Although AES-GCM and ChaCha20-Poly1305 are widely used AEAD schemes, they do not guarantee the Key Commitment property. That is, the same ciphertext may be valid under different (key, nonce) combinations, which introduces a potential vulnerability. Here's why:

  • In AES-GCM, the authentication tag is computed as:

  • In ChaCha20-Poly1305, the tag is computed as:

Protocol Explanation

Proxy-based zkTLS

  1. There is no 2PC (two-party computation) overhead, offering a performance benefit.

  2. Unlike DECO, no modifications are required on either the server or the client side.


The proof in this design can be structured as follows:


then a serious security vulnerability may arise.

This vulnerability stems from the structural properties of TLS itself and may affect systems like DECO in a similar way.

However, if the underlying AEAD scheme does satisfy Key Commitment, then secure zk proofs can still be constructed in the proxy-based model, as proven in Theorem 5.2 of the referenced paper.

Case 1: If You Are Using HTTPS

Fixed Padding

This reasoning is justified when modeling AES as an ideal cipher.


📌 Definition: Ideal Cipher


Then a natural question arises: do we have this kind of padding structure in TLS?

Variable Padding

Although these values are not fixed, their possible combinations are highly constrained, offering useful—though weaker—security compared to fixed padding. This is referred to as Variable Padding.

For example:

  • The Date field can vary per second, but within a 1-hour time window, it can take on 3600 values.

Thus, the total number of possible combinations is:

However, this security assumption relies on the presence of HTTPS. What happens if HTTPS is not available?

Case 2: If HTTPS Is Not Used

Context Forgery Attack (CFY)

  • This attack model lies between the Context Discovery (CDY) and Context Commitment (CMT) models, and it evaluates whether an AEAD scheme enforces decryption only under a unique context.

CFY ↔ Second Preimage Attack Analogy

In this analogy, the relationship is:

  • Preimage Attack ↔ CDY Attack

  • Second Preimage Attack ↔ CFY Attack

Comparison table of context attack on AEAD

The table below compares different types of context attacks on AEAD.

Attack Type
What the Attacker Knows
What the Attacker Finds
Analogous Hash Attack

CDY (Context Discovery)

Preimage attack

CFY (Context Forgery)

Second Preimage attack

CMT (Context Commitment)

Collision attack

AES-GCM CFY Attack Analysis



Thus, the upper bound on the attack success probability is:

ChaCha20-Poly1305 CFY Attack Analysis

Rearranging this equation, we obtain:


Let’s analyze the distribution of both sides.

Thus, the probability that the equation holds — i.e., the probability that the attack succeeds — is:

The Necessity of Key Commitment for Fixed Data

Whether AEAD schemes require the Key Commitment property depends on the nature of the data being protected.

  • When Key Commitment is essential: For sensitive information such as bank account balances, Key Commitment is critical. If the ciphertext is decrypted under a forged context, it may result in misinterpreted financial data, which could lead to serious financial incidents. Therefore, decryption must succeed only under a single, correct context, a property guaranteed when AEAD satisfies Key Commitment.

  • When Key Commitment is not required: Data such as age, account numbers, or national ID numbers are examples of Fixed Data, which do not change across sessions. Even if an attacker manipulates the context to alter the age from 20 to 80, for instance, proper external safeguards can detect or prevent such tampering effectively.

Even in environments where HTTPS cannot be used, and therefore Variable Padding is unavailable, there is no need for despair.

As discussed earlier, when using ChaCha20-Poly1305 as the AEAD scheme, even though it lacks Key Commitment, the probability of a successful CFY attack is extremely low, and a zkTLS protocol based on Fixed Data can still be implemented safely and practically.

Conclusion

This article examined the Key Commitment issue of AEAD schemes in the context of selective data disclosure over TLS. In particular, it highlighted the limitations of the existing DECO protocol and explored how a proxy-based approach could overcome them. The proxy-based design has the advantage of requiring no modifications to either the server or the client, while still enabling selective disclosure through zero-knowledge proofs.

However, since AES-GCM and ChaCha20-Poly1305, the AEAD schemes commonly used in TLS, do not guarantee the Key Commitment property, any such protocol must ensure that AEAD decryption is tightly bound to a unique context to be considered secure.

This article proposed two practical solutions to address the issue:

  1. In an HTTPS environment, the use of Variable Padding can enforce a predictable structure in ciphertexts, helping to mitigate the lack of key commitment.

  2. In non-HTTPS environments, the statistical security guarantees of AEAD schemes like ChaCha20-Poly1305 make it feasible to build a secure zkTLS protocol, especially for Fixed Data.

Therefore, while enforcing full Key Commitment in all scenarios may be unrealistic, by carefully analyzing the data characteristics, the structure of the AEAD scheme, and the capabilities of potential adversaries, it is entirely feasible to design a secure and practical zkTLS system.

That said, one possible drawback of the proxy-based approach is that the proxy has visibility into all communications, which could be viewed as a trade-off.

References

AEAD.Gen(τ)→(k,n)\mathsf{AEAD.Gen}(\tau) \rightarrow (k, n)AEAD.Gen(τ)→(k,n): Generates a secret key kkk and nonce nnn from transcript τ\tauτ. We refer to the pair (k,n)(k, n)(k,n) as the context.

AEAD.Enck(n,a,m)→(c,σ)\mathsf{AEAD.Enc}_k(n, a, m) \rightarrow (c, \sigma)AEAD.Enck​(n,a,m)→(c,σ): Encrypts plaintext mmm using key kkk, nonce nnn, and associated data aaa, producing ciphertext ccc and authentication tag σ\sigmaσ.

AEAD.Deck(n,a,c,σ)→m or FAILED\mathsf{AEAD.Dec}_k(n, a, c, \sigma) \rightarrow m\ \text{or}\ \mathsf{FAILED}AEAD.Deck​(n,a,c,σ)→m or FAILED: Verifies σ\sigmaσ using kkk, nnn, and aaa. If valid, decrypts ccc to recover plaintext mmm; otherwise, returns FAILED\mathsf{FAILED}FAILED.

-: A stream cipher method for encrypting plaintext.

: A function used to compute an authentication tag from the ciphertext and associated data.

m=(m1,…,mℓm)m = (m_1, \dots, m_{\ell_m})m=(m1​,…,mℓm​​)

a=(a1,…,aℓa)a = (a_1, \dots, a_{\ell_a})a=(a1​,…,aℓa​​)

where ℓm\ell_mℓm​ and ℓa\ell_aℓa​ are the number of blocks in the plaintext and AD, respectively.

AES-GCM.Enck(n,a,m)→(c,σ)\mathsf{AES\text{-}GCM.Enc}_k(n, a, m) \rightarrow (c, \sigma)AES-GCM.Enck​(n,a,m)→(c,σ)

ci=mi⊕AESk(n+i)c_i = m_i \oplus \mathsf{AES}_k(n + i)ci​=mi​⊕AESk​(n+i)
σ=AESk(n)⊕GHASH(a,c)\sigma =\mathsf{AES}_k(n) \oplus \mathsf{GHASH}(a, c)σ=AESk​(n)⊕GHASH(a,c)

AES-GCM.Deck(n,a,c,σ)→m or FAILED\mathsf{AES\text{-}GCM.Dec}_k(n, a, c, \sigma) \rightarrow m \text{ or } \mathsf{FAILED}AES-GCM.Deck​(n,a,c,σ)→m or FAILED

σ=?AESk(n)⊕GHASH(a,c)\sigma \stackrel{?}=\mathsf{AES}_k(n) \oplus \mathsf{GHASH}(a, c)σ=?AESk​(n)⊕GHASH(a,c)
mi=ci⊕AESk(n+i)m_i = c_i \oplus \mathsf{AES}_k(n + i)mi​=ci​⊕AESk​(n+i)

is an AEAD (Authenticated Encryption with Associated Data) algorithm that combines the following two components:

: A fast stream cipher used for encrypting plaintext.

: A universal hash function that generates an authentication tag from the ciphertext and associated data.

ChaCha20-Poly1305.Enck(n,a,m)→(c,σ)\mathsf{ChaCha20\text{-}Poly1305.Enc}_k(n, a, m) \rightarrow (c, \sigma)ChaCha20-Poly1305.Enck​(n,a,m)→(c,σ)

ci=mi⊕ChaCha20k(n,i)c_i = m_i \oplus \mathsf{ChaCha20}_k(n, i)ci​=mi​⊕ChaCha20k​(n,i)
σ=s+Poly1305r(a,c)\sigma = s + \mathsf{Poly1305}_{r}(a, c)σ=s+Poly1305r​(a,c)

Here, rrr and sss are derived as follows:

(r,s)=ChaCha20k(n,0)(r, s) = \mathsf{ChaCha20}_k(n, 0)(r,s)=ChaCha20k​(n,0)

ChaCha20-Poly1305.Deck(n,a,c,σ)→m or FAILED\mathsf{ChaCha20\text{-}Poly1305.Dec}_k( n, a, c, \sigma) \rightarrow m \text{ or } \mathsf{FAILED}ChaCha20-Poly1305.Deck​(n,a,c,σ)→m or FAILED

σ=?s+Poly1305r(a,c)\sigma \stackrel{?}= s + \mathsf{Poly1305}_{r}(a, c)σ=?s+Poly1305r​(a,c)
mi=ci⊕ChaCha20k(n,i)m_i = c_i \oplus \mathsf{ChaCha20}_k(n, i)mi​=ci​⊕ChaCha20k​(n,i)

A Context Discovery Attack (CDY) refers to an adversary’s ability to find a valid decryption context (k,n)(k, n)(k,n) for a given (c,a,σ)(c, a, \sigma)(c,a,σ), even if it is not the original context used during encryption.

The goal is not to recover the plaintext itself, but rather to find any context (k,n)(k, n)(k,n) for which decryption succeeds without error.

A Context Commitment Attack (CMT) occurs when an adversary can find two different contexts—say (k1,n1)(k_1, n_1)(k1​,n1​) and (k2,n2)(k_2, n_2)(k2​,n2​) with (k1,n1)≠(k2,n2)(k_1, n_1) \ne (k_2, n_2)(k1​,n1​)=(k2​,n2​)—that both successfully decrypt the same (c,a,σ)(c, a,\sigma)(c,a,σ).

σ=AESk(n)⊕GHASH(a,c)\sigma = \mathsf{AES}_k(n) \oplus \mathsf{GHASH}(a, c)σ=AESk​(n)⊕GHASH(a,c)

Even if AESk1(n1)=AESk2(n2)\mathsf{AES}_{k_1}(n_1) = \mathsf{AES}_{k_2}(n_2)AESk1​​(n1​)=AESk2​​(n2​), it’s possible that (k1,n1)≠(k2,n2)(k_1, n_1) \ne (k_2, n_2)(k1​,n1​)=(k2​,n2​).

σ=s+Poly1305r(a,c)\sigma = s + \mathsf{Poly1305}_{r}(a, c)σ=s+Poly1305r​(a,c)

It is also possible that ChaCha20k1(n1,0)=ChaCha20k2(n2,0)\mathsf{ChaCha20}_{k_1}(n_1, 0) = \mathsf{ChaCha20}_{k_2}(n_2, 0)ChaCha20k1​​(n1​,0)=ChaCha20k2​​(n2​,0), even if (k1,n1)≠(k2,n2)(k_1, n_1) \ne (k_2, n_2)(k1​,n1​)=(k2​,n2​).

According to the referenced paper, in AEAD schemes like AES-GCM and ChaCha20-Poly1305, an adversary A\mathcal{A}A making at most qqq queries to the oracle can succeed in a CMT (Context Commitment) attack with probability at least q232\frac{q}{2^{32}}232q​.

As illustrated above, when the verifier V\mathcal{V}V acts as a proxy between the prover (or TLS client) P\mathcal{P}P and the TLS server S\mathcal{S}S, the following advantages arise:

Since V\mathcal{V}V only relays messages, this design maintains compatibility with various versions of TLS.

In DECO, since P\mathcal{P}P knows kMACk^\mathsf{MAC}kMAC, it can forge a ciphertext c′c'c′ instead of relaying the actual server response ccc from S\mathcal{S}S using kMACk^\mathsf{MAC}kMAC. To prevent this, DECO splits kMACk^\mathsf{MAC}kMAC between P\mathcal{P}P and V\mathcal{V}V, and only allows V\mathcal{V}V to reveal its share after P\mathcal{P}P commits to ccc.

This issue can also be addressed in the proxy-based approach. Because V\mathcal{V}V, acting as a proxy, observes all encrypted requests and responses exchanged between P\mathcal{P}P and S\mathcal{S}S, it can detect if P\mathcal{P}P attempts to tamper with the response and use a forged c′c'c′ instead of the genuine ccc.

In DECO, the issue of forgery arises from P\mathcal{P}P’s knowledge of kMACk^\mathsf{MAC}kMAC, allowing it to manipulate the ciphertext using that key. To mitigate this, the key was split between P\mathcal{P}P and V\mathcal{V}V, and V\mathcal{V}V would only reveal its key share after receiving a commitment to ccc from P\mathcal{P}P.

C(x:{m^,a,c,σ},w:{τ,k,n,m}):AEAD.Gen(τ)=?(k,n)∧AEAD.Enck(n,a,m)=?(c,σ)∧m^∈m∧Pred(m^)C(x:\{ \hat{m}, a, c, \sigma \}, w: \{ \textcolor{red}{\tau}, k, n, m \}): \\ \textcolor{red}{\mathsf{AEAD.Gen}(\tau) \stackrel{?}= (k, n)} \land \mathsf{AEAD.Enc}_k(n, a, m) \stackrel{?}= (c, \sigma) \land \hat{m} \in m \land \mathsf{Pred}(\hat{m})C(x:{m^,a,c,σ},w:{τ,k,n,m}):AEAD.Gen(τ)=?(k,n)∧AEAD.Enck​(n,a,m)=?(c,σ)∧m^∈m∧Pred(m^)

Here, Pred\mathsf{Pred}Pred is a user-defined predicate representing the condition the user wants to prove about m^\hat{m}m^.

However, the constraint AEAD.Gen(τ)=?(k,n)\mathsf{AEAD.Gen(\tau)} \stackrel{?}= (k, n)AEAD.Gen(τ)=?(k,n) introduces significant complexity. To simplify the circuit, we may omit this constraint and instead define the circuit as:

C(x:{m^,a,c,σ},w:{k,n,m}):AEAD.Enck(n,a,m)=?(c,σ)∧m^∈m∧Pred(m^)C(x:\{ \hat{m}, a, c, \sigma \}, w: \{ k, n, m \}): \\ \mathsf{AEAD.Enc}_k(n, a, m) \stackrel{?}= (c, \sigma) \land \hat{m} \in m \land \mathsf{Pred}(\hat{m})C(x:{m^,a,c,σ},w:{k,n,m}):AEAD.Enck​(n,a,m)=?(c,σ)∧m^∈m∧Pred(m^)

This is because AEADs used in TLS 1.2/1.3 do not satisfy Key Commitment, making it possible to find two different combinations (k1,n1,m1)(k_1, n_1, m_1)(k1​,n1​,m1​) and (k2,n2,m2)(k_2, n_2, m_2)(k2​,n2​,m2​) that decrypt the same triple (a,c,σ)(a, c, \sigma)(a,c,σ). For example:

r1=(x:{m^1,a,c,σ},w:{k1,n1,m1})∈Rr_1 = (x: \{ \hat{m}_1, a, c, \sigma \}, w: \{k_1, n_1, m_1\}) \in Rr1​=(x:{m^1​,a,c,σ},w:{k1​,n1​,m1​})∈R

r2=(x:{m^2,a,c,σ},w:{k2,n2,m2})∈Rr_2 = (x: \{ \hat{m}_2, a, c, \sigma \}, w: \{k_2, n_2, m_2\}) \in Rr2​=(x:{m^2​,a,c,σ},w:{k2​,n2​,m2​})∈R

If m^1≠m^2\hat{m}_1 \ne \hat{m}_2m^1​=m^2​, and particularly if Pred(m^1)≠Pred(m^2)\mathsf{Pred}(\hat{m}_1) \ne \mathsf{Pred}(\hat{m}_2)Pred(m^1​)=Pred(m^2​), then multiple conflicting claims can be proven over the same ciphertext, leading to critical security failures.

According to the paper “,” the key commitment issue can be mitigated using fixed padding. For example, consider prepending 128 bytes of 0x00 to the plaintext as padding. Decrypting to exactly 128 bytes of zeros is extremely difficult. The decryption process of AES-GCM works as follows:

mi=ci⊕AESk(n+i)m_i = c_i \oplus \mathsf{AES}_k(n + i)mi​=ci​⊕AESk​(n+i)

Assume that bbb fixed padding blocks are inserted at the beginning of the plaintext. For the decrypted output to exactly match the padding, the following condition must hold:

∀ 1≤i≤b,AESk1(n1+i)=AESk2(n2+i)\forall\, 1 \le i \le b,\quad \mathsf{AES}_{k_1}(n_1 + i) = \mathsf{AES}_{k_2}(n_2 + i)∀1≤i≤b,AESk1​​(n1​+i)=AESk2​​(n2​+i)

In other words, different key/nonce pairs (k1,n1)≠(k2,n2)(k_1, n_1) \ne (k_2, n_2)(k1​,n1​)=(k2​,n2​) must produce identical key streams for bbb blocks—an extremely unlikely event.

An ideal cipher is defined as a collection of functions E={Ek:M→C}\mathcal{E} = \{ E_k : \mathcal{M} \to \mathcal{C} \}E={Ek​:M→C} from a message space M\mathcal{M}M to a ciphertext space C\mathcal{C}C.

Each EkE_kEk​ is a random permutation over M\mathcal{M}M.

That is, for each key kkk, the function EkE_kEk​ is assumed to be independent and completely random.

💡 In other words, under the ideal cipher model, even when the same input is used with different keys k1k_1k1​ and k2k_2k2​, the outputs are almost certainly unrelated.

As shown in the image above, many web servers—including those of Google and Twitter—prepend HTTP responses with consistently structured data such as the HTTP status code and Date fields. This formatting is considered best practice under , and widely adopted by major web servers like Apache and Nginx. According to the paper, more than 85% of webpages begin with this kind of pattern.

There are approximately 63 HTTP status codes, based on the (excluding unassigned codes). (Interestingly, when I counted them myself, there were 64 HTTP status codes. )

63×3600=22680063 \times 3600 = 22680063×3600=226800

If the HTTP status code and Date header together form a 54-byte string, this effectively corresponds to having 254×82^{54 \times 8}254×8 possible padding values. It means that forcing a decryption result to match one of these valid patterns is statistically infeasible for an attacker. In this sense, such structural padding acts as a practical substitute for full key commitment and can offer meaningful protection. (For a detailed analysis, refer to of the paper.)

A CFY attack aims to find a different context (k2,n2)(k_2, n_2)(k2​,n2​) such that a given (c,a,σ)(c, a, \sigma)(c,a,σ), originally generated under a key and nonce (k1,n1)(k_1, n_1)(k1​,n1​), can still be successfully decrypted.

The constraint is that (k1,n1)≠(k2,n2)(k_1, n_1) \ne (k_2, n_2)(k1​,n1​)=(k2​,n2​).

In other words, the attacker tries to determine whether the ciphertext can be decrypted under a different (k,n)(k, n)(k,n) pair than the one originally used.

The structure of a CFY attack closely resembles that of a Second Preimage Attack in cryptographic hash functions. In a Second Preimage Attack, the attacker is given an input xxx and seeks a different input x′≠xx' \ne xx′=x that hashes to the same value:

x≠x′∧H(x)=H(x′)x \ne x' \quad \land \quad H(x) = H(x')x=x′∧H(x)=H(x′)

This is different from a Preimage Attack, where one is given only a hash output hhh and must find any corresponding input xxx:

h=H(x)h = H(x)h=H(x)

One valid that produces

A second valid that produces

Two valid and that produce a same

The zkTLS attacker must perform a CFY attack, not a CDY attack. This is because (k1,n1)(k_1, n_1)(k1​,n1​) is already a valid decryption context, and the attacker’s goal is to find a different context (k2,n2)(k_2, n_2)(k2​,n2​) that also successfully decrypts the ciphertext.

For simplicity, let’s assume the ciphertext ccc consists of a single block and there is no associated data aaa. Under these conditions, the AES-GCM tag σ\sigmaσ is computed as follows:

σ=AESk(n)⊕GHASH(c)=AESk(n)⊕AESk(0)2c⊕AESk(0)\sigma = \mathsf{AES}_k(n) \oplus \mathsf{GHASH}(c) = \mathsf{AES}_k(n) \oplus \mathsf{AES}_k(0)^2c \oplus \mathsf{AES}_k(0)σ=AESk​(n)⊕GHASH(c)=AESk​(n)⊕AESk​(0)2c⊕AESk​(0)

Now, suppose an attacker randomly samples a different key k′≠kk' \ne kk′=k, and attempts to generate the same tag σ\sigmaσ using a new context (k′,n′)(k', n')(k′,n′):

σ=AESk′(n′)⊕GHASH(c)=AESk′(n′)⊕AESk′(0)2c⊕AESk′(0)\sigma = \mathsf{AES}_{k'}(n') \oplus \mathsf{GHASH}(c) = \mathsf{AES}_{k'}(n') \oplus \mathsf{AES}_{k'}(0)^2c \oplus \mathsf{AES}_{k'}(0)σ=AESk′​(n′)⊕GHASH(c)=AESk′​(n′)⊕AESk′​(0)2c⊕AESk′​(0)

Assuming AES behaves as an ideal cipher, each key defines a reversible permutation. Thus, the attacker can rearrange the above equation to solve for n′n'n′:

n′=AESk′−1(σ⊕AESk′(0)2c⊕AESk′(0))n' = \mathsf{AES}_{k'}^{-1}(\sigma \oplus\mathsf{AES}_{k'}(0)^2c \oplus \mathsf{AES}_{k'}(0) )n′=AESk′−1​(σ⊕AESk′​(0)2c⊕AESk′​(0))

In other words, given σ\sigmaσ, ccc, and AESk′(0)\mathsf{AES}_{k'}(0)AESk′​(0), the attacker can compute a candidate nonce n′n'n′ that would yield the same tag.

⚠️ However, in TLS, nonces must follow a specific format: among the 64-bit nonce, the last 32 bits must equal (031‖1)(0^{31}‖1)(031‖1). Therefore, the probability that a randomly generated n′n'n′ satisfies this constraint by chance is:

1232\frac{1}{2^{32}}2321​

Each attempt with a new k′k'k′ requires two AES invocations: one for AESk′(0)\mathsf{AES}_{k'}(0)AESk′​(0) and one for AESk′(n′)\mathsf{AES}_{k'}(n')AESk′​(n′). If the attacker is limited to a total of qqq AES queries, they can make approximately q2\frac{q}{2}2q​ attempts.

q2⋅1232=q233\frac{q}{2} \cdot \frac{1}{2^{32}} = \frac{q}{2^{33}}2q​⋅2321​=233q​

To simplify the discussion, assume the ciphertext ccc consists of a single block and there is no associated data aaa. Under this assumption, the ChaCha20-Poly1305 tag σ\sigmaσ is computed as:

σ=s+Poly1305r(a)\sigma = s + \mathsf{Poly1305}_{r}(a)σ=s+Poly1305r​(a)

Here, rrr and sss are computed as follows:

(r,s)=ChaCha20k(n,0)(r, s) = \mathsf{ChaCha20}_k(n, 0)(r,s)=ChaCha20k​(n,0)

The attacker randomly samples a different key and nonce pair (k′,n′)≠(k,n)(k', n') \ne (k, n)(k′,n′)=(k,n) and attempts to generate the same tag σ\sigmaσ using this new context. For the attack to succeed, the following condition must hold:

σ=s+Poly1305r(a)=s′+Poly1305r′(a)\sigma = s + \mathsf{Poly1305}_{r}(a) = s' + \mathsf{Poly1305}_{r'}(a) \\σ=s+Poly1305r​(a)=s′+Poly1305r′​(a)
s−s′=Poly1305r′(a)−Poly1305r(a)s -s' =\mathsf{Poly1305}_{r'}(a) - \mathsf{Poly1305}_{r}(a)s−s′=Poly1305r′​(a)−Poly1305r​(a)

Since ChaCha20\mathsf{ChaCha20}ChaCha20 behaves as a , different (k,n)(k, n)(k,n) pairs will produce independent and uniformly random (r,s)(r, s)(r,s) values. Therefore, s−s′s - s's−s′ is uniformly distributed over Z/2128Z\mathbb{Z}/2^{128}\mathbb{Z}Z/2128Z.

Poly1305\mathsf{Poly1305}Poly1305 is a , meaning that its output is also uniformly distributed. Hence, if we define:

Δp(r′):=Poly1305r′(a)−Poly1305r(a)\Delta p(r') := \mathsf{Poly1305}_{r'}(a) - \mathsf{Poly1305}_r(a)Δp(r′):=Poly1305r′​(a)−Poly1305r​(a)

then Δp(r′)\Delta p(r')Δp(r′) is uniformly distributed over Z/2128Z\mathbb{Z}/2^{128}\mathbb{Z}Z/2128Z as well.

Pr⁡(BAD)=∑i∈Z/2128ZPr⁡(s′−s=i)Pr⁡(Δp(r′)=i)=12128∑i∈Z/2128Pr⁡(Δp(r′)=i)=12128\begin{align*} \Pr(\mathsf{BAD}) &= \sum_{i \in \mathbb{Z}/2^{128}\mathbb{Z}} \Pr(s' - s = i) \Pr(\Delta p(r') = i) \\ &= \frac{1}{2^{128}}\sum_{i \in \mathbb{Z}/2^{128}} \Pr(\Delta p(r') = i) = \frac{1}{2^{128}} \end{align*}Pr(BAD)​=i∈Z/2128Z∑​Pr(s′−s=i)Pr(Δp(r′)=i)=21281​i∈Z/2128∑​Pr(Δp(r′)=i)=21281​​

Therefore, if the attacker is allowed to make qqq queries, the upper bound on the total success probability is:

q2128\frac{q}{2^{128}}2128q​

(c,a,σ)(c, a, \sigma)(c,a,σ)
(k,n)(k, n)(k,n)
(c,a,σ)(c, a, \sigma)(c,a,σ)
(c,a,σ,k1,n1)(c, a, \sigma, k_1, n_1)(c,a,σ,k1​,n1​)
(k2,n2)(k_2, n_2)(k2​,n2​)
(c,a,σ)(c, a, \sigma)(c,a,σ)
(k1,n1)(k_1, n_1)(k1​,n1​)
(k2,n2)(k_2, n_2)(k2​,n2​)
(c,a,σ)(c, a, \sigma)(c,a,σ)
DECO
AES
CTR (Counter Mode)
GHASH (Galois Hash)
universal hash
ChaCha20-Poly1305
ChaCha20
Poly1305
How to Abuse and Fix Authenticated Encryption Without Key Commitment
RFC 7231
🤔
official IANA registry
https://www.youtube.com/watch?v=xcA3o2NGlfw
https://eprint.iacr.org/2024/733.pdf
Section 6.2
Pseudorandom Function (PRF)
Universal Hash
Source:
https://youtu.be/xcA3o2NGlfw?t=845