X-Wing: How it works, and why it is secure

·7 min read

Post-quantum cryptography is coming whether you trust it fully or not. Harvest-now-decrypt-later attacks mean we can't wait for another 5 years of cryptanalysis before deploying ML-KEM. But we also can't abandon X25519 while the new schemes are still being stress-tested. Hybrid KEMs let us have both, and X-Wing is one such hybrid KEM with some really nice security and performance properties.

The problem is pretty simple. We need to protect against quantum computers that might break our current key exchange (X25519), but we also might not fully trust the new post-quantum schemes yet since they're only a few years old. The obvious answer is to use both. If quantum computers never materialize or take longer than expected, X25519 keeps you safe. If the post-quantum scheme (ML-KEM-768) turns out to have a flaw, X25519 keeps you safe. You only lose if both fail simultaneously (which is pretty unlikely).

How KEM combination works

A key encapsulation mechanism (KEM) is the modern way to do key exchange. Instead of both parties contributing to a shared secret (such as in Diffie-Hellman), one side generates a random key and encrypts it to the other side's public key.

It can really be explained in three steps:

  • Alice runs Enc using Bob’s public key and gets a ciphertext plus a shared key.
  • Alice sends the ciphertext to Bob and keeps her copy of the shared key.
  • Bob runs Dec on that ciphertext using his private key and recovers the same shared key

It's cleaner than Diffie-Hellman for many protocols because the key is explicitly generated by one party rather than derived from both.

When you want to combine two KEMs for hybrid security, the natural approach is to run both in parallel. Generate a key with the first KEM (k1, c1), generate a key with the second KEM (k2, c2), then derive your final shared secret as KDF(k1 || k2). You concatenate both keys and hash them. This feels right because you're mixing entropy from both sources.

The problem, proven by Giacon, Heuer, and Poettering in 2018, is that this breaks if either KEM is pathologically broken in a specific way. Imagine ML-KEM has a flaw where given a ciphertext c1 that decrypts to key k1, you can easily find a different ciphertext c1' that also decrypts to k1. An attacker intercepts your hybrid ciphertext (c1, c2), can't break c2 directly, but generates c1' and sends (c1', c2) to a decryption oracle. The oracle returns KDF(k1 || k2), which is exactly your secret key. The attacker wins without breaking both KEMs.

The fix is to include both ciphertexts in the key derivation: KDF(k1 || k2 || c1 || c2). Now even if the attacker finds c1', the KDF input is different because c1' appears in the hash instead of c1, so they don't get your key. This is the GHP combiner, and it's provably secure even if one KEM is arbitrarily broken. The problem is performance. ML-KEM-768 has a 1088-byte ciphertext. You're hashing over a kilobyte of ciphertext data on every connection, which means multiple SHA3-256 blocks and real overhead.

What X-Wing proves you can skip

X-Wing proves you can skip hashing the ML-KEM ciphertext without breaking security. This works because ML-KEM has a specific property the authors call "ciphertext second preimage resistance". Even if ML-KEM is completely broken as a KEM, its internal structure makes it cryptographically hard to find two different ciphertexts that decrypt to the same key.

The reason comes down to how ML-KEM's decapsulation works, which is based on the Fujisaki-Okamoto transform, which adds a verification step. When you decrypt a ciphertext, ML-KEM first decrypts it to recover a message m'. It then computes both a shared key K and fresh randomness r' by hashing m' together with a hash of the public key. ML-KEM re-encrypts m' using this r' to get a new ciphertext c'. If c' matches the original ciphertext c, you get the shared key K. If they don't match, you get a different key computed by hashing a secret random value z together with the ciphertext.

The re-encryption will only match if the ciphertext was honestly generated, because an honest sender would have used the same deterministic process to derive their randomness from the message. A tampered or maliciously crafted ciphertext will decrypt to some message, but when you re-derive the randomness from that message and re-encrypt, you get a different ciphertext back because it wasn't created using the proper derivation. This means if you try to find two different ciphertexts that decrypt to the same key, at least one will fail the re-encryption check and return the rejection key instead. The structure makes the second-preimage style attack cryptographically hard, even if ML-KEM is broken in other ways.

With this property proven, X-Wing's final KDF only needs: KDF(label || k1 || k2 || c2 || pk2). The ML-KEM shared key k1 is there (32 bytes), but the ML-KEM ciphertext c1 is no longer needed for this hash because of the ciphertext second preimage resistance property. The X25519 ciphertext c2 (32 bytes) is included because X25519 doesn't have the same ciphertext collision resistance. The public key pk2 specifically helps protect against multi-target attacks, similar to ML-KEM's design. Total input to SHA3-256: 134 bytes (6-byte label + 32-byte k1 + 32-byte k2 + 32-byte c2 + 32-byte pk2), which fits in a single block. Compare that to 1222 (134 + 1088) if you include the ML-KEM ciphertext.

X-Wing also uses X25519 directly instead of wrapping it in a full KEM construction like DHKEM from HPKE. That saves another key derivation step. The tradeoff is specificity. The GHP combiner works with any two KEMs. X-Wing's proof only applies to this exact combination, but that's fine because these are the primitives everyone is deploying anyway.

The security proof splits into two cases. In the classical case, they assume X25519 is secure and ML-KEM might be completely broken (as long as it keeps ciphertext second preimage resistance). They model SHA3-256 as a random oracle and reduce security to the Strong Diffie-Hellman problem in X25519's curve. In the post-quantum case, they assume ML-KEM is secure and X25519 offers nothing. Here they work in the standard model, treating SHA3-256 as a pseudorandom function, and reduce to ML-KEM's IND-CCA security (indistinguishability under adaptive chosen-ciphertext attack, the standard security definition for KEMs that says an attacker can't distinguish a real key from random even with access to a decryption oracle). The proof is split into two cases, and each case is explicit about its assumptions and what security reduces to.

The performance gain is measurable. Their benchmarks show 8% faster encapsulation and 9% faster decapsulation compared to including the ML-KEM ciphertext in the hash. That difference comes almost entirely from SHA3-256 processing 134 bytes instead of 1222 bytes.

X-Wing makes concrete choices instead of being a generic combiner framework. It uses X25519, ML-KEM-768, and SHA3-256. These match what Google and Cloudflare already deployed in their early post-quantum rollout (X25519Kyber768Draft00), which means there's real-world deployment experience. ML-KEM-768 was chosen over ML-KEM-512 to target a comfortable margin for 128-bits of security and to match existing deployments like X25519Kyber768Draft00.

The implementation is straightforward because you can treat X25519 and ML-KEM-768 as black boxes. You don't need to open up ML-KEM and merge its key derivation with X-Wing's, even though it might save another hash operation. The authors explicitly chose simplicity. If you have working X25519 and ML-KEM implementations, you can build X-Wing in a few hours (try it, it's fun).

There's an IETF draft to standardize X-Wing. The final version might differ slightly, but the core idea should be the same. If you're implementing post-quantum key exchange today, X-Wing gives you a well-analyzed hybrid that doesn't force you to choose between security and performance.