-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add experimental hybrid post-quantum handshake using Kyber-1024 and Dilithium #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Mateusvff
wants to merge
35
commits into
WireGuard:master
Choose a base branch
from
Mateusvff:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Use parallel summation with native byte order per RFC 1071. add-with-carry operation is used to add 4 words per operation. Byteswap is performed before and after checksumming for compatibility with old `checksumNoFold()`. With this we get a 30-80% speedup in `checksum()` depending on packet sizes. Add unit tests with comparison to a per-word implementation. **Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz** | Size | OldTime | NewTime | Speedup | |------|---------|---------|----------| | 64 | 12.64 | 9.183 | 1.376456 | | 128 | 18.52 | 12.72 | 1.455975 | | 256 | 31.01 | 18.13 | 1.710425 | | 512 | 54.46 | 29.03 | 1.87599 | | 1024 | 102 | 52.2 | 1.954023 | | 1500 | 146.8 | 81.36 | 1.804326 | | 2048 | 196.9 | 102.5 | 1.920976 | | 4096 | 389.8 | 200.8 | 1.941235 | | 8192 | 767.3 | 413.3 | 1.856521 | | 9000 | 851.7 | 448.8 | 1.897727 | | 9001 | 854.8 | 451.9 | 1.891569 | **AMD EPYC 7352 24-Core Processor** | Size | OldTime | NewTime | Speedup | |------|---------|---------|----------| | 64 | 9.159 | 6.949 | 1.318031 | | 128 | 13.59 | 10.59 | 1.283286 | | 256 | 22.37 | 14.91 | 1.500335 | | 512 | 41.42 | 24.22 | 1.710157 | | 1024 | 81.59 | 45.05 | 1.811099 | | 1500 | 120.4 | 68.35 | 1.761522 | | 2048 | 162.8 | 90.14 | 1.806079 | | 4096 | 321.4 | 180.3 | 1.782585 | | 8192 | 650.4 | 360.8 | 1.802661 | | 9000 | 706.3 | 398.1 | 1.774177 | | 9001 | 712.4 | 398.2 | 1.789051 | Signed-off-by: Tu Dinh Ngoc <[email protected]> [Jason: simplified and cleaned up unit tests] Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: ruokeqx <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
It should be POLLIN because closeFd is read-only file. Signed-off-by: Kurnia D Win <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Reduce allocations by eliminating byte reader, hand-rolled decoding and
reusing message structs.
Synthetic benchmark:
var msgSink MessageInitiation
func BenchmarkMessageInitiationUnmarshal(b *testing.B) {
packet := make([]byte, MessageInitiationSize)
reader := bytes.NewReader(packet)
err := binary.Read(reader, binary.LittleEndian, &msgSink)
if err != nil {
b.Fatal(err)
}
b.Run("binary.Read", func(b *testing.B) {
b.ReportAllocs()
for range b.N {
reader := bytes.NewReader(packet)
_ = binary.Read(reader, binary.LittleEndian, &msgSink)
}
})
b.Run("unmarshal", func(b *testing.B) {
b.ReportAllocs()
for range b.N {
_ = msgSink.unmarshal(packet)
}
})
}
Results:
│ - │
│ sec/op │
MessageInitiationUnmarshal/binary.Read-8 1.508µ ± 2%
MessageInitiationUnmarshal/unmarshal-8 12.66n ± 2%
│ - │
│ B/op │
MessageInitiationUnmarshal/binary.Read-8 208.0 ± 0%
MessageInitiationUnmarshal/unmarshal-8 0.000 ± 0%
│ - │
│ allocs/op │
MessageInitiationUnmarshal/binary.Read-8 2.000 ± 0%
MessageInitiationUnmarshal/unmarshal-8 0.000 ± 0%
Signed-off-by: Alexander Yastrebov <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
This is already enforced in receive.go, but if these unmarshallers are to have error return values anyway, make them as explicit as possible. Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
This pairs with the recent change in wireguard-tools. Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Optimize message encoding by eliminating binary.Write (which internally uses reflection) in favour of hand-rolled encoding. This is companion to 9e7529c. Synthetic benchmark: var packetSink []byte func BenchmarkMessageInitiationMarshal(b *testing.B) { var msg MessageInitiation b.Run("binary.Write", func(b *testing.B) { b.ReportAllocs() for range b.N { var buf [MessageInitiationSize]byte writer := bytes.NewBuffer(buf[:0]) _ = binary.Write(writer, binary.LittleEndian, msg) packetSink = writer.Bytes() } }) b.Run("binary.Encode", func(b *testing.B) { b.ReportAllocs() for range b.N { packet := make([]byte, MessageInitiationSize) _, _ = binary.Encode(packet, binary.LittleEndian, msg) packetSink = packet } }) b.Run("marshal", func(b *testing.B) { b.ReportAllocs() for range b.N { packet := make([]byte, MessageInitiationSize) _ = msg.marshal(packet) packetSink = packet } }) } Results: │ - │ │ sec/op │ MessageInitiationMarshal/binary.Write-8 1.337µ ± 0% MessageInitiationMarshal/binary.Encode-8 1.242µ ± 0% MessageInitiationMarshal/marshal-8 53.05n ± 1% │ - │ │ B/op │ MessageInitiationMarshal/binary.Write-8 368.0 ± 0% MessageInitiationMarshal/binary.Encode-8 160.0 ± 0% MessageInitiationMarshal/marshal-8 160.0 ± 0% │ - │ │ allocs/op │ MessageInitiationMarshal/binary.Write-8 3.000 ± 0% MessageInitiationMarshal/binary.Encode-8 1.000 ± 0% MessageInitiationMarshal/marshal-8 1.000 ± 0% Signed-off-by: Alexander Yastrebov <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Kernels below 5.12 are missing this:
commit 98184612aca0a9ee42b8eb0262a49900ee9eef0d
Author: Norman Maurer <[email protected]>
Date: Thu Apr 1 08:59:17 2021
net: udp: Add support for getsockopt(..., ..., UDP_GRO, ..., ...);
Support for UDP_GRO was added in the past but the implementation for
getsockopt was missed which did lead to an error when we tried to
retrieve the setting for UDP_GRO. This patch adds the missing switch
case for UDP_GRO
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Norman Maurer <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
That means we can't set the option and then read it back later. Given
how buggy UDP_GRO is in general on odd kernels, just disable it on older
kernels all together.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
…nitiation Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
…ed secret derivation Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
…ivate keys in Device struct Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
… marshaling and unmarshaling Signed-off-by: Mateus Franco <[email protected]>
…itiation Signed-off-by: Mateus Franco <[email protected]>
…ation Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Signed-off-by: Mateus Franco <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This pull request adds experimental support for a hybrid post-quantum handshake to
wireguard-go, combining the existing X25519-based key agreement with ML-KEM (Kyber-1024) and ML-DSA (Dilithium) for authentication. The goal is to provide a prototype implementation of a post-quantum–ready WireGuard handshake while preserving the original Noise_IK pattern and existing behaviour for non-PQ peers.The implementation is based on the
circlcryptographic library from Cloudflare, which provides Go implementations of Kyber and Dilithium.Design overview
The implementation is split into two main parts:
Hybrid key agreement (ML-KEM / Kyber-1024)
MessageInitiationand is encrypted within the Noise handshake.combinedSecret, which feeds into the existing Noise key schedule.Hybrid authentication (ML-DSA / Dilithium)
MessageInitiationis extended with aSignaturefield that carries a Dilithium signature over the serialized message fields (excluding MAC and signature fields).Implementation details
New sizes and types
MLKEMPublicKeySize,MLKEMPrivateKeySize,MLKEMCiphertextSize.MLDSAPublicKeySize,MLDSAPrivateKeySize,MLDSASignatureSize.device/noise-types.go.Extended identity and handshake structures
staticIdentitynow includes:mlkemPrivateKey,mlkemPublicKeymldsaPrivateKey,mldsaPublicKeyHandshakenow includes:remoteMLKEMStatic(peer’s ML-KEM public key)remoteMLDSAStatic(peer’s ML-DSA public key)Handshake protocol changes
device/noise-protocol.go:MessageInitiationwas extended with:MLKEMfield to carry the Kyber ciphertext.Signaturefield to carry the Dilithium signature.CreateMessageInitiation:kyber1024.Scheme().Encapsulate(remoteMLKEMStatic)to obtainmlkemSecretandciphertext.ciphertextintomsg.MLKEM.mlkemSecretviaKDF2intocombinedSecret.msg.Signature.ConsumeMessageInitiation:remoteMLDSAStaticand verifiesmsg.Signaturewith the Dilithium verification routine.msg.MLKEMand callsDecapsulatewith the local ML-KEM private key to recovermlkemSecret.combinedSecretusing the same KDF and continues with the existing Noise key schedule.UAPI and key management
device/uapi.gowas updated to accept:mlkem_private_key,mlkem_public_keymldsa_private_key,mldsa_public_keydevice/quantum-keys.gowas added providing:GenerateQuantumKeyPairfor Kyber-1024 key pairs.GenerateMLDSAKeyPairfor Dilithium5 key pairs.circl’skem.SchemeAPI for Kyber (viaGenerateKeyPair) andsign.Schemefor Dilithium (viaGenerateKey).Compatibility
Testing
Unit tests
Integration tests
Manual validation