-
Notifications
You must be signed in to change notification settings - Fork 18.1k
proposal: encoding/{base32,base64}: constant-time implementations of RFC 4648 codecs for use with cryptography keys #73901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @golang/security |
Just in case there's any confusion: I do not think this qualifies as a security vulnerability (or I would have reported it as one, rather than filing an issue with the intent to follow up with a patch). |
👍 however do we want new functions ? It would be simpler for everyone if we made the current functions constant time. |
A constant-time alternative will certainly be slower than the general-purpose functions: It replaces a map look-up with many operations. In the context of handling cryptographic secrets, this trade-off is worth it. However, making all use-cases for a base64 codec constant-time would slow most Go programs down unnecessarily. It is for this reason that my opening offer was to implement new functions rather than change the current behavior. |
A single map lookup is far from cheap, I don't want to risk myself to guessing the performance impact, a benchmark would really help 🙂. |
We can always do this in reverse, make the current API constant-time, but add unsafe variants (if needed) for speed. Then folks would actually use the constant-time APIs, otherwise we might have low usage of such (new) APIs. |
Doesn't the fact that stdlib have configurable alphabets for the base64/32 rule out replacing them? |
@renthraysk It's not actually a total show-stopper (although it does make the patch a bit trickier to write). See #73909 for a first pass at a patch that adheres to the suggestions provided in this discussion so far. |
Change https://go.dev/cl/676917 mentions this issue: |
I don't think we'd want to turn on constant-time behavior by default for all uses of encoding/{base32,base64}. I suspect the vast majority of their usage is for entirely benign inputs. That said we probably would want to use it by default for encoding/pem, where the contents are much more likely to be sensitive (of course we should probably still add some way to disable this, for people who i.e. need to parse large numbers of certificates etc). I haven't thought through this too deeply, but based on my understanding of how people typically do constant time base64, could we add a generic decoder/encoder that generates the required bit shift offsets based on the alphabet, rather than needing to implement per-alphabet encoders and decoders? We'd then just need to add a |
I'm not sure this is actually correct (threw it together very quickly), but something along the lines of this https://go.dev/play/p/4toQNnegoOh? |
The problem with that sort of check is that the RFC 4648 alphabets for base64 are not contiguous, which requires arithmetic like https://go-review.googlesource.com/c/go/+/676917/1/src/encoding/base64/mapping.go to implement without branches.
I'm of the same opinion here. In this case, these lines can be omitted and the corresponding logic moved into other places (e.g., encoding/pem) as needed. |
Can this arithmetic not be precomputed for the discontiguous ranges in the alphabet such that there don't need to be branches? Or is there something obvious I'm missing. |
Sorry, I misread the code. It looks like your approach will work: Compare https://go.dev/play/p/EEz9V8fgsCC with https://go.dev/play/p/VHwr1LZpst9 EDIT: Looks like there's a bit of a bug with the logic for handling |
I wrote this awhile ago. I think it's easier to read. Also you can edit the constants for the different character sets. Although these decode functions have an optimization for a single character range (which is obvious to change). Oh also if you want Crockford's base32 decode or case insensitive hex or base32 decode then there's a few modifications like adding Note Also for custom character sets there will be a point when scanning the character set will be faster. But I don't know when that is and I assume it's higher than common character sets (in ASCII). Basically unless the character set is randomised "for security" then doing ranges will leak info about the character set, but cases like that don't really matter because they will always find a way to break their scheme. func base64StandardDecodeChar(ch int) int {
const LO1 = int('A') - 1
const HI1 = int('Z') + 1
const LO2 = int('a') - 1
const HI2 = int('z') + 1
const LO3 = int('0') - 1
const HI3 = int('9') + 1
const LO4 = int('+') - 1
const HI4 = int('+') + 1
const LO5 = int('/') - 1
const HI5 = int('/') + 1
const COUNT1 = 0
const COUNT2 = COUNT1 + HI1 - LO1 - 1
const COUNT3 = COUNT2 + HI2 - LO2 - 1
const COUNT4 = COUNT3 + HI3 - LO3 - 1
const COUNT5 = COUNT4 + HI4 - LO4 - 1
value := -1
// if ( LO < ch && ch < HI ) { value += COUNT + ch - LO }
value += (((LO1 - ch) & (ch - HI1)) >> 8) & (COUNT1 + ch - LO1)
value += (((LO2 - ch) & (ch - HI2)) >> 8) & (COUNT2 + ch - LO2)
value += (((LO3 - ch) & (ch - HI3)) >> 8) & (COUNT3 + ch - LO3)
value += (((LO4 - ch) & (ch - HI4)) >> 8) & (COUNT4 + 1) // ch - LO4 == 1
value += (((LO5 - ch) & (ch - HI5)) >> 8) & (COUNT5 + 1) // ch - LO5 == 1
return value
}
func base64StandardDecodeCharFaster(ch int) int {
const LO1 = int('A') - 1
const HI1 = int('Z') + 1
const LO2 = int('a') - 1
const HI2 = int('z') + 1
const LO3 = int('0') - 1
const HI3 = int('9') + 1
const CHAR4 = int('+')
const CHAR5 = int('/')
const COUNT1 = 0
const COUNT2 = COUNT1 + HI1 - LO1 - 1
const COUNT3 = COUNT2 + HI2 - LO2 - 1
const COUNT4 = COUNT3 + HI3 - LO3 - 1
const COUNT5 = COUNT4 + 1
var borrow uint
value := -1
// if ( LO < ch && ch < HI ) { value += ch - LO + COUNT }
value += (((LO1 - ch) & (ch - HI1)) >> 8) & (ch - LO1 + COUNT1)
value += (((LO2 - ch) & (ch - HI2)) >> 8) & (ch - LO2 + COUNT2)
value += (((LO3 - ch) & (ch - HI3)) >> 8) & (ch - LO3 + COUNT3)
// if ( ch == CHAR ) { value += COUNT + 1 }
_, borrow = bits.Sub(uint(ch ^ CHAR4), 1, 0); value += (COUNT4 + 1) & -int(borrow)
_, borrow = bits.Sub(uint(ch ^ CHAR5), 1, 0); value += (COUNT5 + 1) & -int(borrow)
return value
}
func base64StandardEncodeChar(src int) byte {
const LO1 = int('A')
const HI1 = int('Z')
const LO2 = int('a')
const HI2 = int('z')
const LO3 = int('0')
const HI3 = int('9')
const LO4 = int('+')
const HI4 = int('+')
const LO5 = int('/')
src += LO1
// if ( HI < src ) { src += LO_NEXT - HI - 1 }
src += ((HI1 - src) >> 8) & (LO2 - HI1 - 1)
src += ((HI2 - src) >> 8) & (LO3 - HI2 - 1)
src += ((HI3 - src) >> 8) & (LO4 - HI3 - 1)
src += ((HI4 - src) >> 8) & (LO5 - HI4 - 1)
return byte(src)
} This all assumes that it's in ASCII. If go can compile to IBM mainframes then strings are likely in EBCDIC. Which will break things. |
no actually it would not be simpler for everyone. base64 encoding is used extensively in various format (json, xml) etc without the need to be cryptographically secure. it would be a big regression if all those cases got slowed down for no good reason. |
Ignoring the actual implementation details for a moment, I think a reasonable proposal here is to make the following change to encoding/base{32,64}:
and the following changes to the existing encoding/pem functions:
and add these new functions to encoding/pem:
Another option would be to add a new field to Block, that lets setting variable time behavior for encoding, but we'd still need a VariableTimeDecode function. I think I like adding the new methods more, since it provides better symmetry in the API. |
@rolandshoemaker That does seem like a better user experience than the API changes in my hasty patch. |
What's the use case for variable-time encoding in encoding/pem? encoding/base{32,64} can't be constant-time by default (performance regression, most users don't need constant-time encoding). If encoding/pem.{Encode,Decode} become constant-time by default, then we're saying that most programs either need constant-time encoding or don't care about the performance penalty. That seems plausible to me. Are there programs that use PEM, don't need constant-time encoding, and care about the (hopefully small) performance cost of constant-time base64 encoding? |
Consider a use-case of a Key and/or Certificate Transparency system that accepts PEM-encoded public keys (which are not public) or certificates (which in turn only contain public information) in order to validate them before committing them to a Merkle tree. In such a hypothetical system, slowing down the base64 decoding by default would add a performance bottleneck with no real upside for the system in question. This is the sort of consideration I had in mind, and why I opened with not enabling it by default. |
That's an example of a system that doesn't care about constant time encoding. Is PEM encoding/decoding going to be in critical performance path in a system like that, though? How much of a cost are we talking about here? If we're going to double the API surface of the encoding/pem package and require all future users to spend time trying to decide whether they need constant-time encoding or not, than I think we should quantify what the actual performance difference is. |
I think we need benchmarks to make reasonable decisions here about the trade-offs, but I will note that encoding/pem is currently only three functions. Doubling the API surface is perhaps not ideal, but concretely that comes down to adding three new functions, which isn't particularly horrific. The point about requiring users to decide is valid, but I think the combination of naming and reasonable documentation makes it less scary. Additionally, since the existing Encode/Decode will become "safe" means it's unlikely people will unintentionally begin using the "unsafe" permutations unless they are explicitly looking for them for performance reasons. Of course all of this is moot if the benchmarks show that the performance degradation is relatively low, hence why we need numbers before we can make a decision. |
Uh oh!
There was an error while loading. Please reload this page.
Proposal Details
Background
The current implementations of the base32 and base64 codecs are not ideal for cryptographic use cases. The linked document describes practical timing side-channels against general purpose RFC 4648 codecs when used to encode/decode cryptographic secrets. You can verify the current implementation uses a table look-up in both directions, which introduces the risk of cache-timing attacks.
Specialized implementations of these codecs exist. @Sc00bz wrote a constant-time implementation of base64 in C, which I used as a basis for both base32 and base64 in PHP.
Proposed Change
New functions in
encoding/base32
andencoding/base64
:EncodeToStringConstTime
as an alternative toEncodeToString
DecodeStringConstTime
as an alternative toDecodeString
These functions could then be used in other packages that handle cryptographic secrets and auditors can be absolutely sure that timing leaks are not present.
(I'm happy to provide a patch for this, if accepted.)
The text was updated successfully, but these errors were encountered: