title: Cryptography is not Magic
description: Crypto is often treated like a black art only anointed
experts can hope to wield safely. It's not. There are
laws, and we can learn them.
Don't roll your own crypto.
Broken crypto often looks like it works.
The best way to build crypto is to learn how to break it.
Use existing, vetted, well reviewed libraries.
Leave it to the experts.
Infosec people and their followers.
The running theme seems to be that cryptography is a kind of Dark Magic,
best left to anointed High Priests. Us mere Mortals cannot hope to
wield it safely without first becoming one of those vaunted Experts — a
futile endeavour for those of us who know their place.
While being a good first order approximation,
that kind of gate keeping is problematic on an number of levels. First,
some people back in the real world have needs that current crypto
systems don't always fill. Insulting them with "you don't know what you
are doing" does not help. Second, it has a couple perverse effects:
I would like to advocate a different approach. Cryptography has rules,
which are both simpler and more fundamental than we might think. We
could tell people what they don't know, and give them some pointers:
introductory books or courses, or specific words and
concepts. Here, I will outline what you need to look for before you
roll your own crypto system. I've identified 3 broad categories:
implementing crypto, designing protocols, and designing primitives.
Note: this is an opinion piece, so let my clarify where I came from. I
program professionally since 2007, and started to teach myself
cryptography 4 years ago. I wrote a crypto library, whose
recent audit uncovered no major flaw.
Perhaps surprisingly, implementing cryptographic primitives & protocols
requires little cryptographic knowledge. Choosing what to implement
is more delicate (you need to know the state of the art), but once
you've made your choice, your only worries are side channels and
Side channels have one rule: don't let any data flow from secrets to
that channel. Interesting side channels include timings, energy
consumption, electromagnetic emissions… Most can be ignored most of the
time (energy consumption for instance requires physical access), except
timings. Never ignore timings, they're part of most threat models.
On most CPUs, the timing side channel can be eliminated by removing all
secret dependent branches and all secret dependent indices. Some CPUs
also have variable time arithmetic operations. Watch out for
multiplications and shifts by variable amounts in particular.
Also be careful around compilers and interpreters. No mainstream
language specifies how to make constant time code, so your tools could
insert secret dependent branches where the source code had none. High
level languages in particular tend to have variable time arithmetic. In
practice, low level languages like C are fairly reasonable, but I've
Note that some primitives are more amenable to constant time software
implementations than others: Chacha20 tends to be naturally immune to
timing attacks on most platforms, while AES requires special care if you
don't have hardware support.
Then we have correctness. The slightest error may throw cryptographic
guarantees out the window, so we cannot tolerate errors. Your code must
be bug free, period. It's not easy, but it is simple: it's all about
tests and proofs.
Ciphers and hashes are fairly easy to test. At their core, they are
about taking an input, and mangle it so thoroughly that the slightest
change in the input will completely garble the output. In practice, the
slightest programming error tend to change the end result completely.
All you have to do is compare the output of your primitive with a
reference implementation or test vectors. Ideally, do that for all
possible input & output lengths. Including zero (empty inputs).
Practically, you can stop at a couple iterations of your biggest
internal loop, to be sure you hit all data paths.
If you have an init-update-final interface, try all possible cut-off
points in the updates to make sure you get the same results. I've
caught several bugs that way.
Modular arithmetic (polynomial hashes, elliptic curves…) is more
delicate. Now we're adding or multiplying big numbers (130 bits, 255
bits…) that do not fit in a single machine register. Many libraries
don't run in constant time, and custom code is often faster. So you
end up dividing those big numbers in several limbs, which can fit in a
single machine register.
(You can think of a limb as a huge digit: instead of working in base
10, we can work in base 2³², and store the corresponding "digits" in
Once you've applied the precautions used for ciphers and hashes, the
biggest trap is limb overflow. Not a problem with naive
implementations, but there is one dangerous trick that often drastically
improves performance: delayed carry propagation. Make your limbs
smaller than the range of the registers they are stored into (26 bits
limbs with 32 bits registers for instance), so that they can temporarily
exceed their normal range.
With this trick, some overflow bugs happen rarely enough that random
tests will not catch them. There are two ways to address the problem:
either don't delay carry propagation to begin with (it's safest), or
write a mathematical proof that shows your algorithm never triggers an
overflow. Ideally you'd have a machine verify that proof.
Elliptic curves have their own traps. If you have explicit formulas
telling you what arithmetic operation to perform in what order, no
problem (beyond comparing to a reference implementation and limb
overflow). If you want to be clever, make sure you know exactly what
you are doing. If you see some mathematical term you haven't seen
before (like "birational equivalence"), don't guess, look it up.
I broke that rule once, and it wasn't pretty.
Also note that Elliptic curves often require conditional selection,
swapping, or even lookups. There are techniques to do that in constant
time. Use them. Failing to do so has enabled actual exploits.
Constructions and protocols are generally much simpler to implement than
their underlying primitives. The same precautions applies, with a
That's where things become interesting. Where implementing crypto was
mostly tedious, designing protocols is delicate. One is not
necessarily harder than the other, but they're very different.
This is no longer about faithfully implementing an unambiguous
specification. This is about addressing a threat model. This means
being aware of what you're up against, addressing that threat, and
proving that you have done so. On top of that, the protocol should
allow, even encourage, good APIs that are hard to misuse.
We need to realistically and precisely define the capabilities of the
adversary. A typical threat model is the untrusted network, where the
adversary can basically do anything: watch messages, intercept
messages, forge messages, replay messages…
Some adversaries can also steal your keys. Maybe the police comes with
a warrant. Maybe your computer is being hacked. You can limit the
damage with forward secrecy (messages sent before the key was stolen
can't be decrypted), and key compromise impersonation resistance
(where stealing your keys don't allow the attacker to impersonate
others when they are talking to you).
Yet another threat is meta data analysis. Maybe you want to hide your
identity to outsiders. Some protocols achieve some measure of
anonymity, where it is hard for the attacker to determine who is
talking to whom.
Or maybe you fear traffic analysis, where the size and timings of the
messages reveal too much about the content. You could mitigate that by
padding your messages to some standard size, or even send a constant
stream of data, and fill the bandwidth you don't use with noise.
Once you know your threat model, you need to demonstrate that your
protocol addresses it. For instance, when the adversary is an untrusted
network, you need IND-CCA2 (indistinguishability under adaptive
chosen ciphertext attack). It's a formalisation of what you need to
combat active adversaries (man in the middle). It's defined thus:
Let's try an example with passive adversaries, for which we only need
IND-CPA: independence under chosen plaintext attack. It's the same as
IND-CCA2, except we don't have the decryption oracle.
Alice needs to send messages to Bob, using a shared secret key
a stream cipher
Stream that takes a key as input, and outputs an
arbitrarily long key stream. (Real stream ciphers also have a
nonce, but we'll make do without one for the sake of the exercise.)
To send her messages, Alice will use key erasure, or ratchet:
K1, S1 = Stream(K0)
msg1 = plaintext XOR S1
K2, S2 = Stream(K1)
msg2 = plaintext XOR S2
K3, S3 = Stream(K2)
msg3 = plaintext XOR S3
To send a message, we first split the key stream in two parts: a new
encryption key, and a data stream. Note that
K1 is the first bytes of
S1 is the rest, so they're independent from each
other. Same for
S2, etc. We can't break that one with the
encryption oracle alone (IND-CPA). Here's why:
K(n)is an independent random string, then so are
S(n+1), which come from
K0is an independent random string.
S3… are all independent random strings. As a whole, they act like a one time pad.
Not the most rigorous proof. That will do for this simple construction
under this simple threat model, but a more complex protocol may not
tolerate as much hand waving. And if you want a machine to check your
proof (a good thing to strive for), it'll have to be perfect.
Another limitation is my choice of the symbolic model: I am assuming
the stream cipher is unbreakable, and will always be indistinguishable
from a true random number generator to anyone who doesn't know the key.
Works well enough for most purposes, but keep in mind that real
cryptographic primitives aren't perfect: they can't generate a genuinely
infinite stream without repeating themselves, and a finite amount of
computing work is enough to break them. Those limits are high enough to
be ignored in many cases, but depending on the primitive you use, and
what you use them for, they may not be.
The biggest limitation of all of course is that we only proved IND-CPA
here. This construction is only immune to passive attackers, that
only listen and never talk. What we really want is to protect against
active attackers, that could perform a full Man in the Middle attack.
A chosen ciphertext attack easily destroys our construction:
p2. Get back the ciphertext
c XOR r, get the plaintext
p XOR r. It will be equal to
p2, depending on which it came from.
We basically tricked the oracle into decrypting the challenge
ciphertext, without breaking the rules: we didn't directly asked to
c, we only took advantage of ciphertext malleability to get
around the restriction. To avoid this, we need authentication,
which I'll leave as an exercise to the reader for brevity's sake.
The proper way to achieve IND-CCA2 is to first encrypt the plaintext,
then compute an authentication tag over the ciphertext.
Encrypt-then-mac. We could instead authenticate the plaintext, and
only then encrypt the whole thing (including the authentication
tag). Mac-then-encrypt. It can be done correctly, and we can even prove
IND-CCA2. It is also a bad idea.
Ideally, we want the recipient to verify first. This has some
What your users need is authenticated encryption, where decryption
either successfully decrypts the message, or does nothing at all. That's
the only way you can be sure no information will leak because of
In fact, "success or nothing" is a rule I try generalise to all my APIs:
check the inputs, then act on them. If the inputs are wrong, don't do
anything, just return an error to the caller.
That's the tough one. There's generally no way to prove that a given
primitive is indeed as secure as it claims to be. There's an obvious
tension between performance and security, and designing something as
secure and as fast as the state of the art is not trivial to say the
I have also no experience in the domain. I can only give a few
pointers, that will hopefully make you realise what it means to stare
into the abyss.
All ciphers and hashes I know are designed around a core permutation:
mangling the hell out of a block of data to produce a pseudo-random
Some permutations (Keccak, Xoodoo, Gimli…) are designed to rule them
all, but they don't have to. The most popular ciphers and hashes have
their own permutation, tailored just for them. Chacha20's permutation
for instance produces zero when the input is all zero. Not very random,
but we don't care, because the input is never zero: part of it is a
constant, that aims to "break symmetry".
Speaking of which: before you make a serious attempt at designing your
hash, cipher, or permutation, there are some terms and concepts you
probably should be deeply familiar with:
Those are an incomplete list of terms I stumbled upon. My hope is by
the time you get familiar with those, you'll have a better idea of what
else you should study.
The primary use case of of polynomial hashes is message authentication
codes. They're not as easy to use as regular hashes, but they're very
fast, and their security properties are provable. If you're going to
invent one, you need to produce a proper security reduction (a
mathematical proof of how secure it is, given some assumptions).
Also note that Poly1305 and GHASH already cover a lot of ground.
Poly1305 is designed to allow simple and efficient software
implementations, while GHASH is very hardware friendly. One of them may
already offer what you need.
The modern way to use the difficulty of discrete logarithm. Unlike
other primitives, I feel like elliptic curves are found more than they
are invented. Granted, they're not trivial. The maths involved are
definitely intimidating, and a single mistake could destroy security.
One way to find a curve is to target a security level, choose a prime
that matches that security level (generally twice as big), and follow
the SafeCurves recommendations. The first curve you find is almost
certainly the one you want.
If you want to play with extension fields instead of prime fields, I can
only point out that some extension fields are more efficient on tiny
embedded targets, and the security literature around binary extension
fields is less stable than it is for prime fields.
I also strongly advise sticking to Montgomery or (twisted) Edwards
curves. They are relatively simple, and they're fast. They don't
have a prime order, but that can be worked around, or even solved
entirely. Avoid short Weierstraß curves. They're not
broken, but they are slower and harder to implement securely.
I know even less than Jon Snow. Can't help you, sorry.
I won't sugar coat it, rolling your own crypto is not easy. Mistakes
are easy to make, and the stakes are often high — getting it wrong can
even get people killed. Don't rush it, and if you can, seek guidance
and feedback. Some communities are very welcoming.
Still, we can teach ourselves. The rules may not be easy to follow,
but they are simple: