Commits · 5e51224d2afbda57f33f47485871ee5532145e18 · BeagleBoard.org / Linux

Nov 01, 2023

crypto: adiantum - flush destination page before unmapping · a312e07a

Eric Biggers authored 1 year ago

Upon additional review, the new fast path in adiantum_finish() is
missing the call to flush_dcache_page() that scatterwalk_map_and_copy()
was doing.  It's apparently debatable whether flush_dcache_page() is
actually needed, as per the discussion at
https://lore.kernel.org/lkml/YYP1lAq46NWzhOf0@casper.infradead.org/T/#u.
However, it appears that currently all the helper functions that write
to a page, such as scatterwalk_map_and_copy(), memcpy_to_page(), and
memzero_page(), do the dcache flush.  So do it to be consistent.

Fixes: dadf5e56

 ("crypto: adiantum - add fast path for single-page messages")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

a312e07a

Oct 27, 2023

crypto: adiantum - stop using alignmask of shash_alg · 321dfe97

Eric Biggers authored 1 year ago


Now that the shash algorithm type does not support nonzero alignmasks,
shash_alg::base.cra_alignmask is always 0, so OR-ing it into another
value is a no-op.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

321dfe97

Oct 20, 2023

crypto: adiantum - add fast path for single-page messages · dadf5e56

Eric Biggers authored 1 year ago

When the source scatterlist is a single page, optimize the first hash
step of adiantum to use crypto_shash_digest() instead of
init/update/final, and use the same local kmap for both hashing the bulk
part and loading the narrow part of the source data.

Likewise, when the destination scatterlist is a single page, optimize
the second hash step of adiantum to use crypto_shash_digest() instead of
init/update/final, and use the same local kmap for both hashing the bulk
part and storing the narrow part of the destination data.

In some cases these optimizations improve performance significantly.

Note: ideally, for optimal performance each architecture should
implement the full "adiantum(xchacha12,aes)" algorithm and fully
optimize the contiguous buffer case to use no indirect calls. That's
not something I've gotten around to doing, though. This commit just
makes a relatively small change that provides some benefit with the
existing template-based approach.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dadf5e56

Oct 13, 2023

crypto: adiantum - Only access common skcipher fields on spawn · 3c45b578

Herbert Xu authored 1 year ago


As skcipher spawns may be of the type lskcipher, only the common
fields may be accessed.  This was already the case but use the
correct helpers to make this more obvious.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

3c45b578

Feb 13, 2023

crypto: api - Use data directly in completion function · 255e48eb

Herbert Xu authored 2 years ago


This patch does the final flag day conversion of all completion
functions which are now all contained in the Crypto API.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

255e48eb

Jan 02, 2021

crypto: remove cipher routines from public crypto API · 0eb76ba2

Ard Biesheuvel authored 4 years ago


The cipher routines in the crypto API are mostly intended for templates
implementing skcipher modes generically in software, and shouldn't be
used outside of the crypto subsystem. So move the prototypes and all
related definitions to a new header file under include/crypto/internal.
Also, let's use the new module namespace feature to move the symbol
exports into a new namespace CRYPTO_INTERNAL.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

0eb76ba2

Aug 07, 2020

mm, treewide: rename kzfree() to kfree_sensitive() · 453431a5

Waiman Long authored 4 years ago

As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sen...

453431a5

Jul 16, 2020

crypto: algapi - use common mechanism for inheriting flags · 7bcb2c99

Eric Biggers authored 4 years ago

The flag CRYPTO_ALG_ASYNC is "inherited" in the sense that when a
template is instantiated, the template will have CRYPTO_ALG_ASYNC set if
any of the algorithms it uses has CRYPTO_ALG_ASYNC set.

We'd like to add a second flag (CRYPTO_ALG_ALLOCATES_MEMORY) that gets
"inherited" in the same way.  This is difficult because the handling of
CRYPTO_ALG_ASYNC is hardcoded everywhere.  Address this by:

  - Add CRYPTO_ALG_INHERITED_FLAGS, which contains the set of flags that
    have these inheritance semantics.

  - Add crypto_algt_inherited_mask(), for use by template ->create()
    methods.  It returns any of these flags that the user asked to be
    unset and thus must be passed in the 'mask' to crypto_grab_*().

  - Also modify crypto_check_attr_type() to handle computing the 'mask'
    so that most templates can just use this.

  - Make crypto_grab_*() propagate these flags to the template instance
    being created so that templates don't have to do this themselves.

Make crypto/simd.c propagate these flags too, since it "wraps" another
algorithm, similar to a template.

Based on a patch by Mikulas Patocka <mpatocka@redhat.com>
(https://lore.kernel.org/r/alpine.LRH.2.02.2006301414580.30526@file01.intranet.prod.int.rdu2.redhat.com

).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

7bcb2c99

Jan 16, 2020

crypto: poly1305 - add new 32 and 64-bit generic versions · 1c08a104

Jason A. Donenfeld authored 5 years ago

These two C implementations from Zinc -- a 32x32 one and a 64x64 one,
depending on the platform -- come from Andrew Moon's public domain
poly1305-donna portable code, modified for usage in the kernel. The
precomputation in the 32-bit version and the use of 64x64 multiplies in
the 64-bit version make these perform better than the code it replaces.
Moon's code is also very widespread and has received many eyeballs of
scrutiny.

There's a bit of interference between the x86 implementation, which
relies on internal details of the old scalar implementation. In the next
commit, the x86 implementation will be replaced with a faster one that
doesn't rely on this, so none of this matters much. But for now, to keep
this passing the tests, we inline the bits of the old implementation
that the x86 implementation relied on. Also, since we now support a
slightly larger key space, via the union, some offsets had to be fixed
up.

Nonce calculation was folded in with the emit function, to take
advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no
nonce handling in emit, so this path was conditionalized. We also
introduced a new struct, poly1305_core_key, to represent the precise
amount of space that particular implementation uses.

Testing with kbench9000, depending on the CPU, the update function for
the 32x32 version has been improved by 4%-7%, and for the 64x64 by
19%-30%. The 32x32 gains are small, but I think there's great value in
having a parallel implementation to the 64x64 one so that the two can be
compared side-by-side as nice stand-alone units.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1c08a104

Jan 08, 2020

crypto: cipher - make crypto_spawn_cipher() take a crypto_cipher_spawn · d5ed3b65

Eric Biggers authored 5 years ago


Now that all users of single-block cipher spawns have been converted to
use 'struct crypto_cipher_spawn' rather than the less specifically typed
'struct crypto_spawn', make crypto_spawn_cipher() take a pointer to a
'struct crypto_cipher_spawn' rather than a 'struct crypto_spawn'.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d5ed3b65

crypto: adiantum - use crypto_grab_{cipher,shash} and simplify error paths · ba448407

Eric Biggers authored 5 years ago

Make the adiantum template use the new functions crypto_grab_cipher()
and crypto_grab_shash() to initialize its cipher and shash spawns.

This is needed to make all spawns be initialized in a consistent way.

Also simplify the error handling by taking advantage of crypto_drop_*()
now accepting (as a no-op) spawns that haven't been initialized yet, and
by taking advantage of crypto_grab_*() now handling ERR_PTR() names.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ba448407

crypto: algapi - pass instance to crypto_grab_spawn() · de95c957

Eric Biggers authored 5 years ago


Currently, crypto_spawn::inst is first used temporarily to pass the
instance to crypto_grab_spawn().  Then crypto_init_spawn() overwrites it
with crypto_spawn::next, which shares the same union.  Finally,
crypto_spawn::inst is set again when the instance is registered.

Make this less convoluted by just passing the instance as an argument to
crypto_grab_spawn() instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

de95c957

crypto: skcipher - pass instance to crypto_grab_skcipher() · b9f76ddd

Eric Biggers authored 5 years ago

Initializing a crypto_skcipher_spawn currently requires:

1. Set spawn->base.inst to point to the instance.
2. Call crypto_grab_skcipher().

But there's no reason for these steps to be separate, and in fact this
unneeded complication has caused at least one bug, the one fixed by
commit 6db43410

 ("crypto: adiantum - initialize crypto_spawn::inst")

So just make crypto_grab_skcipher() take the instance as an argument.

To keep the function calls from getting too unwieldy due to this extra
argument, also introduce a 'mask' variable into the affected places
which weren't already using one.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b9f76ddd

crypto: remove propagation of CRYPTO_TFM_RES_* flags · af5034e8

Eric Biggers authored 5 years ago


The CRYPTO_TFM_RES_* flags were apparently meant as a way to make the
->setkey() functions provide more information about errors.  But these
flags weren't actually being used or tested, and in many cases they
weren't being set correctly anyway.  So they've now been removed.

Also, if someone ever actually needs to start better distinguishing
->setkey() errors (which is somewhat unlikely, as this has been unneeded
for a long time), we'd be much better off just defining different return
values, like -EINVAL if the key is invalid for the algorithm vs.
-EKEYREJECTED if the key was rejected by a policy like "no weak keys".
That would be much simpler, less error-prone, and easier to test.

So just remove CRYPTO_TFM_RES_MASK and all the unneeded logic that
propagates these flags around.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

af5034e8

Dec 09, 2019

treewide: Use sizeof_field() macro · c593642c

Pankaj Bharadiya authored 5 years ago


Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().

This patch is generated using following script:

EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do

	if [[ "$file" =~ $EXCLUDE_FILES ]]; then
		continue
	fi
	sed -i  -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com


Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net

c593642c

Nov 16, 2019

crypto: poly1305 - move core routines into a separate library · 48ea8c6e

Ard Biesheuvel authored 5 years ago


Move the core Poly1305 routines shared between the generic Poly1305
shash driver and the Adiantum and NHPoly1305 drivers into a separate
library so that using just this pieces does not pull in the crypto
API pieces of the generic Poly1305 routine.

In a subsequent patch, we will augment this generic library with
init/update/final routines so that Poyl1305 algorithm can be used
directly without the need for using the crypto API's shash abstraction.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

48ea8c6e

Apr 25, 2019

crypto: shash - remove shash_desc::flags · 877b5691

Eric Biggers authored 5 years ago

The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.

With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP. These would all need to be fixed if any shash algorithm
actually started sleeping. For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API. However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.

Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk. It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.

Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

877b5691

Apr 18, 2019

crypto: run initcalls for generic implementations earlier · c4741b23

Eric Biggers authored 5 years ago

Use subsys_initcall for registration of all templates and generic
algorithm implementations, rather than module_init. Then change
cryptomgr to use arch_initcall, to place it before the subsys_initcalls.

This is needed so that when both a generic and optimized implementation
of an algorithm are built into the kernel (not loadable modules), the
generic implementation is registered before the optimized one.
Otherwise, the self-tests for the optimized implementation are unable to
allocate the generic implementation for the new comparison fuzz tests.

Note that on arm, a side effect of this change is that self-tests for
generic implementations may run before the unaligned access handler has
been installed. So, unaligned accesses will crash the kernel. This is
arguably a good thing as it makes it easier to detect that type of bug.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c4741b23

Jan 10, 2019

crypto: adiantum - initialize crypto_spawn::inst · 6db43410

Eric Biggers authored 6 years ago

crypto_grab_*() doesn't set crypto_spawn::inst, so templates must set it
beforehand.  Otherwise it will be left NULL, which causes a crash in
certain cases where algorithms are dynamically loaded/unloaded.  E.g.
with CONFIG_CRYPTO_CHACHA20_X86_64=m, the following caused a crash:

    insmod chacha-x86_64.ko
    python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))'
    rmmod chacha-x86_64.ko
    python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))'

Fixes: 059c2a4d

 ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

6db43410

Dec 13, 2018

crypto: adiantum - fix leaking reference to hash algorithm · 00c9fe37

Eric Biggers authored 6 years ago

crypto_alg_mod_lookup() takes a reference to the hash algorithm but
crypto_init_shash_spawn() doesn't take ownership of it, hence the
reference needs to be dropped in adiantum_create().

Fixes: 059c2a4d

 ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

00c9fe37

crypto: adiantum - adjust some comments to match latest paper · c6018e1a

Eric Biggers authored 6 years ago


The 2018-11-28 revision of the Adiantum paper has revised some notation:

- 'M' was replaced with 'L' (meaning "Left", for the left-hand part of
  the message) in the definition of Adiantum hashing, to avoid confusion
  with the full message
- ε-almost-∆-universal is now abbreviated as ε-∆U instead of εA∆U
- "block" is now used only to mean block cipher and Poly1305 blocks

Also, Adiantum hashing was moved from the appendix to the main paper.

To avoid confusion, update relevant comments in the code to match.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c6018e1a

crypto: adiantum - propagate CRYPTO_ALG_ASYNC flag to instance · b299362e

Eric Biggers authored 6 years ago

If the stream cipher implementation is asynchronous, then the Adiantum
instance must be flagged as asynchronous as well.  Otherwise someone
asking for a synchronous algorithm can get an asynchronous algorithm.

There are no asynchronous xchacha12 or xchacha20 implementations yet
which makes this largely a theoretical issue, but it should be fixed.

Fixes: 059c2a4d

 ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b299362e

Nov 20, 2018

crypto: adiantum - add Adiantum support · 059c2a4d

Eric Biggers authored 6 years ago

Add support for the Adiantum encryption mode. Adiantum was designed by
Paul Crowley and is specified by our paper:

Adiantum: length-preserving encryption for entry-level processors
(https://eprint.iacr.org/2018/720.pdf

)

See our paper for full details; this patch only provides an overview.

Adiantum is a tweakable, length-preserving encryption mode designed for
fast and secure disk encryption, especially on CPUs without dedicated
crypto instructions. Adiantum encrypts each sector using the XChaCha12
stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
function, and an invocation of the AES-256 block cipher on a single
16-byte block. On CPUs without AES instructions, Adiantum is much
faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
and decryption about 5 times faster.

Adiantum is a specialization of the more general HBSH construction. Our
earlier proposal, HPolyC, was also a HBSH specialization, but it used a
different εA∆U hash function, one based on Poly1305 only. Adiantum's
εA∆U hash function, which is based primarily on the "NH" hash function
like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
consequently, Adiantum is about 20% faster than HPolyC.

This speed comes with no loss of security: Adiantum is provably just as
secure as HPolyC, in fact slightly *more* secure. Like HPolyC,
Adiantum's security is reducible to that of XChaCha12 and AES-256,
subject to a security bound. XChaCha12 itself has a security reduction
to ChaCha12. Therefore, one need not "trust" Adiantum; one need only
trust ChaCha12 and AES-256. Note that the εA∆U hash function is only
used for its proven combinatorical properties so cannot be "broken".

Adiantum is also a true wide-block encryption mode, so flipping any
plaintext bit in the sector scrambles the entire ciphertext, and vice
versa. No other such mode is available in the kernel currently; doing
the same with XTS scrambles only 16 bytes. Adiantum also supports
arbitrary-length tweaks and naturally supports any length input >= 16
bytes without needing "ciphertext stealing".

For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
order to make encryption feasible on the widest range of devices.
Although the 20-round variant is quite popular, the best known attacks
on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
security margin; in fact, larger than AES-256's. 12-round Salsa20 is
also the eSTREAM recommendation. For the block cipher, Adiantum uses
AES-256, despite it having a lower security margin than XChaCha12 and
needing table lookups, due to AES's extensive adoption and analysis
making it the obvious first choice. Nevertheless, for flexibility this
patch also permits the "adiantum" template to be instantiated with
XChaCha20 and/or with an alternate block cipher.

We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
where currently the only other suitable options are block cipher modes
such as AES-XTS. A big problem with this is that many low-end mobile
devices (e.g. Android Go phones sold primarily in developing countries,
as well as some smartwatches) still have CPUs that lack AES
instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too
slow to be viable on these devices. We did find that some "lightweight"
block ciphers are fast enough, but these suffer from problems such as
not having much cryptanalysis or being too controversial.

The ChaCha stream cipher has excellent performance but is insecure to
use directly for disk encryption, since each sector's IV is reused each
time it is overwritten. Even restricting the threat model to offline
attacks only isn't enough, since modern flash storage devices don't
guarantee that "overwrites" are really overwrites, due to wear-leveling.
Adiantum avoids this problem by constructing a
"tweakable super-pseudorandom permutation"; this is the strongest
possible security model for length-preserving encryption.

Of course, storing random nonces along with the ciphertext would be the
ideal solution. But doing that with existing hardware and filesystems
runs into major practical problems; in most cases it would require data
journaling (like dm-integrity) which severely degrades performance.
Thus, for now length-preserving encryption is still needed.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

059c2a4d

Admin message