crypto: powerpc - Add POWER8 optimised crc32c
Use the vector polynomial multiply-sum instructions in POWER8 to speed up crc32c. This is just over 41x faster than the slice-by-8 method that it replaces. Measurements on a 4.1 GHz POWER8 show it sustaining 52 GiB/sec. A simple btrfs write performance test: dd if=/dev/zero of=/mnt/tmpfile bs=1M count=4096 sync is over 3.7x faster. Signed-off-by:Anton Blanchard <anton@samba.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
Showing
- arch/powerpc/crypto/Makefile 2 additions, 0 deletionsarch/powerpc/crypto/Makefile
- arch/powerpc/crypto/crc32c-vpmsum_asm.S 1553 additions, 0 deletionsarch/powerpc/crypto/crc32c-vpmsum_asm.S
- arch/powerpc/crypto/crc32c-vpmsum_glue.c 167 additions, 0 deletionsarch/powerpc/crypto/crc32c-vpmsum_glue.c
- arch/powerpc/include/asm/ppc-opcode.h 12 additions, 0 deletionsarch/powerpc/include/asm/ppc-opcode.h
- crypto/Kconfig 11 additions, 0 deletionscrypto/Kconfig
Please register or sign in to comment