linux/arch
Ard Biesheuvel 0eee0fbd41 arm64: crypto: increase AES interleave to 4x
This patch increases the interleave factor for parallel AES modes
to 4x. This improves performance on Cortex-A57 by ~35%. This is
due to the 3-cycle latency of AES instructions on the A57's
relatively deep pipeline (compared to Cortex-A53 where the AES
instruction latency is only 2 cycles).

At the same time, disable inline expansion of the core AES functions,
as the performance benefit of this feature is negligible.

  Measured on AMD Seattle (using tcrypt.ko mode=500 sec=1):

  Baseline (2x interleave, inline expansion)
  ------------------------------------------
  testing speed of async cbc(aes) (cbc-aes-ce) decryption
  test 4 (128 bit key, 8192 byte blocks): 95545 operations in 1 seconds
  test 14 (256 bit key, 8192 byte blocks): 68496 operations in 1 seconds

  This patch (4x interleave, no inline expansion)
  -----------------------------------------------
  testing speed of async cbc(aes) (cbc-aes-ce) decryption
  test 4 (128 bit key, 8192 byte blocks): 124735 operations in 1 seconds
  test 14 (256 bit key, 8192 byte blocks): 92328 operations in 1 seconds

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-02-26 18:31:46 +00:00
..
alpha asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
arc Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma 2015-02-18 08:49:20 -08:00
arm Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2015-02-22 09:57:16 -08:00
arm64 arm64: crypto: increase AES interleave to 4x 2015-02-26 18:31:46 +00:00
avr32 asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
blackfin Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-02-21 12:59:04 -08:00
c6x all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
cris CRIS changes for 3.20 2015-02-15 18:02:02 -08:00
frv asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
hexagon all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
ia64 asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
m32r asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
m68k asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
metag asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
microblaze all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
mips Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2015-02-21 19:41:38 -08:00
mn10300 OK, this has the big virtio 1.0 implementation, as specified by OASIS. 2015-02-18 09:24:01 -08:00
nios2 nios2 for v3.20 2015-02-17 14:23:42 -08:00
openrisc asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
parisc Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2015-02-19 10:07:08 -08:00
powerpc The clock framework changes for 3.20 contain the usual driver additions, 2015-02-21 12:30:30 -08:00
s390 Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-02-22 17:42:14 -08:00
score all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
sh asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
sparc asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
tile tile: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
um all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
unicore32 mm: vmalloc: pass additional vm_flags to __vmalloc_node_range() 2015-02-13 21:21:42 -08:00
x86 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-02-21 11:12:07 -08:00
xtensa asm-generic: uaccess.h cleanup 2015-02-18 10:02:24 -08:00
.gitignore
Kconfig