linux/arch/mips
Chen Jie 615eb603f4 MIPS: csum_partial: Improve instruction parallelism.
Computing sum introduces true data dependency. This patch removes some
true data depdendencies, hence increases instruction level parallelism.

This patch brings up to 50% csum performance gain on Loongson 3a.

One example about how this patch works is in CSUM_BIGCHUNK1:
// ** original **    vs    ** patch applied **
    ADDC(sum, t0)           ADDC(t0, t1)
    ADDC(sum, t1)           ADDC(t2, t3)
    ADDC(sum, t2)           ADDC(sum, t0)
    ADDC(sum, t3)           ADDC(sum, t2)

In the original implementation, each ADDC(sum, ...) depends on the sum
value updated by previous ADDC(as source operand).

With this patch applied, the first two ADDC operations are independent,
hence can be executed simultaneously if possible.

Another example is in the "copy and sum calculating chunk":
// ** original **    vs    ** patch applied **
    STORE(t0, UNIT(0) ...   STORE(t0, UNIT(0) ...
    ADDC(sum, t0)           ADDC(t0, t1)
    STORE(t1, UNIT(1) ...   STORE(t1, UNIT(1) ...
    ADDC(sum, t1)           ADDC(sum, t0)
    STORE(t2, UNIT(2) ...   STORE(t2, UNIT(2) ...
    ADDC(sum, t2)           ADDC(t2, t3)
    STORE(t3, UNIT(3) ...   STORE(t3, UNIT(3) ...
    ADDC(sum, t3)           ADDC(sum, t2)

With this patch applied, ADDC and the **next next** ADDC are independent.

Signed-off-by: chenj <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9608/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-04-01 17:22:11 +02:00
..
alchemy
ar7
ath25
ath79
bcm47xx MIPS: BCM47XX: Fix coding style to match kernel standards 2015-04-01 17:22:10 +02:00
bcm63xx
bmips
boot MIPS: OCTEON: add GPIO LED support for DSR-1000N 2015-04-01 17:22:10 +02:00
cavium-octeon MIPS: OCTEON: add GPIO LED support for DSR-1000N 2015-04-01 17:22:10 +02:00
cobalt
configs
dec
emma
fw
include
jazz
jz4740
kernel
kvm
lantiq
lasat
lib MIPS: csum_partial: Improve instruction parallelism. 2015-04-01 17:22:11 +02:00
loongson
loongson1
math-emu
mm
mti-malta
mti-sead3 MIPS: SEAD3: Nuke remaining I2C bits. 2015-04-01 17:22:08 +02:00
net
netlogic
oprofile
paravirt
pci
pistachio
pmcs-msp71xx
pnx833x
power MIPS: Hibernate: Restructure files and functions 2015-04-01 17:22:09 +02:00
ralink
rb532
sgi-ip22
sgi-ip27
sgi-ip32
sibyte
sni
txx9
vr41xx
Kbuild
Kbuild.platforms
Kconfig
Kconfig.debug
Makefile