blob: c9a6fea297b612e1903f73bb4a39bcd298b01940 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
|
* bs_transpose_dst only
** original
==> test.ccomp.host.out <==
cycles: 3080223
==> test.ccomp.kvx.out <==
cycles: 10145951
==> test.gcc.host.out <==
cycles: 1485887
==> test.gcc.kvx.out <==
cycles: 4078535
** neg and
==> test.ccomp.host.out <==
cycles: 2905049
==> test.ccomp.kvx.out <==
cycles: 7995063
==> test.gcc.host.out <==
cycles: 1858263
==> test.gcc.kvx.out <==
cycles: 5255763
** cmove mais mauvais scheduling de registres
==> test.ccomp.host.out <==
cycles: 4363682
==> test.ccomp.kvx.out <==
cycles: 7208629
==> test.gcc.host.out <==
cycles: 2916854
==> test.gcc.kvx.out <==
cycles: 5646730
** cmove via match du and
==> test.ccomp.host.out <==
cycles: 2553732
==> test.ccomp.kvx.out <==
cycles: 7208629
==> test.gcc.host.out <==
cycles: 1849125
==> test.gcc.kvx.out <==
cycles: 5255763
** hand optimized loads
cycles: 6027072
* both bs_transpose_dst and bs_transpose_rev
** with both cmove
6890902
|