I run all tests under Linux and using mprime. Found strange timings but only on 640K and 8 workers, one cpu core per worker (SMT is disabled, Linux is OS) and found no explanation. Any idea?
Prime95 64-bit version 30.8, RdtscTiming=1
Timings for 640K all-complex FFT length (8 cores, 1 worker): 0.43 ms. Throughput: 2317.71 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 2 workers): 0.54, 0.54 ms. Throughput: 3704.59 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 4 workers): 1.05, 1.04, 1.04, 1.04 ms. Throughput: 3831.54 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 8 workers): 4.22, 4.22, 4.22, 2.59, 4.20, 4.20, 2.38, 4.21 ms. Throughput: 2230.50 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 1 worker): 0.43 ms. Throughput: 2330.67 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 2 workers): 0.58, 0.58 ms. Throughput: 3428.54 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 4 workers): 1.13, 1.13, 1.14, 1.13 ms. Throughput: 3530.72 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 8 workers): 4.20, 4.24, 2.60, 4.08, 2.42, 4.35, 4.37, 4.24 ms. Throughput: 2210.83 iter/sec.
Prime95 64-bit version 29.8, RdtscTiming=1
Timings for 640K all-complex FFT length (8 cores, 1 worker): 0.43 ms. Throughput: 2301.15 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 2 workers): 0.55, 0.55 ms. Throughput: 3626.02 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 4 workers): 1.07, 1.07, 1.06, 1.06 ms. Throughput: 3751.43 iter/sec.
Timings for 640K all-complex FFT length (8 cores, 8 workers): 4.29, 2.77, 4.18, 4.32, 4.26, 3.45, 3.01, 4.34 ms. Throughput: 2152.02 iter/sec.
This machine is dedicated just for crunching and nothing else. Also as you can see in red color, not always same core are faster then rest of them.
____________
92*10^1439761-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
314187728^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |