Quite important, did you try assigning cores to the processes? That decreases run times by about 10-15% for single-threading.
Also, approximately half the time when running with 2 threads? That's a close call then, on the other hand, a slightly smaller throughput might be worth it due to increased 1st rate. Still, I would really test this for a few days and compare throughputs. I ran SGS a long time on an i3-2120 (2 cores) and throughput was definitely higher for single-thread.
Just out of curiosity, what does Prime95 benchmark say about it? FFT is 128k which really sounds like single-threading should be the fastest option. But of course, in the end what matters is what PG says you're computer delivered in the last 24 hours.
Primes: 1281979 & 12+8+1979 & 1+2+8+1+9+7+9 & 1^2+2^2+8^2+1^2+9^2+7^2+9^2 & 12*8+19*79 & 12^8-1979 & 1281979 + 4 (cousin prime)