## Other

drummers-lowrise

Message boards : Project Staging Area : Which CPU program to use for large GFNs?

 Subscribe SortOldest firstNewest firstHighest rated posts first
Author Message
Michael Goetz
Volunteer moderator

Joined: 21 Jan 10
Posts: 13804
ID: 53948
Credit: 345,369,032
RAC: 3,564

Message 80711 - Posted: 7 Nov 2014 | 17:06:24 UTC

As long as 'b' isn't too large, Genefer is usually* the program of choice, especially on a modern CPU with AVX or FMA3. But when b becomes to large for the genefer algorithm to work with 64 bit floating point math, the only alternative is to switch to 80 bit floating point ("x87") which is much slower. When that happens, LLR (which can use all instructions up to FMA3) and PFGW (which can use instructions up to AVX) become faster than genefer.

On older CPUs that lack advanced SIMD instructions (SSE2, SSE4, AVX, FMA3) Genefer may be faster than LLR and PFGW, but on newer computers the ability to use those instruction sets outweighs the speed of the algorithm.

Here's some examples from my Haswell (FMA3) Core i5:

Genefer:

C:\Temp\GFN\3.2.5>genefer_windows64.exe -q "7392970^32768+1" genefer 3.2.5-dev (Windows/CPU/64-bit) Supported transform implementations: fma3 avx-intel sse4 sse2 default x87 Copyright 2001-2014, Yves Gallot Copyright 2009, Mark Rodenkirch, David Underbakke Copyright 2010-2012, Shoichiro Yamada, Ken Brazier Copyright 2011-2014, Iain Bethune, Michael Goetz, Ronald Schneider Command line: genefer_windows64.exe -q 7392970^32768+1 Priority change succeeded. Testing 7392970^32768+1... Using x87 (80-bit) transform Resuming 7392970^32768+1 from a checkpoint (745471 iterations left) Estimated time remaining for 7392970^32768+1 is 0:17:12 7392970^32768+1 is a probable composite. (RES=b6a7b30a1c669739) (225078 digits) (err = 0.0083) (time = 0:18:16) 11:43:29

LLR:
C:\PRPNet\prpclient-5.3.1-windows\prpclient-2>llr64 -d -q"7392970^32768+1" Base factorized as : 2*5*13*29*37*53 Base prime factor(s) taken : 29, 37, 53 Starting N-1 prime test of 7392970^32768+1 Using generic reduction FMA3 FFT length 72K, Pass1=384, Pass2=192, a = 3 7392970^32768+1 is not prime. RES64: C5BC4713CD65C714. OLD64: 5134D53B68315538 Time : 749.329 sec.

PFGW:
C:\PRPNet\prpclient-5.3.1-windows\prpclient-1>pfgw64 -q"7392970^32768+1" PFGW Version 3.7.7.64BIT.20130722.Win_Dev [GWNUM 27.11] 7392970^32768+1 is composite: RES64: [C5BC4713CD65C714] (920.8597s+0.0062s)

As you can see, LLR is fastest, at least on this computer, probably because it uses FMA3 while PFGW "only" uses AVX. Genefer is slower than either because it can only use x87 on this number, and the x87 transform is about 10 times slower than the FMA3 transform.

I suspect that on non-AVX computers Genefer may be faster than LLR or PFGW, and that would include all AMD CPUs. So the choice of program probably depends on what CPU you have. And that leads to an interesting complication...

There's always some talk about either moving all the GFN ports to BOINC, or turning on double checking on the GFN ports. If we turn on double checking or move to BOINC (which WILL be set up for doublechecking), to match with a wingman, the residues have to be the same. Therein lies the problem. On some computers Genefer will be faster (older computers and AMD computers) while LLR will be faster on newer Intel computers. If you're paying attention you may have noticed that the residue from Genefer does not match the residues from LLR and PFGW. Since you can configure PRPNet to use either Genefer, PFGW, or both, some results will come back with a residue from Genefer and some from PFGW, and they won't match. If we turn on double checking on the PRPNet GFN ports we'll have verification errors even when both computers completed the calculation correctly.

If we want to enable double checking, we'll need to force everyone to use one or the other.

By the way, looking at the database, the vast majority of people are using Genefer, even though on modern Intel CPUs PFGW is probably faster.

(*) There's some circumstances where LLR is faster.
____________
My lucky number is 75898524288+1

JeppeSN

Joined: 5 Apr 14
Posts: 1727
ID: 306875
Credit: 41,412,637
RAC: 13,689

Message 80714 - Posted: 7 Nov 2014 | 18:14:58 UTC

Thanks for the post. I suspect many people have one Genefer, one LLR, and one PFGW exe (uncommented) in their .ini files. In that case it is not clear that the client software chooses the best executable for the task. Maybe it chooses Genefer everytime in that situation?

/JeppeSN

Michael Goetz
Volunteer moderator

Joined: 21 Jan 10
Posts: 13804
ID: 53948
Credit: 345,369,032
RAC: 3,564

Message 80715 - Posted: 7 Nov 2014 | 18:26:42 UTC - in response to Message 80714.

Thanks for the post. I suspect many people have one Genefer, one LLR, and one PFGW exe (uncommented) in their .ini files. In that case it is not clear that the client software chooses the best executable for the task. Maybe it chooses Genefer everytime in that situation?

/JeppeSN

It chooses them in a set sequence. I *think* it chooses Genefer first and PFGW second. PRPClient has no way to know which is better. It will only switch from Genefer to PFGW if the precision limit ("MaxErr exceeded") is exceeded in Genefer, which isn't likely to happen since Genefer now internally switches to x87 when needed. x87 has limits too, but they're high enough so we won't be hitting them anytime in the near future.
____________
My lucky number is 75898524288+1

JeppeSN

Joined: 5 Apr 14
Posts: 1727
ID: 306875
Credit: 41,412,637
RAC: 13,689

Message 80716 - Posted: 7 Nov 2014 | 18:43:27 UTC

Maybe we could have an overview table showing all possible combinations.

For example each row in the table could represent one PRPNet project, or "port", and each column could represent one type of executable.

Then cells could be red if that project cannot use that exe, and white if it works (or may work). Also the white cells could provide further information, for example if it is known to not work for "large" numbers of the type in question, or if that exe is known to be "slower" for that subproject.

/JeppeSN

Volunteer moderator
Volunteer tester
Project scientist

Joined: 5 Feb 08
Posts: 1212
ID: 18646
Credit: 815,698,687
RAC: 158,193

Message 80717 - Posted: 7 Nov 2014 | 18:51:28 UTC - in response to Message 80716.

Maybe we could have an overview table showing all possible combinations.

all possible combinations means depending on OS and CPU, too?
will be an epic table ...

EDIT: times for an i7 2600K CPU @ 3.40GHz with hyperthreading on (Ubuntu-Linux 64bit)
genefer: 0:26:59
llr: 0:31:50
pfgw: 0:32:47
____________
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/

Michael Goetz
Volunteer moderator

Joined: 21 Jan 10
Posts: 13804
ID: 53948
Credit: 345,369,032
RAC: 3,564

Message 80718 - Posted: 7 Nov 2014 | 19:34:57 UTC - in response to Message 80717.

Maybe we could have an overview table showing all possible combinations.

all possible combinations means depending on OS and CPU, too?
will be an epic table ...

EDIT: times for an i7 2600K CPU @ 3.40GHz with hyperthreading on (Ubuntu-Linux 64bit)
genefer: 0:26:59
llr: 0:31:50
pfgw: 0:32:47

Interesting. So it looks like only with Haswell does Genefer fall behind.

____________
My lucky number is 75898524288+1

Werinbert

Joined: 9 Jun 13
Posts: 171
ID: 233452
Credit: 382,715,533
RAC: 452,203

Message 80719 - Posted: 7 Nov 2014 | 19:58:33 UTC

Epic table or not, there is a lot of confusion as to which exe file is needed, b values, OS, GPU, argh!

So far I have yet to run any genefer WUs with PRPnet (but not for lack of trying). I am still clueless whether or not x87 a separate app (exe), I do not have that app in any of my folders. Unsure where to down load it if it is a separate app.

Just waiting for the current challenge to end so I can re-evaluate the genefer subprojects.

--Confused in Primeland

Michael Goetz
Volunteer moderator

Joined: 21 Jan 10
Posts: 13804
ID: 53948
Credit: 345,369,032
RAC: 3,564

Message 80720 - Posted: 7 Nov 2014 | 20:14:39 UTC - in response to Message 80719.

Epic table or not, there is a lot of confusion as to which exe file is needed, b values, OS, GPU, argh!

So far I have yet to run any genefer WUs with PRPnet (but not for lack of trying). I am still clueless whether or not x87 a separate app (exe), I do not have that app in any of my folders. Unsure where to down load it if it is a separate app.

Just waiting for the current challenge to end so I can re-evaluate the genefer subprojects.

--Confused in Primeland

tl;dr: Geneferx87 no longer exists and is incorporated into the same executable as the other transforms. On a 64 bit Windows computer, you should enable genefer64.exe.

If you're using the most recent version of PRPNet, there's no longer a separate x87 app. There used to be 3 apps: genef64, genefer, and geneferx87 (I may have the name slightly mangled). Now, they've been combined and there's just one app which comes in both 32 and 64 bit versions. There's no reason to run the 32 bit version if you can run 64 bit programs. Effectively, therefore, there's just one CPU app now, either 32 bits or 64 bits depending on your CPU and OS.

The PRPNet package for Windows comes with 4 genefer executables: genefer32.exe, genefer64.exe, genefercuda.exe, and geneferocl.exe. For CPU, you should enable genefer64.exe unless you can not run 64 bit programs (because you have a very old CPU or a 32 bit version of Windows), in which case you should run genefer32.exe.

____________
My lucky number is 75898524288+1

Scott Brown
Volunteer moderator
Volunteer tester
Project scientist

Joined: 17 Oct 05
Posts: 2329
ID: 1178
Credit: 15,614,402,663
RAC: 11,545,211

Message 80722 - Posted: 7 Nov 2014 | 23:13:45 UTC - in response to Message 80720.

I believe that the old (no longer separate) x87 Genefer application on PRPnet was called genefer80 since it is 80-bit. (cannot verify this since I am out of town in Washington DC and on my android tablet).

p.s. - isn't it time we found another reportable prime since I am out of town? :)

Michael Goetz
Volunteer moderator

Joined: 21 Jan 10
Posts: 13804
ID: 53948
Credit: 345,369,032
RAC: 3,564

Message 80723 - Posted: 7 Nov 2014 | 23:36:31 UTC - in response to Message 80722.

p.s. - isn't it time we found another reportable prime since I am out of town? :)

I sure hope so. We've found at least one mega prime in every month in 2014 so far, and it would be seriously impressive if we can also find one in November and December.
____________
My lucky number is 75898524288+1

Werinbert

Joined: 9 Jun 13
Posts: 171
ID: 233452
Credit: 382,715,533
RAC: 452,203

Message 80724 - Posted: 7 Nov 2014 | 23:39:41 UTC - in response to Message 80720.

Thanks Michael, that cleared up a lot of my confusion.

I still think there is a missing dll for genefercuda. I have dlls for _32_ and _40_ but I think the missing one is for _50_ which is not part of the downloadable package.

Message boards : Project Staging Area : Which CPU program to use for large GFNs?