Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Project Staging Area :
Which CPU program to use for large GFNs?
Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13804 ID: 53948 Credit: 345,369,032 RAC: 3,564
                              
|
As long as 'b' isn't too large, Genefer is usually* the program of choice, especially on a modern CPU with AVX or FMA3. But when b becomes to large for the genefer algorithm to work with 64 bit floating point math, the only alternative is to switch to 80 bit floating point ("x87") which is much slower. When that happens, LLR (which can use all instructions up to FMA3) and PFGW (which can use instructions up to AVX) become faster than genefer.
On older CPUs that lack advanced SIMD instructions (SSE2, SSE4, AVX, FMA3) Genefer may be faster than LLR and PFGW, but on newer computers the ability to use those instruction sets outweighs the speed of the algorithm.
Here's some examples from my Haswell (FMA3) Core i5:
Genefer:
C:\Temp\GFN\3.2.5>genefer_windows64.exe -q "7392970^32768+1"
genefer 3.2.5-dev (Windows/CPU/64-bit)
Supported transform implementations: fma3 avx-intel sse4 sse2 default x87
Copyright 2001-2014, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: genefer_windows64.exe -q 7392970^32768+1
Priority change succeeded.
Testing 7392970^32768+1...
Using x87 (80-bit) transform
Resuming 7392970^32768+1 from a checkpoint (745471 iterations left)
Estimated time remaining for 7392970^32768+1 is 0:17:12
7392970^32768+1 is a probable composite. (RES=b6a7b30a1c669739) (225078 digits) (err = 0.0083) (time = 0:18:16) 11:43:29
LLR:
C:\PRPNet\prpclient-5.3.1-windows\prpclient-2>llr64 -d -q"7392970^32768+1"
Base factorized as : 2*5*13*29*37*53
Base prime factor(s) taken : 29, 37, 53
Starting N-1 prime test of 7392970^32768+1
Using generic reduction FMA3 FFT length 72K, Pass1=384, Pass2=192, a = 3
7392970^32768+1 is not prime. RES64: C5BC4713CD65C714. OLD64: 5134D53B68315538 Time : 749.329 sec.
PFGW:
C:\PRPNet\prpclient-5.3.1-windows\prpclient-1>pfgw64 -q"7392970^32768+1"
PFGW Version 3.7.7.64BIT.20130722.Win_Dev [GWNUM 27.11]
7392970^32768+1 is composite: RES64: [C5BC4713CD65C714] (920.8597s+0.0062s)
As you can see, LLR is fastest, at least on this computer, probably because it uses FMA3 while PFGW "only" uses AVX. Genefer is slower than either because it can only use x87 on this number, and the x87 transform is about 10 times slower than the FMA3 transform.
I suspect that on non-AVX computers Genefer may be faster than LLR or PFGW, and that would include all AMD CPUs. So the choice of program probably depends on what CPU you have. And that leads to an interesting complication...
There's always some talk about either moving all the GFN ports to BOINC, or turning on double checking on the GFN ports. If we turn on double checking or move to BOINC (which WILL be set up for doublechecking), to match with a wingman, the residues have to be the same. Therein lies the problem. On some computers Genefer will be faster (older computers and AMD computers) while LLR will be faster on newer Intel computers. If you're paying attention you may have noticed that the residue from Genefer does not match the residues from LLR and PFGW. Since you can configure PRPNet to use either Genefer, PFGW, or both, some results will come back with a residue from Genefer and some from PFGW, and they won't match. If we turn on double checking on the PRPNet GFN ports we'll have verification errors even when both computers completed the calculation correctly.
If we want to enable double checking, we'll need to force everyone to use one or the other.
By the way, looking at the database, the vast majority of people are using Genefer, even though on modern Intel CPUs PFGW is probably faster.
(*) There's some circumstances where LLR is faster.
____________
My lucky number is 75898524288+1 | |
|
|
Thanks for the post. I suspect many people have one Genefer, one LLR, and one PFGW exe (uncommented) in their .ini files. In that case it is not clear that the client software chooses the best executable for the task. Maybe it chooses Genefer everytime in that situation?
/JeppeSN | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13804 ID: 53948 Credit: 345,369,032 RAC: 3,564
                              
|
Thanks for the post. I suspect many people have one Genefer, one LLR, and one PFGW exe (uncommented) in their .ini files. In that case it is not clear that the client software chooses the best executable for the task. Maybe it chooses Genefer everytime in that situation?
/JeppeSN
It chooses them in a set sequence. I *think* it chooses Genefer first and PFGW second. PRPClient has no way to know which is better. It will only switch from Genefer to PFGW if the precision limit ("MaxErr exceeded") is exceeded in Genefer, which isn't likely to happen since Genefer now internally switches to x87 when needed. x87 has limits too, but they're high enough so we won't be hitting them anytime in the near future.
____________
My lucky number is 75898524288+1 | |
|
|
Maybe we could have an overview table showing all possible combinations.
For example each row in the table could represent one PRPNet project, or "port", and each column could represent one type of executable.
Then cells could be red if that project cannot use that exe, and white if it works (or may work). Also the white cells could provide further information, for example if it is known to not work for "large" numbers of the type in question, or if that exe is known to be "slower" for that subproject.
/JeppeSN | |
|
Sysadm@Nbg Volunteer moderator Volunteer tester Project scientist
 Send message
Joined: 5 Feb 08 Posts: 1212 ID: 18646 Credit: 815,698,687 RAC: 158,193
                      
|
Maybe we could have an overview table showing all possible combinations.
all possible combinations means depending on OS and CPU, too?
will be an epic table ...
EDIT: times for an i7 2600K CPU @ 3.40GHz with hyperthreading on (Ubuntu-Linux 64bit)
genefer: 0:26:59
llr: 0:31:50
pfgw: 0:32:47
____________
Sysadm@Nbg
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13804 ID: 53948 Credit: 345,369,032 RAC: 3,564
                              
|
Maybe we could have an overview table showing all possible combinations.
all possible combinations means depending on OS and CPU, too?
will be an epic table ...
EDIT: times for an i7 2600K CPU @ 3.40GHz with hyperthreading on (Ubuntu-Linux 64bit)
genefer: 0:26:59
llr: 0:31:50
pfgw: 0:32:47
Interesting. So it looks like only with Haswell does Genefer fall behind.
____________
My lucky number is 75898524288+1 | |
|
|
Epic table or not, there is a lot of confusion as to which exe file is needed, b values, OS, GPU, argh!
So far I have yet to run any genefer WUs with PRPnet (but not for lack of trying). I am still clueless whether or not x87 a separate app (exe), I do not have that app in any of my folders. Unsure where to down load it if it is a separate app.
Just waiting for the current challenge to end so I can re-evaluate the genefer subprojects.
--Confused in Primeland
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13804 ID: 53948 Credit: 345,369,032 RAC: 3,564
                              
|
Epic table or not, there is a lot of confusion as to which exe file is needed, b values, OS, GPU, argh!
So far I have yet to run any genefer WUs with PRPnet (but not for lack of trying). I am still clueless whether or not x87 a separate app (exe), I do not have that app in any of my folders. Unsure where to down load it if it is a separate app.
Just waiting for the current challenge to end so I can re-evaluate the genefer subprojects.
--Confused in Primeland
tl;dr: Geneferx87 no longer exists and is incorporated into the same executable as the other transforms. On a 64 bit Windows computer, you should enable genefer64.exe.
If you're using the most recent version of PRPNet, there's no longer a separate x87 app. There used to be 3 apps: genef64, genefer, and geneferx87 (I may have the name slightly mangled). Now, they've been combined and there's just one app which comes in both 32 and 64 bit versions. There's no reason to run the 32 bit version if you can run 64 bit programs. Effectively, therefore, there's just one CPU app now, either 32 bits or 64 bits depending on your CPU and OS.
The PRPNet package for Windows comes with 4 genefer executables: genefer32.exe, genefer64.exe, genefercuda.exe, and geneferocl.exe. For CPU, you should enable genefer64.exe unless you can not run 64 bit programs (because you have a very old CPU or a 32 bit version of Windows), in which case you should run genefer32.exe.
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2329 ID: 1178 Credit: 15,614,402,663 RAC: 11,545,211
                                           
|
I believe that the old (no longer separate) x87 Genefer application on PRPnet was called genefer80 since it is 80-bit. (cannot verify this since I am out of town in Washington DC and on my android tablet).
p.s. - isn't it time we found another reportable prime since I am out of town? :)
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13804 ID: 53948 Credit: 345,369,032 RAC: 3,564
                              
|
p.s. - isn't it time we found another reportable prime since I am out of town? :)
I sure hope so. We've found at least one mega prime in every month in 2014 so far, and it would be seriously impressive if we can also find one in November and December.
____________
My lucky number is 75898524288+1 | |
|
|
Thanks Michael, that cleared up a lot of my confusion.
I still think there is a missing dll for genefercuda. I have dlls for _32_ and _40_ but I think the missing one is for _50_ which is not part of the downloadable package. | |
|
Message boards :
Project Staging Area :
Which CPU program to use for large GFNs? |