Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Generalized Fermat Prime Search :
Vastly improved: AVX CPU Genefer Time
Author |
Message |
|
Running two instances (1600MHz RAM) during Genefer challenge (app 3.1.2.0.2.04) my CPU time was 510,000s, now with app 3.2.4.0.3.04 CPU time dropped to a phenomenal 230,513s/64~Hr. with a WOO task (216,101s/60Hr) running concurrently (1600MHz RAM.)
Currently average Prime grid CPU Genefer time: 139hr, while WOO tasks are 100hr.
Would running 2 Genefer instances concurrent, slow up CPU times? What's difference between LLR tasks and Genefer? Are Both LLR/Genefer-Double Float tasks?
What advanced (lowered) CPU times so significantly? Intel Ivy a non-FMA3 CPU. What can be attributed to faster Ivy Bridge runtimes? Are Haswells able to finish CPU Genefer in under 150,000s? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,051,011 RAC: 285,770
                               
|
Would running 2 Genefer instances concurrent, slow up CPU times?
Yes, it will.
What's difference between LLR tasks and Genefer? Are Both LLR/Genefer-Double Float tasks?
They use completely different algorithms for performing the calculations, but both are similar in that they use FFT-like operations to perform the math. Both therefore are heavily dependant on double precision floating point performance, and both take advantage of whatever SIMD instructions are available on the CPU such as SSE3, AVX, and FMA.
What advanced (lowered) CPU times so significantly? Intel Ivy a non-FMA3 CPU. What can be attributed to faster Ivy Bridge runtimes?
The latest version of Genefer not only takes advantage of AVX and FMA (if available), but also uses a faster transform algorithm.
Are Haswells able to finish CPU Genefer in under 150,000s?
C:\Temp\GFN\3.2.5>genefer_windows64 -q "444444^1048576+1"
genefer 3.2.5-dev (Windows/CPU/64-bit)
Supported transform implementations: fma3 avx-intel sse4 sse2 default x87
Copyright 2001-2014, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: genefer_windows64 -q 444444^1048576+1
Priority change succeeded.
Testing 444444^1048576+1...
Using FMA3 transform
Starting initialization...
Initialization complete (2.794 seconds).
Estimated time remaining for 444444^1048576+1 is 30:33:45
Testing 444444^1048576+1... 19668992 steps to go (30:35:05 remaining)
That's about 109000 seconds.
____________
My lucky number is 75898524288+1 | |
|
|
I will continue running Large FFT LLR tasks with a Genefer for faster times. If CPU times running two Genefer concurrent are slower, throughput will be nearly same running one with LLR. For my i5 time - compared to Genefer Prime grid overall average times- the difference surprised me. I was expecting over a 120~hr run time.
Are Haswell's now fast enough for World Record tasks? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,051,011 RAC: 285,770
                               
|
Are Haswell's now fast enough for World Record tasks?
Kind of. The problem is that only the very fastest of computers can run them in a reasonable time. Most of the hosts can't -- and the server can't easily tell which ones can and which ones can't.
____________
My lucky number is 75898524288+1 | |
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,353,639,160 RAC: 1,018,657
                           
|
Should be the DL of GFN WR tasks set to 27 - 32 days?
I'v tried to crunch a single GFN WR task by the cpuGFN app (Haswell i5-4570S, 3.1 GHz, RAM DDR3 1.6 GHz). The task was credited despite of the CPU time close to 24 days because of being the second successful result (Run time 2,050,188.91/ CPU time 1,977,637.00/credit 618,991.83).
By my point of view (and calculation of a power consumption) is more effective to crunch GFN WR tasks by cpuGFN app than by GPU app. I don't suppose that it is a good idea to enable the CPU crunching of GFN WR tasks to wide Boinc world, on the other side using anonymous platform could be a good idea - creators and users of app_info.xml are familiar with applications, tasks and bounds of usage. DL 27 -32 days could be acceptable for both GPU and anonymous platform CPU crunchers. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,051,011 RAC: 285,770
                               
|
Should be the DL of GFN WR tasks set to 27 - 32 days?
I'v tried to crunch a single GFN WR task by the cpuGFN app (Haswell i5-4570S, 3.1 GHz, RAM DDR3 1.6 GHz). The task was credited despite of the CPU time close to 24 days because of being the second successful result (Run time 2,050,188.91/ CPU time 1,977,637.00/credit 618,991.83).
By my point of view (and calculation of a power consumption) is more effective to crunch GFN WR tasks by cpuGFN app than by GPU app. I don't suppose that it is a good idea to enable the CPU crunching of GFN WR tasks to wide Boinc world, on the other side using anonymous platform could be a good idea - creators and users of app_info.xml are familiar with applications, tasks and bounds of usage. DL 27 -32 days could be acceptable for both GPU and anonymous platform CPU crunchers.
This is an interesting question.
Some facts to consider: There's faster computers than yours -- and some *can* (barely) crunch a GFN-WR within the current 21 day limit.
Most computers can't, however. And most computers are slower than yours. If we make the deadline slightly longer to accommodate your CPU, as well as most desktop Haswells, it would allow some computers to crunch this on the CPU. However, any older computer would not be able to, and if we enable CPU tasks we'll have Celerons and Phenoms and Core2s and so forth trying to run WR tasks, which will not be good. If we force people to use app_info, a lot of people who have fast computers wouldn't bother (or can't figure it out.)
Such a change would likely benefit only a small number of people like yourself. The cost would be longer turn arounds for abandoned tasks. This doesn't sound like a good trade, in my opinion.
On the other hand, if we leave things the way the are right now, you can still do exactly what you're doing now: use app_info to run the CPU app, and miss the deadline by a little bit. If you miss the deadline by a few days, you'll always get credit. You don't have to be the second task; as long as you get the result in before the workunit is purged out of the database, you get credit. The purge delay is several weeks, so you have time.
The only drawback is that if you're late, we'll send out another task -- which may be unnecessary. However, the error rate is pretty high on WR tasks, so there's a good chance that only two good tasks will be returned, including yours. So nothing is wasted.
My initial reaction is to leave things the way they are, but we may look at this again.
____________
My lucky number is 75898524288+1 | |
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1081 ID: 183129 Credit: 1,384,625,026 RAC: 7,097
                          
|
Mike, I have an i5-2500k @ 4.5GHz, with decent runtimes on LLR workunits. What is the runtime of a GFN-WR on a CPU compared to the runtime of a SoB? Or do I just need to download one and find out?
____________
275*2^3585539+1 is prime!!! (1079358 digits)
Proud member of Aggie the Pew
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,051,011 RAC: 285,770
                               
|
Mike, I have an i5-2500k @ 4.5GHz, with decent runtimes on LLR workunits. What is the runtime of a GFN-WR on a CPU compared to the runtime of a SoB? Or do I just need to download one and find out?
I think GFN-WR is about 6 times as big a task as SoB.
It's easy to test -- no need to fake BOINC into giving your CPU a WR task. Just go to the command line and start running Genefer64 on a number like 30000^4194304+1 and it will tell you how long it will take. The -b3 benchmark that's built into Genefer might also provide that information.
____________
My lucky number is 75898524288+1 | |
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1081 ID: 183129 Credit: 1,384,625,026 RAC: 7,097
                          
|
Mike, I have an i5-2500k @ 4.5GHz, with decent runtimes on LLR workunits. What is the runtime of a GFN-WR on a CPU compared to the runtime of a SoB? Or do I just need to download one and find out?
I think GFN-WR is about 6 times as big a task as SoB.
It's easy to test -- no need to fake BOINC into giving your CPU a WR task. Just go to the command line and start running Genefer64 on a number like 30000^4194304+1 and it will tell you how long it will take. The -b3 benchmark that's built into Genefer might also provide that information.
D:\genefer>genefer_windows64 -q "44996^4194304+1"
genefer 3.2.5 (Windows/CPU/64-bit)
Supported transform implementations: avx-intel sse4 sse2 default x87
Copyright 2001-2014, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: genefer_windows64 -q 44996^4194304+1
Priority change succeeded.
Testing 44996^4194304+1...
Using AVX (Intel) transform
Starting initialization...
Initialization complete (20.527 seconds).
Estimated time remaining for 44996^4194304+1 is 440:47:33
440 hours and 47 minutes is 18.366 days, so looks like I might have a chance to complete in deadline.. The 44996 was taken from the max b in progress from the gfn wr subproject status page, let me know if I did that wrong.
Thanks a ton.
EDIT: I ran with -b3 and it didn't go up very high numbers-wise for predictions.
Running benchmarks for transform implementation "AVX (Intel)"
14^32768+1 37557 digits 0 days 0.0 hours (0.11 ms/mul, 124758 iterations) 294 GFLOPS
75898^32768+1 159916 digits 0 days 0.0 hours (0.11 ms/mul, 531226 iterations) 1253 GFLOPS
700000^32768+1 191533 digits 0 days 0.0 hours (0.11 ms/mul, 636255 iterations) 1501 GFLOPS
5000000^32768+1 219512 digits 0 days 0.0 hours (0.11 ms/mul, 729201 iterations) 1720 GFLOPS
14^65536+1 75113 digits 0 days 0.0 hours (0.30 ms/mul, 249517 iterations) 1243 GFLOPS
75898^65536+1 319831 digits 0 days 0.0 hours (0.30 ms/mul, 1062453 iterations) 5292 GFLOPS
710000^65536+1 383469 digits 0 days 0.1 hours (0.29 ms/mul, 1273852 iterations) 6345 GFLOPS
2500000^65536+1 419296 digits 0 days 0.1 hours (0.31 ms/mul, 1392868 iterations) 6938 GFLOPS
14^131072+1 150226 digits 0 days 0.0 hours (0.57 ms/mul, 499036 iterations) 5233 GFLOPS
75898^131072+1 639662 digits 0 days 0.3 hours (0.56 ms/mul, 2124908 iterations) 22281 GFLOPS
700000^131072+1 766129 digits 0 days 0.4 hours (0.59 ms/mul, 2545023 iterations) 26687 GFLOPS
1000000^131072+1 786432 digits 0 days 0.4 hours (0.56 ms/mul, 2612469 iterations) 27394 GFLOPS
14^262144+1 300451 digits 0 days 0.4 hours (1.64 ms/mul, 998074 iterations) 21978 GFLOPS
75898^262144+1 1279324 digits 0 days 1.8 hours (1.55 ms/mul, 4249818 iterations) 93581 GFLOPS
468750^262144+1 1486604 digits 0 days 2.1 hours (1.60 ms/mul, 4938388 iterations) 108744 GFLOPS
815000^262144+1 1549575 digits 0 days 2.2 hours (1.59 ms/mul, 5147574 iterations) 113350 GFLOPS
14^524288+1 600902 digits 0 days 1.8 hours (3.37 ms/mul, 1996149 iterations) 92097 GFLOPS
75898^524288+1 2558647 digits 0 days 7.9 hours (3.35 ms/mul, 8499637 iterations) 392151 GFLOPS
468750^524288+1 2973207 digits 0 days 9.5 hours (3.48 ms/mul, 9876777 iterations) 455688 GFLOPS
710000^524288+1 3067745 digits 0 days 9.7 hours (3.44 ms/mul, 10190825 iterations) 470178 GFLOPS
____________
275*2^3585539+1 is prime!!! (1079358 digits)
Proud member of Aggie the Pew
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,051,011 RAC: 285,770
                               
|
440 hours and 47 minutes is 18.366 days, so looks like I might have a chance to complete in deadline.. The 44996 was taken from the max b in progress from the gfn wr subproject status page, let me know if I did that wrong.
You might want to try running 4 of those at once and see what the estimates look like with all cores crunching (a more realistic scenario).
EDIT: I ran with -b3 and it didn't go up very high numbers-wise for predictions.
On the GPU it goes higher, but at the time, n=19 was the largest numbers we were crunching on CPUs.
____________
My lucky number is 75898524288+1 | |
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1081 ID: 183129 Credit: 1,384,625,026 RAC: 7,097
                          
|
440 hours and 47 minutes is 18.366 days, so looks like I might have a chance to complete in deadline.. The 44996 was taken from the max b in progress from the gfn wr subproject status page, let me know if I did that wrong.
You might want to try running 4 of those at once and see what the estimates look like with all cores crunching (a more realistic scenario).
EDIT: I ran with -b3 and it didn't go up very high numbers-wise for predictions.
On the GPU it goes higher, but at the time, n=19 was the largest numbers we were crunching on CPUs.
Okay, with BOINC running 2 ESP LLR tasks (set to 50% usage), this is what I got from GFN. I only run with 3 cores, or else temps get really high (95-98°C high). Running 3 cores is fine for me... Yep, 596 hours estimate this time. Large difference haha. Almost 25 days for the leading edge GFN-wr on my CPU.. I'll probably just run 1 task alongside some LLR after the challenge, just 1 task would get me to the next badge quite quickly..
Thanks again for your help, How would I go about setting it to run with app_info? No rush.
Thanks.
-Golfer
____________
275*2^3585539+1 is prime!!! (1079358 digits)
Proud member of Aggie the Pew
| |
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,353,639,160 RAC: 1,018,657
                           
|
At first: anonymous platform is not easy discipline.
- it is highly recommended to finish/abort all tasks of the project before changing to anonymous platform,
- any mistake in the app_info.xml means loss of tasks,
- it is able to crunch only apps, defined in app_info,
- it is necessary to download new versions of apps manually and maintain the structure and the content of <app_version> and <file_info> according to the actual version of apps,
- changing content of app_info.xml is high risk if any task is unfinished: you "must" prepare content of app_info according to the next PG challenge and duration of GFN WR task,
- turn back from anonymous platform to standard style (delete/rename of app_info.xml + restart Boinc core) means loss of all unreported tasks (=all tasks of the project "in BM') without any information about that issue to the server side, tasks are "abandoned".
Well, now the "GFN WR"part of app_info.xml for windows
<app_info>
<app>
<name>genefer_wr</name>
<user_friendly_name>Genefer WR</user_friendly_name>
<non_cpu_intensive>0</non_cpu_intensive>
</app>
<file_info>
<name>primegrid_genefer_3_2_5_0_3.05_windows_x86_64__cpuGFN.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>genefer_wr</app_name>
<version_num>305</version_num>
<avg_ncpus>1.000000</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<file_ref>
<file_name>primegrid_genefer_3_2_5_0_3.05_windows_x86_64__cpuGFN.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>
Michael Goetz wrote: Such a change would likely benefit only a small number of people like yourself. The cost would be longer turn arounds for abandoned tasks. This doesn't sound like a good trade, in my opinion. I agree with you, let it be.If you miss the deadline by a few days, you'll always get credit. You don't have to be the second task; as long as you get the result in before the workunit is purged out of the database, you get credit. Nice, I had no knowledge of that difference to a standard credit system. Is it used for GFN/GFN WR tasks only or for all PG apps? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,051,011 RAC: 285,770
                               
|
If you miss the deadline by a few days, you'll always get credit. You don't have to be the second task; as long as you get the result in before the workunit is purged out of the database, you get credit. Nice, I had no knowledge of that difference to a standard credit system. Is it used for GFN/GFN WR tasks only or for all PG apps?
Yes, that applies to all tasks at PrimeGrid. You did the crunching, why shouldn't you get credit? With BOINC having so much trouble estimating task deadlines, it misses deadlines a lot, even when the computer is capable of finishing in time. We don't want to penalize participants for that.
By the way, this has nothing to do with the credit system we use. It's a server setting. Project admins can decide whether or not to give credit to late tasks, as well as deciding how long to keep workunits in the database after they're completed. We mostly keep workunits in the database for about the same amount of time as the task deadline. For example, once the last task in a GFN-WR task finishes (or times out), the server waits an additional 30 days before purging the WU. You have until then to return any late tasks and get credit.
____________
My lucky number is 75898524288+1 | |
|
Message boards :
Generalized Fermat Prime Search :
Vastly improved: AVX CPU Genefer Time |