Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Number crunching :
LLR Ryzen Performance
| Author |
Message |
|
|
|
Is there any ETA when LLR Apps will be using Gwnum Library Version 28.14?
From MersenneForum:
http://www.mersenneforum.org/showthread.php?t=22981
This LLR version is linked with the Version 28.14 of George Woltman's gwnum library.
The only thing new from 28.13 is better FFT selections for AMD Ryzen CPUs. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14646 ID: 53948 Credit: 1,019,952,007 RAC: 576,882
                                      
|
Is there any ETA when LLR Apps will be using Gwnum Library Version 28.14?
From MersenneForum:
http://www.mersenneforum.org/showthread.php?t=22981
This LLR version is linked with the Version 28.14 of George Woltman's gwnum library.
The only thing new from 28.13 is better FFT selections for AMD Ryzen CPUs.
We expect to be starting official acceptance testing of that version of LLR in the near future. In the meantime, feel free to use that version of LLR under app_info.xml. I don't know if it will help with Ryzen performance, but it definitely seems to fix the multi-threading bug.
If you run it under app_info, please let us know if there are any problems.
I've been using it for a month or two, and it seems reliable.
____________
My lucky number is 75898524288+1 | |
|
|
|
|
I´ve tested the new Version and found no problems.
The performance has increased by 10.5% averaged from 11 WUs
____________
Main system | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14646 ID: 53948 Credit: 1,019,952,007 RAC: 576,882
                                      
|
I´ve tested the new Version and found no problems.
The performance has increased by 10.5% averaged from 11 WUs
Thanks for running those tests.
The good news is... with Ryzen and the latest software, we finally do see an improvement when using FMA3. Previously, LLR wouldn't even attempt to use AVX or FMA3 because it didn't provide a benefit. 10% is a significant improvement...
The bad news is... ...unless you compare it to Intel CPUs. At least it's movement in the right direction.
I guess if you're running a Ryzen CPU, I'd recommend running the new LLR under app_info.xml if you're up to handling the configuration hassles. I'll see what I can do to get the ball rolling on getting the new LLR into production.
____________
My lucky number is 75898524288+1 | |
|
|
|
|
The configuration of the app_info.xml has to be done only one time but it would be great if you could get the ball rolling.
Here the exactly results for 321 WUs with a Ryzen 7 1700x @ stock with 8 threads / WU and 16 GB dual rank @ 1463.6 MHz / 16.0-16-16-39-55:
New version from 11 WUs: running time 11586,48s, CPU time 88535,62s, Credits 4909,38 -> credits / second 0,423716239
Old version from 26 WUs: running time 12717,83s, CPU time 82772,24s, Credits 4877,12 -> credits / second 0,383487231
Maybe someone else can confirm my results?
____________
Main system | |
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3358 ID: 50683 Credit: 223,005,516 RAC: 1,716,499
                                
|
The configuration of the app_info.xml has to be done only one time but it would be great if you could get the ball rolling.
Here the exactly results for 321 WUs with a Ryzen 7 1700x @ stock with 8 threads / WU and 16 GB dual rank @ 1463.6 MHz / 16.0-16-16-39-55:
New version from 11 WUs: running time 11586,48s, CPU time 88535,62s, Credits 4909,38 -> credits / second 0,423716239
Old version from 26 WUs: running time 12717,83s, CPU time 82772,24s, Credits 4877,12 -> credits / second 0,383487231
Maybe someone else can confirm my results?
Using Primegrid for benchmark is not good idea. You must use same data and perform test with "old" and "new" LLR. Then any only then you will know does new LLR give improvements in speed or not.
With srsieve make test sieve file and process it, and then give us results.
____________
2*836^798431+1 CRUS PRIME
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
2022202116^131072+1 GFN
Proud member of team Aggie The Pew. Go Aggie! | |
|
|
|
|
Sorry, I mean if anyone else can confirm the overall performance increase.
____________
Main system | |
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3358 ID: 50683 Credit: 223,005,516 RAC: 1,716,499
                                
|
Sorry, I mean if anyone else can confirm the overall performance increase.
You dont be needed sorry for anything. But I very skeptic about increasing performance using latest LLR.
So made test and know for all time, is that true or not :)
____________
2*836^798431+1 CRUS PRIME
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
2022202116^131072+1 GFN
Proud member of team Aggie The Pew. Go Aggie! | |
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2948 ID: 29980 Credit: 773,786,746 RAC: 153,298
                                      
|
The good news is... with Ryzen and the latest software, we finally do see an improvement when using FMA3. Previously, LLR wouldn't even attempt to use AVX or FMA3 because it didn't provide a benefit. 10% is a significant improvement...
Before Ryzen that is the case. LLR uses gwnum, like Prime95. It's easier for me to go by Prime95 version numbers for now. 28.x and earlier implementations simply didn't know about the existence of Ryzen, and picked an older transform for AMD CPUs in general. With 29.x came a new CPU detection method that did recognise Ryzen, enabling FMA3 in that case. In my limited testing, that didn't give a difference beyond test variations. Later on in 29.x was a FFT type choice refresh. The generic "best" transform was used for each CPU architecture. I'm not sure I've actually tested this.
Very roughly speaking, my early tests with 29.x suggested Ryzen was about half the peak IPC of Intel in FMA3 type operations. I will go back and see if it has changed, but probably after the 321 challenge. | |
|
Message boards :
Number crunching :
LLR Ryzen Performance |