LLR Ryzen Performance

Join PrimeGrid

Read our rules and policies.
Download, install and run the BOINC software used by PrimeGrid.
When prompted, enter the URL:
http://www.primegrid.com/
Select which subprojects to run via the PrimeGrid Preferences.

If you are not using graphics enabled BOINC client, you may use manual registration form.

Author

Message

C. Ketelsen

Send message
Joined: 30 Dec 10
Posts: 3
ID: 79134
Credit: 34,066,328
RAC: 0

Is there any ETA when LLR Apps will be using Gwnum Library Version 28.14?

From MersenneForum:

http://www.mersenneforum.org/showthread.php?t=22981

This LLR version is linked with the Version 28.14 of George Woltman's gwnum library.
The only thing new from 28.13 is better FFT selections for AMD Ryzen CPUs.

ID: 115838 |

Michael Goetz

Volunteer moderator
Project administrator

Send message
Joined: 21 Jan 10
Posts: 14646
ID: 53948
Credit: 1,019,952,007
RAC: 576,882

Is there any ETA when LLR Apps will be using Gwnum Library Version 28.14?

From MersenneForum:

http://www.mersenneforum.org/showthread.php?t=22981

This LLR version is linked with the Version 28.14 of George Woltman's gwnum library.
The only thing new from 28.13 is better FFT selections for AMD Ryzen CPUs.

We expect to be starting official acceptance testing of that version of LLR in the near future. In the meantime, feel free to use that version of LLR under app_info.xml. I don't know if it will help with Ryzen performance, but it definitely seems to fix the multi-threading bug.

If you run it under app_info, please let us know if there are any problems.

I've been using it for a month or two, and it seems reliable.
____________
My lucky number is 75898⁵²⁴²⁸⁸+1

ID: 115839 |

No_Name

Send message
Joined: 2 Feb 17
Posts: 7
ID: 487741
Credit: 1,398,290,184
RAC: 1,442,071

I´ve tested the new Version and found no problems.
The performance has increased by 10.5% averaged from 11 WUs
____________
Main system

ID: 116133 |

Michael Goetz

Volunteer moderator
Project administrator

Send message
Joined: 21 Jan 10
Posts: 14646
ID: 53948
Credit: 1,019,952,007
RAC: 576,882

I´ve tested the new Version and found no problems.
The performance has increased by 10.5% averaged from 11 WUs

Thanks for running those tests.

The good news is... with Ryzen and the latest software, we finally do see an improvement when using FMA3. Previously, LLR wouldn't even attempt to use AVX or FMA3 because it didn't provide a benefit. 10% is a significant improvement...

The bad news is... ...unless you compare it to Intel CPUs. At least it's movement in the right direction.

I guess if you're running a Ryzen CPU, I'd recommend running the new LLR under app_info.xml if you're up to handling the configuration hassles. I'll see what I can do to get the ball rolling on getting the new LLR into production.
____________
My lucky number is 75898⁵²⁴²⁸⁸+1

ID: 116135 |

No_Name

Send message
Joined: 2 Feb 17
Posts: 7
ID: 487741
Credit: 1,398,290,184
RAC: 1,442,071

The configuration of the app_info.xml has to be done only one time but it would be great if you could get the ball rolling.

Here the exactly results for 321 WUs with a Ryzen 7 1700x @ stock with 8 threads / WU and 16 GB dual rank @ 1463.6 MHz / 16.0-16-16-39-55:

New version from 11 WUs: running time 11586,48s, CPU time 88535,62s, Credits 4909,38 -> credits / second 0,423716239
Old version from 26 WUs: running time 12717,83s, CPU time 82772,24s, Credits 4877,12 -> credits / second 0,383487231

Maybe someone else can confirm my results?
____________
Main system

ID: 116136 |

Crun-chi

Volunteer tester
Avatar

Send message
Joined: 25 Nov 09
Posts: 3358
ID: 50683
Credit: 223,005,516
RAC: 1,716,499

The configuration of the app_info.xml has to be done only one time but it would be great if you could get the ball rolling.

Here the exactly results for 321 WUs with a Ryzen 7 1700x @ stock with 8 threads / WU and 16 GB dual rank @ 1463.6 MHz / 16.0-16-16-39-55:

New version from 11 WUs: running time 11586,48s, CPU time 88535,62s, Credits 4909,38 -> credits / second 0,423716239
Old version from 26 WUs: running time 12717,83s, CPU time 82772,24s, Credits 4877,12 -> credits / second 0,383487231

Maybe someone else can confirm my results?

Using Primegrid for benchmark is not good idea. You must use same data and perform test with "old" and "new" LLR. Then any only then you will know does new LLR give improvements in speed or not.
With srsieve make test sieve file and process it, and then give us results.
____________
2*836^⁷⁹⁸⁴³¹+1 CRUS PRIME
92*10^^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
2022202116^¹³¹⁰⁷²+1 GFN
Proud member of team Aggie The Pew. Go Aggie!

ID: 116150 |

No_Name

Send message
Joined: 2 Feb 17
Posts: 7
ID: 487741
Credit: 1,398,290,184
RAC: 1,442,071

Sorry, I mean if anyone else can confirm the overall performance increase.
____________
Main system

ID: 116152 |

Crun-chi

Volunteer tester
Avatar

Send message
Joined: 25 Nov 09
Posts: 3358
ID: 50683
Credit: 223,005,516
RAC: 1,716,499

Sorry, I mean if anyone else can confirm the overall performance increase.

You dont be needed sorry for anything. But I very skeptic about increasing performance using latest LLR.
So made test and know for all time, is that true or not :)
____________
2*836^⁷⁹⁸⁴³¹+1 CRUS PRIME
92*10^^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
2022202116^¹³¹⁰⁷²+1 GFN
Proud member of team Aggie The Pew. Go Aggie!

ID: 116154 |

mackerel

Volunteer tester
Avatar

Send message
Joined: 2 Oct 08
Posts: 2948
ID: 29980
Credit: 773,786,746
RAC: 153,298

The good news is... with Ryzen and the latest software, we finally do see an improvement when using FMA3. Previously, LLR wouldn't even attempt to use AVX or FMA3 because it didn't provide a benefit. 10% is a significant improvement...

Before Ryzen that is the case. LLR uses gwnum, like Prime95. It's easier for me to go by Prime95 version numbers for now. 28.x and earlier implementations simply didn't know about the existence of Ryzen, and picked an older transform for AMD CPUs in general. With 29.x came a new CPU detection method that did recognise Ryzen, enabling FMA3 in that case. In my limited testing, that didn't give a difference beyond test variations. Later on in 29.x was a FFT type choice refresh. The generic "best" transform was used for each CPU architecture. I'm not sure I've actually tested this.

Very roughly speaking, my early tests with 29.x suggested Ryzen was about half the peak IPC of Intel in FMA3 type operations. I will go back and see if it has changed, but probably after the 321 challenge.

ID: 116170 |

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other