1) Message boards : Number crunching : Good Riddance 2020! Challenge (Message 147543)
Posted 1 day ago by Profile GrebulonerProject donor
Well, I've learned one important lesson from this challenge:

You can't run PG on an overclocked (350W!) 10980XE, 3930k, E3v3 Xeon, their respective high powered GPUs, and a laptop, plus pfsense router, storage server, all the networking gear for the house, 4 stereos (for surround sound!), and the bedrooms' lights off of one 20A, 120V circuit. (No, they were not all in one room.)


I have a very old house where the wiring was updated to good, modern stuff in the 90s, but not, apparently, to the (then) standard of one circuit breaker per room.

The straw that broke the camel's back? I turned on the little monitor I use for direct access to the server.

There has been a relocation of hardware. >.<
2) Message boards : Number crunching : Good Riddance 2020! Challenge (Message 147508)
Posted 2 days ago by Profile GrebulonerProject donor
Shiny new hardware is always nice to have :)

Current PPS-DIV FFT size is 480K

Oh, indeed it is!

Just under 4MB per task then, excellent, thank you.
3) Message boards : Number crunching : Good Riddance 2020! Challenge (Message 147500)
Posted 3 days ago by Profile GrebulonerProject donor
I'm looking forward to a new challenge season! I have a new cruncher on the way, this week, to retire a now useless (thanks to practical CPU sieving ending) pair of X5675s and easily outpace the both of them with a single "new" 5960X-and eventually a 22-core E5 2699 v4 :D. Hope it arrives in time to compete!

So I can set my threading optimally(ish), what are the current DIV FFT sizes?

Happy hunting, everybody!
4) Message boards : 321 Prime Search : Multi-threading: Max # of threads for each task Setting? (Message 147442)
Posted 6 days ago by Profile GrebulonerProject donor
Does the llr make use of the AVX512?
I only see "using FMA3" or the like in the output.

It does. The performance hit to single AVX512 unit CPUs has been well-documented, so I wonder if bypassing it and using faster FMA3 on affected chips (like that Xeon) was baked into the code?

(copied from an stderr output on my Cascade Lake):

LLR Program - Version 3.8.23, using Gwnum Library Version 29.8 LLR command line: primegrid_cllr.exe -d -oDiskWriteTime=1 -oThreadsPerTest=1 Using zero-padded AVX-512 FFT length 128K, Pass1=128, Pass2=1K, clm=1
5) Message boards : 321 Prime Search : Multi-threading: Max # of threads for each task Setting? (Message 147396)
Posted 9 days ago by Profile GrebulonerProject donor
Yeah, the Xeon Gold 5217 (8 Cores @ 3.0 GHz Turbo; 11 MB L3) is currently working on 8 WUs in parallel with a "Time per it: 16.570 ms", that is a lot slower than the E5-2690 with 9.317 ms and a lot slower than the Xeon Gold 6254.
It is running with 3.0 GHz, but only drawing 68 W as per rapl, whereas it was drawing 85 W (max TDP) when doing one WU with 8 threads.
Projected finishing time is around 3 days, yielding a throughput of 2.54/day, throughput was 7.22/day with 1 WU@8threads and 4.52/day with 1 WU@4threads.
But Fujitsu was cheap only fitting one memory module with 32 GB instead of populating all six channels.

Will post back when finished

One memory channel? Oy Vey! Leaves quite a lot of performance potential off the table.

Also, the Gold 5000 series only has a single AVX512 unit, which when used for PG makes it slower than using the AVX2/FMA3 optimization. You might want to do an additional run and see what the difference is.
6) Message boards : General discussion : Advice on buying used computers (Message 147334)
Posted 11 days ago by Profile GrebulonerProject donor
AVX512 is new enough and only useful on the Core-X HEDT/expensive Xeon Gold 6000+ Server CPUs that it's going to cost a lot to acquire, even used, for now.

Look for CPUs that fully support AVX2/FMA3. Since you're going used, that would be Intel Haswell (4000/Xeon E3v3) or newer, and AMD Ryzen 3000 and newer. For price/performance, I'd say Haswell is a pretty good buy, and the 5800/5900/6800/6900 HEDT are also very nice and coming down in price. Quad core i5 or better, i7 if you can swing it. Skylake(6000)/Kaby Lake (7000) are also rapidly dropping in price thanks to AMD forcing Intel to step up its core count game. I'm finding complete Optiplex i7 4770 systems around the $200 mark on ebay (no GPU of course). Not bad.

If you are adventurous, special mention goes to building your own from used parts. Haswell CPUs and motherboards are readily available. Ultimately you will spend more than with a prebuilt system, but you are getting higher quality parts and faster memory. Xeon CPUs are quite affordable and for the massive selloff of Haswell/Broadwell-EP (Xeon 26xx v3/v4) parts, there are inexpensive Chinese-made compatible motherboards you can put them into, or for a bit more, used single and dual CPU server motherboards from good manufacturers, but you'll need to also spring for cooling, power, memory, storage, case if you want one, etc. But, IMHO, these represent some of the best bang for buck you can get in used parts. Many cores, high memory bandwidth (ECC is not required), but will certainly use more power than a used consumer/office machine for more work done.

Desktop is going to be your best bet if you want it to last; laptop cooling is rarely adequate for PG loads. Would I turn down a Haswell laptop for $50? No, and I'd run it bottoms up (I have one that I got for free I do that with. It is very loud.). Be wary of "SFF" desktops. These are the tiny little boxes that are smaller than keyboards and feature low power CPUs and poor cooling.

I'm going to go stoke some flames and say that any GPU that comes with a used system is probably garbage, and was garbage back when it was new, too. This is the place where newer is always going to be leaps and bounds better than older. We don't see +50% gains in CPUs (AVX512 notwithstanding) every year or two, but we do with GPUs. Prebuilts will likely have big limitations in what can be installed, whether it's power usage or simply fitting inside the case (and it's a crapshoot if a prebuilt system can be reinstalled into a larger case or use an off the shelf power supply). Buying used PC parts separately (or as combos) can overcome this limitation.

Intel really does have a long list of CPUs, don't they? They want something in just about every $10 price bracket, and OEMs like that option, too. Searching ebay by cpu model is a great way to find hidden gems.
7) Message boards : News : Change in Prime Reporting Procedure (Message 147087)
Posted 19 days ago by Profile GrebulonerProject donor
Are these new prime finding users generally running older and/or slower machines that wouldn't have otherwise been likely to be first? Asking more out of curiosity than anything.

With LLR2, who does PG consider the double checker? (or am I misreading that part and they all get reported as anonymous if there is no response?)
8) Message boards : 321 Prime Search : Multi-threading: Max # of threads for each task Setting? (Message 147016)
Posted 22 days ago by Profile GrebulonerProject donor
Hmm according to task manager my system has the following available:

L1 2.2MB
L2 9MB
L3 90MB

Might you be running a two socket system? 36 cores total across 2 CPUs? (your computers are hidden)

You can fit 6 tasks in L3 with room to spare, so I'd suggest 3 threads/task.

Not fully comparable, of course, but I found much better throughput doing 321 with 6x3t instead of 3x6t on my 18c 10980XE.
9) Message boards : 321 Prime Search : Multi-threading: Max # of threads for each task Setting? (Message 146971)
Posted 23 days ago by Profile GrebulonerProject donor
For the i5-3470 I would run 1 task on 4 cores and for the 3900XT I would run 4 tasks on 3 cores each. These tasks are presently using about 7MB of CPU cache each, so the 3470 will be a bit slow, but the 3900XT should run them pretty quick with that setting.

You will want to make sure that the tasks are run on the same CCX (a CCX is a set of 3 adjacent cores for the 3900XT) for the Ryzen CPU though to maximize efficiency. You can use Process Lasso (free, here or, if you hop on the Discord server, someone may have a link to another program that the same person who made the LLR2 software wrote. Ryzen 3rd gen CPUs are much more productive and efficient if you can keep a task on the same CCX.

As far as optimal configurations go, that is a function of FFT length and cache size. If you look at the output of a task after you run it (by clicking on the task on your tasks summary page) you can see the FFT size. Multiply that by 8 and you get MB of CPU cache used during the test. For 321 right now the FFT size is 864K, so each task uses about 7MB cache. The i5-3470 has 6MB and each CCX of the 3900XT has 16MB.

Some examples: Looking at PPSE tasks, with an FFT length of 120K (~1MB cache usage), the most efficient configuration would be to run 1 task on each core (4 tasks total for the 3470 and 12 tasks total for the 3900XT). For PPS-Mega the FFT size right now is 256K, or about 2MB per task. You would probably want to run 2 tasks with 2 threads each on the 3470 and 12 tasks with 1 thread each for the 3900XT for optimal throughput, to stay within the cache limits for each CPU.

Hope this helps, and good luck!

Hi, are you talking L2 or L3 CPU cache? I ask because my CPU has only 2MB of L2 but 45MB L3. Should I be calculating for L2 or L3?

Generally speaking, it's L3, but there is an "it depends" based on cache inclusiveness. AMD and consumer Intel chips store a copy of L2 in L3, so only L3 matters (and in Ryzen 3k/5k, it's L3 per CCX). Skylake-X/SP and up (HEDT/server) have a mostly non-inclusive hierarchy, so it's a little less than L2+L3.

What CPU are you using that mixes itsy bitsy L2 with massive L3?
10) Message boards : Number crunching : Have I found a prime or not? I'm confused... (Message 146825)
Posted 27 days ago by Profile GrebulonerProject donor
The "firsts" are counts of tasks that you returned before the double checking user in the last 24 hours. They aren't a count of primes.

Primes you find will show up on a line called "discoveries" and if you are the initial finder, you'll get a notification by a message in your account and email (if you set it up).

