PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Number crunching : 50xx performance tests

Author Message
tng Project donor
Send message
Joined: 29 Aug 10
Posts: 601
ID: 66603
Credit: 63,876,606,271
RAC: 15,486,133
Discovered the World's First base 25 Generalized Cullen prime!!!Discovered the World's First base 73 Generalized Cullen prime!!!Discovered 85 mega primesEliminated 2 conjecture "k"sDiscovered 4 AP26sDiscovered 1 AP27Discovered 1 twin primeDiscovered 1 Fermat divisor2017 Tour de Primes highest prime count2017 Tour de Primes most Mountain Stage primes2018 Tour de Primes most Mountain Stage primesFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2018 Tour de Primes Mountain StageFound 1 mega prime in the 2018 Tour de Primes Mountain StageFound 1 prime in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de Primes2020 Tour de Primes largest primeFound 3 primes in the 2020 Tour de PrimesFound 2 mega primes in the 2020 Tour de Primes2021 Tour de Primes most Mountain Stage primesFound 9 primes in the 2021 Tour de PrimesFound 6 mega primes in the 2021 Tour de PrimesFound 3 primes in the 2021 Tour de Primes Mountain StageFound 1 mega prime  in the 2021 Tour de Primes Mountain Stage2022 Tour de Primes highest prime count2022 Tour de Primes highest prime scoreFound 37 primes in the 2022 Tour de PrimesFound 12 mega primes in the 2022 Tour de Primes2023 Tour de Primes largest primeFound 2 primes in the 2023 Tour de PrimesFound 2 mega primes in the 2023 Tour de PrimesFound 3 primes in the 2024 Tour de PrimesFound 3 mega primes in the 2024 Tour de PrimesFound 2 primes in the 2024 Tour de Primes Mountain StageFound 2 mega prime s in the 2024 Tour de Primes Mountain StageFound 2 primes in the 2025 Tour de PrimesFound 2 mega primes in the 2025 Tour de Primes321 LLR Double Silver: Earned 200,000,000 credits (343,553,608)Cullen LLR Double Silver: Earned 200,000,000 credits (315,023,255)ESP LLR Double Gold: Earned 500,000,000 credits (502,258,596)Generalized Cullen/Woodall LLR Double Gold: Earned 500,000,000 credits (527,209,325)Primorial Prime Search Double Gold: Earned 500,000,000 credits (505,133,305)PPS LLR Double Amethyst: Earned 1,000,000,000 credits (1,129,573,342)PSP LLR Double Gold: Earned 500,000,000 credits (505,158,789)SoB LLR Double Gold: Earned 500,000,000 credits (540,846,573)SR5 LLR Double Gold: Earned 500,000,000 credits (513,955,064)SGS LLR (suspended) Double Silver: Earned 200,000,000 credits (258,821,441)TRP LLR Double Amethyst: Earned 1,000,000,000 credits (1,267,654,093)Woodall LLR Double Gold: Earned 500,000,000 credits (502,878,341)321 Sieve (suspended) Double Bronze: Earned 100,000,000 credits (134,986,204)Factorial/Compositorial Sieve Double Gold: Earned 500,000,000 credits (670,225,537)Cullen/Woodall Sieve Double Ruby: Earned 2,000,000,000 credits (2,006,781,932)Generalized Cullen/Woodall Sieve (suspended) Double Bronze: Earned 100,000,000 credits (100,428,955)PPS Sieve Double Turquoise: Earned 5,000,000,000 credits (5,178,082,478)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Double Bronze: Earned 100,000,000 credits (101,857,516)Sierpinski/Riesel Base 5 Sieve Double Gold: Earned 500,000,000 credits (502,514,062)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,936,081)AP 26/27 Double Turquoise: Earned 5,000,000,000 credits (6,216,625,961)GFN Double Jade: Earned 10,000,000,000 credits (19,320,713,548)WW (retired) Double Jade: Earned 10,000,000,000 credits (16,390,276,000)PSA Double Turquoise: Earned 5,000,000,000 credits (6,291,360,623)
Message 179842 - Posted: 13 Mar 2025 | 12:09:58 UTC

Tested a 5080 and a 5070 Ti against my 40xx cards. Results:

+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |Subproject |4070 elapsed |40470 power |4080 elapsed |4080 power |4090 elapsed |4090 power |5070 Ti elapsed|5070 Ti power |5080 elapsed |5080 power| +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-16 |59 |114w |49 |152w |41 |187w |68 |101w |59 |133w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-16 (x2) |103 |122w |85 |154w |73 |196w |113 |100w |73 |134w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-17 |169 |160w |122 |198w |104 |236w |150 |138w |147 |182w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-17 (2x) |301 |160w |223 |200w |192 |242w |270 |136w |267 |186w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-17 (3x) |445 |160w |329 |201w |289 |250w |400 |139w |400 |194w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-18 |460 |180w |316 |251w |260 |316w |377 |177w |357 |241w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-18 (2x) |922 |183w |613 |257w |512 |319w |740 |180w |716 |261w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-19 |1516 |176w |901 |334w |801 |361w |1176 |226w |957 |319w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-19 (2x) |3224 |179w |1962 |330w |1795 |367w |2532 |221w |1911 |327w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-20 |5279 |179w |3012 |339w |2548 |443w |3640 |253w |2907 |347w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |GFN-21 |24247 |196w |11362 |340w |9636 |425w |14370 |306w |10765 |360w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |DYFL |158032 |200w |149645 |339w |108024 |443w |157833 |293w |111764 |360w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |AP27 |232 |184w |133 |338w |87 |450w |146 |270w |121 |347w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+ |AP27 (2x) |456 |184w |265 |336w |172 |450w |289 |270w |237 |345w | +---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+----------+

____________

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 179844 - Posted: 13 Mar 2025 | 17:38:18 UTC

If we compare RTX 4080 and 5080:

RTX 4080 48.74 TFLOPS Bandwidth 716.8 GB/s L2 Cache 64 MB RTX 5080 56.28 TFLOPS (1.15x) Bandwidth 960.0 GB/s (1.34x) L2 Cache 64 MB RTX 4080 RTX 5080 elapsed elapsed x data size GFN-16 49 59 0.830 GFN-16 (x2) 85 73 1.164 GFN-17 122 147 0.829 GFN-17 (2x) 223 267 0.835 GFN-17 (3x) 329 400 0.822 GFN-18 316 357 0.885 GFN-18 (2x) 613 716 0.856 GFN-19 901 957 0.941 18 MB GFN-19 (2x) 1962 1911 1.026 18 MB (2x) GFN-20 3012 2907 1.036 36 MB GFN-21 11362 10765 1.055 72 MB DYFL 149645 111764 1.338 192 MB

Large GFN are faster because of the memory bandwidth.
A priori, the new integer unit of Blackwell is slower then the combined integer and FP (multiply–add instruction) units of Ada Lovelace.
It is possible that the MAD operation no longer exists and that only the mul operation is available.
Two mul units instead of one MAD could be slower because of the specific code needed for modular arithmetic.

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180130 - Posted: 24 Mar 2025 | 14:45:51 UTC

I generated Nvidia assembly code, trying to understand the new Blackwell architecture.

The code is a Radix-4 NTT butterfly using three prime moduli, the most intensively used function of genefer.

GeForce 10 series (Pascal): the multiplier is a 16-bit unit then four 16-bit Multiply and Add are needed to emulate a 32-bit MAD. 394 instructions are generated and the number of cycles is 394.

GeForce 20 series (Turing): each core contains a 32-bit MAD unit. The number of instructions and cycles is 270.

GeForce 30 and 40 series (Ampere & Ada Lovelace): double FP32 performance per SM. Half of FP32 units are paired with integer units (without the MAD instruction). The other half of FP32 units are paired with MAD units. If the number of MAD is equal to the number of other integer instructions, UINT32 performance is doubled (the compiler transforms some add, left shift and move instructions into MAD for load balancing). The number of instructions is 273 (including 131 MAD): counting pairs of cores, the number of cycles is 142.
Ada Lovelace is faster, not because of the core architecture but because of fabrication process (TSMC 4N vs Samsung 8N) and L2 cache size (16x).

GeForce 50 series (Blackwell): cores are all identical with one FP32 unit and one UINT32 unit. It is not faster but load balancing of MAD/others is not needed. 277 instructions are generated (the number of MAD is now 78). It increased a bit because the addressing mode was narrowed. With Ada Lovelace, you can write IADD3 R1, R2, c[0x0][0x180], RZ; where c[0x0][0x180] is a parameter on the stack. Now, two instructions are needed: LDC.64 R3, c[0x0][0x180] and IADD3 R1, PT, PT, R2, R3, RZ. This difference is not sufficient to explain why 5080 is not 15% faster then 4080.
32-bit addressing mode was removed, but genefer is a 64-bit application. Maybe 32-bit memory accesses are now slower because of a different cache organization...?

csbyseti
Send message
Joined: 5 Sep 17
Posts: 2
ID: 921721
Credit: 611,758,248
RAC: 1,095
321 LLR Amethyst: Earned 1,000,000 credits (1,547,702)Cullen LLR Amethyst: Earned 1,000,000 credits (1,719,455)ESP LLR Ruby: Earned 2,000,000 credits (3,400,854)PPS LLR Amethyst: Earned 1,000,000 credits (1,952,945)SoB LLR Turquoise: Earned 5,000,000 credits (7,987,233)SR5 LLR Silver: Earned 100,000 credits (244,904)TRP LLR Amethyst: Earned 1,000,000 credits (1,002,334)PPS Sieve Double Silver: Earned 200,000,000 credits (235,589,077)AP 26/27 Double Bronze: Earned 100,000,000 credits (167,606,608)GFN Double Bronze: Earned 100,000,000 credits (132,511,494)WW (retired) Emerald: Earned 50,000,000 credits (58,188,000)
Message 180185 - Posted: 26 Mar 2025 | 15:28:22 UTC - in response to Message 180130.
Last modified: 26 Mar 2025 | 15:28:45 UTC

Thanks for explanation.

GPU-Z shows extrem high Bus Interface Load with RTX5080, did anyone else has the same?

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180272 - Posted: 29 Mar 2025 | 4:21:48 UTC - in response to Message 180130.
Last modified: 29 Mar 2025 | 5:16:45 UTC

You all know way more than I do but did you download Aida64 Extreme for the free trial to test on the 50xx graphics cards? What does it show for 24-bit/32-bit IOPS for 50xx series graphics cards? I know the software could possibly be inaccurate.

https://www.aida64.com

I ran the trial AIDA64 Extreme GPGPU benchmark on my GeForce RTX 4060 (that runs it alongside my i5-13400 for comparison and hides some values in the trial run).
Here is my screenshot - curious what 50xx series show.

Boost is rated 2460 Mhz for GPU but I have seen 2715/2730/2760 Mhz in 3DMark benchmarks.
(It's a HP desktop with its own HP version GeForce RTX 4060 I presume).

I am totally guessing here based on things I have read but if I understand:

a) 15264 GFLOPS for Single-Precision FLOPS is because 2460 * 1000 * 1000 cycles per second * 3072 shaders * 2 (because two instructions in FMA - fused multiply add) is approximately 15114 GFLOPS that matches techpowerup.com

b) Double-Precision FLOPS is 261.5 GFLOPS because instead of FP64:FP32 ratio of 1:2 consumer GeForce 4060 is 1:64 (since only 2 FP64 for every 1 streaming multiprocessor that has 128 shaders (FP32), unlike datacenter GPUs that would have 64 FP64 for every 1 streaming multiprocessor) hence why Double-Precision FLOPS is 261.5 since 15114/64 is approximately 236.16 GFLOPS that matches techpowerup.com

c) Trial version AIDA64 has 32-bit integer IOPS hidden but looking at images on internet others are same as 24 bit. My 24 bit shows 8414 GIOPS since 2460 x 1000 * 1000 cycles per second * (3072 shaders / 2 --> since Ada Lovelace architecture is half shaders are INT32 or FP32 while other half are only FP32) * 2 (integer multiply add) = is approximately 7557 GIOPS which is half single precision FLOPS and seems to be correct since for RTX 4090 graphics card Nvidia Ada Lovelace document here shows "Peak INT32 TOPS (nonTensor)" as half "Peak FP32 TFLOPS (nonTensor)" value.

https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf

Note Yves important comments regarding this Compute Capability 8.9 graphics card:
Compute Capability 8.6 (GeForce 30): SM = 64 MAD32_64/FP32 + 64 INT32/FP32 + 2 FP64. Half of the cores are able to execute a MAD instruction z += x * y, where x, y are 32-bit integers, the result of the multiplication and z are 64-bit integers. The other half of the cores execute other instructions (add, shift, logical operations, etc).
Compute Capability 8.9 (GeForce 40): SM are identical to 8.6 but process size is 5 nm (Ampere was 8 nm) then GPU is operating at higher frequency. More importantly, L2 cache size is 10x: 40x0 are at least 50% faster than 30x0.

d) 64-bit integer IOPS of 2069 GIOPS - I can guess how that value is determined but I leave to you Yves.

Techpowerup page on 4060:
https://www.techpowerup.com/gpu-specs/geforce-rtx-4060.c4107

My other post where I say Nvidia fixed error on website to show 5080, 5090 is Compute Capability 12.0 not 10.0

https://www.primegrid.com/forum_thread.php?id=10836

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180274 - Posted: 29 Mar 2025 | 7:28:42 UTC - in response to Message 180272.

For Nvidia GPU, there is no 24-bit integer then 24-bit IOPS = 32-bit IOPS. For AMD GPU, a 24-bit multiplication is three times as fast as a 32-bit multiplication and there is no MAD instruction, but a MUL operation.

FLOPS is twice the number of FP operations per second because of FMA.
How IOPS is defined? A priori x2 for IMAD, but there is also a 3-input integer addition (IADD3) which is two additions (genefer code is a list of IMAD and IADD3 and some tests and conditional move for modular addition/substraction).
We should have for
- 40 series: 2 cores can execute 2 FMA or 1 IADD3 and 1 IMAD => 4 FLOP or 4 IOP. Then FLOPS = IOPS.
- 50 series: each core can execute 1 FMA or 1 IADD3 or 1 IMAD => 2 FLOP or 2 IOP. Then FLOPS = IOPS.

There is no 64-bit integer. Then a 64-bit addition is two 32-bit additions and a 64-bit MAD is four 32-bit MAD. Note that every instruction supports carry-in and carry-out.

WezH
Send message
Joined: 9 Jun 11
Posts: 184
ID: 101605
Credit: 1,690,023,334
RAC: 833,945
Discovered 5 mega primesFound 2 primes in the 2021 Tour de PrimesFound 1 mega prime in the 2021 Tour de PrimesFound 2 primes in the 2022 Tour de PrimesFound 1 mega prime in the 2022 Tour de PrimesFound 2 primes in the 2023 Tour de PrimesFound 2 mega primes in the 2023 Tour de PrimesFound 1 prime in the 2024 Tour de PrimesFound 2 primes in the 2025 Tour de PrimesFound 1 mega prime in the 2025 Tour de Primes321 LLR Sapphire: Earned 20,000,000 credits (22,284,118)Cullen LLR Sapphire: Earned 20,000,000 credits (21,855,101)ESP LLR Sapphire: Earned 20,000,000 credits (20,066,371)Generalized Cullen/Woodall LLR Sapphire: Earned 20,000,000 credits (20,080,651)Primorial Prime Search Sapphire: Earned 20,000,000 credits (23,646,234)PPS LLR Emerald: Earned 50,000,000 credits (69,823,521)PSP LLR Sapphire: Earned 20,000,000 credits (25,821,827)SoB LLR Sapphire: Earned 20,000,000 credits (23,544,569)SR5 LLR Sapphire: Earned 20,000,000 credits (27,373,211)SGS LLR (suspended) Jade: Earned 10,000,000 credits (12,925,176)TRP LLR Sapphire: Earned 20,000,000 credits (23,828,467)Woodall LLR Sapphire: Earned 20,000,000 credits (20,019,633)321 Sieve (suspended) Ruby: Earned 2,000,000 credits (2,489,551)Factorial/Compositorial Sieve Sapphire: Earned 20,000,000 credits (33,129,644)Cullen/Woodall Sieve Double Bronze: Earned 100,000,000 credits (100,398,628)Generalized Cullen/Woodall Sieve (suspended) Bronze: Earned 10,000 credits (45,874)PPS Sieve Double Bronze: Earned 100,000,000 credits (126,213,502)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (18,428)Sierpinski/Riesel Base 5 Sieve Sapphire: Earned 20,000,000 credits (26,457,019)TRP Sieve (suspended) Bronze: Earned 10,000 credits (14,729)AP 26/27 Emerald: Earned 50,000,000 credits (53,472,718)GFN Double Gold: Earned 500,000,000 credits (905,869,390)WW (retired) Double Bronze: Earned 100,000,000 credits (130,168,000)PSA Gold: Earned 500,000 credits (510,756)
Message 180279 - Posted: 29 Mar 2025 | 13:12:04 UTC

Here is my Asus PRIME GeForce RTX™ 5070 12GB GDDR7 OC Edition (PRIME-RTX5070-O12G)

+---------------+---------------+---------------+ |Subproject |5070 elapsed |5070 power | +---------------+---------------+---------------+ |GFN-16 |70 |90w | +---------------+---------------+---------------+ |GFN-16 (x2) |121 |105w | +---------------+---------------+---------------+ |GFN-16 (x3) |192 |111w | +---------------+---------------+---------------+ |GFN-17 |163 |124w | +---------------+---------------+---------------+ |GFN-17 (2x) |347 |151w | +---------------+---------------+---------------+ |GFN-17 (3x) |472 |158w | +---------------+---------------+---------------+ |GFN-18 |459 |202w | +---------------+---------------+---------------+ |GFN-18 (2x) |896 |196w | +---------------+---------------+---------------+ |GFN-19 |1364 |238w | +---------------+---------------+---------------+ |GFN-19 (2x) |2743 |239w | +---------------+---------------+---------------+ |GFN-20 |4463 |246w | +---------------+---------------+---------------+ |GFN-21 |18131 |247w | +---------------+---------------+---------------+ |DYFL |205857 |246w | +---------------+---------------+---------------+ |AP27 |219 |201w | +---------------+---------------+---------------+ |AP27 (2x) |421 |205w | +---------------+---------------+---------------+

Profile kuroganet
Send message
Joined: 13 Nov 09
Posts: 7
ID: 50048
Credit: 1,837,298,018
RAC: 83,617
Discovered 9 mega primesDiscovered 1 AP26Found 6 primes in the 2021 Tour de PrimesFound 4 mega primes in the 2021 Tour de PrimesFound 6 primes in the 2022 Tour de PrimesFound 3 mega primes in the 2022 Tour de PrimesFound 1 prime in the 2022 Tour de Primes Mountain StageFound 1 mega prime  in the 2022 Tour de Primes Mountain StageFound 3 primes in the 2023 Tour de PrimesFound 7 primes in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (17,229,876)Cullen LLR Jade: Earned 10,000,000 credits (11,043,069)ESP LLR Jade: Earned 10,000,000 credits (14,350,383)Generalized Cullen/Woodall LLR Sapphire: Earned 20,000,000 credits (30,482,815)Primorial Prime Search Turquoise: Earned 5,000,000 credits (6,966,935)PPS LLR Emerald: Earned 50,000,000 credits (85,428,533)PSP LLR Sapphire: Earned 20,000,000 credits (39,628,061)SoB LLR Sapphire: Earned 20,000,000 credits (25,109,843)SR5 LLR Sapphire: Earned 20,000,000 credits (29,250,777)SGS LLR (suspended) Jade: Earned 10,000,000 credits (10,611,654)TRP LLR Sapphire: Earned 20,000,000 credits (24,385,306)Woodall LLR Sapphire: Earned 20,000,000 credits (23,658,244)321 Sieve (suspended) Ruby: Earned 2,000,000 credits (3,645,200)Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (15,347,553)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (8,296,793)PPS Sieve Double Silver: Earned 200,000,000 credits (385,469,724)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,604,513)Sierpinski/Riesel Base 5 Sieve Silver: Earned 100,000 credits (109,308)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,369,210)AP 26/27 Double Bronze: Earned 100,000,000 credits (157,832,077)GFN Double Silver: Earned 200,000,000 credits (464,707,695)WW (retired) Double Silver: Earned 200,000,000 credits (464,144,000)PSA Jade: Earned 10,000,000 credits (10,624,198)
Message 180309 - Posted: 30 Mar 2025 | 21:29:42 UTC

Tested ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY 32GB GDDR7 (ZT-B50900B-10P)

+---------------+---------------+---------------+ |Subproject |5090 elapsed |5090 power | +---------------+---------------+---------------+ |GFN-16 |39 |210w | +---------------+---------------+---------------+ |GFN-16 (x2) |60 |207w | +---------------+---------------+---------------+ |GFN-16 (x3) |124 |204w | +---------------+---------------+---------------+ |GFN-17 |98 |279w | +---------------+---------------+---------------+ |GFN-17 (2x) |128 |263w | +---------------+---------------+---------------+ |GFN-17 (3x) |256 |278w | +---------------+---------------+---------------+ |GFN-18 |218 |395w | +---------------+---------------+---------------+ |GFN-18 (2x) |426 |380w | +---------------+---------------+---------------+ |GFN-19 |653 |590w | +---------------+---------------+---------------+ |GFN-19 (2x) |1323 |540w | +---------------+---------------+---------------+ |GFN-20 |2005 |600w | +---------------+---------------+---------------+ |GFN-21 |6969 |600w | +---------------+---------------+---------------+ |DYFL |60592 |600w | +---------------+---------------+---------------+ |AP27 |64 |600w | +---------------+---------------+---------------+ |AP27 (2x) |122 |600w | +---------------+---------------+---------------+

Profile mikey
Avatar
Send message
Joined: 17 Mar 09
Posts: 2339
ID: 37043
Credit: 1,055,100,737
RAC: 156,395
Discovered 3 mega primesFound 12 primes in the 2023 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (10,204,019)Cullen LLR Turquoise: Earned 5,000,000 credits (6,875,060)ESP LLR Turquoise: Earned 5,000,000 credits (6,394,080)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,081,460)Primorial Prime Search Turquoise: Earned 5,000,000 credits (5,049,175)PPS LLR Jade: Earned 10,000,000 credits (14,263,804)PSP LLR Turquoise: Earned 5,000,000 credits (5,303,999)SoB LLR Turquoise: Earned 5,000,000 credits (5,602,388)SR5 LLR Turquoise: Earned 5,000,000 credits (5,339,799)SGS LLR (suspended) Turquoise: Earned 5,000,000 credits (6,780,752)TRP LLR Turquoise: Earned 5,000,000 credits (5,023,333)Woodall LLR Turquoise: Earned 5,000,000 credits (5,047,133)321 Sieve (suspended) Sapphire: Earned 20,000,000 credits (23,770,672)Factorial/Compositorial Sieve Jade: Earned 10,000,000 credits (16,375,638)Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (64,145,452)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,813,253)PPS Sieve Double Silver: Earned 200,000,000 credits (383,309,749)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,446,797)Sierpinski/Riesel Base 5 Sieve Turquoise: Earned 5,000,000 credits (5,749,837)AP 26/27 Emerald: Earned 50,000,000 credits (94,832,608)GFN Double Silver: Earned 200,000,000 credits (278,192,726)WW (retired) Emerald: Earned 50,000,000 credits (64,048,000)PSA Sapphire: Earned 20,000,000 credits (20,457,430)
Message 180313 - Posted: 31 Mar 2025 | 12:44:13 UTC - in response to Message 180309.
Last modified: 31 Mar 2025 | 12:45:40 UTC

Tested ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY 32GB GDDR7 (ZT-B50900B-10P)

+---------------+---------------+---------------+ |Subproject |5090 elapsed |5090 power | +---------------+---------------+---------------+ |GFN-16 |39 |210w | +---------------+---------------+---------------+ |GFN-16 (x2) |60 |207w | +---------------+---------------+---------------+ |GFN-16 (x3) |124 |204w | +---------------+---------------+---------------+ |GFN-17 |98 |279w | +---------------+---------------+---------------+ |GFN-17 (2x) |128 |263w | +---------------+---------------+---------------+ |GFN-17 (3x) |256 |278w | +---------------+---------------+---------------+ |GFN-18 |218 |395w | +---------------+---------------+---------------+ |GFN-18 (2x) |426 |380w | +---------------+---------------+---------------+ |GFN-19 |653 |590w | +---------------+---------------+---------------+ |GFN-19 (2x) |1323 |540w | +---------------+---------------+---------------+ |GFN-20 |2005 |600w | +---------------+---------------+---------------+ |GFN-21 |6969 |600w | +---------------+---------------+---------------+ |DYFL |60592 |600w | +---------------+---------------+---------------+ |AP27 |64 |600w | +---------------+---------------+---------------+ |AP27 (2x) |122 |600w | +---------------+---------------+---------------+


Based on the last two charts Prime Numbers are going to not only found but found even faster if people could upgrade to these gpu's!! I hope the Admins have new Prime Number ranges for us that they can bring online pretty quickly as well.

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180321 - Posted: 31 Mar 2025 | 17:27:56 UTC

Looks like the Aida64 GPGPU benchmark is currently not supported since Nvidia removed 32-bit OpenCL support for the 5000 series.

https://forums.aida64.com/topic/16903-nvidia-rtx5090d-unsupported-gpgpu-benchmark/

tng Project donor
Send message
Joined: 29 Aug 10
Posts: 601
ID: 66603
Credit: 63,876,606,271
RAC: 15,486,133
Discovered the World's First base 25 Generalized Cullen prime!!!Discovered the World's First base 73 Generalized Cullen prime!!!Discovered 85 mega primesEliminated 2 conjecture "k"sDiscovered 4 AP26sDiscovered 1 AP27Discovered 1 twin primeDiscovered 1 Fermat divisor2017 Tour de Primes highest prime count2017 Tour de Primes most Mountain Stage primes2018 Tour de Primes most Mountain Stage primesFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2018 Tour de Primes Mountain StageFound 1 mega prime in the 2018 Tour de Primes Mountain StageFound 1 prime in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de Primes2020 Tour de Primes largest primeFound 3 primes in the 2020 Tour de PrimesFound 2 mega primes in the 2020 Tour de Primes2021 Tour de Primes most Mountain Stage primesFound 9 primes in the 2021 Tour de PrimesFound 6 mega primes in the 2021 Tour de PrimesFound 3 primes in the 2021 Tour de Primes Mountain StageFound 1 mega prime  in the 2021 Tour de Primes Mountain Stage2022 Tour de Primes highest prime count2022 Tour de Primes highest prime scoreFound 37 primes in the 2022 Tour de PrimesFound 12 mega primes in the 2022 Tour de Primes2023 Tour de Primes largest primeFound 2 primes in the 2023 Tour de PrimesFound 2 mega primes in the 2023 Tour de PrimesFound 3 primes in the 2024 Tour de PrimesFound 3 mega primes in the 2024 Tour de PrimesFound 2 primes in the 2024 Tour de Primes Mountain StageFound 2 mega prime s in the 2024 Tour de Primes Mountain StageFound 2 primes in the 2025 Tour de PrimesFound 2 mega primes in the 2025 Tour de Primes321 LLR Double Silver: Earned 200,000,000 credits (343,553,608)Cullen LLR Double Silver: Earned 200,000,000 credits (315,023,255)ESP LLR Double Gold: Earned 500,000,000 credits (502,258,596)Generalized Cullen/Woodall LLR Double Gold: Earned 500,000,000 credits (527,209,325)Primorial Prime Search Double Gold: Earned 500,000,000 credits (505,133,305)PPS LLR Double Amethyst: Earned 1,000,000,000 credits (1,129,573,342)PSP LLR Double Gold: Earned 500,000,000 credits (505,158,789)SoB LLR Double Gold: Earned 500,000,000 credits (540,846,573)SR5 LLR Double Gold: Earned 500,000,000 credits (513,955,064)SGS LLR (suspended) Double Silver: Earned 200,000,000 credits (258,821,441)TRP LLR Double Amethyst: Earned 1,000,000,000 credits (1,267,654,093)Woodall LLR Double Gold: Earned 500,000,000 credits (502,878,341)321 Sieve (suspended) Double Bronze: Earned 100,000,000 credits (134,986,204)Factorial/Compositorial Sieve Double Gold: Earned 500,000,000 credits (670,225,537)Cullen/Woodall Sieve Double Ruby: Earned 2,000,000,000 credits (2,006,781,932)Generalized Cullen/Woodall Sieve (suspended) Double Bronze: Earned 100,000,000 credits (100,428,955)PPS Sieve Double Turquoise: Earned 5,000,000,000 credits (5,178,082,478)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Double Bronze: Earned 100,000,000 credits (101,857,516)Sierpinski/Riesel Base 5 Sieve Double Gold: Earned 500,000,000 credits (502,514,062)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,936,081)AP 26/27 Double Turquoise: Earned 5,000,000,000 credits (6,216,625,961)GFN Double Jade: Earned 10,000,000,000 credits (19,320,713,548)WW (retired) Double Jade: Earned 10,000,000,000 credits (16,390,276,000)PSA Double Turquoise: Earned 5,000,000,000 credits (6,291,360,623)
Message 180442 - Posted: 6 Apr 2025 | 1:09:26 UTC - in response to Message 180279.

Here is my Asus PRIME GeForce RTX™ 5070 12GB GDDR7 OC Edition (PRIME-RTX5070-O12G)

+---------------+---------------+---------------+ |Subproject |5070 elapsed |5070 power | +---------------+---------------+---------------+ |GFN-16 |70 |90w | +---------------+---------------+---------------+ |GFN-16 (x2) |121 |105w | +---------------+---------------+---------------+ |GFN-16 (x3) |192 |111w | +---------------+---------------+---------------+ |GFN-17 |163 |124w | +---------------+---------------+---------------+ |GFN-17 (2x) |347 |151w | +---------------+---------------+---------------+ |GFN-17 (3x) |472 |158w | +---------------+---------------+---------------+ |GFN-18 |459 |202w | +---------------+---------------+---------------+ |GFN-18 (2x) |896 |196w | +---------------+---------------+---------------+ |GFN-19 |1364 |238w | +---------------+---------------+---------------+ |GFN-19 (2x) |2743 |239w | +---------------+---------------+---------------+ |GFN-20 |4463 |246w | +---------------+---------------+---------------+ |GFN-21 |18131 |247w | +---------------+---------------+---------------+ |DYFL |205857 |246w | +---------------+---------------+---------------+ |AP27 |219 |201w | +---------------+---------------+---------------+ |AP27 (2x) |421 |205w | +---------------+---------------+---------------+


Very similar to my results with an MSI Shadow. Faster than a 4070 for GFN19-21, but really runs out of steam on DYFL.
____________

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180501 - Posted: 11 Apr 2025 | 2:01:20 UTC

Yves already above described about the architecture.

Just for awareness, on March 20, I put a Nvidia support ticket in regarding page 11-12 of the Nvidia RTX Blackwell GPU Architecture Document.

I asked “Please advise for consumer RTX 5000 series graphics cards if they do or do not have double INT32 performance (i.e. can use all shaders for INT32 rather than half) for Blackwell consumer Geforce 5000 series graphics cards compared to previous Ada Lovelace consumer Geforce 4000 series graphics (which can use half the shaders for INT32)”.

On April 9 Nvidia replied “There is no error in the Blackwell architecture document. Yes, the consumer GeForce RTX 5000 series graphics cards do have double INT32 performance compared to the Ada Lovelace-based RTX 4000 series”

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180502 - Posted: 11 Apr 2025 | 6:45:35 UTC - in response to Message 180501.

On April 9 Nvidia replied “There is no error in the Blackwell architecture document. Yes, the consumer GeForce RTX 5000 series graphics cards do have double INT32 performance compared to the Ada Lovelace-based RTX 4000 series”

It is both true and false.
If the number of cores is n, Ada Lovelace can execute n/2 IADD3 instructions per cycle and Blackwell n IADD3 instructions. Ada Lovelace can execute n/2 IMAD and Blackwell n IMAD. But Ada Lovelace can execute n/2 IADD3 and n/2 IMAD per cycle in the same way as Blackwell.
If a benchmark evaluates instructions individually then INT32 performance is 2x. But in practice, where instructions are mixed, INT32 performance is 1x.

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180503 - Posted: 11 Apr 2025 | 14:35:20 UTC - in response to Message 180502.
Last modified: 11 Apr 2025 | 14:35:37 UTC

Thanks Yves.

FYI - I have been quoting you (giving you credit) in the Nvidia Blackwell integer thread here:

https://forums.developer.nvidia.com/t/blackwell-integer/320578

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180505 - Posted: 11 Apr 2025 | 14:56:47 UTC

Also Yves, in that thread it sounds like for cases where the instructions are not mixed they are not seeing the 2x INT32 increase for Blackwell versus Ada Lovelace. (genefer does a IADD3/IMAD mix for Ada Lovelace so can't be used as a comparison since it negates any "theoretical 2x" increase).

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180506 - Posted: 11 Apr 2025 | 15:00:23 UTC - in response to Message 180505.

Also Yves, in that thread it sounds like for cases where the instructions are not mixed they are not seeing the 2x INT32 increase for Blackwell versus Ada Lovelace. (genefer does a IADD3/IMAD mix for Ada Lovelace so can't be used as a comparison since it negates any "theoretical 2x" increase).


Yves, the user Curefab in that forum says:

As far as I understood, the performed tests only showed the same characteristics as an Ada Lovelace system, whereas the documentation and the marketing emphasized the improvement you stated.

mackerel Project donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2883
ID: 29980
Credit: 716,426,128
RAC: 203,941
Discovered 6 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de PrimesFound 6 primes in the 2020 Tour de PrimesFound 5 primes in the 2021 Tour de PrimesFound 1 prime in the 2022 Tour de PrimesFound 1 prime in the 2023 Tour de PrimesFound 2 primes in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (10,747,880)Cullen LLR Jade: Earned 10,000,000 credits (10,179,678)ESP LLR Jade: Earned 10,000,000 credits (10,081,650)Generalized Cullen/Woodall LLR Jade: Earned 10,000,000 credits (16,188,476)Primorial Prime Search Jade: Earned 10,000,000 credits (10,092,157)PPS LLR Double Bronze: Earned 100,000,000 credits (144,065,867)PSP LLR Sapphire: Earned 20,000,000 credits (21,293,168)SoB LLR Sapphire: Earned 20,000,000 credits (20,128,807)SR5 LLR Sapphire: Earned 20,000,000 credits (31,067,789)SGS LLR (suspended) Turquoise: Earned 5,000,000 credits (7,492,571)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Sapphire: Earned 20,000,000 credits (43,313,604)Woodall LLR Jade: Earned 10,000,000 credits (10,111,927)321 Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,236,219)Factorial/Compositorial Sieve Jade: Earned 10,000,000 credits (10,283,478)Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,607,938)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (76,969,144)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)Sierpinski/Riesel Base 5 Sieve Jade: Earned 10,000,000 credits (10,016,731)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Sapphire: Earned 20,000,000 credits (39,210,805)GFN Double Bronze: Earned 100,000,000 credits (144,127,355)WW (retired) Sapphire: Earned 20,000,000 credits (43,304,000)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 180614 - Posted: 18 Apr 2025 | 22:06:01 UTC

Latest driver 576.02 has been reported to give performance increases in (some?) gaming related uses. On my own system, relative to early drivers, I saw between 4 to 11% increases in Steel Nomad, Monster Hunter Wilds, Black Myth Wukong, and FFXIV: Dawntrail benchmarks. For Wukong specifically, I noticed the boost clocks were going higher but don't know if that is the only thing going on. Power may be a little higher, but that fluctuated a lot so I'm less certain.

Only compute load I tried is Blender 4.3.0. 7339.73 with latest driver, 7121.3 with older, or just over 3% increase.

I have not done any BOINC type testing on it so far.

tng Project donor
Send message
Joined: 29 Aug 10
Posts: 601
ID: 66603
Credit: 63,876,606,271
RAC: 15,486,133
Discovered the World's First base 25 Generalized Cullen prime!!!Discovered the World's First base 73 Generalized Cullen prime!!!Discovered 85 mega primesEliminated 2 conjecture "k"sDiscovered 4 AP26sDiscovered 1 AP27Discovered 1 twin primeDiscovered 1 Fermat divisor2017 Tour de Primes highest prime count2017 Tour de Primes most Mountain Stage primes2018 Tour de Primes most Mountain Stage primesFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2018 Tour de Primes Mountain StageFound 1 mega prime in the 2018 Tour de Primes Mountain StageFound 1 prime in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de Primes2020 Tour de Primes largest primeFound 3 primes in the 2020 Tour de PrimesFound 2 mega primes in the 2020 Tour de Primes2021 Tour de Primes most Mountain Stage primesFound 9 primes in the 2021 Tour de PrimesFound 6 mega primes in the 2021 Tour de PrimesFound 3 primes in the 2021 Tour de Primes Mountain StageFound 1 mega prime  in the 2021 Tour de Primes Mountain Stage2022 Tour de Primes highest prime count2022 Tour de Primes highest prime scoreFound 37 primes in the 2022 Tour de PrimesFound 12 mega primes in the 2022 Tour de Primes2023 Tour de Primes largest primeFound 2 primes in the 2023 Tour de PrimesFound 2 mega primes in the 2023 Tour de PrimesFound 3 primes in the 2024 Tour de PrimesFound 3 mega primes in the 2024 Tour de PrimesFound 2 primes in the 2024 Tour de Primes Mountain StageFound 2 mega prime s in the 2024 Tour de Primes Mountain StageFound 2 primes in the 2025 Tour de PrimesFound 2 mega primes in the 2025 Tour de Primes321 LLR Double Silver: Earned 200,000,000 credits (343,553,608)Cullen LLR Double Silver: Earned 200,000,000 credits (315,023,255)ESP LLR Double Gold: Earned 500,000,000 credits (502,258,596)Generalized Cullen/Woodall LLR Double Gold: Earned 500,000,000 credits (527,209,325)Primorial Prime Search Double Gold: Earned 500,000,000 credits (505,133,305)PPS LLR Double Amethyst: Earned 1,000,000,000 credits (1,129,573,342)PSP LLR Double Gold: Earned 500,000,000 credits (505,158,789)SoB LLR Double Gold: Earned 500,000,000 credits (540,846,573)SR5 LLR Double Gold: Earned 500,000,000 credits (513,955,064)SGS LLR (suspended) Double Silver: Earned 200,000,000 credits (258,821,441)TRP LLR Double Amethyst: Earned 1,000,000,000 credits (1,267,654,093)Woodall LLR Double Gold: Earned 500,000,000 credits (502,878,341)321 Sieve (suspended) Double Bronze: Earned 100,000,000 credits (134,986,204)Factorial/Compositorial Sieve Double Gold: Earned 500,000,000 credits (670,225,537)Cullen/Woodall Sieve Double Ruby: Earned 2,000,000,000 credits (2,006,781,932)Generalized Cullen/Woodall Sieve (suspended) Double Bronze: Earned 100,000,000 credits (100,428,955)PPS Sieve Double Turquoise: Earned 5,000,000,000 credits (5,178,082,478)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Double Bronze: Earned 100,000,000 credits (101,857,516)Sierpinski/Riesel Base 5 Sieve Double Gold: Earned 500,000,000 credits (502,514,062)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,936,081)AP 26/27 Double Turquoise: Earned 5,000,000,000 credits (6,216,625,961)GFN Double Jade: Earned 10,000,000,000 credits (19,320,713,548)WW (retired) Double Jade: Earned 10,000,000,000 credits (16,390,276,000)PSA Double Turquoise: Earned 5,000,000,000 credits (6,291,360,623)
Message 180617 - Posted: 19 Apr 2025 | 0:27:14 UTC - in response to Message 180614.

Latest driver 576.02 has been reported to give performance increases in (some?) gaming related uses. On my own system, relative to early drivers, I saw between 4 to 11% increases in Steel Nomad, Monster Hunter Wilds, Black Myth Wukong, and FFXIV: Dawntrail benchmarks. For Wukong specifically, I noticed the boost clocks were going higher but don't know if that is the only thing going on. Power may be a little higher, but that fluctuated a lot so I'm less certain.

Only compute load I tried is Blender 4.3.0. 7339.73 with latest driver, 7121.3 with older, or just over 3% increase.

I have not done any BOINC type testing on it so far.


No noticeable difference with my 5070 Ti running GFN-21.
____________

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180932 - Posted: 5 May 2025 | 12:40:00 UTC

CUDA 12.9 provides new insights into Blackwell architecture.

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capability-12-x
Compute Capability 12.x (Blackwell 2.0). A Streaming Multiprocessor (SM) consists of:
- 128 FP32 cores for single-precision arithmetic operations,
- 2 FP64 cores for double-precision arithmetic operations,
- 64 INT32 cores for integer math.

More details are available at https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions
Compute Capability of Ada Lovelace is 8.9. Then what's new (or not) with Blackwell architecture?
The throughput of 32-bit integer add, sub, logic is still 64 (and not 128 as indicated in Blackwell GPU Architecture Whitepaper).
But now the throughput of 32-bit integer multiply-add is 128. A surprising architecture: 128 additions can be computed using the MAD instruction, but 64 only using the ADD instruction. It is not a real improvement compared to Ada Lovelace because the throughput was already 64 ADD/SUB/AND/... + 64 MAD = 128 instructions.
It is interesting to note that there are now 128 64-bit integer add, multiply units. Because the 32-bit support was removed and addressing modes were reduced, 64-bit operations are needed for address calculation. These instructions are not as powerful as 32-bit variants (carry-in, carry-out and MAD are not available) but it is sufficient for evaluating a 64-bit address. This may indicate that the next Nvidia architectures will be based on 64-bit cores. It would be a major improvement for primality tests.

64-bit operations are not sufficient to counteract the narrowed addressing mode. The size of the code increased by 5-10%.
RTX 5080 (10752 cores, 360 W) is about as fast as RTX 4080 (9728 cores, 320 W) for GFN-21. The L2 cache size is 64MB and data size is 36MB. GFN-23 is faster on RTX 5080 because the data size is 96MB and memory bandwidth is 960.0GB/s (vs 716.8GB/s).

Profile kuroganet
Send message
Joined: 13 Nov 09
Posts: 7
ID: 50048
Credit: 1,837,298,018
RAC: 83,617
Discovered 9 mega primesDiscovered 1 AP26Found 6 primes in the 2021 Tour de PrimesFound 4 mega primes in the 2021 Tour de PrimesFound 6 primes in the 2022 Tour de PrimesFound 3 mega primes in the 2022 Tour de PrimesFound 1 prime in the 2022 Tour de Primes Mountain StageFound 1 mega prime  in the 2022 Tour de Primes Mountain StageFound 3 primes in the 2023 Tour de PrimesFound 7 primes in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (17,229,876)Cullen LLR Jade: Earned 10,000,000 credits (11,043,069)ESP LLR Jade: Earned 10,000,000 credits (14,350,383)Generalized Cullen/Woodall LLR Sapphire: Earned 20,000,000 credits (30,482,815)Primorial Prime Search Turquoise: Earned 5,000,000 credits (6,966,935)PPS LLR Emerald: Earned 50,000,000 credits (85,428,533)PSP LLR Sapphire: Earned 20,000,000 credits (39,628,061)SoB LLR Sapphire: Earned 20,000,000 credits (25,109,843)SR5 LLR Sapphire: Earned 20,000,000 credits (29,250,777)SGS LLR (suspended) Jade: Earned 10,000,000 credits (10,611,654)TRP LLR Sapphire: Earned 20,000,000 credits (24,385,306)Woodall LLR Sapphire: Earned 20,000,000 credits (23,658,244)321 Sieve (suspended) Ruby: Earned 2,000,000 credits (3,645,200)Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (15,347,553)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (8,296,793)PPS Sieve Double Silver: Earned 200,000,000 credits (385,469,724)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,604,513)Sierpinski/Riesel Base 5 Sieve Silver: Earned 100,000 credits (109,308)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,369,210)AP 26/27 Double Bronze: Earned 100,000,000 credits (157,832,077)GFN Double Silver: Earned 200,000,000 credits (464,707,695)WW (retired) Double Silver: Earned 200,000,000 credits (464,144,000)PSA Jade: Earned 10,000,000 credits (10,624,198)
Message 180933 - Posted: 5 May 2025 | 13:53:12 UTC

CUDA Toolkit 12.9 (NVIDIA Driver 575.51.03)
Genefer v4.10
AP27 Search v2.11

Tested ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY 32GB GDDR7 (ZT-B50900B-10P)
Fedora 41 Workstation

+---------------+---------------+---------------+ |Subproject |5090 elapsed |5090 power | +---------------+---------------+---------------+ |GFN-16 |39 |188w | +---------------+---------------+---------------+ |GFN-16 (x2) |70 |193w | +---------------+---------------+---------------+ |GFN-16 (x3) |89 |193w | +---------------+---------------+---------------+ |GFN-17 |81 |294w | +---------------+---------------+---------------+ |GFN-17 (2x) |150 |292w | +---------------+---------------+---------------+ |GFN-17 (3x) |230 |295w | +---------------+---------------+---------------+ |GFN-18 |200 |408w | +---------------+---------------+---------------+ |GFN-18 (2x) |389 |380w | +---------------+---------------+---------------+ |GFN-19 |498 |559w | +---------------+---------------+---------------+ |GFN-19 (2x) |1009 |505w | +---------------+---------------+---------------+ |GFN-20 |1357 |600w | +---------------+---------------+---------------+ |GFN-21 |4788 |600w | +---------------+---------------+---------------+ |DYFL |44806 |600w | +---------------+---------------+---------------+ |AP27 |66 |600w | +---------------+---------------+---------------+ |AP27 (2x) |121 |600w | +---------------+---------------+---------------+

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180936 - Posted: 6 May 2025 | 14:34:22 UTC - in response to Message 180933.
Last modified: 6 May 2025 | 14:36:09 UTC

Genefer v4.10 AP27 Search v2.11 Tested ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY 32GB GDDR7 (ZT-B50900B-10P)

Of course, these are new records:
4179054^{2^20} + 1: proof file is generated, time = 00:22:35. 2305080^{2^21} + 1: proof file is generated, time = 01:19:41. 100136^{2^23} + 1: proof file is generated, time = 12:26:38.

I have compared energy (elapsed * power) to tng's tests with genefer v4.10 (RTX 5090 TDP is 575 W and RTX 4090 TDP is 450 W).
In kWh:
4090 5080 5090 GFN-20 0.19 0.21 0.23 GFN-21 0.75 0.84 0.80 DYFL 9.47 9.16 7.47

They require equivalent amounts of energy for GFN-20 and 21 but RTX 5090 is more energy efficient for DYFL... I don't know why.

Honza Project donor
Volunteer moderator
Volunteer tester
Project scientist
Send message
Joined: 15 Aug 05
Posts: 2030
ID: 352
Credit: 8,498,208,341
RAC: 2,642,421
Discovered 17 mega primesEliminated 4 conjecture "k"sFound 2 primes in the 2018 Tour de PrimesFound 1 prime in the 2018 Tour de Primes Mountain Stage2019 Tour de Primes largest primeFound 4 primes in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain StageFound 1 prime in the 2020 Tour de PrimesFound 4 primes in the 2021 Tour de PrimesFound 1 mega prime in the 2021 Tour de PrimesFound 1 prime in the 2021 Tour de Primes Mountain StageFound 2 primes in the 2022 Tour de PrimesFound 1 mega prime in the 2022 Tour de PrimesFound 3 primes in the 2023 Tour de PrimesFound 1 mega prime in the 2023 Tour de PrimesFound 1 prime in the 2023 Tour de Primes Mountain StageFound 3 primes in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2024 Tour de Primes Mountain StageFound 4 primes in the 2025 Tour de PrimesFound 1 mega prime in the 2025 Tour de PrimesFound 3 primes in the 2025 Tour de Primes Mountain Stage321 LLR Double Bronze: Earned 100,000,000 credits (114,828,162)Cullen LLR Double Bronze: Earned 100,000,000 credits (136,198,079)ESP LLR Double Bronze: Earned 100,000,000 credits (117,666,516)Generalized Cullen/Woodall LLR Double Bronze: Earned 100,000,000 credits (143,201,627)Primorial Prime Search Double Gold: Earned 500,000,000 credits (653,608,121)PPS LLR Double Silver: Earned 200,000,000 credits (366,038,198)PSP LLR Double Bronze: Earned 100,000,000 credits (129,598,109)SoB LLR Double Bronze: Earned 100,000,000 credits (133,324,006)SR5 LLR Double Gold: Earned 500,000,000 credits (515,262,651)SGS LLR (suspended) Double Bronze: Earned 100,000,000 credits (107,001,508)TPS LLR (retired) Bronze: Earned 10,000 credits (43,033)TRP LLR Double Bronze: Earned 100,000,000 credits (142,906,040)Woodall LLR Double Bronze: Earned 100,000,000 credits (101,172,892)321 Sieve (suspended) Double Bronze: Earned 100,000,000 credits (115,948,450)Factorial/Compositorial Sieve Sapphire: Earned 20,000,000 credits (20,841,722)Cullen/Woodall Sieve Double Bronze: Earned 100,000,000 credits (100,408,718)Generalized Cullen/Woodall Sieve (suspended) Emerald: Earned 50,000,000 credits (50,504,945)PPS Sieve Double Gold: Earned 500,000,000 credits (513,057,580)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,288,222)Sierpinski/Riesel Base 5 Sieve Double Bronze: Earned 100,000,000 credits (161,650,206)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,149,354)AP 26/27 Double Gold: Earned 500,000,000 credits (782,455,526)GFN Double Amethyst: Earned 1,000,000,000 credits (1,497,276,676)WW (retired) Double Ruby: Earned 2,000,000,000 credits (2,018,796,000)PSA Double Gold: Earned 500,000,000 credits (536,055,990)
Message 180939 - Posted: 6 May 2025 | 17:49:46 UTC - in response to Message 180936.

They require equivalent amounts of energy for GFN-20 and 21 but RTX 5090 is more energy efficient for DYFL... I don't know why.



Memory bus width 256 vs 512?
L1 cache 10 vs 21.25MB?


____________
My stats

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180943 - Posted: 6 May 2025 | 21:53:22 UTC - in response to Message 180939.

Memory bus width 256 vs 512?
L1 cache 10 vs 21.25MB?

5090 is about 2x 5080 (number of cores and memory bandwidth).
5090 is twice as fast as 5080 (44806/89351 seconds) but the power consumption is 600/370 W (x1.62).

mackerel Project donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2883
ID: 29980
Credit: 716,426,128
RAC: 203,941
Discovered 6 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de PrimesFound 6 primes in the 2020 Tour de PrimesFound 5 primes in the 2021 Tour de PrimesFound 1 prime in the 2022 Tour de PrimesFound 1 prime in the 2023 Tour de PrimesFound 2 primes in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (10,747,880)Cullen LLR Jade: Earned 10,000,000 credits (10,179,678)ESP LLR Jade: Earned 10,000,000 credits (10,081,650)Generalized Cullen/Woodall LLR Jade: Earned 10,000,000 credits (16,188,476)Primorial Prime Search Jade: Earned 10,000,000 credits (10,092,157)PPS LLR Double Bronze: Earned 100,000,000 credits (144,065,867)PSP LLR Sapphire: Earned 20,000,000 credits (21,293,168)SoB LLR Sapphire: Earned 20,000,000 credits (20,128,807)SR5 LLR Sapphire: Earned 20,000,000 credits (31,067,789)SGS LLR (suspended) Turquoise: Earned 5,000,000 credits (7,492,571)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Sapphire: Earned 20,000,000 credits (43,313,604)Woodall LLR Jade: Earned 10,000,000 credits (10,111,927)321 Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,236,219)Factorial/Compositorial Sieve Jade: Earned 10,000,000 credits (10,283,478)Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,607,938)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (76,969,144)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)Sierpinski/Riesel Base 5 Sieve Jade: Earned 10,000,000 credits (10,016,731)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Sapphire: Earned 20,000,000 credits (39,210,805)GFN Double Bronze: Earned 100,000,000 credits (144,127,355)WW (retired) Sapphire: Earned 20,000,000 credits (43,304,000)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 180945 - Posted: 6 May 2025 | 22:40:09 UTC - in response to Message 180943.

5090 is about 2x 5080 (number of cores and memory bandwidth).
5090 is twice as fast as 5080 (44806/89351 seconds) but the power consumption is 600/370 W (x1.62).

5080 uses 30Gbps VRAM vs 28Gbps on 5090. I don't know if they use a higher voltage to get the higher speed. If so, that will affect power efficiency beyond just clock scaling.

5080 has higher default clocks, roughly 2.3 GHz base, 2.6 boost, compared to 5090 2.0 base, 2.4 boost. Power limit of 5080 is 360W vs 600W of 5090.

Basically 5080 is pushed harder and works in a less efficient area. If the 5090 didn't have that power limit it likely could go faster still.

tng Project donor
Send message
Joined: 29 Aug 10
Posts: 601
ID: 66603
Credit: 63,876,606,271
RAC: 15,486,133
Discovered the World's First base 25 Generalized Cullen prime!!!Discovered the World's First base 73 Generalized Cullen prime!!!Discovered 85 mega primesEliminated 2 conjecture "k"sDiscovered 4 AP26sDiscovered 1 AP27Discovered 1 twin primeDiscovered 1 Fermat divisor2017 Tour de Primes highest prime count2017 Tour de Primes most Mountain Stage primes2018 Tour de Primes most Mountain Stage primesFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2018 Tour de Primes Mountain StageFound 1 mega prime in the 2018 Tour de Primes Mountain StageFound 1 prime in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de Primes2020 Tour de Primes largest primeFound 3 primes in the 2020 Tour de PrimesFound 2 mega primes in the 2020 Tour de Primes2021 Tour de Primes most Mountain Stage primesFound 9 primes in the 2021 Tour de PrimesFound 6 mega primes in the 2021 Tour de PrimesFound 3 primes in the 2021 Tour de Primes Mountain StageFound 1 mega prime  in the 2021 Tour de Primes Mountain Stage2022 Tour de Primes highest prime count2022 Tour de Primes highest prime scoreFound 37 primes in the 2022 Tour de PrimesFound 12 mega primes in the 2022 Tour de Primes2023 Tour de Primes largest primeFound 2 primes in the 2023 Tour de PrimesFound 2 mega primes in the 2023 Tour de PrimesFound 3 primes in the 2024 Tour de PrimesFound 3 mega primes in the 2024 Tour de PrimesFound 2 primes in the 2024 Tour de Primes Mountain StageFound 2 mega prime s in the 2024 Tour de Primes Mountain StageFound 2 primes in the 2025 Tour de PrimesFound 2 mega primes in the 2025 Tour de Primes321 LLR Double Silver: Earned 200,000,000 credits (343,553,608)Cullen LLR Double Silver: Earned 200,000,000 credits (315,023,255)ESP LLR Double Gold: Earned 500,000,000 credits (502,258,596)Generalized Cullen/Woodall LLR Double Gold: Earned 500,000,000 credits (527,209,325)Primorial Prime Search Double Gold: Earned 500,000,000 credits (505,133,305)PPS LLR Double Amethyst: Earned 1,000,000,000 credits (1,129,573,342)PSP LLR Double Gold: Earned 500,000,000 credits (505,158,789)SoB LLR Double Gold: Earned 500,000,000 credits (540,846,573)SR5 LLR Double Gold: Earned 500,000,000 credits (513,955,064)SGS LLR (suspended) Double Silver: Earned 200,000,000 credits (258,821,441)TRP LLR Double Amethyst: Earned 1,000,000,000 credits (1,267,654,093)Woodall LLR Double Gold: Earned 500,000,000 credits (502,878,341)321 Sieve (suspended) Double Bronze: Earned 100,000,000 credits (134,986,204)Factorial/Compositorial Sieve Double Gold: Earned 500,000,000 credits (670,225,537)Cullen/Woodall Sieve Double Ruby: Earned 2,000,000,000 credits (2,006,781,932)Generalized Cullen/Woodall Sieve (suspended) Double Bronze: Earned 100,000,000 credits (100,428,955)PPS Sieve Double Turquoise: Earned 5,000,000,000 credits (5,178,082,478)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Double Bronze: Earned 100,000,000 credits (101,857,516)Sierpinski/Riesel Base 5 Sieve Double Gold: Earned 500,000,000 credits (502,514,062)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,936,081)AP 26/27 Double Turquoise: Earned 5,000,000,000 credits (6,216,625,961)GFN Double Jade: Earned 10,000,000,000 credits (19,320,713,548)WW (retired) Double Jade: Earned 10,000,000,000 credits (16,390,276,000)PSA Double Turquoise: Earned 5,000,000,000 credits (6,291,360,623)
Message 180948 - Posted: 7 May 2025 | 1:44:07 UTC - in response to Message 180945.

5090 is about 2x 5080 (number of cores and memory bandwidth).
5090 is twice as fast as 5080 (44806/89351 seconds) but the power consumption is 600/370 W (x1.62).

5080 uses 30Gbps VRAM vs 28Gbps on 5090. I don't know if they use a higher voltage to get the higher speed. If so, that will affect power efficiency beyond just clock scaling.

5080 has higher default clocks, roughly 2.3 GHz base, 2.6 boost, compared to 5090 2.0 base, 2.4 boost. Power limit of 5080 is 360W vs 600W of 5090.

Basically 5080 is pushed harder and works in a less efficient area. If the 5090 didn't have that power limit it likely could go faster still.


Until it melted.
____________

Profile kuroganet
Send message
Joined: 13 Nov 09
Posts: 7
ID: 50048
Credit: 1,837,298,018
RAC: 83,617
Discovered 9 mega primesDiscovered 1 AP26Found 6 primes in the 2021 Tour de PrimesFound 4 mega primes in the 2021 Tour de PrimesFound 6 primes in the 2022 Tour de PrimesFound 3 mega primes in the 2022 Tour de PrimesFound 1 prime in the 2022 Tour de Primes Mountain StageFound 1 mega prime  in the 2022 Tour de Primes Mountain StageFound 3 primes in the 2023 Tour de PrimesFound 7 primes in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (17,229,876)Cullen LLR Jade: Earned 10,000,000 credits (11,043,069)ESP LLR Jade: Earned 10,000,000 credits (14,350,383)Generalized Cullen/Woodall LLR Sapphire: Earned 20,000,000 credits (30,482,815)Primorial Prime Search Turquoise: Earned 5,000,000 credits (6,966,935)PPS LLR Emerald: Earned 50,000,000 credits (85,428,533)PSP LLR Sapphire: Earned 20,000,000 credits (39,628,061)SoB LLR Sapphire: Earned 20,000,000 credits (25,109,843)SR5 LLR Sapphire: Earned 20,000,000 credits (29,250,777)SGS LLR (suspended) Jade: Earned 10,000,000 credits (10,611,654)TRP LLR Sapphire: Earned 20,000,000 credits (24,385,306)Woodall LLR Sapphire: Earned 20,000,000 credits (23,658,244)321 Sieve (suspended) Ruby: Earned 2,000,000 credits (3,645,200)Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (15,347,553)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (8,296,793)PPS Sieve Double Silver: Earned 200,000,000 credits (385,469,724)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,604,513)Sierpinski/Riesel Base 5 Sieve Silver: Earned 100,000 credits (109,308)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,369,210)AP 26/27 Double Bronze: Earned 100,000,000 credits (157,832,077)GFN Double Silver: Earned 200,000,000 credits (464,707,695)WW (retired) Double Silver: Earned 200,000,000 credits (464,144,000)PSA Jade: Earned 10,000,000 credits (10,624,198)
Message 180956 - Posted: 7 May 2025 | 15:38:39 UTC - in response to Message 180948.
Last modified: 7 May 2025 | 15:59:37 UTC

I am computing with this kind of setup.

Profile gemini8 Project donor
Avatar
Send message
Joined: 2 Jan 16
Posts: 170
ID: 434794
Credit: 1,159,063,572
RAC: 490,419
Discovered 5 mega primesDiscovered 1 AP26Found 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de PrimesFound 1 prime in the 2020 Tour de PrimesFound 1 prime in the 2023 Tour de PrimesFound 1 mega prime in the 2023 Tour de PrimesFound 3 primes in the 2024 Tour de PrimesFound 2 mega primes in the 2024 Tour de PrimesFound 1 prime in the 2024 Tour de Primes Mountain Stage321 LLR Turquoise: Earned 5,000,000 credits (5,542,543)Cullen LLR Jade: Earned 10,000,000 credits (10,021,170)ESP LLR Jade: Earned 10,000,000 credits (10,062,358)Generalized Cullen/Woodall LLR Jade: Earned 10,000,000 credits (14,680,384)Primorial Prime Search Jade: Earned 10,000,000 credits (17,364,189)PPS LLR Sapphire: Earned 20,000,000 credits (31,371,561)PSP LLR Jade: Earned 10,000,000 credits (12,397,681)SoB LLR Jade: Earned 10,000,000 credits (16,012,496)SR5 LLR Jade: Earned 10,000,000 credits (10,006,594)SGS LLR (suspended) Turquoise: Earned 5,000,000 credits (5,004,286)TRP LLR Jade: Earned 10,000,000 credits (10,024,661)Woodall LLR Turquoise: Earned 5,000,000 credits (5,002,923)321 Sieve (suspended) Ruby: Earned 2,000,000 credits (2,035,186)Factorial/Compositorial Sieve Turquoise: Earned 5,000,000 credits (5,234,078)Cullen/Woodall Sieve Double Bronze: Earned 100,000,000 credits (100,160,728)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,002,647)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,000,715)Sierpinski/Riesel Base 5 Sieve Sapphire: Earned 20,000,000 credits (21,302,683)TRP Sieve (suspended) Silver: Earned 100,000 credits (156,305)AP 26/27 Double Bronze: Earned 100,000,000 credits (109,177,172)GFN Double Gold: Earned 500,000,000 credits (568,503,061)WW (retired) Double Bronze: Earned 100,000,000 credits (100,000,000)
Message 180968 - Posted: 8 May 2025 | 5:45:04 UTC - in response to Message 180956.

Just 91% utilization of the GPU?
What about the values running two tasks in parallel?
____________
Greetings, Jens

147433824^131072+1

Honza Project donor
Volunteer moderator
Volunteer tester
Project scientist
Send message
Joined: 15 Aug 05
Posts: 2030
ID: 352
Credit: 8,498,208,341
RAC: 2,642,421
Discovered 17 mega primesEliminated 4 conjecture "k"sFound 2 primes in the 2018 Tour de PrimesFound 1 prime in the 2018 Tour de Primes Mountain Stage2019 Tour de Primes largest primeFound 4 primes in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain StageFound 1 prime in the 2020 Tour de PrimesFound 4 primes in the 2021 Tour de PrimesFound 1 mega prime in the 2021 Tour de PrimesFound 1 prime in the 2021 Tour de Primes Mountain StageFound 2 primes in the 2022 Tour de PrimesFound 1 mega prime in the 2022 Tour de PrimesFound 3 primes in the 2023 Tour de PrimesFound 1 mega prime in the 2023 Tour de PrimesFound 1 prime in the 2023 Tour de Primes Mountain StageFound 3 primes in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2024 Tour de Primes Mountain StageFound 4 primes in the 2025 Tour de PrimesFound 1 mega prime in the 2025 Tour de PrimesFound 3 primes in the 2025 Tour de Primes Mountain Stage321 LLR Double Bronze: Earned 100,000,000 credits (114,828,162)Cullen LLR Double Bronze: Earned 100,000,000 credits (136,198,079)ESP LLR Double Bronze: Earned 100,000,000 credits (117,666,516)Generalized Cullen/Woodall LLR Double Bronze: Earned 100,000,000 credits (143,201,627)Primorial Prime Search Double Gold: Earned 500,000,000 credits (653,608,121)PPS LLR Double Silver: Earned 200,000,000 credits (366,038,198)PSP LLR Double Bronze: Earned 100,000,000 credits (129,598,109)SoB LLR Double Bronze: Earned 100,000,000 credits (133,324,006)SR5 LLR Double Gold: Earned 500,000,000 credits (515,262,651)SGS LLR (suspended) Double Bronze: Earned 100,000,000 credits (107,001,508)TPS LLR (retired) Bronze: Earned 10,000 credits (43,033)TRP LLR Double Bronze: Earned 100,000,000 credits (142,906,040)Woodall LLR Double Bronze: Earned 100,000,000 credits (101,172,892)321 Sieve (suspended) Double Bronze: Earned 100,000,000 credits (115,948,450)Factorial/Compositorial Sieve Sapphire: Earned 20,000,000 credits (20,841,722)Cullen/Woodall Sieve Double Bronze: Earned 100,000,000 credits (100,408,718)Generalized Cullen/Woodall Sieve (suspended) Emerald: Earned 50,000,000 credits (50,504,945)PPS Sieve Double Gold: Earned 500,000,000 credits (513,057,580)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,288,222)Sierpinski/Riesel Base 5 Sieve Double Bronze: Earned 100,000,000 credits (161,650,206)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,149,354)AP 26/27 Double Gold: Earned 500,000,000 credits (782,455,526)GFN Double Amethyst: Earned 1,000,000,000 credits (1,497,276,676)WW (retired) Double Ruby: Earned 2,000,000,000 credits (2,018,796,000)PSA Double Gold: Earned 500,000,000 credits (536,055,990)
Message 180970 - Posted: 8 May 2025 | 8:08:08 UTC - in response to Message 180968.

Just 91% utilization of the GPU?
What about the values running two tasks in parallel?


Power limit just 600W? :-)
____________
My stats

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180971 - Posted: 8 May 2025 | 9:32:04 UTC - in response to Message 180968.

Just 91% utilization of the GPU?
What about the values running two tasks in parallel?

Data size is 96MB and RTX 5090 L2 cache size is 96MB.
I think that 91% is 2820MHz/3090MHz because of the power limit.
Two tasks are efficient when the power draw is less than the power limit.

Profile kuroganet
Send message
Joined: 13 Nov 09
Posts: 7
ID: 50048
Credit: 1,837,298,018
RAC: 83,617
Discovered 9 mega primesDiscovered 1 AP26Found 6 primes in the 2021 Tour de PrimesFound 4 mega primes in the 2021 Tour de PrimesFound 6 primes in the 2022 Tour de PrimesFound 3 mega primes in the 2022 Tour de PrimesFound 1 prime in the 2022 Tour de Primes Mountain StageFound 1 mega prime  in the 2022 Tour de Primes Mountain StageFound 3 primes in the 2023 Tour de PrimesFound 7 primes in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (17,229,876)Cullen LLR Jade: Earned 10,000,000 credits (11,043,069)ESP LLR Jade: Earned 10,000,000 credits (14,350,383)Generalized Cullen/Woodall LLR Sapphire: Earned 20,000,000 credits (30,482,815)Primorial Prime Search Turquoise: Earned 5,000,000 credits (6,966,935)PPS LLR Emerald: Earned 50,000,000 credits (85,428,533)PSP LLR Sapphire: Earned 20,000,000 credits (39,628,061)SoB LLR Sapphire: Earned 20,000,000 credits (25,109,843)SR5 LLR Sapphire: Earned 20,000,000 credits (29,250,777)SGS LLR (suspended) Jade: Earned 10,000,000 credits (10,611,654)TRP LLR Sapphire: Earned 20,000,000 credits (24,385,306)Woodall LLR Sapphire: Earned 20,000,000 credits (23,658,244)321 Sieve (suspended) Ruby: Earned 2,000,000 credits (3,645,200)Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (15,347,553)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (8,296,793)PPS Sieve Double Silver: Earned 200,000,000 credits (385,469,724)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,604,513)Sierpinski/Riesel Base 5 Sieve Silver: Earned 100,000 credits (109,308)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,369,210)AP 26/27 Double Bronze: Earned 100,000,000 credits (157,832,077)GFN Double Silver: Earned 200,000,000 credits (464,707,695)WW (retired) Double Silver: Earned 200,000,000 credits (464,144,000)PSA Jade: Earned 10,000,000 credits (10,624,198)
Message 180984 - Posted: 9 May 2025 | 9:17:22 UTC - in response to Message 180971.

This is what the data I have on hand looks like.

RTX5090
GPU utilization

GFN16
x1 73%
x2 74%
x3 75%

GFN17
x1 81%
x2 81%
x3 82%

GFN18
x1 83%
x2 85%

GFN19
x1 83%
x2 84%

GFN20
x1 90-91%

GFN21
x1 75-96%

DYFL
x1 98%

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 180992 - Posted: 9 May 2025 | 20:37:50 UTC - in response to Message 180932.
Last modified: 9 May 2025 | 20:53:59 UTC

CUDA 12.9 provides new insights into Blackwell architecture.


But now the throughput of 32-bit integer multiply-add is 128.


It is interesting to note that there are now 128 64-bit integer add, multiply units.


The Blackwell integer thread:

https://forums.developer.nvidia.com/t/blackwell-integer/320578

just mentioned the throughput table has been updated again by Nvidia.

https://docs.nvidia.com/cuda/cuda-c-programming-guide/#maximize-instruction-throughput

Blackwell can do 64 32-bit integer multiply, multiply-add instructions per Clock Cycle per Multiprocessor, instead of 128.

It can do 64 64-bit integer adds.

Looks like everyone was correct about Blackwell having similar INT32 instruction throughput performance compared to Ada Lovelace architecture.

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 180998 - Posted: 10 May 2025 | 7:22:07 UTC

Nvidia documentation last Monday (see https://web.archive.org/web/20250505133403/https://docs.nvidia.com/cuda/cuda-c-programming-guide/#maximize-instruction-throughput

32-bit integer add, extended-precision add, subtract, extended-precision subtract: 64
32-bit integer multiply, multiply-add, extended-precision multiply-add: 128
64-bit integer add, multiply: 128

And Nvidia documentation on Friday (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-instruction-throughput)

32-bit integer add, extended-precision add, subtract, extended-precision subtract: 64
32-bit integer multiply, multiply-add, extended-precision multiply-add: 64
64-bit integer add: 64

On Ada Lovelace, the compiler generates about 50% of integer multiply-add and 50% of integer add, shift, compare such that the throughput is 64 + 64 = 128.
On Blackwell, multiply and others are not balanced. But the throughput of integer operations is similar to Ada Lovelace's throughput.
It is inconsistent with the table... the real architecture remains unknown.
Are integer add unit, madd unit, shift unit and compare unit independent such that the throughputs can be added?

Yves Gallot Project donor
Send message
Joined: 19 Aug 12
Posts: 969
ID: 164101
Credit: 310,512,294
RAC: 65
GFN Double Silver: Earned 200,000,000 credits (310,512,294)
Message 181005 - Posted: 10 May 2025 | 13:32:49 UTC - in response to Message 180984.
Last modified: 10 May 2025 | 13:34:59 UTC

This is what the data I have on hand looks like.
RTX5090 GPU utilization [...]

"GPU utilization" is not very significant because it is the percentage of time that the GPU was executing at least one kernel.
For an application running on one core, reported GPU utilization is 100% and real GPU utilization is 1/21760 = 0.0046%.

I have just compared the number of integer operations per second to the peak performance of 5090.
The number of integer operations is based on the number of machine code instructions generated with driver 576.28 and the current version of genefer.
GFN Inst. count elapsed TIOPS PEAK: 52.38 TIOPS 16 1.28E+14 39 3.28 6.26% 17 5.28E+14 81 6.51 12.43% 18 2.02E+15 200 10.12 19.33% 19 7.78E+15 498 15.63 29.83% 20 3.00E+16 1357 22.11 42.20% 21 1.22E+17 4788 25.50 48.68% DYFL 1.11E+18 44806 24.78 47.31%

Half of the peak performance for GFN-21 and DYFL is a very good score because the memory bandwidth should be taken into account.
RTX 5090 is oversized for GFN-16 and 17, it was noticeable from power usage.

Profile Rafael
Volunteer tester
Avatar
Send message
Joined: 22 Oct 14
Posts: 988
ID: 370496
Credit: 895,812,757
RAC: 714,175
Discovered 4 mega primesFound 1 prime in the 2023 Tour de PrimesFound 1 mega prime in the 2023 Tour de PrimesFound 1 prime in the 2023 Tour de Primes Mountain StageFound 1 mega prime  in the 2023 Tour de Primes Mountain StageFound 1 prime in the 2024 Tour de PrimesFound 1 mega prime in the 2024 Tour de PrimesFound 1 prime in the 2025 Tour de PrimesFound 1 mega prime in the 2025 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (10,020,214)Cullen LLR Jade: Earned 10,000,000 credits (10,005,009)ESP LLR Jade: Earned 10,000,000 credits (10,041,747)Generalized Cullen/Woodall LLR Jade: Earned 10,000,000 credits (10,000,820)Primorial Prime Search Jade: Earned 10,000,000 credits (11,017,317)PPS LLR Jade: Earned 10,000,000 credits (18,298,845)PSP LLR Jade: Earned 10,000,000 credits (10,049,767)SoB LLR Double Bronze: Earned 100,000,000 credits (107,348,837)SR5 LLR Jade: Earned 10,000,000 credits (10,011,491)SGS LLR (suspended) Jade: Earned 10,000,000 credits (10,016,806)TRP LLR Jade: Earned 10,000,000 credits (10,011,903)Woodall LLR Jade: Earned 10,000,000 credits (10,076,850)321 Sieve (suspended) Jade: Earned 10,000,000 credits (10,033,828)Factorial/Compositorial Sieve Jade: Earned 10,000,000 credits (15,004,058)Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (50,003,859)Generalized Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (10,037,204)PPS Sieve Emerald: Earned 50,000,000 credits (51,390,895)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,053)Sierpinski/Riesel Base 5 Sieve Emerald: Earned 50,000,000 credits (50,000,068)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,030,160)AP 26/27 Emerald: Earned 50,000,000 credits (51,770,615)GFN Double Silver: Earned 200,000,000 credits (205,170,666)WW (retired) Emerald: Earned 50,000,000 credits (50,712,000)PSA Double Bronze: Earned 100,000,000 credits (170,761,999)
Message 181006 - Posted: 10 May 2025 | 14:21:51 UTC - in response to Message 181005.

Now we just need to optimize the sieve app too, yes? Surely™ we can do something about it to harness these new cards.

William F. Garnett III
Send message
Joined: 5 Jan 25
Posts: 11
ID: 1790638
Credit: 6,977,085
RAC: 94,881
GFN Turquoise: Earned 5,000,000 credits (6,977,085)
Message 181617 - Posted: 16 Jun 2025 | 16:04:18 UTC

Nvidia has finally responded in the Blackwell Integer thread.

From employee mjoux

Hi, I’m part of the team responsible for the instruction throughput table in the CUDA programming guide, and I wanted to give you an update on this thread from my point of view:

First, thank you all for raising these issues!
With CUDA 13.0, we will make significant changes to the table: (1) it will be moved to the CUDA best practices guide as it is less relevant to the programming model and (2) it will be re-structured, and will have example PTX instructions which will hopefully provide a little more clarity, although it is obviously far from perfect.

Concerning Blackwell integer instruction throughput specifically:
As pointed out in the thread already, some instructions have not been improved/changed: this applies to IMAD, LOP3, PRMT for example, as well as IADD3.
The main improved instructions relevant to this thread are IADD, IMNMX/VIMNMX, FSETP/ISETP: addition of 2 operands and min/max/compare.
Concerning integer addition specifically, it is even more complicated: previous architectures already had the possibility to achieve 2x throughput by combining e.g. IADD3 with IMAD.IADD or VIADD (for 9.0/10.0). Blackwell 12.0 now allows achieving this 2x throughput with a single instruction: IADD. But note that it can be difficult to get a sequence of instructions achieving higher throughput: for previous architectures, this is because of constraints in the instructions as well as compiler, which we cannot disclose publicly. For Blackwell 12.0, the compiler often outputs IADD3 instead of IADD: this should be improved soon.

Because of the above, it is unlikely that current benchmarks are able to achieve this 2x throughput, unless they are hand-crafted without relying on the compiler.
Note that this is the case for some other entries in the instruction throughput table: it only lists the theoretical maximum throughput, but we cannot always disclose how to precisely achieve this in practice.

If you have further questions, I can try answering them.



https://forums.developer.nvidia.com/t/blackwell-integer/320578/136
https://docs.nvidia.com/cuda/cuda-c-programming-guide/#maximize-instruction-throughput

Post to thread

Message boards : Number crunching : 50xx performance tests

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2025 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 4.69, 4.21, 4.10
Generated 3 Sep 2025 | 17:34:32 UTC