PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Generalized Fermat Prime Search : Genefer performance in relation to PCI Express bus bandwidth

Author Message
Aionel
Send message
Joined: 31 Mar 23
Posts: 2
ID: 1573519
Credit: 68,759,909
RAC: 2,157,108
Discovered 1 mega prime321 LLR Silver: Earned 100,000 credits (157,120)Cullen LLR Silver: Earned 100,000 credits (144,987)ESP LLR Silver: Earned 100,000 credits (173,382)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (156,416)PPS LLR Silver: Earned 100,000 credits (153,519)SGS LLR Silver: Earned 100,000 credits (116,907)Cullen/Woodall Sieve Gold: Earned 500,000 credits (688,548)PPS Sieve Gold: Earned 500,000 credits (562,957)AP 26/27 Gold: Earned 500,000 credits (541,762)GFN Emerald: Earned 50,000,000 credits (66,064,310)
Message 161786 - Posted: 17 Apr 2023 | 16:03:32 UTC

I was wondering how much does the PCI Express bus bandwidth affects performance of the the Genefer GPU application?

Probably it depends on how often does the CPU feeds the GPU with new work to be done while at the same time collecting results of previous work.

You see, I am thinking of reusing one of those former cryptocurrency miners as a dedicated BOINC machine. Now - after mining most currencies lost profitability - such GPU based miners can be bought for a small fraction of original parts cost. They are usually constructed by using 6 up to 12 GPUs. The RTX 3060 Ti seems to be a popular model here while it has still decent computing power. Yet those cards always use risers - that act both as extender and also x16 to x1 port adapter - so they can be connected to the motherboard x1 PCI Express ports. Otherwise they would not fit next to each other connected directly to the motherboard. So the communication with the CPU/RAM etc. is either 16 (if the x1 ports on the motherboard are acting still as 3.0 ports) or 64 times (if the x1 ports on the motherboard are switched to 1.0 speed which seems to be popular due to the increased stability when running multiple GPUs) slower than if they would be connected to regular x16 PCI Express 3.0 ports.

Profile Crun-chiProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Nov 09
Posts: 3247
ID: 50683
Credit: 152,646,050
RAC: 18,212
Discovered 2 mega primesEliminated 1 conjecture "k"Found 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de PrimesFound 1 prime in the 2020 Tour de PrimesFound 1 prime in the 2021 Tour de PrimesFound 1 prime in the 2022 Tour de PrimesFound 2 primes in the 2023 Tour de Primes321 LLR Silver: Earned 100,000 credits (229,492)Cullen LLR Silver: Earned 100,000 credits (110,733)PPS LLR Sapphire: Earned 20,000,000 credits (25,997,824)PSP LLR Ruby: Earned 2,000,000 credits (2,838,382)SoB LLR Silver: Earned 100,000 credits (106,117)SR5 LLR Amethyst: Earned 1,000,000 credits (1,240,750)SGS LLR Amethyst: Earned 1,000,000 credits (1,328,241)TRP LLR Gold: Earned 500,000 credits (626,755)Woodall LLR Silver: Earned 100,000 credits (122,944)321 Sieve (suspended) Silver: Earned 100,000 credits (104,900)Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,129,666)Generalized Cullen/Woodall Sieve (suspended) Gold: Earned 500,000 credits (515,556)PPS Sieve Jade: Earned 10,000,000 credits (11,935,566)TRP Sieve (suspended) Silver: Earned 100,000 credits (255,612)AP 26/27 Turquoise: Earned 5,000,000 credits (5,612,167)GFN Emerald: Earned 50,000,000 credits (87,881,236)WW (retired) Amethyst: Earned 1,000,000 credits (1,088,000)PSA Turquoise: Earned 5,000,000 credits (7,522,050)
Message 161787 - Posted: 17 Apr 2023 | 17:16:16 UTC - in response to Message 161786.

Al my GFN16, and GFN17 ( in current date) are found in GPU attached to risers, so dont have any problem with that. Time is same, running it on PCIEx 16 slot on the motherboard or at 1x slot on riser.
I also use my miner for that purpose, fresh install of Linux little tuning with nvidia-smi, for lower consumption, an that is that :)
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie!

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 21 Jan 10
Posts: 14037
ID: 53948
Credit: 477,051,011
RAC: 285,770
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 2 mega primesFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de PrimesFound 1 prime in the 2020 Tour de PrimesFound 2 primes in the 2021 Tour de PrimesFound 2 primes in the 2022 Tour de PrimesFound 1 mega prime in the 2022 Tour de PrimesFound 1 prime in the 2022 Tour de Primes Mountain StageFound 1 prime in the 2023 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (6,949,793)Cullen LLR Turquoise: Earned 5,000,000 credits (5,513,946)ESP LLR Turquoise: Earned 5,000,000 credits (7,150,009)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,094,541)PPS LLR Sapphire: Earned 20,000,000 credits (24,049,916)PSP LLR Jade: Earned 10,000,000 credits (11,203,327)SoB LLR Sapphire: Earned 20,000,000 credits (36,601,737)SR5 LLR Sapphire: Earned 20,000,000 credits (22,821,256)SGS LLR Turquoise: Earned 5,000,000 credits (6,383,954)TRP LLR Turquoise: Earned 5,000,000 credits (6,308,522)Woodall LLR Turquoise: Earned 5,000,000 credits (6,390,624)321 Sieve (suspended) Jade: Earned 10,000,000 credits (10,061,196)Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (51,764,198)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (22,888,492)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Sapphire: Earned 20,000,000 credits (23,342,956)GFN Double Bronze: Earned 100,000,000 credits (120,616,519)WW (retired) Emerald: Earned 50,000,000 credits (88,580,000)PSA Jade: Earned 10,000,000 credits (13,196,884)
Message 161789 - Posted: 17 Apr 2023 | 18:57:08 UTC - in response to Message 161786.

I was wondering how much does the PCI Express bus bandwidth affects performance of the the Genefer GPU application?

Probably it depends on how often does the CPU feeds the GPU with new work to be done while at the same time collecting results of previous work.

You see, I am thinking of reusing one of those former cryptocurrency miners as a dedicated BOINC machine. Now - after mining most currencies lost profitability - such GPU based miners can be bought for a small fraction of original parts cost. They are usually constructed by using 6 up to 12 GPUs. The RTX 3060 Ti seems to be a popular model here while it has still decent computing power. Yet those cards always use risers - that act both as extender and also x16 to x1 port adapter - so they can be connected to the motherboard x1 PCI Express ports. Otherwise they would not fit next to each other connected directly to the motherboard. So the communication with the CPU/RAM etc. is either 16 (if the x1 ports on the motherboard are acting still as 3.0 ports) or 64 times (if the x1 ports on the motherboard are switched to 1.0 speed which seems to be popular due to the increased stability when running multiple GPUs) slower than if they would be connected to regular x16 PCI Express 3.0 ports.


The interface bandwidth is important for gaming. Probably useful for some other things.

But like mining, it doesn't affect our apps much. The data transfers between CPU and GPU are relatively small and infrequent. You could probably replace the PCIe x16 connection with a carrier pigeon without slowing down the app. :)
____________
My lucky number is 75898524288+1

Aionel
Send message
Joined: 31 Mar 23
Posts: 2
ID: 1573519
Credit: 68,759,909
RAC: 2,157,108
Discovered 1 mega prime321 LLR Silver: Earned 100,000 credits (157,120)Cullen LLR Silver: Earned 100,000 credits (144,987)ESP LLR Silver: Earned 100,000 credits (173,382)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (156,416)PPS LLR Silver: Earned 100,000 credits (153,519)SGS LLR Silver: Earned 100,000 credits (116,907)Cullen/Woodall Sieve Gold: Earned 500,000 credits (688,548)PPS Sieve Gold: Earned 500,000 credits (562,957)AP 26/27 Gold: Earned 500,000 credits (541,762)GFN Emerald: Earned 50,000,000 credits (66,064,310)
Message 162555 - Posted: 14 May 2023 | 15:19:24 UTC

I was finally able to test this, and it seems that Genefer GPU application performance actually is somehow affected by PCIe bus bandwidth. Although it may be only the case of newer, faster GPUs.

I've got a machine with 4 x RTX 4070 Ti GPUs installed. It's using Ryzen 5950x CPU on Asus TUF Gaming B550 Plus motherboard. This motherboard has one 4.0 mode x16 speed PCIe slot, one 3.0 mode x16 speed PCIe slot (yet this one actually only has x16 physical slot but it runs at x4 speed, so other lanes here are not connected to anything), and three 3.0 mode x1 speed PCIe slots (yet there share lanes with the second one x16 slot, so if any of the x1 slots are used the second x16 slot runs at x1 rather than x4 speed). All of the GPUs are connected via standard powered mining risers. So small PCB is inserted into each of PCIe motherboard slot, this PCB has USB 3.0 connector with cable connected to larger PCB that has power connector and regular x16 PCIe slot connected to GPU.

I couldn't get this machine to boot correctly, so as advised in many cryptocurrency miner building guides, I have connected only one GPU directly to the motherboard and it finally booted, next in BIOS I've changed mode/generation of each PCIe slot to 1.0 and enabled 4G decoding. This allowed the machine to boot every time with all GPUs connected via risers.

I have tested performance when running 4 GFN-19 tasks at the same time with no other CPU tasks running to interfere. Each GPU was running the task at slightly different speed with different power consumption. GPU core usage was in 70 - 75 % range, with power consumption in 150 - 220 W range (this GPU type is rated at 285 W) which resulted in task running time up to about 40 minutes. Here goes the screenshot of all the test parameters at about 2/3 progress.

This was way slower than the average time of 18 minutes per task as reported in this thread.

Thinking it may be related to PCIe slots 1.0 mode, I have changed the mode to 2.0 for all PCIe slots and yet, as opposed to most guides, I was still able to boot correctly every time. I have run the test again. This time GPU core usage was in 75 - 85 % range, with power consumption in 190 - 225 W range which resulted in tasks running time up to about 28 minutes. Better, but still slower than 18 minutes I was expecting. Here goes the screenshot of all the test parameters at about 2/3 progress.

In both test cases none of the tasks resulted in error while computing and all of them were validated later correctly.

I couldn't get the machine to boot at all after trying to change all PCIe slots to 3.0 mode. Probably the cables of risers are not able to handle such signals.

What is interesting to note here, it is not like that with every BOINC project and probably even it won't be like this with every application of each project (I didn't test any other PG applications on this machine yet). It's probably related to how often does the CPU feed new data to be processed by GPU. For example in Distributed.net RC5-72 client I was able to get full rated speed as compared with others using this GPU, with GPU usage of 99 % and power consumption of 280 W.

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2652
ID: 29980
Credit: 570,442,335
RAC: 10,182
Discovered 6 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de PrimesFound 6 primes in the 2020 Tour de PrimesFound 5 primes in the 2021 Tour de PrimesFound 1 prime in the 2022 Tour de PrimesFound 1 prime in the 2023 Tour de Primes321 LLR Jade: Earned 10,000,000 credits (10,747,880)Cullen LLR Turquoise: Earned 5,000,000 credits (6,154,591)ESP LLR Turquoise: Earned 5,000,000 credits (7,207,880)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (6,714,227)PPS LLR Double Bronze: Earned 100,000,000 credits (119,961,682)PSP LLR Jade: Earned 10,000,000 credits (16,843,431)SoB LLR Sapphire: Earned 20,000,000 credits (20,128,807)SR5 LLR Sapphire: Earned 20,000,000 credits (26,030,253)SGS LLR Turquoise: Earned 5,000,000 credits (7,451,505)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Sapphire: Earned 20,000,000 credits (38,431,288)Woodall LLR Turquoise: Earned 5,000,000 credits (8,968,201)321 Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,236,219)Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,607,938)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (76,969,144)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Sapphire: Earned 20,000,000 credits (28,343,221)GFN Emerald: Earned 50,000,000 credits (95,435,184)WW (retired) Sapphire: Earned 20,000,000 credits (43,304,000)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 162556 - Posted: 14 May 2023 | 15:49:19 UTC - in response to Message 162555.

You can try running GPU-Z which shows "Bus Interface Load". As a quick example, I'm running a GFN 17 and I'm seeing 1% bus load with a 4070 on 3.0 x16 interface. If this value starts going much away from zero, it might start to impact in latency of transfer and reduce performance that way.

Maybe you can get faster bus speeds if you get better risers. Many of the mining era ones looked pretty bad, recycling things like physical USB connections which were never intended for that use case.

Yves GallotProject donor
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 843
ID: 164101
Credit: 306,521,622
RAC: 5,385
GFN Double Silver: Earned 200,000,000 credits (306,521,622)
Message 162560 - Posted: 14 May 2023 | 21:23:50 UTC - in response to Message 162555.

PCIe bus load depends on GFN subproject.
If PCIe bus bandwidth is the problem then testing GFN-20 or GFN-21 should increase the power consumption and achieve 280 W.

Post to thread

Message boards : Generalized Fermat Prime Search : Genefer performance in relation to PCI Express bus bandwidth

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2023 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 2.99, 2.50, 2.10
Generated 23 Sep 2023 | 15:13:00 UTC