Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Number crunching :
Proth Prime Search (Sieve) - GPU performance
Author |
Message |
|
I would like to ask you for providing WU computation time for different GPU running PPS CUDA sieve.
I think it could be interesting to see result achieved by our power hungry GPU.
Anyway in my case:
Microsoft Windows 7 Ultimate x64 Edition - NVIDIA GeForce GTX 260 (877MB) driver: 19745 -> 17:52 min
I am not sure if we need to distinguish OS and CPU, so maybe somebody could elaborate on it...
____________
| |
|
|
Microsoft Windows 7 Ultimate x64 Edition - NVIDIA GeForce GTX 260 (877MB) driver: 19745 -> 17:52 min
Microsoft Windows 7 Home Premium x64 Edition - NVIDIA GeForce GTX 275 (896MB) driver: 258.96 -> 14:50 min approx.
____________
35 x 2^3587843+1 is prime! | |
|
|
I would like to ask you for providing WU computation time for different GPU running PPS CUDA sieve.
I think it could be interesting to see result achieved by our power hungry GPU.
We should probably include GPU and Memory clock rates as well.
Ubuntu x64 PNY Verto GeForce GTX 275 (896MB) (GPU 713MHz Shader 1584MHz Memory 1283MHz) driver: 256.44 -> 13:30
EDIT<Added Shader freq. Revised runtime - if Boinc is only running GPU tasks - runtime = 13:16 when Boinc is also running CPU tasks - runtime = 13:55 >EDIT
____________
There's someone in our head but it's not us. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Windows 7 Professional x64, Intel Core2Quad Q6600 @ 2.4GHz, EVGA NVIDEA GTX 280 (621/1350/1134 core/shader/mem, 1G ram), driver 258.96, runtime: 15:59
EDIT: If you have a GTX 260, you should mention if you have the 192 or 216 shader version.
____________
My lucky number is 75898524288+1 | |
|
|
Microsoft Windows 7 Ultimate x64 Edition, Intel Core2Quad Q6600 @ 2.4GHz, NVIDIA GeForce GTX 260-216 (655/1404/1125 core/shader/mem, 896MB RAM) driver: 197.45 -> runtime: 17:52
____________
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
The main thing that drives performance is number of shaders (or alternatively number of multi-processors--e.g., old 9600 GSO has 12 mp's with 96 shaders total [i.e., 12 x 8 = 96]). Also important is the card generation (later = better), especially the new Fermi architecture vs. all older models. After that, the shader clock is the next most important. In general, core clock on NVidia cards has little effect on BOINC CUDA apps, and memory clock varies in importance from not at all to having a modest effect depending on the application. That said, below is a compiled table from my handful of cards as well as a rough comb through of the top thousand computers here at PG. With few exceptions, all times are avg.'s of 20 workunits. Since I do not have access to many of these cards, I cannot provide clocks.
All results are listed as follows: card type, speed (secs), speed (minutes), OS
FERMI
Nvidia GTX 460 403s 6.7m Win7 64-bit
GTX, GTS, & GT 2XX series
Nvidia GTX 285 847s 14.1m Darwin
Nvidia GTX 280 1,055s 17.6m Vista HP 64-bit
Nvidia GTX 275 869s 14.5m XP Pro 64-bit
Nvidia GTX 260 (216) 1,033s 17.2m Vista ULT 64-bit
Nvidia GTX 260 (192) 1,346s 22.4m Win7 64-bit
Nvidia GTS 250 1,282s 21.4m Win7 HP 64-bit
Nvidia GTX 260M 2,083s 34.7m Win7 HP 64-bit
Nvidia GT 240 2,150s 35.8m Win7 HP 64-bit
Nvidia GT 230 2,404s 40.1m Win7 HP 64-bit
Nvidia GT 220 4,354s 72.6m XP Pro 32-bit
GTS & GT 1XX series
Nvidia GT 120 6,472s 107.9m Darwin
9000 series
Nvidia 9800 GX2 1,547s 25.8m XP Pro 32-bit
Nvidia 9800 GTX+ 1,337s 22.3m Vista ULT 64-bit
Nvidia 9800 GT 1,554s 25.9m Win7 64-bit
Nvidia 9600 GSO 1,741s 29.0m XP Pro 32-bit
Nvidia 9600 GSO (512) 3,880s 64.7m Win7 ULT 64-bit
Nvidia 9500 GT 6,621s 110.4m Vista HP 32-bit
Nvidia 9400 GT (32 shader) 6,766s 112.8m Win7 Ent 64-bit
8000 series
Nvidia 8800 GTS (512) 1,305s 21.8m Win7 ULT 64-bit
Nvidia 8800 GT 1,544s 25.7m XP Pro 64-bit
Nvidia 8800 GTX 1,738s 29.0m Vista HP 64-bit
Nvidia 8800 GS 1,993s 33.2m XP Pro 32-bit
Nvidia 8800 GTS (340) 2,254s 37.6m Vista HP 32-bit
Nvidia 8600 GTS 6,347s 105.8m Win7 Ent 64-bit
Nvidia 8700M GT 7,136s 118.9m Vista ULT 64-bit
Nvidia 8500 GT 19,610s 326.8m XP Pro 32-bit
Nvidia 8400M GS 2,2865s 381.1m Vista HP 32-bit
Quadro
Quadro FX 4800 1,311s 21.9m Darwin
Quadro FX 3800 1,334s 22.2m Server 2008 64-bit
Quadro FX 880M 5,154s 85.9m Win7 ULT 64-bit
Quadro FX 580 8,196s 136.6m XP Pro 64-bit
Quadro FX 1700 9,821s 163.7m XP Pro 64-bit
Some CPU times for reference (times per core):
Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5] 8,931s 148.9m Vista HP 64-bit
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz [Family 6 Model 23 Stepping 10] 4,594s 76.6m Vista ULT 64-bit
AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ [AMD64 Family 15 Model 67 Stepping 3] 5,954s 99.2m Win7 Ent 64-bit
Intel(R) Core(TM)2 Duo CPU T8100 @ 2.10GHz [x86 Family 6 Model 23 Stepping 6] 9,880s 164.7m Vista HP 32-bit
Intel(R) Pentium(R) D CPU 3.73GHz [x86 Family 15 Model 6 Stepping 4] 26,633s 443.9m XP Pro 32-bit
Intel(R) Pentium(R) D CPU 3.00GHz [x86 Family 15 Model 4 Stepping 4] 27,168s 452.8m XP Pro 32-bit
Keep in mind that these are average times, not fastest times. Individual cards have some variability even in the same machine due to various issues such as workload, etc. Also, different manufacturers may use slight to notable variations on clock speeds for various cards (e.g., factory OC cards) and different models are more or less tolerant of user overclocks. The above table should be viewed as a general guide on speeds...some variation around the times listed is to be expected.
Also, while I have listed the OS for all table entries, note that the OS (and more specifically 64- vs. 32-bit) is only relevant for the CPU speeds. The OS should have minimal differences for GPU's, and what differences are observed will be mostly Windows vs. Mac vs. Linux (due to application version differences)...64- vs. 32-bit has no effect on GPU times.
____________
141941*2^4299438-1 is prime!
| |
|
|
Let the feeding frenzy begin or how long will PPS sieve survive with ALL the piranhas munching away. Time to get the stomachs full of WU's | |
|
|
CPU: Intel Core 2 Q9550 @ 3.6 GHz
GPU: NVIDIA GeForce GTX 260-192 (667/1438/1098 core/shader/mem, 896MB RAM) driver: 258.96
OS : Microsoft Windows Vista Home Premium x64 Edition Service Pack 2
GPU runtime: 19:20
CPU runtime: 64:00
____________
| |
|
BiBi Volunteer tester Send message
Joined: 6 Mar 10 Posts: 151 ID: 56425 Credit: 34,290,031 RAC: 0
                   
|
@Scott: Thanks for the overview; I cannot wait to put a 9500GT PCI in my 'old' system.
Does anyone know if I can use a 8400 GS PCI together with a 9500 GT PCI?
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Nvidia GTX 280 1,055s 17.6m Vista HP 64-bit
Nvidia GTX 260 (216) 1,033s 17.2m Vista ULT 64-bit
Nvidia GTX 260 (192) 1,346s 22.4m Win7 64-bit
Scott,
Something's funky with those numbers.
The 280 and both versions of the 260 are the same chip, varying only in how many shaders are enabled. (260's are essentially rebinned 280s that failed some of the validation tests.)
With the same chip and more shaders, the 280 has to be faster than either flavor of 260.
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Nvidia GTX 280 1,055s 17.6m Vista HP 64-bit
Nvidia GTX 260 (216) 1,033s 17.2m Vista ULT 64-bit
Nvidia GTX 260 (192) 1,346s 22.4m Win7 64-bit
Scott,
Something's funky with those numbers.
The 280 and both versions of the 260 are the same chip, varying only in how many shaders are enabled. (260's are essentially rebinned 280s that failed some of the validation tests.)
With the same chip and more shaders, the 280 has to be faster than either flavor of 260.
Yeah, I wasn't thrilled with those results, but since neither were my cards, I don't know the shader clock speeds. It may be that the 280 is slightly under-clocked and the 260 over-clocked...with 240 vs. 216 shaders being such a small difference speeds could well be fairly equal. Also, the 280 is a 65nm fab, and thus, has some heat issues that can limit stable shader clocks, whereas the 260-216 is a 55nm fab with less heat issues allowing for some OCing. I think the 260's numbers are fairly typical...anyone with a faster 280?
____________
141941*2^4299438-1 is prime!
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Yeah, I wasn't thrilled with those results, but since neither were my cards, I don't know the shader clock speeds. It may be that the 280 is slightly under-clocked and the 260 over-clocked...with 240 vs. 216 shaders being such a small difference speeds could well be fairly equal. Also, the 280 is a 65nm fab, and thus, has some heat issues that can limit stable shader clocks, whereas the 260-216 is a 55nm fab with less heat issues allowing for some OCing. I think the 260's numbers are fairly typical...anyone with a faster 280?
Possible, but I think the more likely culprit is the most obvious one -- the numbers just aren't comparing apples to apples, or the data is bad.
The 280, BTW, seems to have an enormous amount of headroom, temperature wise. They're not particularly prone to overheating, in part because they can run safely at temperatures as high as 105, according to NVIDIA. The BIOS on the card doesn't even start to ramp up the fan speed until 75 degrees, which was quite alarming to the early adopters of this card!!!
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Yeah, I wasn't thrilled with those results, but since neither were my cards, I don't know the shader clock speeds. It may be that the 280 is slightly under-clocked and the 260 over-clocked...with 240 vs. 216 shaders being such a small difference speeds could well be fairly equal. Also, the 280 is a 65nm fab, and thus, has some heat issues that can limit stable shader clocks, whereas the 260-216 is a 55nm fab with less heat issues allowing for some OCing. I think the 260's numbers are fairly typical...anyone with a faster 280?
Possible, but I think the more likely culprit is the most obvious one -- the numbers just aren't comparing apples to apples, or the data is bad.
The 280, BTW, seems to have an enormous amount of headroom, temperature wise. They're not particularly prone to overheating, in part because they can run safely at temperatures as high as 105, according to NVIDIA. The BIOS on the card doesn't even start to ramp up the fan speed until 75 degrees, which was quite alarming to the early adopters of this card!!!
Sorry to take a while getting back on this, but I am going to say that the data are fine. Looking at the cards in the top 1000 computers list, the GTX 280 and GTX 260 (216) times are fairly similar, with some of the latter being a bit slower. Most of both types of cards trend in the 1,050 second range, though a couple of both types have sub-1,000 second times. The big difference comes in the low end where a couple of the GTX 260 (216) cards are showing times in the 1,150 second range, whereas no 280's have times over 1,100 seconds.
The similarities in these times shouldn't come as too much of a surprise if one looks closely at these two models. There are only 24 fewer shaders in the 260 and the 55nm models probably have a modestly higher OC celing. The 280 is going to be a slight bit faster on average, but it looks like the 260-216 is quite competitive.
____________
141941*2^4299438-1 is prime!
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
The similarities in these times shouldn't come as too much of a surprise if one looks closely at these two models. There are only 24 fewer shaders in the 260 and the 55nm models probably have a modestly higher OC celing. The 280 is going to be a slight bit faster on average, but it looks like the 260-216 is quite competitive.
Interesting.
NVIDIA is notorious for branding obfuscation, and I'm wondering if there's a bit going on here.
The way I remember it, when I bought my 280 at the end of 2008, all three cards, the 280, the 260-192, and the 260-216 were all 65nm, and, in fact, all three were the exact same chip. The only difference was that the 260's were binned differently due to defects on the chip affecting some cores/shaders.
But, according to Wikipedia, there's also a 55nm, GT200b based version of the 260-216. So there's actually three versions of the GTX 260!
Either way, there are a lot more -216's out there than -192's, and if some, most, or all of them have 200b chips, then it's entirely reasonable to expect more OC headroom given the die shrink.
Yet, the data is what it is. Thanks for the research!
Mike
____________
My lucky number is 75898524288+1 | |
|
|
Hello.
That's my first post on this forum. :)
I got 2 question:
- How looks performance on GTX 470 PPS Sieve? It's much faster in games.
- Will there be other projects on the gpu (Sieve or LLR)? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Hello.
That's my first post on this forum. :)
I got 2 question:
- How looks performance on GTX 470 PPS Sieve? It's much faster in games.
- Will there be other projects on the gpu (Sieve or LLR)?
One person with a 470 reported his sieve speed. It was (no surprise here) a lot faster than the 460. IIRC, on the "long" test, my 280 was taking about 35 seconds, the 460 was about 15 seconds, and the 470 was under 10 seconds (7 point something, I think.)
There's been some talk that some other sieves might be possible, so perhaps those will be coming along too. LLR seems less likely.
____________
My lucky number is 75898524288+1 | |
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 264,903,289 RAC: 105,711
                            
|
IIRC, on the "long" test, my 280 was taking about 35 seconds, the 460 was about 15 seconds, and the 470 was under 10 seconds (7 point something, I think.)
Note that the guy with the 470 had it overclocked. At stock speeds, the 470 appears to be about 33% faster than the 460.
There's been some talk that some other sieves might be possible, so perhaps those will be coming along too. LLR seems less likely.
I'm working towards an improved Cullen/Woodall sieve on all platforms, but kind of slowly. Here is the GPU FFT/LLR thread.
____________
| |
|
|
33% more power over 460 is a great result.
I'm looking for a powerful card to play and for PG Sieve. I think 470 will be good.
BTW, my other cruncher machine have 5870, it would be nice to use it to pg too. ;)
____________
Polish National Team | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
BTW, my other cruncher machine have 5870, it would be nice to use it to pg too. ;)
With an app_info.xml file, you can...see here for details of how to use the OpenCL test application.
____________
141941*2^4299438-1 is prime!
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
33% more power over 460 is a great result.
I'm looking for a powerful card to play and for PG Sieve. I think 470 will be good.
If you have an SLI capable motherboard, current thinking is that 2x GTX460 is the way to go. Two 460s have more crunching power -- and generally get higher frame rates in games -- than a single GTX480 (or anything else), and cost less as well.
One of the hardware websites recently did a multipage benchmark test comparing a pair of 460s against all the top of the line cards, and the 460s were less expensive and more powerful than any single card. Sorry, I don't remember where I saw this, but a little Googling should turn it up. I may have seen the link to it here on the PrimeGrid boards, come to think of it.
____________
My lucky number is 75898524288+1 | |
|
|
With an app_info.xml file, you can...see here for details of how to use the OpenCL test application.
I read this thread and it's a little too complicated for me. :(
If you have an SLI capable motherboard, current thinking is that 2x GTX460 is the way to go. Two 460s have more crunching power -- and generally get higher frame rates in games -- than a single GTX480 (or anything else), and cost less as well.
It's good idea, but am afraid of micrustuttering that can occure in some games.
____________
Polish National Team | |
|
|
Mac OS X 10.6.4 (Darwin 10.4.0), NVIDIA GeForce GT 330M (forced), 256MB RAM --> 1:25:00 per task.
I have no idea why it uses "0.27 CPUs and 1.00 NVIDIA GPUs" to run a WU. Or why it won't run more than one at once. Or why it (now) won't get new ones.
Having read through this thread, it would appear this is the best I can get from my laptop card...
For reference, it's a new MacBook Pro with a 2.53GHz i5; runs 24/7; I force the NVIDIA over the GMA chip to 100% uptime. | |
|
STE\/E Volunteer tester
 Send message
Joined: 10 Aug 05 Posts: 573 ID: 103 Credit: 3,667,132,788 RAC: 144,546
                     
|
33% more power over 460 is a great result.
I'm looking for a powerful card to play and for PG Sieve. I think 470 will be good.
If you have an SLI capable motherboard, current thinking is that 2x GTX460 is the way to go. Two 460s have more crunching power -- and generally get higher frame rates in games -- than a single GTX480 (or anything else), and cost less as well.
One of the hardware websites recently did a multipage benchmark test comparing a pair of 460s against all the top of the line cards, and the 460s were less expensive and more powerful than any single card. Sorry, I don't remember where I saw this, but a little Googling should turn it up. I may have seen the link to it here on the PrimeGrid boards, come to think of it.
I sent you a PM on this Mike ... :)
| |
|
STE\/E Volunteer tester
 Send message
Joined: 10 Aug 05 Posts: 573 ID: 103 Credit: 3,667,132,788 RAC: 144,546
                     
|
Couple simple Question, do you have to run a App Files with the GTX Fermi Cards or just enable GPU Work in your Account. Also does it make any difference in the speed of the Wu's between say a 768mb or 1gb GTX 460 ... Thanks, Steve | |
|
|
Couple simple Question, do you have to run a App Files with the GTX Fermi Cards or just enable GPU Work in your Account. Also does it make any difference in the speed of the Wu's between say a 768mb or 1gb GTX 460 ... Thanks, Steve
there are at least 3 designs around:
the 768MB version has a limited 192-bit memory interface.
those with 1 or 2GB have the full 256-bit interface. which should be of no relevance for crunching - except the 2gb version will consume more energy and run hotter.
and then of course there is a lot of factory-overclocking around.
| |
|
|
Couple simple Question, do you have to run a App Files with the GTX Fermi Cards or just enable GPU Work in your Account. Also does it make any difference in the speed of the Wu's between say a 768mb or 1gb GTX 460 ... Thanks, Steve
You might need an app_info.xml to get work for Fermi based cards even when you only want to use the stock application.
I don't know if it is alread solved but there is/was an issue with Fermi based cards receiving no PPS sieve WUs a few days ago. Currently I'm crunching with a newer testing version (0.2.1a-cuda) which is way faster than the stock app.
GF104 based cards (GTX 460) may receive another hefty speed gain by simply compiling the 0.2.1a code with the CUDA 3.1 toolkit. At least one user over in the mersenne forums reported a 60% speed gain by using a linux version that he compiled with the CUDA 3.1 toolkit. It seems that the newer NVIDIA compilers can take advantage of the architectural changes that were introduced with the GF104 chips which are used in the GTX 460 card.
____________
| |
|
STE\/E Volunteer tester
 Send message
Joined: 10 Aug 05 Posts: 573 ID: 103 Credit: 3,667,132,788 RAC: 144,546
                     
|
Thank's Frank & Ralf, so Ralf are you saying you have to Compile your own App Version ??? ... Thanks | |
|
|
Thank's Frank & Ralf, so Ralf are you saying you have to Compile your own App Version ??? ... Thanks
My results (a few days old):
Stock app + GTX 260 (Tesla) -> works
Stock app + GTX 460 (Fermi) -> No work available for whatever reason
Stock app + GTX 460 (Fermi) + app_info.xml -> works
Test app (0.2.0) + GTX 260 (Tesla) -> works even faster
Test app (0.2.1a) + GTX 460 (Fermi) + app_info.xml -> works even faster
You can get the app from Ken-g6 here (no need to compile anything).
____________
| |
|
Menipe Volunteer tester Send message
Joined: 2 Jan 08 Posts: 235 ID: 17041 Credit: 113,010,371 RAC: 723
                       
|
Can you post the app_info.xml file you are using?
I can't seem to get my GTX 470 to get any work and I have not been successful getting it to work with an app_info.xml file.
____________
| |
|
LookAS Volunteer tester Send message
Joined: 19 Apr 08 Posts: 38 ID: 21649 Credit: 354,890,618 RAC: 0
                      
|
I am using app_info.xml from this post with 'last' 0.2.1a app from here and it works fine (except i cannot obviously compute any other things on PG )
anyway in ma case with OCed GTX470 (760MHz GPU), Windows 7 Ultimate 64b, driver 260.63 -> 01:46 min
____________
| |
|
|
(...)
(except i cannot obviously compute any other things on PG )
(...)
You could allways put the other apps from a normal client_state.xml into your app_info.xml and go on crunching other things on primegrid. | |
|
LookAS Volunteer tester Send message
Joined: 19 Apr 08 Posts: 38 ID: 21649 Credit: 354,890,618 RAC: 0
                      
|
(...)
(except i cannot obviously compute any other things on PG )
(...)
You could allways put the other apps from a normal client_state.xml into your app_info.xml and go on crunching other things on primegrid.
I did not have time to finish it. thx for pointing me the right direction ;) | |
|
|
GTX 285 (no OC) + Q9650@3.6GHz + Win7 x64 pro (0.66 CPU + 1.00GPU): 1,161.20 sec (19,25m)
Hm, performance looks bad compared to the others :-( | |
|
|
GTX 285 (no OC) + Q9650@3.6GHz + Win7 x64 pro (0.66 CPU + 1.00GPU): 1,161.20 sec (19,25m)
Hm, performance looks bad compared to the others :-(
WUs (app, lengths and exponent ranges) were changed multiple times in the last weeks. Your card is certainly not underperforming ;)
____________
| |
|
|
And what about the high CPU usage? GPUGRIP use only 0.1 CPUs.
Which clock rates is the one I need to OC to increase the calculation speed?
Or are there any "good" settings for a GTX 285? The default is: GPU: 648MHz, Memory: 1242MHz, Shader: 1476MHz | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
And what about the high CPU usage? GPUGRIP use only 0.1 CPUs.
The "High CPU Usage" you're referring to is a reference number, not actual usage (and is completely meaningless unless you're running multiple GPUs). Your actual CPU usage by this app should be, and almost certainly is, less than 5% of ONE core.
As others have said,you can't compare the older results in this thread to what you're seeing today, The WUs and the software have changed substantially. You came in just under 20 minutes for that WU; I'm seeing between 19 and 23 minutes on my GTX 280. There's some variability in the WU run times, but your run time seems to be exactly where it should be.
____________
My lucky number is 75898524288+1 | |
|
|
Hi all, i see that Ati version is too, but no results for Ati... Double precision is needed? (58xx cards?) or 5770 is good enough?
JHAPA
PS: benchmark NVS2100M - avg 16,200 s for WU
____________
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
PG needs only Single precision.
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
For information
Seven 64 bits + ATI 5850 @825Mhz GPU, RAM @ 1000Mhz drivers 10.10
1270 s around 21 min 16s | |
|
|
7 64 + I7 930 + GTX 480 = 400 seconds :-)
____________
Intel I7 930 - GTX 480 - Windows 7 64
Join BOINC Synergy, the best team in the galaxy! | |
|
tng Send message
Joined: 29 Aug 10 Posts: 499 ID: 66603 Credit: 50,763,972,894 RAC: 31,346,729
                                                    
|
7 64 + I7 930 + GTX 480 = 400 seconds :-)
Some more data points. Note that all of these systems are using all CPU cores for CPU crunching, in addition to the GPUS. Leaving a core free is known to increase GPU productivity in some cases.
Server 2008 64-bit (compare to Vista), dual Xeon 5520 (HT enabled), GTX260-216 (standard clocks): 1500 seconds
Windows 7 64-bit, Dual Xeon 5520 (HT enabled), GTX470 (standard clocks): 515 seconds
Windows 7 64-bit, dual Xeon 5345, GTX 480 (standard clocks): 395 seconds
Windows 7 64-bit, Core I7 930 (HT enabled), GTX580 (standard clocks): 330 seconds
Fermi cards are much faster on on these tasks. The 470 is the best bang for the buck right now, but not by all that much. 480 and 580 are pretty close in compute power/$, but the 580s seem to be limited in availability (all out of stock at Newegg). | |
|
|
Naam pps_sr2sieve_4265066_1
Werkeenheid 143973759
Computer ID 171590
Loop tijd 603.31 is 10:03 minutes
CPU tijd 21.42
Punten 2,314.00
Naam pps_sr2sieve_4269628_0
Werkeenheid 143978321
Computer ID 171590
Loop tijd 1,779.91 is 29:39 minutes
CPU tijd 21.88
Punten 2,314.00
Perhaps somebody here can explain me the difference in time while increasing the shaders of my EVGA GTX 460 768MB SC.
The first WU's were ready in 10/11 minutes while the latest took 20-30 minutes but I had the shaders increased from 1580 to 1820.
I don't get it.
Phenom II X4 945 / 2GB / EVGA GTX 460 768MB SC / XP x32 / 260.99 | |
|
|
Another topic, how to get PPS sieve for Ati...i try upload it, but only cuda WUs are coming (for my cuda compie) ...or is it the problem, that i have full gpu buffer? or somethig else? any idea?
JHAPA
____________
| |
|
|
Just got a dual GTX260 machine set up for CUDA crunching for PrimeGrid.
It gives my computer serious graphical lags (such as the Windows 7 GUI) when GPU crunching is enabled, I've never had those issues while crunching Collatz, GPUGRID, and Folding@Home.
Can someone give me an explanation?
System Specs:
Phenom II X4 920 @ 3.5GHz
2x GTX260 @ 626/1512/1000
4GB memory
Cards runs at 65C 24/7
Also crunches WCG 24/7 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Can someone give me an explanation?
It's a feature, not a bug. Seriously.
The Holy Grail of GPU computing is having an application that's so efficient that you have the GPU running at 100% utilization. Most apps, most good apps, can only drive the GPU at around 80% efficiency, more or less. That leaves enough of the GPU unused to do its normal screen stuff for the user's GUI.
The PPS-Sieve app is insanely efficient. On most systems it's capable of driving the GPU at 98 to 99 percent utilization. The downside of that efficiency is that there's not enough left over to drive the display without the lag being noticeable.
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Can someone give me an explanation?
It's a feature, not a bug. Seriously.
The Holy Grail of GPU computing is having an application that's so efficient that you have the GPU running at 100% utilization. Most apps, most good apps, can only drive the GPU at around 80% efficiency, more or less. That leaves enough of the GPU unused to do its normal screen stuff for the user's GUI.
The PPS-Sieve app is insanely efficient. On most systems it's capable of driving the GPU at 98 to 99 percent utilization. The downside of that efficiency is that there's not enough left over to drive the display without the lag being noticeable.
That's part of it, but since the Collatz application often runs at that level of efficiency, then the lag must be in part due to something else.
____________
141941*2^4299438-1 is prime!
| |
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,975,644 RAC: 311
                     
|
Not necessarily. When I run Collatz, I see a noticeable lag (but I do not have a very high powered GPU either). Maybe what you are seeing is the difference between 90% and 98%.
____________
Murphy (AtP)
| |
|
|
Microsoft Windows 7 Pro x64
NVIDIA GeForce GTX 285 (1024MB - 720/1639/1242 MHz) driver: 260.99
=> 1030s (17:10m) for 2314 Credits (= 134 c/m)
And yes I have the GUI lag too but GPU usage is 98-99% :-)
I ordered a GTX 460, hope I will get it this weekend!
____________
Have a N.I.C.E. day!
| |
|
|
Can someone give me an explanation?
It's a feature, not a bug. Seriously.
The Holy Grail of GPU computing is having an application that's so efficient that you have the GPU running at 100% utilization. Most apps, most good apps, can only drive the GPU at around 80% efficiency, more or less. That leaves enough of the GPU unused to do its normal screen stuff for the user's GUI.
The PPS-Sieve app is insanely efficient. On most systems it's capable of driving the GPU at 98 to 99 percent utilization. The downside of that efficiency is that there's not enough left over to drive the display without the lag being noticeable.
That's part of it, but since the Collatz application often runs at that level of efficiency, then the lag must be in part due to something else.
Yeah I am aware of its efficiency, but other GPU applications also runs my GPU like no tomorrow. Collatz and F@H frequently runs my cards at 98-99%. Oh also, the lag is not graphical, it actually lags the whole system. I am not being critical or anything, just trying to point it out. :)
By the way, is there any settings where I can set the CUDA applications to not free up 0.70 CPUs per work unit? I am asking so because I don't notice the Primegrid WUs to be using much of the core. I will be fine with something like 0.30 CPU reserved since I would like to free up a core to let the WCG app run. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Oh also, the lag is not graphical, it actually lags the whole system. I am not being critical or anything, just trying to point it out. :)
I'm pretty certain that the lag on my system is only related to screen manipulations. Other processing definitely isn't being affected. (Most conclusive evidence of this is that background CPU BOINC processing doesn't get slowed down.)
So maybe there's something else going on in your system. Or it's simply a case of YMMV; different combinations of CPU, GPU, drivers, and OS are likely to behave differently. This is still a learning process for everyone.
By the way, is there any settings where I can set the CUDA applications to not free up 0.70 CPUs per work unit? I am asking so because I don't notice the Primegrid WUs to be using much of the core. I will be fine with something like 0.30 CPU reserved since I would like to free up a core to let the WCG app run.
Are you running more than one GPU in your computer? If not, this shouldn't be an issue. If you're running two GPUs, then BOINC will think you need 1.4 CPU cores just for the GPUs, and not run any CPU tasks on a core. But as long as the total that BOINC thinks it needs is less than 1.0, it doesn't have any effect on what gets run, or how much. Actual CPU usage is less than 5% of one core. The estimate really only seems to have any effect if it totals more than 1.0.
If you are running more than one GPU, and BOINC is idling a core because of that, the only way I know of correcting that is with app_info.
____________
My lucky number is 75898524288+1 | |
|
|
If somebody is interested;
Microsoft Windows 7 Home x86 Edition - HD5770 ( 900/1200 ) driver: 10-11 -> 33,15 min approx. ( powered by E5200 3,4Ghz )
http://www.primegrid.com/result.php?resultid=205281011
Darwin 10.6.4 - GTX260 ( 216 Shader ): Cuda Driver 3.2 ->
25,35 min ( powered by 2x Xeon 5520 2,33Ghz )
http://www.primegrid.com/result.php?resultid=205132162
____________
Public Energy -Crunch da Power- | |
|
|
Are you running more than one GPU in your computer? If not, this shouldn't be an issue. If you're running two GPUs, then BOINC will think you need 1.4 CPU cores just for the GPUs, and not run any CPU tasks on a core. But as long as the total that BOINC thinks it needs is less than 1.0, it doesn't have any effect on what gets run, or how much. Actual CPU usage is less than 5% of one core. The estimate really only seems to have any effect if it totals more than 1.0.
If you are running more than one GPU, and BOINC is idling a core because of that, the only way I know of correcting that is with app_info.
Yeah, you are right; I am running two GTX 260s in my system right now. Can you give me a hint on how to set the work units to free up a core for me? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Are you running more than one GPU in your computer? If not, this shouldn't be an issue. If you're running two GPUs, then BOINC will think you need 1.4 CPU cores just for the GPUs, and not run any CPU tasks on a core. But as long as the total that BOINC thinks it needs is less than 1.0, it doesn't have any effect on what gets run, or how much. Actual CPU usage is less than 5% of one core. The estimate really only seems to have any effect if it totals more than 1.0.
If you are running more than one GPU, and BOINC is idling a core because of that, the only way I know of correcting that is with app_info.
Yeah, you are right; I am running two GTX 260s in my system right now. Can you give me a hint on how to set the work units to free up a core for me?
Ok, understand that I don't have this problem myself, so I'm going on general knowledge plus what others have said.
You have two options here, making a change to app_info.xml, or making a change to cc_config.xml. Both have drawbacks, but cc_config is IMHO the better (easier + safer) way to go. (WARNING: One of the reasons cc_config is a lot safer to use is that changing app_info usually crashes any WUs in your cache. So you should clear out your cache before making any changes to app_info. Cc_config doesn't usually cause that kind of problem.)
Cc_config.xml can be used to tell BOINC you have one more CPU core than you actually have. So if you have a 4 core system, BOINC will think you have 5 cores, and reserve one of them for the GPUs, and run CPU tasks on the "remaining" 4 cores. The only disadvantage of this approach is that if the GPU isn't running or the CPU estimate on GPU tasks drops below 0.5, BOINC is going to run 5 CPU tasks on your system. That will work, but will be a bit less efficient because the cores will be constantly task switching.
You'll want to put <ncpus>N</ncpus> (where N is one greater than the number of physical CPUs) into the <options> section in your cc_config file. You may need to create the file if it doesn't exist. Details can be found here.[/b]
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
There's a third way, too. You could run a VM (Virtual Machine) to run the CPU tasks. I still recommend cc_config as the best approach, however.
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Cc_config.xml can be used to tell BOINC you have one more CPU core than you actually have. So if you have a 4 core system, BOINC will think you have 5 cores, and reserve one of them for the GPUs, and run CPU tasks on the "remaining" 4 cores. The only disadvantage of this approach is that if the GPU isn't running or the CPU estimate on GPU tasks drops below 0.5, BOINC is going to run 5 CPU tasks on your system. That will work, but will be a bit less efficient because the cores will be constantly task switching.
You'll want to put <ncpus>N</ncpus> (where N is one greater than the number of physical CPUs) into the <options> section in your cc_config file. You may need to create the file if it doesn't exist. Details can be found here.[/b]
This does not work on all machines. For example, I have a Pent D 965 Extreme Edition (dual-core with HT for four threads) with dual 9600 GSO GPUs. Default it runs 3 CPU tasks alongside the 2 GPU tasks. Adding <ncpus>5</ncpus> to the cc_config file doesn't change this in any way. My guess is that the <ncpus>N</ncpus> option does not handle HT properly, but this is only a guess.
____________
141941*2^4299438-1 is prime!
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Maybe a bug in 6.10.18?
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Probably not...I upgraded from 6.10.18 to 6.10.58 and the issue remains...
____________
141941*2^4299438-1 is prime!
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Sample WU times
(Please post your current WU times in the Proth Prime Search (Sieve) - GPU performance thread. Yes, CPU times are welcomed as well.)
- 330s (5:30) : GTX 580 (standard clocks), Windows 7 64-bit, Core I7 930 (HT enabled)
- 395s (6:35) : GTX 480 (standard clocks), Windows 7 64-bit, dual Xeon 5345
- 400s (6:40) : GTX 480, Windows 7 64 + I7 930
- 515s (8:35) : GTX 470 (standard clocks), Windows 7 64-bit, Dual Xeon 5520 (HT enabled)
- 966s (16:06) : GTS 450 (Factory OC, 1850 shader clock), Windows 7 64-bit, Core I7 860 (HT enabled)
- 1030s (17:10) : GTX 285 (1024MB - 720/1639/1242 MHz) driver: 260.99, Windows 7 Pro x64
- 1161s (19:21) : GTX 285 (no OC), Windows 7 x64 pro + Q9650@3.6GHz (0.66 CPU + 1.00GPU)
- 1270s (21:10) : ATI 5850 @825Mhz GPU, RAM @ 1000Mhz drivers 10.10, Windows 7 64 bits
- 1500s (25:00) : GTX 260 - 216 (standard clocks), Windows Server 2008 64-bit (compare to Vista), dual Xeon 5520 (HT enabled)
- 1521s (25:21) : GTX 260 (216 Shader ): Cuda Driver 3.2, Darwin 10.6.4 -> (powered by 2x Xeon 5520 2,33Ghz)
- 1924s (32:06) : 9800 GTX+ (standard clocks), Windows Vista Ultimate 64-bit, Core2 E8400
- 1989s (33:09) : HD5770 (900/1200) driver: 10-11, Windows 7 Home x86 Edition - powered by E5200 3,4Ghz
- 2118s (35:18) : GTS 240 (standard clocks), Windows 7 64-bit, Core I7 975 (HT enabled)
- 2332 (38:54) : 9600 GSO (Factory OC, 1750 shader clock), Windows XP 32-bit, Pentium D 965 Extreme Edition (HT enabled)
- 2664 (44:24) : 8800 GS (Manual OC, 1530 shader clock), Windows XP 32-bit, Pentium D 965 Extreme Edition (HT enabled)
- 3105s (51:48) : GT 240 (standard clocks), Windows Vista 32-bit, Q6600
- 6897s (1:54:57) : 9600 GS (standard clocks), Windows 7 64-bit, Core I7 860 (HT enabled)
- 6962s (1:56:02) : 9500 GT (Factory OC, 1750 shader clock), Windows Vista Ultimate 64-bit, Core2 E8400
- 7552s (2:05:52) : HD 4670 (standard clocks), Windows XP 32-bit, Athlon 64 x2 4200+ (socket 939)
- 9354s (2:35:54) : 9400 GT (standard clocks, 32 shader version), Linux 64-bit, Q6700
- 9460s (2:37:40) : 8600 GT (standard clocks), Windows XP 32-bit, Pentium D 830
- 27262s (7:34:22) : 8400 GS (Manual OC, 1014 shader clock), Windows XP 32-bit, Pentium 4 3.6Ghz (HT enabled)
- 30082s (8:21:22) : 8400M GS (standard clocks), Windows Vista 32-bit, T8100
Updated with some slower/older cards.
____________
141941*2^4299438-1 is prime!
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Sample WU times
(Please post your current WU times in the Proth Prime Search (Sieve) - GPU performance thread. Yes, CPU times are welcomed as well.)
- 330s (5:30) : GTX 580 (standard clocks), Windows 7 64-bit, Core I7 930 (HT enabled)
- 395s (6:35) : GTX 480 (standard clocks), Windows 7 64-bit, dual Xeon 5345
- 400s (6:40) : GTX 480, Windows 7 64 + I7 930
- 515s (8:35) : GTX 470 (standard clocks), Windows 7 64-bit, Dual Xeon 5520 (HT enabled)
- 966s (16:06) : GTS 450 (Factory OC, 1850 shader clock), Windows 7 64-bit, Core I7 860 (HT enabled)
- 1030s (17:10) : GTX 285 (1024MB - 720/1639/1242 MHz) driver: 260.99, Windows 7 Pro x64
- 1161s (19:21) : GTX 285 (no OC), Windows 7 x64 pro + Q9650@3.6GHz (0.66 CPU + 1.00GPU)
- 1270s (21:10) : ATI 5850 @825Mhz GPU, RAM @ 1000Mhz drivers 10.10, Windows 7 64 bits
- 1360s (22:40): GTX 280 (Factory OC), Windows 7 x64 Pro, C2Q Q6600 @2.4 GHz
- 1500s (25:00) : GTX 260 - 216 (standard clocks), Windows Server 2008 64-bit (compare to Vista), dual Xeon 5520 (HT enabled)
- 1521s (25:21) : GTX 260 (216 Shader ): Cuda Driver 3.2, Darwin 10.6.4 -> (powered by 2x Xeon 5520 2,33Ghz)
- 1924s (32:06) : 9800 GTX+ (standard clocks), Windows Vista Ultimate 64-bit, Core2 E8400
- 1989s (33:09) : HD5770 (900/1200) driver: 10-11, Windows 7 Home x86 Edition - powered by E5200 3,4Ghz
- 2118s (35:18) : GTS 240 (standard clocks), Windows 7 64-bit, Core I7 975 (HT enabled)
- 2332 (38:54) : 9600 GSO (Factory OC, 1750 shader clock), Windows XP 32-bit, Pentium D 965 Extreme Edition (HT enabled)
- 2664 (44:24) : 8800 GS (Manual OC, 1530 shader clock), Windows XP 32-bit, Pentium D 965 Extreme Edition (HT enabled)
- 3105s (51:48) : GT 240 (standard clocks), Windows Vista 32-bit, Q6600
- 6897s (1:54:57) : 9600 GS (standard clocks), Windows 7 64-bit, Core I7 860 (HT enabled)
- 6962s (1:56:02) : 9500 GT (Factory OC, 1750 shader clock), Windows Vista Ultimate 64-bit, Core2 E8400
- 7552s (2:05:52) : HD 4670 (standard clocks), Windows XP 32-bit, Athlon 64 x2 4200+ (socket 939)
- 9354s (2:35:54) : 9400 GT (standard clocks, 32 shader version), Linux 64-bit, Q6700
- 9460s (2:37:40) : 8600 GT (standard clocks), Windows XP 32-bit, Pentium D 830
- 27262s (7:34:22) : 8400 GS (Manual OC, 1014 shader clock), Windows XP 32-bit, Pentium 4 3.6Ghz (HT enabled)
- 30082s (8:21:22) : 8400M GS (standard clocks), Windows Vista 32-bit, T8100
Updated with my recently deceased GTX280. Sorry, don't have the actual OC frequency since the card is dead, Windows is re-installed, and the box doesn't say.
____________
My lucky number is 75898524288+1 | |
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,975,644 RAC: 311
                     
|
CPU times:
GenuineIntel Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz [Family 6 Model 15 Stepping 10]
160653 5 Dec 2010 20:12:10 UTC 6 Dec 2010 14:35:55 UTC Completed and validated 62,864.46 61,433.03 2,314.00 Proth Prime Search (Sieve) v1.36
GenuineIntel Intel(R) Core(TM)2 CPU T5300 @ 1.73GHz [Family 6 Model 15 Stepping 2]
83159 5 Dec 2010 5:53:55 UTC 6 Dec 2010 4:45:33 UTC Completed and validated 80,626.59 68,183.12 2,314.00 Proth Prime Search (Sieve) v1.36
AuthenticAMD AMD Sempron(tm) 3000+ [Family 6 Model 10 Stepping 0]
125961 2 Dec 2010 14:30:19 UTC 5 Dec 2010 14:35:37 UTC Completed and validated 223,428.84 211,556.10 2,314.00 Proth Prime Search (Sieve) v1.36
____________
Murphy (AtP)
| |
|
|
HD 4770 : 3800-4100 s
____________
| |
|
|
CPU times:
i5 2.53GHz on Darwin 10.5.0 = 16h42m (approx)
P4 2.8GHz on XP SP3 = 56h20m (because it's 32-bit) this is what I mean
GPU time:
NViDIA GeForce GT 330M (255MB) on Darwin 10.5.0, CUDA 3.2 = 2h05m
I have a GTX 260 but no machine with a power supply good enough to run it, unfortunately.
____________
| |
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2652 ID: 29980 Credit: 570,442,335 RAC: 10,182
                              
|
Times for my two active GPUs here:
GTS 450 OC 983sec (16m23s) 888-1000-1776, Win7-64, Q6600@2.4GHz
9500GT 8600sec (2h23m20s) 550-400-1400, XP32, E6600@2.7GHz
I've no idea what stock clocks are for them, but the GTS450 is sold as an OC card. The clocks afterwards are those reported by GPU-Z. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
GTX280 1370s (Factory OC) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
GTX460 829s (Stock 675/900/1350) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
The raw performance doesn't tell the whole story. The 460 is the replacement for the 280, which died yesterday. It's almost twice as fast, half the price, draws a lot less power, runs a heck of a lot cooler, is several inches shorter, and is only running at 90% capacity vs the 280 running at 100%. I guess the CPU isn't quite keeping up in feeding it work. Or the command-line parameters need different tuning. Or it's a driver issue. I think I prefer it this way, though, since at 90%, it's not adversely affecting the GUI at all.
____________
My lucky number is 75898524288+1 | |
|
|
GTX275 ~ 1150s (std clock, 877MB), Win 7 x64, Q9400 @ 2.66GHz
GT240 ~ 3140s (std clock, 474MB), Win 7 x64, Q9550 @ 2.83GHz
HD5850 ~ 1480s (std clock, 1024MB), Win 7 x64, Q9450 @ 3.04GHz
but the GT240 doesn't work well - after downloading a bunch of units, it only finishes correctly the first one and all the rest error-out... Probably a driver issue... | |
|
LookAS Volunteer tester Send message
Joined: 19 Apr 08 Posts: 38 ID: 21649 Credit: 354,890,618 RAC: 0
                      
|
CPU
X5650 ~ 45,500s (OC 3.52GHz, HT on - 12threads simultaneously), Win 7 x64
GPU
GTX470 ~ 386s (OC 750MHz GPU), Win 7 x64, X5650 @ 3.52GHz
| |
|
|
NVIDIA ION LE, Linux: 25400s
NVIDIA GeForce 9300 / nForce 730i, XP 64bit: 22900s
NVIDIA GeForce 9400 GT, XP: 17000s
NVIDIA GeForce 9700M GTS, Linux 64bit: 6600s
NVIDIA GeForce 8800 GT, XP 64bit: 2070s
ATI HD4870, XP 64bit: would be around 1800s, but all WUs still bomb out with "Elapsed time exceeded" error
ATI HD5870, XP 64bit: 950s
____________
| |
|
|
With app version 1.36, Mac OS X 10.6.5, NVIDIA GT 120, CUDA 3.2.17, ~9600 s/WU (~2:40:00).
--gary | |
|
|
Windows Vista x64 SP 2 - NVIDIA Driver Version 258.96 WHQL
NVIDIA GTX 460 @ 900/1800/2000 MHz: 553-558s* with 4 PPS sieve instances running on the cores of the CPU (Q9550 @ 3.4 GHz). Average completion time of the CPU WUs: Slightly under 28,000s.
Without the CPU WUs the 460 might manage to crunch an extra WU or two during a 24 hour time frame but the CPU can crunch 12 WUs in the same time (3 per core in 24 hours).
Application versions:
tpsieve-cuda 0.2.3 on the GPU and app version 1.36 on the CPU.
Additional note:
It would be nice to have a multithreading enabled BOINC version of tpsieve.
*That's no typo ;)
____________
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
@John
Dividing the list in the Winter Solstice thread by NVidia/ATI/CPU and by GPU generation would make it more readable as follows for example:
NVIDIA
Fermi
- 330s (5:30) : GTX 580 (standard clocks), Windows 7 64-bit, Core I7 930 (HT enabled)
- 386s (6:26) : GTX 470 (OC 750MHz GPU), Win 7 x64, X5650 @ 3.52GHz
- 395s (6:35) : GTX 480 (standard clocks), Windows 7 64-bit, dual Xeon 5345
- 400s (6:40) : GTX 480, Windows 7 64 + I7 930
- 515s (8:35) : GTX 470 (standard clocks), Windows 7 64-bit, Dual Xeon 5520 (HT enabled)
- 829s (13:49) : GTX460 (Stock 675/900/1350) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
- 966s (16:06) : GTS 450 (Factory OC, 1850 shader clock), Windows 7 64-bit, Core I7 860 (HT enabled)
- 983s (16:23) : GTS 450 OC 888-1000-1776, Win7-64, Q6600@2.4GHz
GT 3xx
- 7500s (2:05:00) : GT 330M (255MB) on Darwin 10.5.0, CUDA 3.2
GTX 2xx, GTS 2xx, GT 2xx
- 1030s (17:10) : GTX 285 (1024MB - 720/1639/1242 MHz) driver: 260.99, Windows 7 Pro x64
- 1150s (19:10) : GTX275 (std clock, 877MB), Win 7 x64, Q9400 @ 2.66GHz
- 1161s (19:21) : GTX 285 (no OC), Windows 7 x64 pro + Q9650@3.6GHz (0.66 CPU + 1.00GPU)
- 1360s (22:40): GTX 280 (Factory OC), Windows 7 x64 Pro, C2Q Q6600 @2.4 GHz
- 1370s (22:50) : GTX280 (Factory OC) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
- 1500s (25:00) : GTX 260 - 216 (standard clocks), Windows Server 2008 64-bit (compare to Vista), dual Xeon 5520 (HT enabled)
- 1521s (25:21) : GTX 260 (216 Shader ): Cuda Driver 3.2, Darwin 10.6.4 -> (powered by 2x Xeon 5520 2,33Ghz)
- 2118s (35:18) : GTS 240 (standard clocks), Windows 7 64-bit, Core I7 975 (HT enabled)
- 3105s (51:48) : GT 240 (standard clocks), Windows Vista 32-bit, Q6600
- 3140s (52:23) : GT240 (std clock, 474MB), Win 7 x64, Q9550 @ 2.83GHz
9xxx
- 1924s (32:06) : 9800 GTX+ (standard clocks), Windows Vista Ultimate 64-bit, Core2 E8400
- 2332 (38:54) : 9600 GSO (Factory OC, 1750 shader clock), Windows XP 32-bit, Pentium D 965 Extreme Edition (HT enabled)
- 6600s (1:50:00) : 9700M GTS, Linux 64bit
- 6897s (1:54:57) : 9600 GS (standard clocks), Windows 7 64-bit, Core I7 860 (HT enabled)
- 6962s (1:56:02) : 9500 GT (Factory OC, 1750 shader clock), Windows Vista Ultimate 64-bit, Core2 E8400
- 8600s (2:23:20) : 9500GT 8600sec (2h23m20s) 550-400-1400, XP32, E6600@2.7GHz
- 9354s (2:35:54) : 9400 GT (standard clocks, 32 shader version), Linux 64-bit, Q6700
- 17000s (4:43:20) : 9400 GT, XP
- 22900s (6:21:40) : 9300 / nForce 730i, XP 64bit
8xxx
- 2070s (34:30) : 8800 GT, XP 64bit
- 2664 (44:24) : 8800 GS (Manual OC, 1530 shader clock), Windows XP 32-bit, Pentium D 965 Extreme Edition (HT enabled)
- 9460s (2:37:40) : 8600 GT (standard clocks), Windows XP 32-bit, Pentium D 830
- 27262s (7:34:22) : 8400 GS (Manual OC, 1014 shader clock), Windows XP 32-bit, Pentium 4 3.6Ghz (HT enabled)
- 30082s (8:21:22) : 8400M GS (standard clocks), Windows Vista 32-bit, T8100
Other
- 25400s (7:03:20) : NVIDIA ION LE, Linux
ATI
HD 5xxx
- 950s (15:50) : HD5870, XP 64bit
- 1270s (21:10) : ATI 5850 @825Mhz GPU, RAM @ 1000Mhz drivers 10.10, Windows 7 64 bits
- 1480s (24:40) : HD5850 (std clock, 1024MB), Win 7 x64, Q9450 @ 3.04GHz
- 1989s (33:09) : HD5770 (900/1200) driver: 10-11, Windows 7 Home x86 Edition - powered by E5200 3,4Ghz
HD4xxx
- 1800s (30:00) : HD4870, XP 64bit
- 3800-4100s (1:03:20 - 1:08:20) : HD 4770
- 7552s (2:05:52) : HD 4670 (standard clocks), Windows XP 32-bit, Athlon 64 x2 4200+ (socket 939)
CPU
- 45500s (12:38:20) : X5650 (OC 3.52GHz, HT on - 12threads simultaneously), Win 7 x64
- 60120s (16:42:00) : i5 2.53GHz on Darwin 10.5.0
- 61433s (17:03:53) : Intel Core2 Duo T7300 @ 2.00GHz
- 68183s (18:56:23) : Intel Core2 T5300 @ 1.73GHz
- 202800s (56:20:00) : P4 2.8GHz on XP SP3 (32-bit)
- 211556s (58:45:56) : AMD Sempron 3000+
Also, to answer your question from that thread, all of the times that I posted above (in my earlier post) were using the 3G (2,314 credit) workunits.
____________
141941*2^4299438-1 is prime!
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Windows Vista x64 SP 2 - NVIDIA Driver Version 258.96 WHQL
NVIDIA GTX 460 @ 900/1800/2000 MHz: 553-558s*
*That's no typo ;)
That's some serious overclocking action you've got going on there! Stock clocks are 675/900/1350. :)
____________
My lucky number is 75898524288+1 | |
|
|
~6100s for a GT220 on a i7-920, W7-64, 8G
~2075s for a HD5770 on a i7-930, W7-64, 12G
____________
My lucky numbers are 121*2^4553899-1 and 3756801695685*2^666669±1
My movie https://vimeo.com/manage/videos/502242 | |
|
|
Any idea why my GPU WU do not work correctly? They all come up with a calcution error after 2 or 3 seconds... with 0 seconds of calculation time
My system:
OS: Windows 7 Ultimate 64bit Version
GPU: Ati HD 5770 (Powercolor 5770 PCS+ @900/1250)
CPU: AMD Athlon II X4 635
Collatz Conjecture works properly, so I really wonder where the problem lies...
____________
| |
|
|
[quote]Any idea why my GPU WU do not work correctly? They all come up with a calcution error after 2 or 3 seconds... with 0 seconds of calculation time
Collatz Conjecture works properly, so I really wonder where the problem lies.../quote]
Primegrid needs the full APP driver. Please go to ATI and download the driver that is called the APP driver for your card.
____________
My lucky numbers are 121*2^4553899-1 and 3756801695685*2^666669±1
My movie https://vimeo.com/manage/videos/502242 | |
|
|
let me add some numbers:
# 1580s : GTS 250 OC 800MHz - W7x64
# 1560s : GTX 260 - 192 OC 700 MHz Vistax64
# 37.000s XEON X3323 2.5 GHz, Windows Server 2008 64-bit
# 30.800s Core2 Duo E8400 3.00GHz W7X64
# 47.000s Xeon E5320 1.86GHz Linux-x64
| |
|
|
I havent seen any stats on a Nvidia Quadro FX 370 and I'm trying to use one but it appears that it's really really slow. Was just wondering if it's that old and slow or if I have some kind of setting I can change.
____________
@AggieThePew
| |
|
|
I havent seen any stats on a Nvidia Quadro FX 370 and I'm trying to use one but it appears that it's really really slow. Was just wondering if it's that old and slow or if I have some kind of setting I can change.
it's probably a g84 chip - so comparable to 8600 cards. but a lot depends on the clocks.. | |
|
|
Based on the expected end time, it looks like it will take 10 hours to finish a task. I have a CPU that can run it that fast.
Not sure if there's a way to change the settings or not. GPU's are new to me so I've been taking whatever is defaulted.
____________
@AggieThePew
| |
|
|
Based on the expected end time, it looks like it will take 10 hours to finish a task. I have a CPU that can run it that fast.
Not sure if there's a way to change the settings or not. GPU's are new to me so I've been taking whatever is defaulted.
get GPU-Z and check what's it excatly and how the clocks are set.
http://www.techpowerup.com/downloads/1907/TechPowerUp_GPU-Z_v0.4.9.html | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
I havent seen any stats on a Nvidia Quadro FX 370 and I'm trying to use one but it appears that it's really really slow. Was just wondering if it's that old and slow or if I have some kind of setting I can change.
FX 370 is basically the Quadro version of the 8400 GS card (16 shader variety with the G86 chip--not the G84 as mention above). Times will be similar, though often slower by a bit, as almost all Quadro cards are slower clocked (stock) than their equivalent Geforce card.
____________
141941*2^4299438-1 is prime!
| |
|
|
I havent seen any stats on a Nvidia Quadro FX 370 and I'm trying to use one but it appears that it's really really slow. Was just wondering if it's that old and slow or if I have some kind of setting I can change.
FX 370 is basically the Quadro version of the 8400 GS card (16 shader variety with the G86 chip--not the G84 as mention above). Times will be similar, though often slower by a bit, as almost all Quadro cards are slower clocked (stock) than their equivalent Geforce card.
wiki says:
FX 370 G84GL 360 400 256
FX 370 LP G98 540 500 256
but you never know what's been sold as OEM-versions... | |
|
|
It's the G84 version.. clock speeds are GPU: 360 Memory: 400 Shader: 720
So is there anything that can be changed to increase the performance or is that what I'm stuck with?
____________
@AggieThePew
| |
|
|
It's the G84 version.. clock speeds are GPU: 360 Memory: 400 Shader: 720
So is there anything that can be changed to increase the performance or is that what I'm stuck with?
then it's the G84gl version as i suspected. the only thing you could try is using riva-tuner to push the clocks.. ;(
| |
|
|
Well I figure I better leave it be. Besides I just noticed the date on the card.. April 2007... kinda like the ole Tandy I once had. Not even a green screen, a floppy that wasn't standard size and a blazing 300 baud user interface or was that 150 baud.
____________
@AggieThePew
| |
|
|
Well I figure I better leave it be. Besides I just noticed the date on the card.. April 2007... kinda like the ole Tandy I once had. Not even a green screen, a floppy that wasn't standard size and a blazing 300 baud user interface or was that 150 baud.
LOL - you mean like TRS-80 Cat. No. 264-1174?
110-300 baud..
got one on the shelves.. ;) | |
|
|
I'd forgotten the model but yes! I think it was the TRS-80 or something like that. Thank goodness we've progressed as far as we have. I don't think we'd have ever found a prime on one of those. It'd been way too much trouble continually changing out the floppy.
I can't believe you still have one around :)
____________
@AggieThePew
| |
|
|
I'd forgotten the model but yes! I think it was the TRS-80 or something like that. Thank goodness we've progressed as far as we have. I don't think we'd have ever found a prime on one of those. It'd been way too much trouble continually changing out the floppy.
I can't believe you still have one around :)
i've started computing with punch cards, so this was way advanced back then.. ;)
someone should try to code a current PPS-sieve WU for a punch card machine built int the early '80s.
better not... ;) | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
It's the G84 version.. clock speeds are GPU: 360 Memory: 400 Shader: 720
So is there anything that can be changed to increase the performance or is that what I'm stuck with?
Frank has the right answer with riva tuner...it works well with the G84 chips. I have an 8400 GS pushed a bit over 1000 shader which gets the times down to around 8 hours. If you do play with the clocks, do it with small incremental increases and watch how badly your heat builds up. These low end cards come with fairly cheap coolers, so the clock increases will be only modest.
____________
141941*2^4299438-1 is prime!
| |
|
|
Well this is just some kind of cool. I have both the monitor software and the tuner software installed and running. I've increased the clock speed(s) to power 3D which seemed to increase it to about 1/2 way of total available.
Core is running at 378, mem 399 and shader 756 with temps of 60 C, 42 C and 40 C
Fan is running on auto at 60% and 4372 rpm
The pc is in a temp controlled room that I keep at or below 76.
So, how far can I push this little beast and what would you consider to be the danger levels.
____________
@AggieThePew
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
So, how far can I push this little beast and what would you consider to be the danger levels.
You will find lots of disagreement about this across those who overclock. In general, I like to keep any GPU below 70C on average and never let them go over 80C for any length of time. Even without the heat issue, you'll likely run into some instability point, but this varies by the individual card (e.g., I have two identical ASUS 9600 GSO cards, but one take just a slightly higher OC than the other).
____________
141941*2^4299438-1 is prime!
| |
|
|
I think having the pc in a cold room helps a ton.
core clock 504
memory clock 513
shader 1008
gpu temp 66
gpu temp 51
pcb temp 45
gpu load 92%
fan 4372 rpm at 60%
____________
@AggieThePew
| |
|
|
OK.
After a day of successful sieving on both ATIs I have to update their times.
5870 seems to be more in 910s/WU zone, while 4870 takes about 2430s/WU.
BR,
____________
| |
|
|
LOL - you mean like TRS-80 Cat. No. 264-1174?
110-300 baud..
got one on the shelves.. ;)
Sorry I've got to ask is that the one that looked vaguely like a sex aid with the rubber cups? Dear me, deary, deary me. That takes me back. | |
|
|
Hmmmmmmmm I don't recall ANY pc I've ever used looking like any kind of aid :)
That is a funny thought however.
____________
@AggieThePew
| |
|
|
8 hours 41 minutes on a cpu - AMD Phenom II X4 955 3.5ghz win7x64
____________
| |
|
|
LOL - you mean like TRS-80 Cat. No. 264-1174?
110-300 baud..
got one on the shelves.. ;)
Sorry I've got to ask is that the one that looked vaguely like a sex aid with the rubber cups? Dear me, deary, deary me. That takes me back.
LOL!
not the "mobile" model, but that one:
http://farm1.static.flickr.com/175/408856771_3a1c6e9f86.jpg | |
|
|
LOL - you mean like TRS-80 Cat. No. 264-1174?
110-300 baud..
got one on the shelves.. ;)
Sorry I've got to ask is that the one that looked vaguely like a sex aid with the rubber cups? Dear me, deary, deary me. That takes me back.
LOL!
not the "mobile" model, but that one:
http://farm1.static.flickr.com/175/408856771_3a1c6e9f86.jpg
That is the one. My word. Haven't seen one of those in years. Thanks for reminding me.
| |
|
|
I've got
Windows 7 ultimate 64-bit
Phenom II x2 550 BE
ATI HD 4850
ATI CL SDK 2.2,
CCC 10.9, (not sure)
running approx 3505.16 seconds, 58:42 minutes.
can round it to 58-60 minutes leeway. Stock clocks and volts. | |
|
|
[quote]Any idea why my GPU WU do not work correctly? They all come up with a calcution error after 2 or 3 seconds... with 0 seconds of calculation time
Collatz Conjecture works properly, so I really wonder where the problem lies.../quote]
Primegrid needs the full APP driver. Please go to ATI and download the driver that is called the APP driver for your card.
It works now! Thanks for your help
____________
| |
|
|
[quote]Any idea why my GPU WU do not work correctly? They all come up with a calcution error after 2 or 3 seconds... with 0 seconds of calculation time
Collatz Conjecture works properly, so I really wonder where the problem lies.../quote]
Primegrid needs the full APP driver. Please go to ATI and download the driver that is called the APP driver for your card.
It works now! Thanks for your help
It works for me as well; thanks to Pooh Bear 27 for the APP driver suggestion.
I went here for my drivers; http://support.amd.com/us/gpudownload/Pages/index.aspx
____________
| |
|
|
Windows Vista x64 SP 2 - NVIDIA Driver Version 258.96 WHQL
NVIDIA GTX 460 @ 900/1800/2000 MHz: 553-558s*
*That's no typo ;)
That's some serious overclocking action you've got going on there! Stock clocks are 675/900/1350. :)
If I increase the clocks even more the card throws the BUY_A_GTX_470_YOU_CHEAPSKATE exception ;)
515s (8:35) : GTX 470 (standard clocks), Windows 7 64-bit, Dual Xeon 5520 (HT enabled)
829s (13:49) : GTX460 (Stock 675/900/1350) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
I think I call my card GTX 468 ;)
Completed and validated 542.66 21.15 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
Completed and validated 542.63 27.85 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
____________
| |
|
|
The slowest of the slow GPUs
ATI HD 4350 - 29,605.50s
but that is about 8000s faster than the CPU it's attached to. | |
|
|
GTX 480 Super Clock, Windows 7 64 on a Q9550 2.83Ghz is running singles at the moment in 370s (6:10) | |
|
|
GTX 460 @ 760/1520/1800 = 650 seconds
____________
Have a N.I.C.E. day!
| |
|
|
GTX 570 880/1760/1900 @0.988V, win7 x64, driver - 263.09 on i7-920 4011(191*21) @1.325V, 1.37 app
2 WUs simultaneously - 598 secs | |
|
|
4*gtx 580 oc 930 on v1.37
268 sec per WU | |
|
|
2 GTX470, light oc'ed 731 / 1462 / 1803
= 393.05 sec / wu
25.64 sec CPU
____________
| |
|
|
9400 GT @ 1400 MHz (stock clock)
Sa 18 Dez 2010 20:27:43 CET PrimeGrid Aborting task pps_sr2sieve_5588257_2: exceeded elapsed time limit 4937.806093 | |
|
|
GTX 570's stock are knocking Wu's out in 6 min flat.
| |
|
|
Hi,
10600s : NVIDIA Quadro FX 580 (511MB) (standard clocks), XP Pro 32 bits, Core(TM)2 CPU 6600 @ 2.40GHz
10600s : NVIDIA Quadro FX 580 (511MB) (standard clocks), XP Pro 32 bits, Core2 Duo CPU E8400 @ 3.00GHz
Good crunch @ all.
____________
MySpace al@ON =8?()> | |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Hi,
10600s : NVIDIA Quadro FX 580 (511MB) (standard clocks), XP Pro 32 bits, Core(TM)2 CPU 6600 @ 2.40GHz
10600s : NVIDIA Quadro FX 580 (511MB) (standard clocks), XP Pro 32 bits, Core2 Duo CPU E8400 @ 3.00GHz
Are your sure with this values?
My slow GT240(GT215) needs only ~2800sec = ~47min...
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Hi,
10600s : NVIDIA Quadro FX 580 (511MB) (standard clocks), XP Pro 32 bits, Core(TM)2 CPU 6600 @ 2.40GHz
10600s : NVIDIA Quadro FX 580 (511MB) (standard clocks), XP Pro 32 bits, Core2 Duo CPU E8400 @ 3.00GHz
Are your sure with this values?
My slow GT240(GT215) needs only ~2800sec = ~47min...
FX 580 has 32 shaders (similar to a 9500 GT).
GT 240 has 96 shaders.
____________
141941*2^4299438-1 is prime!
| |
|
|
Maybe he mistook the Quadro FX 580 for a GeForce GTX 580. | |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Maybe he mistook the Quadro FX 580 for a GeForce GTX 580.
UPS, i mistook...
...but why this big difference in the running times from 3,612.84s to 10,580.47s on both hosts?
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
BFG GeForce GTX 260, 216 shader, 896MB, "factory overclock" (meaning I haven't done anything but install it...so far...)
Running on Windows XP Pro 32 bit on a Pentium 4 3.0GHz with Hyper-Threading enabled.
GPU for PPS sieve: 1,404.05 seconds
(just for laughs)
CPU for PPS LLR: 1,581.42 seconds (average)
Running the sieve on the P4 CPU takes upwards of two and a half days. Thank goodness for CUDA.
____________
| |
|
|
Maybe he mistook the Quadro FX 580 for a GeForce GTX 580.
UPS, i mistook...
...but why this big difference in the running times from 3,612.84s to 10,580.47s on both hosts?
No Idea, 10580 seconds sound correct to me. With my 9400 GT i was not able to complete a WU since since after 4600 seconds the WU was aborted due to runtime exceeded, would have taken 18000 seconds. Tje 9400 GT has half the shaders of the Quadro FX 580 but 25 % more core frequency (1400 vs 1125 MHz). | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2416 ID: 1178 Credit: 19,983,460,334 RAC: 19,408,790
                                                
|
Maybe he mistook the Quadro FX 580 for a GeForce GTX 580.
UPS, i mistook...
...but why this big difference in the running times from 3,612.84s to 10,580.47s on both hosts?
The 10000+ times seem right for a stock clocked FX 580. The couple of mid-6000 sec times for that machine are also reasonable with a substantial overclock (e.g., my factory OC'ed 9500 GT--1750 shader clock--has similar times). I cannot explain the mid-3000sec time...maybe a different card was installed at that time?
____________
141941*2^4299438-1 is prime!
| |
|
|
354 seconds on a Gigabyte GTX470 soc clocked @ 800/1600/1800, on a windows 7 system, i7 920 @4 GHz. | |
|
|
354 seconds on a Gigabyte GTX470 soc clocked @ 800/1600/1800, on a windows 7 system, i7 920 @4 GHz.
Water-cooled? Else it would be a very loud system ^^
____________
Have a N.I.C.E. day!
| |
|
|
primegrid
2 x GTX470 735 / 1470 /1800
387.00 sec / wu
~~~~~~~~~~~~~~~~~~~
with collatz
GTX470 754 / 1508 / 1901
runs stable
why not with primegrid ?
is it app dependend ?
____________
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
primegrid
2 x GTX470 735 / 1470 /1800
387.00 sec / wu
~~~~~~~~~~~~~~~~~~~
with collatz
GTX470 754 / 1508 / 1901
runs stable
why not with primegrid ?
is it app dependend ?
I think the primegrid-app produces a higher gpu load.
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
Water-cooled? Else it would be a very loud system ^^
On air, with a small desk fan assisting :-)
Not overly loud.
p.s. CPU is water cooled... | |
|
|
primegrid
2 x GTX470 735 / 1470 /1800
387.00 sec / wu
~~~~~~~~~~~~~~~~~~~
with collatz
GTX470 754 / 1508 / 1901
runs stable
why not with primegrid ?
is it app dependend ?
_heinz
The collatz code is not so optimed.
Basically at 754Mhz your card is not stable.
You have got away with it up to now because you haven't applied a heavy load.
If you use a higher core voltage you can get stability at 750Mhz on a 470 but the heat output goes up a lot.
The standard fan will sound like a jet engine.
When i say stability i mean long term 24hr a day 100% running, not 5 minute benches.
Some programs you can use to test VC card stability are OCCT, Tessmark and Furmark.
In the case of Furmark i wouldn't run long tests it puts a very high strain on components.
BTW
Don't trust Kombuster it doesn't apply sufficent load to prove a card stable.
____________
| |
|
|
~ 3,300s (55min): NVIDIA Geforce 8800 GTS (768MB) (standard clocks), Win 7 64 Bit, AMD Phenom II X4 940
My new GTS 450 should arrive in a few days and push the crunching time down to ~15-20 mins. Yay :) | |
|
|
Just to give my own stats on this:
703 seconds (each)
Win 7 Ultimate x64 i7 980X (12GB RAM) w/ 3x GTX 460 desktop
6400 seconds (average)
Win 7 Home Premium x64 i7 720 (8GB RAM) w/ GT 330M laptop
Cheers.
____________
* (check primes)
| |
|
|
W7pro 64bit / KFA GTX 570 stock speeds i.e. 732MHz etc. driver: 263.09 362secs
____________
35 x 2^3587843+1 is prime! | |
|
|
Windows 7 x64 Sparkle GTX570 800 core 1600 shaders 327 seconds
____________
| |
|
|
Ubuntu 10.10 64b with an ATI 5870 gives me 17mn per task.
Cross Fire doesn't seem quite good for now...I got a GPU usage of 40-50% for my 2 5870.
Anyone else tried ? | |
|
|
2010 Mac Mini
Nvidia GeForce 320M 256 MB
OSX 10.6.5 CUDA Driver version 3.2.17 GPU Driver version 1.6.24.17 (256.00.15f04)
PrimeGrid 2h 30 minutes
Collatz 3h 23 minutes
| |
|
|
2010 Mac Mini
Nvidia GeForce 320M 256 MB
OSX 10.6.5 CUDA Driver version 3.2.17 GPU Driver version 1.6.24.17 (256.00.15f04)
PrimeGrid 2h 30 minutes
Collatz 3h 23 minutes
Hmph, I've got a 2010 MacBook Pro with a 330M that takes 2h05m for PPS sieve (very very consistent) but Collatz fails instantly. :(
I was running a GTX 260 but you might wanna delete any information entered into the master list that was gleaned from the times I posted (think it was around 23 minutes or so) as the card was defective and has since been replaced. Made by a company that went under in August (and it was sold to me in October, go figure...)
anyhow...Got another benchmark I should post:
NViDIA GeForce GTX 460 OC (810/1000/1620) 1024MB GDDR5
Windows XP 32-bit, Pentium 4 3.0GHz (one core, two threads), 1GB system RAM (This machine)
I have done a little over 300 or so tasks since I got this running on Christmas. So far I've seen:
PPS sieve: 9m55s to 10m25s
So roughly a 3 minute improvement over stock clocks, and the GPU is sitting at 67-69 C and the fan's at 37% (thank you GPU-Z)...and I can overclock it more (probably won't though). It's very stable although still half as fast as the 580. Yikes.
(for reference)
Collatz: 15m30s to 16m23s
Milkyway (cuda_opencl): 16m20s to 16m45s
DNETC: anywhere from 15m to around 22
Einstein: 4 hours or so (?!)
These Fermis man...they just burn through the sieve tasks. I stopped getting GPU work on my laptop altogether because the 460 has gotten more credit in 5 days than the 330M has in several months. Both have 100% uptime (or at least, I try ;) )
If you're wondering, it's the Zotac Amp! version I have. I haven't done anything but install it. Running latest drivers (forceware 260.99). I'm now averaging ~300k credits/day. Zero computation errors. You're welcome.
How long til we see Fermi cards in laptops, I wonder...
Not trying to one-up anyone, obviously. I think 2 hours or so on a mobile chip is darn good, especially considering it shows no extreme temperature increase.
Also, more germane to the topic of "GPU performance" - when running PPS sieve, neither my MBP nor the tweaked Dell with the 460 experience any kind of crippling interface lag, just the slight jerkiness in refresh rate one might expect. DNETC and Einstein, however, really slow things down...so I'd say that the PPS sieve app is about as optimized as it can be!
Now to port the other sieves, hint hint...
____________
| |
|
|
@ NullCoding: Which company was it that went under? Considering an upgrade and don't want any of their stuff. | |
|
|
BFG tech. Shame really, I read good reviews of their stuff and their closing was almost not publicized at all.
So yeah...their "lifetime warranty and 24/7 tech support" kinda doesn't exist anymore.
I highly recommend Zotac considering the OC'd GTX 460 I got for Christmas is turning out the impressively consistent results I mentioned above. Also it makes games look great. Looking for something better than a 460, I can't really help you - but the 460 is a great card for the money.
____________
| |
|
|
2010 Mac Mini
Nvidia GeForce 320M 256 MB
OSX 10.6.5 CUDA Driver version 3.2.17 GPU Driver version 1.6.24.17 (256.00.15f04)
PrimeGrid 2h 30 minutes
Collatz 3h 23 minutes
Hmph, I've got a 2010 MacBook Pro with a 330M that takes 2h05m for PPS sieve (very very consistent) but Collatz fails instantly. :(
I was running a GTX 260 but you might wanna delete any information entered into the master list that was gleaned from the times I posted (think it was around 23 minutes or so) as the card was defective and has since been replaced. Made by a company that went under in August (and it was sold to me in October, go figure...)
anyhow...Got another benchmark I should post:
NViDIA GeForce GTX 460 OC (810/1000/1620) 1024MB GDDR5
Windows XP 32-bit, Pentium 4 3.0GHz (one core, two threads), 1GB system RAM (This machine)
I have done a little over 300 or so tasks since I got this running on Christmas. So far I've seen:
PPS sieve: 9m55s to 10m25s
So roughly a 3 minute improvement over stock clocks, and the GPU is sitting at 67-69 C and the fan's at 37% (thank you GPU-Z)...and I can overclock it more (probably won't though). It's very stable although still half as fast as the 580. Yikes.
(for reference)
Collatz: 15m30s to 16m23s
Milkyway (cuda_opencl): 16m20s to 16m45s
DNETC: anywhere from 15m to around 22
Einstein: 4 hours or so (?!)
These Fermis man...they just burn through the sieve tasks. I stopped getting GPU work on my laptop altogether because the 460 has gotten more credit in 5 days than the 330M has in several months. Both have 100% uptime (or at least, I try ;) )
If you're wondering, it's the Zotac Amp! version I have. I haven't done anything but install it. Running latest drivers (forceware 260.99). I'm now averaging ~300k credits/day. Zero computation errors. You're welcome.
How long til we see Fermi cards in laptops, I wonder...
Not trying to one-up anyone, obviously. I think 2 hours or so on a mobile chip is darn good, especially considering it shows no extreme temperature increase.
Also, more germane to the topic of "GPU performance" - when running PPS sieve, neither my MBP nor the tweaked Dell with the 460 experience any kind of crippling interface lag, just the slight jerkiness in refresh rate one might expect. DNETC and Einstein, however, really slow things down...so I'd say that the PPS sieve app is about as optimized as it can be!
Now to port the other sieves, hint hint...
Collatz is fragile on the 320M under OSX. If I play Civ the game if I don't sleep the machine to reset the GPU Collatz will burn WU's like crazy. I always suspend GPU WUs while Civ runs but I need to be careful how I restart WUs after the game.
| |
|
|
Hi all, new numbers for chart...
GPU
GTX470 SOC (gigabyte) ~ 399s (6:39) (standard manufacter clock 700MHz GPU), Win 7 x64, AMD Athlon II X4 640 @ 3.00GHz
this SOC edition of card is overclocked from factory...there is no other overclock
JHAPA
____________
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Either has my GT240 a problem or the calculation needs more time now:
resultid=215644102 needed 2,738.82sec and resultid=215648127 needed 5,004.24sec
Does anybody have an idea???
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Problem is gone after a reboot.
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
450 GTS (OC to 950/1900/1900) win7 64-bit - 888s
9800 GTX+ (SC) win7 64-bit - 1590s
the 450 is really nice $100 and looks like I can get >200k a day also the screen lag when I am using the computer is barely noticeable and even ok for watching videos. I guess Fermi really does make a big difference. | |
|
|
@rroonnaalldd: Have the same problem with my GT240 from time to time. According to GPU-Z, the card underclocks itself for some reason. I have yet to notice the temps creeping up before this happens, so I have yet to determine why. A reboot does seem to reset it however. | |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Have the same problem with my GT240 from time to time. According to GPU-Z, the card underclocks itself for some reason. I have yet to notice the temps creeping up before this happens, so I have yet to determine why. A reboot does seem to reset it however.
Seems to be a driver problem. Take a look at Speed difference for same CUDA code under Windows/Linux.
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
Currently playing around with overclocking my Zotac GeForce GTX 460 even more than it was.
Always at 67 to 69 degrees C. That's not horrid but close to the upper limit. I've got a secondary fan (modded Celeron heatsink, lol) blowing hot air out of the GPU at about 6000 RPM...
Bought it 810/1000/1620. About 10 minutes/WU.
Running it 905/1010/1810 right now. 09:19 instead of the average 10:20. Nice.
Next up I thought I'd clock the core to over 1000MHz and shader over 2000MHz without worrying about memory clocks because I'm barely using any (100MB of 1024MB). VDDC was sitting at 0.9370V and the max is something like 1.25; I have a PSU that could start up a truck, so I'm not too worried, but for some reason it downclocks itself to 405MHz when I did that (through FireStorm). Oops. It freaked...
So, back to a nice stable 900/1010/1800 :)
216482315 151688180 8 Jan 2011 17:30:07 UTC 9 Jan 2011 5:20:40 UTC Completed and validated 662.94 92.38 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
216388611 151926275 10 Jan 2011 6:31:35 UTC 10 Jan 2011 6:46:25 UTC Completed and validated 559.92 86.84 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
216272807 151809768 9 Jan 2011 6:16:11 UTC 9 Jan 2011 17:23:00 UTC Completed and validated 615.11 85.30 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
Still Pentium 4 3.0GHz (stock), Intel Alderwood mobo, WinXP 32-bit.
I can only imagine a higher-end Fermi benchmark...
____________
| |
|
|
Currently playing around with overclocking my Zotac GeForce GTX 460 even more than it was.
Always at 67 to 69 degrees C. That's not horrid but close to the upper limit. I've got a secondary fan (modded Celeron heatsink, lol) blowing hot air out of the GPU at about 6000 RPM...
Bought it 810/1000/1620. About 10 minutes/WU.
Running it 905/1010/1810 right now. 09:19 instead of the average 10:20. Nice.
Next up I thought I'd clock the core to over 1000MHz and shader over 2000MHz without worrying about memory clocks because I'm barely using any (100MB of 1024MB). VDDC was sitting at 0.9370V and the max is something like 1.25; I have a PSU that could start up a truck, so I'm not too worried, but for some reason it downclocks itself to 405MHz when I did that (through FireStorm). Oops. It freaked...
So, back to a nice stable 900/1010/1800 :)
216482315 151688180 8 Jan 2011 17:30:07 UTC 9 Jan 2011 5:20:40 UTC Completed and validated 662.94 92.38 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
216388611 151926275 10 Jan 2011 6:31:35 UTC 10 Jan 2011 6:46:25 UTC Completed and validated 559.92 86.84 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
216272807 151809768 9 Jan 2011 6:16:11 UTC 9 Jan 2011 17:23:00 UTC Completed and validated 615.11 85.30 2,311.00 Proth Prime Search (Sieve) v1.37 (cuda23)
Still Pentium 4 3.0GHz (stock), Intel Alderwood mobo, WinXP 32-bit.
I can only imagine a higher-end Fermi benchmark...
5.5 Minutes for current WUs on a GTX 580 ;) Here is a small overview.
900 MHz seems to be the limit on my GTX 460 too. With the voltage at 1.087V the temps reached 64°C, with the case opened they immediately dropped to 55°C (without overvolting they would barely exceed 45°C). Since I have only 1 120mm case fan running there is clearly room for improvement. Initially my WUs were in the range of your timings but with a little bit finetuning I was able to to push the runtimes below 9 minutes (app_info.xml and 2 WUs in parallel + 4 CPU WUs) I used the parameters -m 16 -B 262144 -C 8192 to reduce the memory accesses for the GPU WUs (MCU load from 2-3% to 0%). During the challenge the card reached a peak output of 372,000 credits in 24 hours.
____________
| |
|
|
Ah see, I've no volt mods or anything. Just the stock voltages (which are factory overclocked as far as I can tell).
As it's still under warranty I'm not too keen on editing the BIOS, though I have the utility to do so. With the 580s out, I see little point in running my 460 into the ground. It's not the top model, nor will it ever be. Sure it runs like a 480 at the moment, though...
PPS sieve WUs are finishing in under 10 minutes now, which is fine by me. DNETC and MilkyWay WUs, though, are done about three minutes faster than they used to be, and I've yet to test it on Collatz.
I actually have no case fan. The system is a heavily modded Dell, and the PCI-E cooling fan slot is hard-coded to be controlled by GPU load (in this case). That means it is absurdly loud. Doesn't seem to lower the temperature, but it's very well ventilated. Open case, high airflow area, no other cards blocking it. That's right, I have no sound now. XD
Getting yours to run at 55 is nice. Once I finish my custom rig(s) I'll look into high-level benchmarking with low-level tweaks...
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14036 ID: 53948 Credit: 476,002,776 RAC: 209,336
                               
|
Posting timing results for the 2011 Winter Solstice challenge
829s (13:49) : GTX 460 (Stock 675/900/1350) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
That's from John's post; those were my numbers last year.
The WU's are bigger this year. Same computer now has a run time of 1,423 seconds, compared to last year's 829.
____________
My lucky number is 75898524288+1 | |
|
|
Posting timing results for the 2011 Winter Solstice challenge
829s (13:49) : GTX 460 (Stock 675/900/1350) Win 7 x64 Pro C2Q Q6600 @2.4 GHz
That's from John's post; those were my numbers last year.
The WU's are bigger this year. Same computer now has a run time of 1,423 seconds, compared to last year's 829.
I have seen similar timings for the 64 bit stock app on my GTX 460 under Linux a few months ago. The card is clocked higher (725 MHz factory OC) but the 64 bit Linux version is slower compared to the 32 bit Windows version. Last weeks tests with my modified 32 bit Linux app on the same card with the same clocks resulted in runtimes of ca. 1150 seconds. After a few additional modifications the card reached runtimes of ca. 1115 seconds. The CC 2.1 cards require some special treatment to squeeze the last ounce of performance out of them.
____________
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
The CC 2.1 cards require some special treatment to squeeze the last ounce of performance out of them. Are this changes ready for the community?
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
The CC 2.1 cards require some special treatment to squeeze the last ounce of performance out of them. Are this changes ready for the community?
Not yet. Although the modifications have been thoroughly tested with at least 3 SDKs and more than half a dozen different driver versions under Linux (32 and 64 bit binaries), there is still some work to do when it comes to the windows side of things (especially the CC 2.0 card performance gains under windows were rather disappointing).
In addition to that there is one optimization that could result in a ca. 15% speedup on all cards for certain ranges of p, k and n, but I still need some deeper understanding of the involved maths to implement it correctly or at least determine the boundaries were it would work without flaws.
____________
| |
|
Message boards :
Number crunching :
Proth Prime Search (Sieve) - GPU performance |