Author |
Message |
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Welcome to the testing thread for what will probably be the third CUDA BOINC project for PrimeGrid, a CUDA Cullen/Woodall sieve (for base 2 only):
CWPSieve-CUDA (Source, on the "cw" branch)
CWPSieve-CUDA is based on TPSieve-CUDA, and by partial coincidence they both get about the same speedup over their CPU counterparts.
Now that I have a compute-capable GPU, I've tested CWPSieve-CUDA on several ranges and it seems to work fine. But I have only a Fermi-based card on a 64-bit Linux box. I need testers using older cards and/or Windows. Despite the speed advantage CUDA provides, most test ranges are slow to provide factors. But here's one that finds three factors in less than a minute on my GTX 460:
cwpsieve-cuda-boinc-{platform} -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
The expected factors are:
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
If you feel like working a little harder, you can download or locate the GCW sieve file and replace the -n's in that line with "-i gcwsieve_1250180098.sieveinput". You should get the same result.
Good luck!
____________
|
|
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 686 ID: 845 Credit: 2,910,184,413 RAC: 268,519
                              
|
Win7 x64
GTX 470 @ 750/1500/1674
Without sievefile:
19:46:56 (4812): Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 470
Detected compute capability: 2.0
Detected 14 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 34.66 sec. (0.03 init + 34.63 sieve) at 582913 p/sec.
Processor time: 0.75 sec. (0.03 init + 0.72 sieve) at 28128450 p/sec.
Average processor utilization: 1.01 (init), 0.02 (sieve)
19:47:31 (4812): called boinc_finish
With sievefile:
19:48:11 (3952): Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 470
Detected compute capability: 2.0
Detected 14 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 35.86 sec. (1.35 init + 34.51 sieve) at 584940 p/sec.
Processor time: 2.00 sec. (1.34 init + 0.66 sieve) at 30807333 p/sec.
Average processor utilization: 1.00 (init), 0.02 (sieve)
19:48:47 (3952): called boinc_finish
Factors as expected.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
I was going for ruby badge on CPUs!
This could change sub-project "life" expectancy
Win7 x64/ GTX580 GPU version, very low CPU usage.
Stand alone
cwpsieve-cuda-x86-windows.exe -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3 (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 580
Detected compute capability: 2.0
Detected 16 multiprocessors.
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 29.52 sec. (0.03 init + 29.50 sieve) at 684340 p/sec.
Processor time: 0.88 sec. (0.03 init + 0.84 sieve) at 23923067 p/sec.
Average processor utilization: 1.14 (init), 0.03 (sieve)
Sieve file
cwpsieve-cuda-x86-windows.exe -p2563602e7 -P2563604e7 -i gcwsieve_1250180098.sieveinput
cwpsieve version cuda-0.2.3 (testing)
Scanning ABC file...
Read maximum n 24999999
Found N's from 10000008 to 24999999.
nstart=10000008, nstep=20
Reading ABC file.
Read 1108077 terms from ABC format input file `gcwsieve_1250180098.sieveinput'
Changed nstep to 19
cwpsieve initialized: 10000008 <= n <= 25000024
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 580
Detected compute capability: 2.0
Detected 16 multiprocessors.
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 32.69 sec. (3.28 init + 29.41 sieve) at 686365 p/sec.
Processor time: 3.83 sec. (3.28 init + 0.55 sieve) at 36909875 p/sec.
Average processor utilization: 1.00 (init), 0.02 (sieve)
____________
My stats |
|
|
Sysadm@Nbg Volunteer moderator Volunteer tester Project scientist
 Send message
Joined: 5 Feb 08 Posts: 1224 ID: 18646 Credit: 877,929,236 RAC: 321,810
                      
|
NVIDIA GeForce 9800 GTX+ on Linux 64-bit
stdout wrote: $ ./cwpsieve-cuda-boinc-x86_64-linux -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3 (testing)
Compiled Jan 3 2011 with GCC 4.3.3
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Found 3 factors
stderr.txt wrote: Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce 9800 GTX+
Detected compute capability: 1.1
Detected 16 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 133.70 sec. (0.01 init + 133.69 sieve) at 150984 p/sec.
Processor time: 1.06 sec. (0.02 init + 1.04 sieve) at 19408738 p/sec.
Average processor utilization: 1.42 (init), 0.01 (sieve)
called boinc_finish
____________
Sysadm@Nbg
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/
|
|
|
|
C:\ppsieve_cuda>cwpsievecuda -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3 (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 570
Detected compute capability: 2.0
Detected 15 multiprocessors.
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 30.88 sec. (0.03 init + 30.84 sieve) at 654430 p/sec.
Processor time: 0.66 sec. (0.02 init + 0.64 sieve) at 31508430 p/sec.
Average processor utilization: 0.50 (init), 0.02 (sieve)
Done with GTX570 Windows |
|
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 686 ID: 845 Credit: 2,910,184,413 RAC: 268,519
                              
|
Win7 x64
GTS 250 @ 750/2000/1100
Without sievefile:
19:57:45 (3704): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTS 250
Detected compute capability: 1.1
Detected 16 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 122.12 sec. (0.03 init + 122.09 sieve) at 165323 p/sec.
Processor time: 1.03 sec. (0.03 init + 1.00 sieve) at 20217314 p/sec.
Average processor utilization: 1.20 (init), 0.01 (sieve)
19:59:47 (3704): called boinc_finish
With sievefile:
20:00:09 (3432): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTS 250
Detected compute capability: 1.1
Detected 16 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 124.89 sec. (3.05 init + 121.84 sieve) at 165673 p/sec.
Processor time: 3.99 sec. (3.01 init + 0.98 sieve) at 20538222 p/sec.
Average processor utilization: 0.99 (init), 0.01 (sieve)
20:02:13 (3432): called boinc_finish
Factors as expected.
____________
|
|
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2392 ID: 1178 Credit: 18,655,586,930 RAC: 6,970,494
                                                
|
Win XP Pro 32-bit
9600 GSO
Cannot get application to run. Copied the cudart.dll file from the BOINC directory, but I get the following error message:
The procedure entry point cudaSetDeviceFlags could not be located in the dynamic link library cudart.dll
Edit 1:
Never mind...got it to work with cudart file from another machine
cwpsieve-cu da-x86-windows.exe -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3 (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce 9600 GSO
Detected compute capability: 1.1
Detected 12 multiprocessors.
25636026136339 | 24184321*2^24184321-1
p=25636026815745, 113.6K p/sec, 0.01 CPU cores, 34.1% done. ETA 04 Jan 16:04
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
p=25636033107201, 104.8K p/sec, 0.01 CPU cores, 65.5% done. ETA 04 Jan 16:04
p=25636039398657, 104.8K p/sec, 0.01 CPU cores, 97.0% done. ETA 04 Jan 16:05
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 188.06 sec. (0.05 init + 188.01 sieve) at 107359 p/sec.
Processor time: 2.11 sec. (0.31 init + 1.80 sieve) at 11233440 p/sec.
Average processor utilization: 6.67 (init), 0.01 (sieve)
Edit 2:
GTS 450 on Win7 64-bit
15:56:52 (3372): Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTS 450
Detected compute capability: 2.1
Detected 4 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 74.06 sec. (0.03 init + 74.03 sieve) at 272679 p/sec.
Processor time: 0.51 sec. (0.03 init + 0.48 sieve) at 41738964 p/sec.
Average processor utilization: 1.01 (init), 0.01 (sieve)
15:58:07 (3372): called boinc_finish
15:59:28 (2136): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTS 450
Detected compute capability: 2.1
Detected 4 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 73.90 sec. (0.03 init + 73.87 sieve) at 273248 p/sec.
Processor time: 0.34 sec. (0.03 init + 0.31 sieve) at 64695380 p/sec.
Average processor utilization: 0.97 (init), 0.00 (sieve)
16:00:42 (2136): called boinc_finish
____________
141941*2^4299438-1 is prime!
|
|
|
|
I may just be an inexperienced idiot trying to do this, but this is what I get:
mmillerick@mmillerick-laptop:~$ sh '/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux' -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: �@@�w@8: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: @: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: cannot create ��Ni���*Ci����%/�S�f�G���z�K�+tv�q�ը�e��!âA�G���Bp�
B��=
�̍���q�Lΐb��Z��6�����F�P�{�M��d: Directory nonexistent
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: ELF: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: �,��cI��9�����2: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 4: Syntax error: word unexpected (expecting ")")
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Michael, are you running 64-bit Linux or 32-bit Linux?
____________
|
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
What compute capability and memory is needed? I have an nVidia Quadro NVS 135M (1.1 - 128MB).
____________
Murphy (AtP)
|
|
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2392 ID: 1178 Credit: 18,655,586,930 RAC: 6,970,494
                                                
|
What compute capability and memory is needed? I have an nVidia Quadro NVS 135M (1.1 - 128MB).
The compute capability is fine (same as my 9600 GSO above). Not sure about the memory, but the response on the 9600 GSO was noticably more sluggish than the current PPS sieve app...so you should be prepared for a nearly hung machine while testing on that mobile Quadro.
____________
141941*2^4299438-1 is prime!
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
What compute capability and memory is needed? I have an nVidia Quadro NVS 135M (1.1 - 128MB).
Just like TPSieve: Compute capability 1.0 and practically no memory.
I'm not sure anyone's tried any of my sieves on a 128MB card. In theory it should work.
____________
|
|
|
Vato Volunteer tester
 Send message
Joined: 2 Feb 08 Posts: 851 ID: 18447 Credit: 713,903,832 RAC: 1,641,370
                           
|
Good work again Ken!
Is there likely to be an ATI version of this, or are you too fed up of that?
BTW, I found from a colleague that my high CPU with ATI is due to a busy-wait that got introduced between the 2.1 and 2.2 SDK - very nice of them - NOT.
____________
|
|
|
|
Michael, are you running 64-bit Linux or 32-bit Linux?
64 bit
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Good work again Ken!
Is there likely to be an ATI version of this, or are you too fed up of that?
BTW, I found from a colleague that my high CPU with ATI is due to a busy-wait that got introduced between the 2.1 and 2.2 SDK - very nice of them - NOT.
Yes, there's likely to be an ATI version at some point. I may be able to optimize it more than the CUDA version, relative to TPSieve performance. But I haven't got that all figured out yet.
Edit: Michael, would you humor me and try the 32-bit version (of CWPSieve)?
____________
|
|
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2392 ID: 1178 Credit: 18,655,586,930 RAC: 6,970,494
                                                
|
8400M GS
Vista 32-bit
17:25:45 (2640): Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce 8400M GS
Detected compute capability: 1.1
Detected 2 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 2442.71 sec. (0.18 init + 2442.54 sieve) at 8264 p/sec.
Processor time: 15.10 sec. (0.08 init + 15.02 sieve) at 1343622 p/sec.
Average processor utilization: 0.44 (init), 0.01 (sieve)
18:06:28 (2640): called boinc_finish
That's about as slow as GPU's get. Was almost unusable when in use...but temps were good! :)
____________
141941*2^4299438-1 is prime!
|
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
Win XP
Could not find "cudart.dll"...
Copied several "cudart.dll" from various other folders, but it did not like any of them. Could not find entry ...
???
____________
Murphy (AtP)
|
|
|
|
win 7 64 bit 9800 gtx+
It said cudart.dll was missing so I copied it from boinc folder (and I also tried downloading a copy online) but I keep getting the error "The procedure entry point cudaSetDeviceFlags could not be located in the dynamic link library cudart.dll"
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
This should be the right cudart.dll. If it's not, ask Scott Brown what he did to solve it. ;)
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
@Ken
Is this the 32 or 64bit version?
For Linux are both versions in the download folder:
http://www.primegrid.com/download/libcudart.so.2.32bit
http://www.primegrid.com/download/libcudart.so.2.64bit is equal to http://www.primegrid.com/download/libcudart.so
[edit]
link
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
For Windows there is no 64-bit version, because I can't compile a 64-bit binary with the free MS Express tools. Just use the 32-bit version; it should work as well as TPSieve does.
____________
|
|
|
|
worked
win7-64 9800gtx+
C:\Users\xkeemy>C:\New\cwpsieve-cuda-x86-windows.exe -p2563602e7 -P2563604e7 -n
10000000 -N 25000000
cwpsieve version cuda-0.2.3 (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce 9800 GTX+
Detected compute capability: 1.1
Detected 16 multiprocessors.
25636026136339 | 24184321*2^24184321-1
p=25636029437185, 157.3K p/sec, 0.01 CPU cores, 47.2% done. ETA 04 Jan 21:26
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
p=25636038612225, 152.9K p/sec, 0.01 CPU cores, 93.1% done. ETA 04 Jan 21:26
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 132.22 sec. (0.04 init + 132.18 sieve) at 152710 p/sec.
Processor time: 1.22 sec. (0.05 init + 1.17 sieve) at 17252109 p/sec.
Average processor utilization: 1.30 (init), 0.01 (sieve)
|
|
|
|
What is TPSieve-CUDA? I don't see that app listed on the apps page:
http://www.primegrid.com/apps.php
____________
Reno, NV
|
|
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 686 ID: 845 Credit: 2,910,184,413 RAC: 268,519
                              
|
What is TPSieve-CUDA? I don't see that app listed on the apps page:
http://www.primegrid.com/apps.php
TPSieve is the program used for Proth Prime Search (Sieve).
____________
|
|
|
|
Win 7 x64, GTX 570 stock
Without sieve file:
12:27:00 (4024): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 570
Detected compute capability: 2.0
Detected 15 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 34.20 sec. (0.05 init + 34.16 sieve) at 590899 p/sec.
Processor time: 0.86 sec. (0.03 init + 0.83 sieve) at 24413360 p/sec.
Average processor utilization: 0.69 (init), 0.02 (sieve)
12:27:34 (4024): called boinc_finish
Factors are correct:
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
____________
Polish National Team |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Linux64, GT240(GT215)
"./cwpsieve-cuda-boinc-x86_64-linux -p2563602e7 -P2563604e7 -n 10000000 -N 25000000" wrote: Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Resuming from checkpoint p=25636022621441 in cwpcheck2563602e7.txt
Thread 0 starting
Detected GPU 0: GeForce GT 240
Detected compute capability: 1.2
Detected 12 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 201.32 sec. (0.01 init + 201.31 sieve) at 87248 p/sec.
Processor time: 68.40 sec. (0.02 init + 68.39 sieve) at 256823 p/sec.
Average processor utilization: 1.22 (init), 0.34 (sieve)
called boinc_finish
"./cwpsieve-cuda-x86_64-linux -p2563602e7 -P2563604e7 -n 10000000 -N 25000000" wrote: cwpsieve version cuda-0.2.3 (testing)
Compiled Jan 3 2011 with GCC 4.3.3
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GT 240
Detected compute capability: 1.2
Detected 12 multiprocessors.
p=25636025767169, 96.12K p/sec, 0.35 CPU cores, 28.8% done. ETA 05 Jan 12:56
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
p=25636031010049, 87.38K p/sec, 0.33 CPU cores, 55.1% done. ETA 05 Jan 12:57
25636030632281 | 14263341*2^14263341+1
p=25636036252929, 87.38K p/sec, 0.35 CPU cores, 81.3% done. ETA 05 Jan 12:57
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 230.58 sec. (0.01 init + 230.56 sieve) at 87547 p/sec.
Processor time: 78.34 sec. (0.02 init + 78.32 sieve) at 257710 p/sec.
Average processor utilization: 1.24 (init), 0.34 (sieve)
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Vista Home Basic / 32
GeForce 9500 GT
07:01:53 (5844): Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce 9500 GT
Detected compute capability: 1.1
Detected 4 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 714.50 sec. (0.03 init + 714.46 sieve) at 28252 p/sec.
Processor time: 1.48 sec. (0.03 init + 1.45 sieve) at 13912988 p/sec.
Average processor utilization: 1.00 (init), 0.00 (sieve)
07:13:48 (5844): called boinc_finish
Factors found as listed below |
|
|
|
WinXP 32bit:
13:17:03 (3428): Can't open init data file - running in standalone mode
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce 9600 GT
Detected compute capability: 1.1
Detected 8 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 314.23 sec. (0.03 init + 314.20 sieve) at 64242 p/sec.
Processor time: 1.97 sec. (0.08 init + 1.89 sieve) at 10676410 p/sec.
Average processor utilization: 2.50 (init), 0.01 (sieve)
13:22:17 (3428): called boinc_finish
Factors as expected
____________
There are only 10 kinds of people - those who understand binary and those who don't
|
|
|
|
Another one . . .
Vista Home Premium / 32
GeForce GT 240
07:28:58 (5140): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GT 240
Detected compute capability: 1.2
Detected 12 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 230.15 sec. (0.05 init + 230.10 sieve) at 87723 p/sec.
Processor time: 0.51 sec. (0.08 init + 0.44 sieve) at 46210965 p/sec.
Average processor utilization: 1.67 (init), 0.00 (sieve)
07:32:48 (5140): called boinc_finish
Factors found as listed below.
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
@Ken
boinc@vmware2k-3:~/Cuda/gcw$ time ./gcwsieve_1.12_x86_64-pc-linux-gnu -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
gcwsieve 1.3.8 -- A sieve for Generalised Cullen/Woodall numbers n*b^n+/-1.
Read 1108077 terms n*2^n+/-1 from ABC file `sieve.txt'.
gcwsieve 1.3.8 started: 10000005 <= n <= 24999999, 25636020000000 <= p <= 25636040000000
p=25636020524351, 8087 p/sec, 0 factors, 2.6% done, ETA 05 Jan 14:57
p=25636021573147, 8207 p/sec, 0 factors, 7.9% done, ETA 05 Jan 14:56
p=25636022097479, 8179 p/sec, 0 factors, 10.5% done, ETA 05 Jan 14:56
p=25636022621921, 8199 p/sec, 0 factors, 13.1% done, ETA 05 Jan 14:56
p=25636023146329, 8141 p/sec, 0 factors, 15.7% done, ETA 05 Jan 14:56
p=25636023670771, 8121 p/sec, 0 factors, 18.4% done, ETA 05 Jan 14:56
p=25636024195163, 8172 p/sec, 0 factors, 21.0% done, ETA 05 Jan 14:56
p=25636024719653, 8130 p/sec, 0 factors, 23.6% done, ETA 05 Jan 14:56
p=25636025244209, 8146 p/sec, 0 factors, 26.2% done, ETA 05 Jan 14:56
p=25636025768557, 8164 p/sec, 0 factors, 28.8% done, ETA 05 Jan 14:56
25636026136339 | 24184321*2^24184321-1
p=25636026161819, 8113 p/sec, 1 factor, 30.8% done, ETA 05 Jan 14:56
p=25636026686201, 8170 p/sec, 1 factor, 33.4% done, 818 sec/factor
p=25636027210609, 8129 p/sec, 1 factor, 36.1% done, ETA 05 Jan 14:56
p=25636027735061, 8135 p/sec, 1 factor, 38.7% done, 950 sec/factor
p=25636028259529, 7981 p/sec, 1 factor, 41.3% done, ETA 05 Jan 14:56
p=25636028783929, 8181 p/sec, 1 factor, 43.9% done, 1073 sec/factor
p=25636029308333, 8149 p/sec, 1 factor, 46.5% done, ETA 05 Jan 14:56
25636029526061 | 12004589*2^12004589+1
p=25636029570551, 8198 p/sec, 2 factors, 47.9% done, 583 sec/factor
p=25636030094933, 8159 p/sec, 2 factors, 50.5% done, ETA 05 Jan 14:56
25636030632281 | 14263341*2^14263341+1
p=25636030750457, 7987 p/sec, 3 factors, 53.8% done, ETA 05 Jan 14:56
p=25636031274839, 8179 p/sec, 3 factors, 56.4% done, 459 sec/factor
p=25636031799271, 8184 p/sec, 3 factors, 59.0% done, ETA 05 Jan 14:56
p=25636032323789, 8154 p/sec, 3 factors, 61.6% done, 503 sec/factor
p=25636032848201, 8179 p/sec, 3 factors, 64.2% done, ETA 05 Jan 14:56
p=25636033372583, 8057 p/sec, 3 factors, 66.9% done, 553 sec/factor
p=25636033896971, 8120 p/sec, 3 factors, 69.5% done, ETA 05 Jan 14:56
p=25636034421299, 8152 p/sec, 3 factors, 72.1% done, 589 sec/factor
p=25636034945639, 8152 p/sec, 3 factors, 74.7% done, ETA 05 Jan 14:56
p=25636035470027, 8164 p/sec, 3 factors, 77.4% done, 631 sec/factor
p=25636035994523, 8169 p/sec, 3 factors, 80.0% done, ETA 05 Jan 14:56
p=25636036518899, 8131 p/sec, 3 factors, 82.6% done, 677 sec/factor
p=25636037043229, 8141 p/sec, 3 factors, 85.2% done, ETA 05 Jan 14:56
p=25636037567669, 8143 p/sec, 3 factors, 87.8% done, 719 sec/factor
p=25636038092047, 8190 p/sec, 3 factors, 90.5% done, ETA 05 Jan 14:56
p=25636038616417, 8266 p/sec, 3 factors, 93.1% done, 750 sec/factor
p=25636039140739, 8196 p/sec, 3 factors, 95.7% done, ETA 05 Jan 14:56
p=25636039665157, 8119 p/sec, 3 factors, 98.3% done, 807 sec/factor
gcwsieve 1.3.8 stopped: at p=25636040000000 because range is complete.
Found factors for 3 terms in 2454.810 sec. (expected about 0.03)
real 40m56.805s
user 40m53.689s
sys 0m0.764s
I renamed "gcwsieve_1250180098.sieveinput" to "sieve.txt". I found nothing with the smaller "gcwsieve_20110101.sieveinput" file. I this only for testing?
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
I renamed "gcwsieve_1250180098.sieveinput" to "sieve.txt". I found nothing with the smaller "gcwsieve_20110101.sieveinput" file. I this only for testing?
gcwsieve_20110101 is a new sieve file. Those factors have already been removed from it.
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Thanks John. The new App_info file is ready for takeoff...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Ken,
I know this isn't really your responsibility, but perhaps you know what the plans are.
Once a CW sieve app on CUDA is released, is there going to be a mechanism that allows people to select one CUDA app over the other without using app_info? (While separately selecting CPU tasks, of course.)
P.S. Thanks for all the incredible work. You have essentially revolutionized sieving here at PrimeGrid.
____________
My lucky number is 75898524288+1 |
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
Once a CW sieve app on CUDA is released, is there going to be a mechanism that allows people to select one CUDA app over the other without using app_info? (While separately selecting CPU tasks, of course.)
This issue has not been resolved...and yes, it's a pretty significant issue. Outside of the app_info approach, there does not appear to be a way to separate the GPU projects.
The "cheat" that has allowed work to flow to the GPU while still being able to select CPU projects will still work. However, there will be no way to select PPS (Sieve) or CW (Sieve) for the GPU. Work will come from both.
At this time, it seems that PG is the only BOINC project that experiences this unique issue with all the sub-projects that are offered. Therefore, there doesn't appear to be much activity at Berkley to resolve it. :(
However, maybe a BOINC developer will see our plight and come to the rescue.
____________
|
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
Once a CW sieve app on CUDA is released, is there going to be a mechanism that allows people to select one CUDA app over the other without using app_info? (While separately selecting CPU tasks, of course.)
This issue has not been resolved...and yes, it's a pretty significant issue. Outside of the app_info approach, there does not appear to be a way to separate the GPU projects.
The "cheat" that has allowed work to flow to the GPU while still being able to select CPU projects will still work. However, there will be no way to select PPS (Sieve) or CW (Sieve) for the GPU. Work will come from both.
At this time, it seems that PG is the only BOINC project that experiences this unique issue with all the sub-projects that are offered. Therefore, there doesn't appear to be much activity at Berkley to resolve it. :(
However, maybe a BOINC developer will see our plight and come to the rescue.
The immediate/simplest solution is to make each GPU project its own sub-project (independent from its CPU counterpart). If user wants both, he/she will check two boxes; otherwise, just the box of the sub-project wanted.
____________
Murphy (AtP)
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
I renamed "gcwsieve_1250180098.sieveinput" to "sieve.txt". I found nothing with the smaller "gcwsieve_20110101.sieveinput" file. I this only for testing?
gcwsieve_20110101 is a new sieve file. Those factors have already been removed from it.
I didn't know this was in the wild yet. I've just released v0.2.3a, which handles the new file format.
____________
|
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
1/5/2011 11:25:48 AM NVIDIA GPU 0: Quadro NVS 135M (driver version 26099, CUDA version 3020, compute capability 1.1, 128MB, 13 GFLOPS peak)
stdout.txt:
cwpsieve version cuda-0.2.3 (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Found 3 factors
stderr.txt:
11:30:54 (4340): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: Quadro NVS 135M
Detected compute capability: 1.1
Detected 1 multiprocessors.
Thread 0 completed
Sieve complete: 25636020000000 <= p < 25636040000000
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 4834.21 sec. (0.08 init + 4834.13 sieve) at 4176 p/sec.
Processor time: 30.42 sec. (0.19 init + 30.23 sieve) at 667620 p/sec.
Average processor utilization: 2.40 (init), 0.01 (sieve)
12:51:28 (4340): called boinc_finish
cwpfactors.txt:
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
You can not get much more minimal GPU that this.
____________
Murphy (AtP)
|
|
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2392 ID: 1178 Credit: 18,655,586,930 RAC: 6,970,494
                                                
|
Ken,
I know this isn't really your responsibility, but perhaps you know what the plans are.
Once a CW sieve app on CUDA is released, is there going to be a mechanism that allows people to select one CUDA app over the other without using app_info? (While separately selecting CPU tasks, of course.)
P.S. Thanks for all the incredible work. You have essentially revolutionized sieving here at PrimeGrid.
Michael,
As one of the proponents of running VM's to solve BOINC's shortcomings, couldn't a solution be done with them?
i.e., - setup a single VM to run on all but one core. In the non-VM, run the GPU by selecting either CW or PPS sieves and deselecting the run on CPU option. On the VM, deselect the run on ATI and run on NVidia options and pick the CPU projects you want. You miss one core, but you should be able to crunch only the things you want to.
____________
141941*2^4299438-1 is prime!
|
|
|
|
I may just be an inexperienced idiot trying to do this, but this is what I get:
mmillerick@mmillerick-laptop:~$ sh '/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux' -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: �@@�w@8: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: @: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: cannot create ��Ni���*Ci����%/�S�f�G���z�K�+tv�q�ը�e��!âA�G���Bp�
B��=
�̍���q�Lΐb��Z��6�����F�P�{�M��d: Directory nonexistent
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: ELF: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 3: �,��cI��9�����2: not found
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux: 4: Syntax error: word unexpected (expecting ")")
What's with the "sh"? Try
/home/mmillerick/Desktop/cwpsieve-cuda-boinc-x86_64-linux -p2563602e7 -P2563604e7 -n 10000000 -N 25000000 |
|
|
|
Great stuff Ken, let the gold-rush on the C/W Sieve project begin!
I've done Mac builds (x86/x86_64 boinc/non-boinc), get them here http://www.pyramid-productions.net/downloads/cwpsieve-cuda.tar.gz. As usual, CUDA 3.2 is required since that's my build platform.
I've tested the range you suggested and it looks fine.
Cheers
- Iain |
|
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 686 ID: 845 Credit: 2,910,184,413 RAC: 268,519
                              
|
After the successful standalone tests I tried to run a few Cullen/Woodall Sieve WUs via BOINC.
Good news: WUs are completed, results are uploaded, run time ~130s (on GTX460), CPU time ~4s.
Bad news: They are declared invalid.
Is it just the validator ignoring them because of the short run time (I remember this issue with the very first ppsieve-CUDA WUs) or are there some more modifications to be done?
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
"Scott Brown" wrote: Michael,
As one of the proponents of running VM's to solve BOINC's shortcomings, couldn't a solution be done with them?
i.e., - setup a single VM to run on all but one core. In the non-VM, run the GPU by selecting either CW or PPS sieves and deselecting the run on CPU option. On the VM, deselect the run on ATI and run on NVidia options and pick the CPU projects you want. You miss one core, but you should be able to crunch only the things you want to.
Running GPU stuff inside a VM is not possible with all virtualization solutions. VMware vSphere (ESX/ESXi4), Citrix XenServer and KVM can do it, Parallels Workstation only with some expensive Tesla-cards on HP-servers. Oracles VirtualBox can't do it.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
"pschoefer" wrote: After the successful standalone tests I tried to run a few Cullen/Woodall Sieve WUs via BOINC.
Good news: WUs are completed, results are uploaded, run time ~130s (on GTX460), CPU time ~4s.
Bad news: They are declared invalid.
Is it just the validator ignoring them because of the short run time (I remember this issue with the very first ppsieve-CUDA WUs) or are there some more modifications to be done?
You are a lucky guy. I lost this game on linux64.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Bad news: They are declared invalid.
Is it just the validator ignoring them because of the short run time (I remember this issue with the very first ppsieve-CUDA WUs) or are there some more modifications to be done?
I already informed John yesterday who passed it to Rytis. Probably time issue.
____________
My stats |
|
|
Lumiukko Volunteer tester Send message
Joined: 7 Jul 08 Posts: 165 ID: 25183 Credit: 875,346,885 RAC: 114,595
                           
|
WIN7 x64 GTX275:
c:\Test>cwpsieve-cuda-x86-windows.exe -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3a (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 275
Detected compute capability: 1.3
Detected 30 multiprocessors.
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
p=25636033631489, 227.2K p/sec, 0.01 CPU cores, 68.2% done. ETA 06 Jan 10:09
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 98.39 sec. (0.03 init + 98.36 sieve) at 205214 p/sec.
Processor time: 0.67 sec. (0.03 init + 0.64 sieve) at 31508430 p/sec.
Average processor utilization: 1.14 (init), 0.01 (sieve)
Linux x64 GTX480:
./cwpsieve-cuda-x86_64-linux -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3a (testing)
Compiled Jan 5 2011 with GCC 4.3.3
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GTX 480
Detected compute capability: 2.0
Detected 15 multiprocessors.
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 41.97 sec. (0.02 init + 41.95 sieve) at 481141 p/sec.
Processor time: 11.26 sec. (0.03 init + 11.23 sieve) at 1797425 p/sec.
Average processor utilization: 1.48 (init), 0.27 (sieve)
--
Lumiukko |
|
|
|
MAC OS X 10.6.5, iMac, with a GT120 cuda 3.2.17, BOINC suspended, using Iain's build posted several hours ago:
gary% ./cwpsieve-cuda-x86_64-mac -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3a (testing)
Compiled Jan 5 2011 with GCC 4.2.1 (Apple Inc. build 5664)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
Sieve started: 25636020000000 <= p < 25636040000000
Thread 0 starting
Detected GPU 0: GeForce GT 120
Detected compute capability: 1.1
Detected 4 multiprocessors.
(factors as expected)
Thread 0 completed
Waiting for threads to exit
Sieve complete: 25636020000000 <= p < 25636040000000
Found 3 factors
count=647707,sum=0xe66f848aacfc21bb
Elapsed time: 792.81 sec. (0.01 init + 792.80 sieve) at 25460 p/sec.
Processor time: 5.45 sec. (0.02 init + 5.43 sieve) at 3720320 p/sec.
Average processor utilization: 1.97 (init), 0.01 (sieve)
--Gary
p.s. Just after I saw this thread, I suspended CPU work for CW Sieve and applied it elsewhere. :-) |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Ken,
I know this isn't really your responsibility, but perhaps you know what the plans are.
Once a CW sieve app on CUDA is released, is there going to be a mechanism that allows people to select one CUDA app over the other without using app_info? (While separately selecting CPU tasks, of course.)
P.S. Thanks for all the incredible work. You have essentially revolutionized sieving here at PrimeGrid.
Michael,
As one of the proponents of running VM's to solve BOINC's shortcomings, couldn't a solution be done with them?
i.e., - setup a single VM to run on all but one core. In the non-VM, run the GPU by selecting either CW or PPS sieves and deselecting the run on CPU option. On the VM, deselect the run on ATI and run on NVidia options and pick the CPU projects you want. You miss one core, but you should be able to crunch only the things you want to.
Yes.
Setting up VMs have their own difficulties, however. For one thing, scheduling of the processes running on the cores in the VM and outside the VM is not nearly as seamless as having everything running on the host machine.
It's not a solution for the masses. For a novice computer user, first setting up the VM, then setting up a second operating system, is a daunting task. For a typical Windows user, that means either buying a second copy of Windows or learning a new operating system. It's not actually hard to do, but most won't even attempt it.
Also, not all GPU crunchers are running dual GTX 580's in a 16 gig i7 system. Since the GPU requirements are so low for this app, very old GPUs can be used -- and that often means they're running in very old computers as well. Some of those computers will have less than one gig of main memory. On such a computer, running a VM wouldn't be very practical.
As I see it, I personally have three options:
1) Use app_info. Drawbacks include complexity and potential for error, having to set up information for every subproject, having to update the file every time ANY project here changes, and frequently clobbering every WU in your cache.
2) Use a VM. See above
3) Let the GPU run whatever it wants. This is what I'll probably do, since I don't have any particular preference between the two sieves. Even randomly doing workunits, a modern GPU would blow through both ruby badges in no time, so that wouldn't really hurt people trying for a badge.
3a) This is problematic for a challenge on one of the projects, but that's easily solved by switching the CPUs and GPUs to the one project and turning off the "send anything" flag.
____________
My lucky number is 75898524288+1 |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
After the successful standalone tests I tried to run a few Cullen/Woodall Sieve WUs via BOINC.
Good news: WUs are completed, results are uploaded, run time ~130s (on GTX460), CPU time ~4s.
Bad news: They are declared invalid.
Is it just the validator ignoring them because of the short run time (I remember this issue with the very first ppsieve-CUDA WUs) or are there some more modifications to be done?
It's running.
First unit with resultid=214457695 waits for validation and for two units parallel resultid=214458893 and resultid=214458877 i got the message: Stderr output
<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>
Unrecognized XML in parse_init_data_file: hostid
Skipping: 115189
Skipping: /hostid
Unrecognized XML in parse_init_data_file: starting_elapsed_time
Skipping: 0.000000
Skipping: /starting_elapsed_time
Sieve started: 102554440000000 <= p < 102554500000000
Thread 0 starting
Detected GPU 0: GeForce GT 240
Detected compute capability: 1.2
Detected 12 multiprocessors.
</stderr_txt>
]]>
Either i have a wrong or a to optimistically <flops> value in my app_info file. I investigate this.
[edit]
link corrected
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Works just fine on 32-bit WinXP SP3 with overclocked GTX 460 (810/1000/1620) with 1024MB GDDR5 on a Pentium 4 3.0GHz. Under 50 seconds with and without sieveinput. :)
Same output as the lot above me. All factors found, no runtime problems, etc. Cool.
Wonder how we can tweak the Linux version a bit to run on Mac...or is that too far in the future?
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
After the successful standalone tests I tried to run a few Cullen/Woodall Sieve WUs via BOINC.
Good news: WUs are completed, results are uploaded, run time ~130s (on GTX460), CPU time ~4s.
Bad news: They are declared invalid.
Is it just the validator ignoring them because of the short run time (I remember this issue with the very first ppsieve-CUDA WUs) or are there some more modifications to be done?
It's running.
First unit with resultid=214457695 waits for validation....
Unfortunately, that didn't work at all. Even though the Result is successful, you have to check the work unit. It sent out a second task, so your result will soon be declared invalid. :(
Wonder how we can tweak the Linux version a bit to run on Mac...or is that too far in the future?
Did you see Iain's post? :)
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
I think the invalidation problem is fixed in v0.2.3b. The factors file needed to be non-empty even when no factors were found.
____________
|
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
[quote][quote]After the successful standalone tests I tried to run a few Cullen/Woodall Sieve WUs via BOINC.
Good news: WUs are completed, results are uploaded, run time ~130s (on GTX460), CPU time ~4s.
Bad news: They are declared invalid.
...
/quote]
How do I get the C/W CUDA to run under BOINC?
____________
Murphy (AtP)
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
With an app_info.xml file. Very carefully. ;)
____________
|
|
|
|
And app_info.xml should look like... ? :)
____________
Polish National Team |
|
|
|
validator still doesn't like them.. :(
edit: scratch that!
after another reset it works now:
215838353 149311603 6 Jan 2011 21:07:35 UTC 6 Jan 2011 21:15:58 UTC Completed and validated 319.54 4.26 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU)
214460619 150510273 6 Jan 2011 20:59:20 UTC 6 Jan 2011 21:04:48 UTC Completed and validated 318.47 4.29 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU)
214460486 150510277 6 Jan 2011 21:02:22 UTC 6 Jan 2011 21:15:58 UTC Completed and validated 306.20 4.21 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU) |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
And app_info.xml should look like... ? :)
Please see thread App_info file.
The update of site http://primegrid.pytalhost.net/ comes tomorrow.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Card/OS/Drivers: GTX 460 / 64 Bit Linux 2.6.32 / Driver Version 256.53
Clocks: Factory overclocked @725 MHz
6 Jan 2011 22:36:07 UTC 6 Jan 2011 22:43:22 UTC Completed and validated 145.32 25.13 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU)
6 Jan 2011 22:56:51 UTC 6 Jan 2011 23:01:49 UTC Completed and validated 143.06 25.31 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU)
6 Jan 2011 22:50:45 UTC 6 Jan 2011 22:58:32 UTC Completed and validated 142.79 24.49 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU)
Thanks to rroonnaalldd for the app_info templates.
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
"Ken_g6" wrote: I think the invalidation problem is fixed in v0.2.3b. The factors file needed to be non-empty even when no factors were found.
Ken is it possible to release one Cuda-app for both Sieve projects?
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Vato Volunteer tester
 Send message
Joined: 2 Feb 08 Posts: 851 ID: 18447 Credit: 713,903,832 RAC: 1,641,370
                           
|
Another question:
Will there be a CPU version of cwpsieve?
Or will it not offer any benefit over the existing gcwsieve?
____________
|
|
|
|
Did you see Iain's post? :)
No -.-
Well I've got CUDA 3.2 on my laptop, might as well give it a shot! You did want tests with lower-end GPUs after all.
Might give it a try on BOINC too...
____________
|
|
|
|
as slow as it can get?
NVS 3100M:
7 Jan 2011 17:02:50 UTC Completed and validated 3,599.50 15.94 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU) |
|
|
|
WinVista 64
CPU:yorkfield
Nvidia 9800GT, BOINC
Run time 418.49
CPU time 3.24
Validate state Valid
Credit 182.43
http://www.primegrid.com/result.php?resultid=215597695
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
"Vato" wrote: Another question:
Will there be a CPU version of cwpsieve?
Or will it not offer any benefit over the existing gcwsieve?
Actually, that was the first thing I tried. It wasn't quite as fast as gcwsieve. It may be possible to make it faster than gcwsieve, but that would take quite a bit more work.
"rroonnaalldd" wrote: Ken is it possible to release one Cuda-app for both Sieve projects?
No. Well, yes, TPSieve could find the same factors as CWPSieve. But it would either find millions more that aren't Cullen/Woodall factors or it would take more memory than your computer has, or both! Plus it would use one entire CPU core and be much slower than CWPsieve. So, that's a no. :)
____________
|
|
|
|
81 seconds on a GTX460 (i7 X980 @ 4.01GHz & 12GB RAM, Win7 64 Ultimate)
cwpsieve-cuda>cwpsieve-cuda-boinc-x86-windows -p2563602e7 -P2563604e7 -n 10000000 -N 25000000
cwpsieve version cuda-0.2.3b (testing)
nstart=10000000, nstep=20
Changed nstep to 19
cwpsieve initialized: 10000000 <= n <= 25000000
25636026136339 | 24184321*2^24184321-1
25636029526061 | 12004589*2^12004589+1
25636030632281 | 14263341*2^14263341+1
Found 3 factors
Keep up the good work, all you clever folk that build these improvements! :)
____________
* (check primes)
|
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
as slow as it can get?
NVS 3100M:
7 Jan 2011 17:02:50 UTC Completed and validated 3,599.50 15.94 182.43 Cullen/Woodall Prime Search (Sieve)
Anonymous platform (NVIDIA GPU)
No, my NVS 135M is slower that that!
____________
Murphy (AtP)
|
|
|
Lumiukko Volunteer tester Send message
Joined: 7 Jul 08 Posts: 165 ID: 25183 Credit: 875,346,885 RAC: 114,595
                           
|
The "cheat" that has allowed work to flow to the GPU while still being able to select CPU projects will still work. However, there will be no way to select PPS (Sieve) or CW (Sieve) for the GPU. Work will come from both.
Is it possible to have third option for the subproject selection in the Primegrid preferences?
Instead of current "Yes/No" there would be "Yes/No/Never".
Meaning:
"Yes" = Send work from this subproject
"No" = Do not send work from this subproject unless there is no "Yes" work.
"Never" = Never send work from this subproject.
---
Converted from current settings:
If "Send work from any subproject if ..." is selected, then
Yes => Yes
No => No
If "Send work from any subproject if ..." is not selected, then
Yes => Yes
No => Never
The "Send work from any subproject if ..." setting would not be needed anymore.
---
That would allow the "cheat" to continue with subproject selection:
Set only PPS (Sieve) to "Never" and you would get CW (Sieve) for the GPU.
Set only CW (Sieve) to "Never" and you would get PPS (Sieve) for the GPU.
--
Lumiukko |
|
|
mfbabb2 Volunteer tester
 Send message
Joined: 10 Oct 08 Posts: 510 ID: 30360 Credit: 20,784,268 RAC: 731
                     
|
Another way around the problem:
Create 2 new sub-projects for GPU only --
CW (Sieve) for the GPU.
PPS (Sieve) for the GPU.
And make the current sub-projects CPU only --
CW (Sieve) for the CPU.
PPS (Sieve) for the CPU.
Credit could be combined or not as Project Admins determine.
____________
Murphy (AtP)
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
A third way around the problem: These GPU-capable projects should be advancing very quickly. If they're not past the point of optimal efficiency on CPUs they should be soon. Once they are, why not disable the CPU clients and make them GPU-only projects?
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
A third way around the problem: These GPU-capable projects should be advancing very quickly. If they're not past the point of optimal efficiency on CPUs they should be soon. Once they are, why not disable the CPU clients and make them GPU-only projects?
Full ack.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
A third way around the problem: These GPU-capable projects should be advancing very quickly. If they're not past the point of optimal efficiency on CPUs they should be soon. Once they are, why not disable the CPU clients and make them GPU-only projects?
I don't with this because:
1. CPUs working on a project still gets work done.
2. Stopping projects once they are past where they would be optimal with just CPUs working on it would make it harder for people with just CPUs to get badges.
If people want to set their CPUs on a project they should be allowed to.
If the projects are split into two separate projects then they should both count towards the same badge and not different ones.
____________
|
|
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 686 ID: 845 Credit: 2,910,184,413 RAC: 268,519
                              
|
Even with the latest CWPSieve CUDA some WUs are not validated: 151348618
Seems to be a validator issue, all WUs with less than 3 seconds of CPU time are marked as invalid. ;)
____________
|
|
|
|
Even with the latest CWPSieve CUDA some WUs are not validated: 151348618
Seems to be a validator issue, all WUs with less than 3 seconds of CPU time are marked as invalid. ;)
looks like a ticket for speeding.. :( |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Even with the latest CWPSieve CUDA some WUs are not validated: 151348618
Seems to be a validator issue, all WUs with less than 3 seconds of CPU time are marked as invalid. ;)
Maybe are 2 WUs at once the solution...
If you using an app_info:
<app>
<name>gcwsieve</name>
<user_friendly_name>Cullen/Woodall Prime Search (Sieve)</user_friendly_name>
</app>
<file_info>
<name>gcwsieve_20110101.sieveinput_in</name>
<status>1</status>
<sticky/>
</file_info>
<file_info>
<name>gcwsieve_1250180098.sieveinput_in</name>
<status>1</status>
<sticky/>
</file_info>
<file_info>
<name>cwpsieve-cuda-boinc-x86_64-linux</name>
<status>1</status>
<executable/>
</file_info>
<file_info>
<name>libcudart.so.2</name>
<status>1</status>
</file_info>
<file_info>
<name>stat_primegrid.png</name>
<status>1</status>
</file_info>
<file_info>
<name>primegrid_slideshow_00.png</name>
<status>1</status>
</file_info>
<app_version>
<app_name>gcwsieve</app_name>
<version_num>112</version_num>
<platform>x86_64-pc-linux-gnu</platform>
<avg_ncpus>0.020000</avg_ncpus>
<max_ncpus>0.020000</max_ncpus>
<flops>1000000000.000000</flops>
<plan_class>cuda23</plan_class>
<api_version>6.2.18</api_version>
<file_ref>
<file_name>cwpsieve-cuda-boinc-x86_64-linux</file_name>
<open_name>cwpsieve-cuda-boinc-x86_64-linux</open_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libcudart.so.2</file_name>
<open_name>libcudart.so.2</open_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>0.500000</count>
</coproc>
<cmdline></cmdline>
<gpu_ram>268435456.000000</gpu_ram>
</app_version>
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
"Ken_g6 wrote: I think the invalidation problem is fixed in v0.2.3b. The factors file needed to be non-empty even when no factors were found.
Have you compiled a new version?
"cwpsieve-cuda.zip - on Jan 8, 2011 1:55 AM by Ken Brazier (version 4 / earlier versions)"
I see no changes:
boinc@vmware2k-3:~/Cuda/cwpsieve$ ./cwpsieve-cuda-x86_64-linux
cwpsieve version cuda-0.2.3b (testing)
Compiled Jan 6 2011 with GCC 4.3.3
pmax not specified, using default pmax = pmin + 1e9
Please specify an input file or nmax
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Oops. I meant to add cudart.dll to GeneferCUDA.zip. I added it to CWPSieve.zip by mistake. As it's the wrong version for CWPSieve, I've removed it.
Thanks for catching that!
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Despite of running apps on GPU for quite a long time so not new to GPU computing, I'm pretty new to nVidia (preferred ATI/AMD for years). And since Fermi is an interesting card, I got one.
My question would be - any difference in speed vs driver version on nVidia?
Or is it just fine when you make it runnning (see ATI issues with OpenCL).
____________
My stats |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
On Linux, 256.53 is the best nVIDIA driver. But you're on Windows, and I don't know what's best there.
____________
|
|
|
|
The latest version is 260.99 and I am having no problems with it.
I've tested CWPsieve CUDA just fine with my overclocked GTX 460, but have yet to do so in BOINC because I read that using an app_info.xml resets/deletes all work currently in BOINC...so that'll have to wait til tomorrow, once my machine has run through its current stockpile.
Hopefully there are no snags with the CUDA 3.2 toolkit being installed and CWPsieve using cuda23, not cuda31 or cuda32.
Why is that, by the way?
____________
|
|
|
|
Hopefully there are no snags with the CUDA 3.2 toolkit being installed and CWPsieve using cuda23, not cuda31 or cuda32.
Why is that, by the way?
1. The initial ppsieve/tpsieve developed was done with the CUDA 2.3 SDK (without a GPU!)
2. Recompiling the app with newer SDKs brings no (noticeable) speed gains. There are some people out there who have done that ;)
3. Compiling the apps with the 3.x SDKs would enforce people to update their drivers without any necessity.
3a. Newer drivers are not always faster, better, more stable or more bug free. The 260 drivers are the proof for this statement...
____________
|
|
|
|
1. The initial ppsieve/tpsieve developed was done with the CUDA 2.3 SDK (without a GPU!)
Replace developed with development ;)
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
New version 0.2.3.c released. Here are the changes:
Date: Tue Jan 11 14:06:35 2011 -0700
Version number change: v0.2.3c
Date: Tue Jan 11 12:47:44 2011 -0700
Fixed bmove() error return value.
Date: Tue Jan 11 12:22:10 2011 -0700
Fixed factor counts for invalid checkpoints.
Date: Tue Jan 11 11:34:50 2011 -0700
Factor file checks/improvements
- If there is a checkpoint on a range and factors are expected, but not
found in the factors file, the range is restarted.
- On BOINC, a temporary factor file is used, then rename()d to the final
name.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Another way around the problem:
Create 2 new sub-projects for GPU only --
CW (Sieve) for the GPU.
PPS (Sieve) for the GPU.
And make the current sub-projects CPU only --
CW (Sieve) for the CPU.
PPS (Sieve) for the CPU.
Credit could be combined or not as Project Admins determine.
Sounds to me like Murphy has the best idea. Is it difficult to add sub projects to the preferences? As more and more GPU apps are developed this issue (if not resolved) will just get worse.
John said the boinc/pg app might be ready soon so that it would run without the app_info file. Was just wondering again how close that was to being done?
Thanks
____________
@AggieThePew
|
|
|
|
Thanks Microcruncher*. Once I get Linux running I'm sure I'll know what you mean by newer drivers being slower...
I said I'd test this with BOINC now but I can't...cruncher restarted itself because of an "update" and then booted to the wrong OS. >_>
I'll wait patiently.
Rick I'd give it a week, maybe more...but not too much more.
____________
|
|
|
|
Once again... patience is a virtue but not necessarily a blessing.
____________
@AggieThePew
|
|
|
|
Thanks Microcruncher*. Once I get Linux running I'm sure I'll know what you mean by newer drivers being slower...
The (not so) funny thing is: I built the ppsieve/tpsieve apps with the SDK 3.1/3.2 in the last days and I ended up with the same error that initially appeared when running tpsieve with the 260 drivers. I my case the error occurred with the 256.53 drivers and the CUDA 3.1 SDK but the app worked without flaws with the 260 drivers and the CUDA 3.2 SDK. Things really interact in weird ways...
____________
|
|
|
|
I think a lot of it has to do with how Linux allocates memory and virtual resources to devices, and what instructions a program gives that may or may not contradict that.
I certainly run into my fair share of computation errors on Windows, but that could be due to my card's high OC sometimes exceeding the memory allocation of the system itself!
____________
|
|
|
|
What compute capability and memory is needed? I have an nVidia Quadro NVS 135M (1.1 - 128MB).
Just like TPSieve: Compute capability 1.0 and practically no memory.
I'm not sure anyone's tried any of my sieves on a 128MB card. In theory it should work.
You are right, my notebook 8400M G with 128Mb execute gcwsieve-cuda WU's, but all of them falled into errors after 1100 sec with the message
Maximum elapsed time exceeded
http://www.primegrid.com/result.php?resultid=215609709
http://www.primegrid.com/results.php?userid=64131&offset=0&show_names=0&state=5&appid=
____________
|
|
|
|
The (not so) funny thing is: I built the ppsieve/tpsieve apps with the SDK 3.1/3.2 in the last days and I ended up with the same error that initially appeared when running tpsieve with the 260 drivers. I my case the error occurred with the 256.53 drivers and the CUDA 3.1 SDK but the app worked without flaws with the 260 drivers and the CUDA 3.2 SDK. Things really interact in weird ways...
So, is it possible that if I upgrade CUDA to the latest, I'll be able to use the 260 driver without immediate comp errors? 256 doesn't play well with xorg-server 1.9 on my computer. I'd like to use my video card for video as well as crunching. |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
"x3mEn" wrote: What compute capability and memory is needed? I have an nVidia Quadro NVS 135M (1.1 - 128MB).
Just like TPSieve: Compute capability 1.0 and practically no memory.
I'm not sure anyone's tried any of my sieves on a 128MB card. In theory it should work.
You are right, my notebook 8400M G with 128Mb execute gcwsieve-cuda WU's, but all of them falled into errors after 1100 sec with the message
Maximum elapsed time exceeded
http://www.primegrid.com/result.php?resultid=215609709
http://www.primegrid.com/results.php?userid=64131&offset=0&show_names=0&state=5&appid=
You are using an app_info.xml file. The problem is your entry for <flops>1e11</flops>.
A 8400M is a low performance card and you should lower this value at least to <flops>1e9</flops>. If it doesn't help, lower this value to <flops>1e8</flops>.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
rroonnaalldd,
ok, I'll try. |
|
|
|
Ok...so this works fine from cmd with both the boinc and standalone version.
But trying to use app_info.xml (provided by Siegfried Niklas) is just a headache. No matter what I do or what I change, all I get is:
Sending scheduler request: requested by user.
Not reporting or requesting tasks.
Scheduler request complete.
WHY? It's working and validating WUs for others. How come nothing I do will get it to work?
Yes it's the 0.2.3b version. WinXP 32bit + GTX 460. What do I need to do with the app_info (or some other xml), or my PrimeGrid prefs, to get the sieve to work and report/validate WUs?
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Ok...so this works fine from cmd with both the boinc and standalone version.
But trying to use app_info.xml (provided by Siegfried Niklas) is just a headache. No matter what I do or what I change, all I get is:
Sending scheduler request: requested by user.
Not reporting or requesting tasks.
Scheduler request complete.
WHY? It's working and validating WUs for others. How come nothing I do will get it to work?
Yes it's the 0.2.3b version. WinXP 32bit + GTX 460. What do I need to do with the app_info (or some other xml), or my PrimeGrid prefs, to get the sieve to work and report/validate WUs?
"Siegfried Niklas" wrote: <app_version>
<app_name>gcwsieve</app_name>
<version_num>019</version_num>
Take a look at http://www.primegrid.com/apps.php.
I think version number should be 112 for "Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU" and 101 for "Microsoft Windows (98 or later) running on an Intel x86-compatible CPU"...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Ok I'll redo that app_info when I get home then.
I don't need to do init_data or anything, correct? Also, and this really shows my ineptitude, my PrimeGrid preferences are set to "Use NViDIA GPU" and not to use the CPU, with only Cullen/Woodall Prime Search (sieve) selected. Exactly the approach I use to get only PPS sieve CUDA units on this machine. So it's the same idea...right?
Thanks rrrrronald. I'll update this in a couple hours once I've tried it out!
____________
|
|
|
|
"rroonnaalldd" wrote:
[quote="Siegfried Niklas"]
<app_version>
<app_name>gcwsieve</app_name>
<version_num>019</version_num>
Take a look at http://www.primegrid.com/apps.php.
I think version number should be 112 for "Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU" and 101 for "Microsoft Windows (98 or later) running on an Intel x86-compatible CPU"...
In my first app_info I used the version number 112, but in BM the Version Number was shown as 019 - so I changed it.
Current I run it with version number 019 - but in BM is shown 0.01
Doesn't matter for work-fetch (Win Vista, 9800GT, BM 6.10.58)
>>Host 94904
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
|
Hey again.
Doesn't matter what I change it to. 101, 112, 019, 023, 001 - nothing works.
I still get "Not reporting or requesting tasks." I AM requesting tasks!!
What do I need to put in the www.primegrid.com directory in my BOINC data folder, exactly? I have the two sieveinput_in files, the cuda-boinc-windows-intel86.exe, and app_info.xml.
"Found app_info - using anonymous platform" - okay, good! Now why won't it get/do work?
____________
|
|
|
|
Ah never mind, this is stupid. It's not worth all this frustration on my part. Clearly someone messed up somewhere. Whether it's me or some coder, I can't tell. All I see is tasks working on other computers and not on mine, regardless of how closely I follow instructions.
Some other project can have my GPU for now. I spent hours trying to get GeneferCUDA to work (and it does now) so might as well be that.
I'll check back when CWPsieve(CUDA) is a bit more stable :)
____________
|
|
|
|
It is CWPSieve CUDA testing
I'm a tester since years and was acting "desperately" many times.
If you are looking for "smooth crunching" you are wrong at a testing projekt.
This is the whole content of my www.primegrid.com folder
For crunching gcwsieve you need:
app_info.xml
cwpsieve-cuda-boinc-x86-windows.exe
gcwsieve_20110101.sieveinput_in
gcwsieve_1250180098.sieveinput_in
ppconfig.txt
cudart.dll
Downloads: http://www.primegrid.com/forum_thread.php?id=2742&nowrap=true#30571
Your BM is not requesting tasks.
1. Stop all other GPU-Projects in BM
2. Reset Primegrid
3. Stop BM
4: Copy the needed files in "www.primegrid.com" folder
5. Start BM
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
ppconfig.txt
I found no ppconfig.txt in the PG download folder. Either this file isn't needed or comes with the WU...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Siegfried, I understand it's in the testing phase. I just don't understand why it works for you and not me, since we are using essentially the same platforms. Then again, take a look at my nickname. Code and compiling and testing has always always always gotten the better of me.
Hence I enjoy being involved in testing, even minimally. I helped test GeneferCUDA and was able to help find a lot of runtime errors (which were probably my fault in the end) so this project seemed interesting as well, given I love GPU crunching :P
That said. I have the ppconfig.txt in my BOINC directory under PrimeGrid. I have reset the project, and it downloads all the needed files again...but will not get tasks.
Given the screenshot you posted earlier, I take it I do not have to detach all other projects! So I won't go that far.
I will however see about getting a 32-bit Linux up and running, maybe in the next day (I found a spare hard drive in some odd part of my room).
I think with my case, it is the version number causing problems. However I see nowhere a stderr.txt, or something saying "hey something went wrong."
Perhaps it's just something to be worked out in testing...oh well, they said it'd be live in the near future. I just kinda hoped I could be more helpful in testing since I know my way around an overclocked GPU. ;)
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
ppconfig.txt
I found no ppconfig.txt in the PG download folder. Either this file isn't needed or comes with the WU...
It's not needed. Also, I set my Version to 192, so I don't think that matters either.
____________
|
|
|
|
Bleurgh. Y'know what? I'm gonna do what made GeneferCUDA work, and simply delete everything associated with CWPsieve CUDA, reset PrimeGrid, and then rewrite the app_info.xml (based on Siegfried's)
Ken, the latest version of CWP CUDA reads on cmd as 0.2.3c (testing) - that's the latest version right? As in, if I set up everything correctly with that, I will get WUs and theoretically they will validate?
Waiting on a rather large GFN task to finish hogging my 460 and then it's off to work...
____________
|
|
|
|
Ok, I'm through with being a P.I.T.A here - rrrronald's done it, thanks for the app_info template. I had made a similar one but left out some things and had a few things, like <coproc> marked incorrectly.
Now I'm doing tasks in two minutes and they're validating.
Well I'll be going now. ;)
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
130sec for 182 credits sounds great.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Yup, and this is without my overkill-overclocking to 910/1200/1820. :D
Which I won't do, btw, my room is hot enough...
Now to chase those badges... ;)
____________
|
|
|
|
@NullCoding
A tough going - congratulations on your success :)
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
|
I installed under BOINC Manager using the app_info supplied by Siegfried Niklas (thanks!). No problems apart from downloading the sieve file. I changed my preferences to do cpu work and got the sieve file that way.
Only thing that surprised me was that 4 GPU tasks ran at the same time, each took c. 6mins to complete (W7 64bit GTX 570 stock speeds).
I've switched back to ppsieve as for 6mins of crunching there I get 2300 credits and for cwsieve I get 4x180 credits.
I'm sure that will change.
Well done all.
____________
35 x 2^3587843+1 is prime! |
|
|
|
Followed the advices from this post and GCW Sieve crunches on my NVIDIA GTS 450 without Problems.
Runtimes are ~ 450s/wu with 2 workunits parallel.
Now heading towards the silver badge!
Tasks and Computer specs: click |
|
|
|
I put the files in the folder but I do not receive units to calculate .... ?
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
app_info.xml in the PG-folder?
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
yes I put the file in the folder app_info PrimeGrid
____________
|
|
|
STE\/E Volunteer tester
 Send message
Joined: 10 Aug 05 Posts: 573 ID: 103 Credit: 3,659,101,651 RAC: 314,262
                     
|
I put the files in the folder but I do not receive units to calculate .... ?
Same here, app in folder, cudart in folder, exe's in folder, but BOINC Manager keeps trying to download the 2 sieveinput files even though I put them in the folder too ... ?
1/14/2011 2:17:38 PM PrimeGrid Backing off 1 min 0 sec on download of gcwsieve_1250180098.sieveinput_in
1/14/2011 2:17:38 PM PrimeGrid Backing off 1 min 0 sec on download of gcwsieve_20110101.sieveinput_in
____________
|
|
|
|
I'll check my own app_info and provided informations on my main system tomorrow.
I set it (crunching PPS-CUDA) on no new work.
I'll follow my own descriptions to get it to run.
(I7-980X, GTX295, Win7 prof., HOST 95767
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
I wrote another app_info for Win32 and according to "NullCoding" >>Ok, I'm through with being a P.I.T.A here - rrrronald's done it, thanks for the app_info template. I had made a similar one but left out some things and had a few things, like <coproc> marked incorrectly.<< it works:
<app_info>
<app>
<name>gcwsieve</name>
<user_friendly_name>Cullen/Woodall Prime Search (Sieve)</user_friendly_name>
</app>
<file_info>
<name>gcwsieve_20110101.sieveinput_in</name>
<status>1</status>
<sticky/>
</file_info>
<file_info>
<name>gcwsieve_1250180098.sieveinput_in</name>
<status>1</status>
<sticky/>
</file_info>
<file_info>
<name>cwpsieve-cuda-boinc-x86-windows.exe</name>
<status>1</status>
<executable/>
</file_info>
<file_info>
<name>cudart.dll</name>
<status>1</status>
</file_info>
<file_info>
<name>stat_primegrid.png</name>
<status>1</status>
</file_info>
<file_info>
<name>primegrid_slideshow_00.png</name>
<status>1</status>
</file_info>
<app_version>
<app_name>gcwsieve</app_name>
<version_num>101</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.020000</avg_ncpus>
<max_ncpus>0.020000</max_ncpus>
<flops>1000000000.000000</flops>
<plan_class>cuda23</plan_class>
<api_version>6.2.18</api_version>
<file_ref>
<file_name>cwpsieve-cuda-boinc-x86-windows.exe</file_name>
<open_name>cwpsieve-cuda-boinc-x86-windows.exe</open_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
<open_name>cudart.dll</open_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1.000000</count>
</coproc>
<cmdline></cmdline>
<gpu_ram>268435456.000000</gpu_ram>
</app_version>
</app_info>
Needed files have to manually downloaded in the PG-folder (like written by "Siegfried Niklas"):
http://www.primegrid.com/download/gcwsieve_1250180098.sieveinput and renaming to gcwsieve_1250180098.sieveinput_in
http://www.primegrid.com/download/gcwsieve_20110101.sieveinput and renaming to gcwsieve_20110101.sieveinput_in
http://www.primegrid.com/download/cudart.dll
https://sites.google.com/site/kenscode/prime-programs/cwpsieve-cuda.zip?attredirects=0&d=1 unzip the file "cwpsieve-cuda-boinc-x86-windows.exe" also to the PG-folder
[add]
renaming the sieve files
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
I wrote another app_info for Win32 and according to "NullCoding"
Thanks for that.
I checked my one app-info on another system and it works without any problem.
But I'm not able to test it on FERMI based cards - due to a lack of hardware :(
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
|
No problems with workfetch but with validation.
We may have the ongoing problem that all WU's with a CPU-time
equal or below 3.00 seconds are marked as invalid
Valid
Invalid
another system:
Valid
Invalid
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
No problems with workfetch but with validation.
We may have the ongoing problem that all WU's with a CPU-time
equal or below 3.00 seconds are marked as invalid
Valid
Invalid
another system:
Valid
Invalid
Two possible solutions.
Either you run 2 or more units in parallel: <coproc>
<type>CUDA</type>
<count>0.5</count>
</coproc> [add] and sometimes you must also lowering the flops value (prevents the message "Maximum elapsed time exceeded" after ~1100sec):
<flops>1e9</flops>
...or you increase the values for avg_ncpus and max_ncpus: <avg_ncpus>0.5</avg_ncpus>
<max_ncpus>0.5</max_ncpus>
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
"rroonnaalldd" wrote:
...
Two possible solutions.
Either you run 2 or more units in parallel:
<coproc>
<type>CUDA</type>
<count>0.5</count>
</coproc>
...
I changed to running 2 units in parallel.
This works :)
Thanks
____________
Member of Crunching Family
http://crunching-family.at/ |
|
|
|
I was gonna make an app_info file for Macs after seeing if I could get it to work on mine...but where the heck did the Mac builds go? Iain posted them earlier - was there a problem, or am I not looking hard enough?
____________
|
|
|
|
lowering the flops value (prevents the message "Maximum elapsed time exceeded" after ~1100sec):
<flops>1e9</flops>
rroonnaalldd, I think it will be interesting for you, lowering flops to 1e9 really solved the problem:
http://www.primegrid.com/result.php?resultid=216670011
2:48:43 @ 8400M vs 0:01:44 @ 460GTX
not bad result as for integrated videocard, yeh? ;) |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
lowering the flops value (prevents the message "Maximum elapsed time exceeded" after ~1100sec):
<flops>1e9</flops>
rroonnaalldd, I think it will be interesting for you, lowering flops to 1e9 really solved the problem:
http://www.primegrid.com/result.php?resultid=216670011
2:48:43 @ 8400M vs 0:01:44 @ 460GTX
not bad result as for integrated videocard, yeh? ;)
Congratulations on your success.
I changed this flops-value from 1e11 to 1e9 after some troubles with "Maximum elapsed time exceeded" while calculating two CWPsievings in parallel in all Cuda app_info's at http://primegrid.pytalhost.net/. Not every user has a fermi card...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Please note, the Mac builds at http://www.pyramid-productions.net/downloads/cwpsieve-cuda.tar.gz are now updated to 0.2.3c.
Cheers
- Iain |
|
|
|
Excellent, thanks Iain. :)
Took a bit of tweaking to get things to run on my MBP + 330M. Now it's working just fine (albeit quite slowly and with much interface lag.
My laptop GPU has 256MB of memory, and when I originally set up the app_info, it was scaled to 256MB of memory. This led BOINC in an infinite loop where it DID download tasks but would not start them - "Waiting for GPU memory."
A bit of math and now it's running. I figured 128MB*1024KB/MB*1024B/KB=134217728, which goes in the <gpu_RAM> tag. I guess I could use more but the lag might be too much for the machine, even though the display is in sleep mode 80% of the time!
Anyway I now have CWPsieve CUDA running on all the machines capable of it. Cool stuff. :)
Mac users, note that any time you modify the BOINC Data folder, you must right click on the www.primegrid.com folder, click Get Info, and then click the wheel at the bottom under "Sharing and Permissions" making sure that boinc_master and boinc_project have read and write, then click "Apply to all" or whatever that says. Say yes. That makes things like the app_info and the cwpsieve-cuda-boinc-x86_64-mac app owned NOT by you.
If you have ownership of anything in the BOINC folder it can't use it and won't open. That confused me at first but I found a workaround.
See I'm a Mac guy through and through. This stuff's no problem. At least, not yet.
____________
|
|
|
|
App_info file for Mac including builds by Iain Bethune and libcudart.dylib and its signature - here.
Tested and working perfectly on a Core i5 MBP, Mac OS X 10.6.6 w/GeForce GT 330M. Roughly half hour per WU.
Note a few things:
Must make sure to drop these files into www.primegrid.com folder and then get the folder info; make sure boinc_master and boinc_project have full read+write access, then click the wheel-like icon and apply that to all contents - otherwise BOINC will not open!
As soon as you modify any of these files, they become owned by you and BOINC will not use them. I'm not quite sure if there is an easier workaround but this approach certainly works fine.
<flops> are set to 1.0e10 whereas on bigger, better GPUs, it might be 1.0e11. Since most Macs with CUDA cards have either a 240M or 330M, 9400M GT or 9600M GT, this needs to be lower!
<gpu_ram> is set (in bytes) to represent 128MB. Tested with a 256MB GPU and the interface lag was kind of extreme, sorry. Change this value to whatever half of your GPU's memory is, or play around with smaller numbers if you want to reduce graphical lag.
Also I made this on a 64-bit OS, so...make the appropriate changes with what app you use.
As far as testing goes, this is a first run. I wonder if anyone can get the Mac versions to run even faster/better!
____________
|
|
|
|
If I remember from the PPSieve/TPSieve testing, the Mac version should run just as fast as the equivalent hardware on Windows/Linux. While I'm sure ti would be nice to have faster code on the Mac :) I'm not aware of any specific optimisation that we could do, since all the time is spent in CUDA, which shares the same kernel code with the Windows and Linux versions. If the quality of drivers available for a given platform makes any difference I'm not sure, but again there is not much we can do about it! |
|
|
|
The "slower CUDA apps on Mac" is just something people are likely to notice if they're used to seeing completion times of 2 minutes (GTX 460 + WinXP 32) or less...! To the best of my knowledge, the best CUDA-capable card currently in Macs is the GeForce GT 330M, which I have; takes about 30 minutes.
In fact...if I remember correctly...only Mac laptops have NViDIA whereas the desktop models have ATi chips in.
The code IS quite optimized. Interface lag is due to memory consumption, I know that. ;) I'm not using the computer all the time (in fact, barely ever!) so it's no problem.
I've heard rumors that Apple may be putting some kind of Quadro in the Mac Pro...I wish we could also code for OpenCL, since so many people have ATi cards (like, say, most Macs!). That'll come, I suppose.
____________
|
|
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,254,478,021 RAC: 748,894
                           
|
"Siegfried Niklas" wrote: No problems with workfetch but with validation.
We may have the ongoing problem that all WU's with a CPU-time
equal or below 3.00 seconds are marked as invalid
The same here.
ID 128555 9600GT + C2D 2.33 GHz @ Ubuntu 64bit CPU time > 3s for all tasks - all tasks validated
ID 131846 9600GT + C2D 3.0 GHz @ Ubuntu 64bit CPU time < 3s for most of tasks - all tasks with CPU time < 3s marked as invalid. I have tried all recomendations (<flops>, <ncpus>, <count> in <CUDA> part - 2 tasks concurently) but no succes. CPU time >3s I have got when I suspended and resumed the project only. Any idea? |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
The same here.
ID 128555 9600GT + C2D 2.33 GHz @ Ubuntu 64bit CPU time > 3s for all tasks - all tasks validated
ID 131846 9600GT + C2D 3.0 GHz @ Ubuntu 64bit CPU time < 3s for most of tasks - all tasks with CPU time < 3s marked as invalid. I have tried all recomendations (<flops>, <ncpus>, <count> in <CUDA> part - 2 tasks concurently) but no succes. CPU time >3s I have got when I suspended and resumed the project only. Any idea?
Try to run more than 2 units in parallel: <coproc>
<type>CUDA</type>
<count>0.25</count>
</coproc>
...and increase the values for avg_ncpus and max_ncpus: <avg_ncpus>0.2</avg_ncpus>
<max_ncpus>0.2</max_ncpus>
The value for <avg_ncpus> and <max_ncpus> should never reach one full cpu core or you will lose one core for Boincing. If this value is equal or bigger than 1, one cpu core will be reserved by Boinc client for the GPU work (means feeding the GPU with work)...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,254,478,021 RAC: 748,894
                           
|
9600GT has not sufficient memory for more than 2 CUDA CWPS tasks. I had to lover the <gpu_ram> to run 2 tasks...Load of GPU RAM was 90% when used GPU-Z to test 2 CUDA units in parallel. |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Okay then try higher values for <avg_ncpus> and <max_ncpus>.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,254,478,021 RAC: 748,894
                           
|
I have tried:
- <avg_ncpus> to 0.3 and <max_ncpus> to 1 (two GPU tasks and one CPU task),
- to increase the nice of the CUDA processes,
- <avg_ncpus> to 0.5 and <max_ncpus> to 1 (two GPU tasks and no CPU task),
with no success. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Here's an odd idea. Try downgrading to the oldest drivers you can possibly use on that GPU. (What is that, 180.*?) If you get old enough drivers, the backup sleep-wait system might kick in. And it uses more CPU time.
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
I use version 195.36.31 with Cuda23-SDK and it works but i have no CWP-WUs listed in the moment. This version works also in DnetC and Collatz.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
STE\/E Volunteer tester
 Send message
Joined: 10 Aug 05 Posts: 573 ID: 103 Credit: 3,659,101,651 RAC: 314,262
                     
|
Not sure if this is the right place for this but I'm running 4 @ a time (2 each GPU) of the CWPSieve CUDA Wu's on a Dual GTX 570 Box in about 2:30 Min's running at 850 Core Speed, doesn't seem to matter if the Memory is set to High or Low the Wu's still seem to run in the same amount of time .... http://www.primegrid.com/results.php?hostid=44845&offset=0&show_names=0&state=2&appid=
Thanks goes out to rroonnaalldd's new app posted yesterday ... :)
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
"nenym" wrote: I have tried:
- <avg_ncpus> to 0.3 and <max_ncpus> to 1 (two GPU tasks and one CPU task),
- to increase the nice of the CUDA processes,
- <avg_ncpus> to 0.5 and <max_ncpus> to 1 (two GPU tasks and no CPU task),
with no success.
Nenym take a look at following WU from Steve*:
CPU time........2.94
Validate state..valid
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
STE\/E Volunteer tester
 Send message
Joined: 10 Aug 05 Posts: 573 ID: 103 Credit: 3,659,101,651 RAC: 314,262
                     
|
A couple from another Box, a GTX 580, almost all the Wu's finish over the 3.00 CPU Time but the few that don't seem to Validate for me so far ...
http://www.primegrid.com/result.php?resultid=219626935
http://www.primegrid.com/result.php?resultid=219627591
____________
|
|
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,254,478,021 RAC: 748,894
                           
|
Interesting.
The propblem I solved as a beginner - by switching this computer to WinXP 32 bit. The same CPU, GPU (drivers 266.35) include OC, run time the same, CPU time about 4.5 - 6s.
The linux 64bit app seems to be too perfect, it should be coded more inefficiency :-). |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
The linux 64bit app seems to be too perfect, it should be coded more inefficiency :-).
Y'know, that's funny, because I run Ubuntu 9.04 64-bit, on a C2Q9400@3GHz, (edit, with the 256.53 drivers) and I can't get my CPU WU times under about 26 seconds! And I don't know why.
____________
|
|
|
|
Call it a shot in the dark, but I know Ubuntu has some feature that throttles your processor while "idle," and is somewhat notorious for viewing BOINC as "idle time." Or maybe that's just 10.04/10.10+.
The CPU time on my 5+ year-old Pentium 4 is always around 6 seconds for a 1m44s to 2m10s GTX 460 runtime, whereas the CPU time on my i5 540M is about 127s for a half-hour task using a GT 330M.
I guess it varies by OS + GPU combination - some don't require as much time to feed the GPU with work, whereas others do. I'd imagine it could have something to do with the fact I only allocated half of my GPU memory to CWPsieve on the 330M - 128MB of 256MB. I don't know. I'm stabbing at shadows here. :P
____________
|
|
|
nenymSend message
Joined: 23 Apr 09 Posts: 22 ID: 39029 Credit: 2,254,478,021 RAC: 748,894
       |