Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
We've had to turn off the PPS-Sieve app. It's missing factors.
Ken's written a new version, but we need help building and testing it. More details later.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
We're suffering from a lack of ATI hardware, and also a lack of available time.
If anyone is able to build the Windows versions of the new ATI app, it's located here:
https://github.com/Ken-g6/PSieve-CUDA/tree/redcl
We then need one or more people to test it.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
A few more details.
I can and have built the Linux versions of the app. No problems there.
I was able to build the Windows version without BOINC. I was also able to build the BOINC libraries. But I wasn't able to compile them together. When I asked about the error, the response wasn't helpful considering the instructions still say to use a compiler not supported on the latest BOINC libraries. I may try this again next weekend if no one else comes up with anything.
And I have no chance of building anything for Mac.
Oh, and you want this new version built. It might not fix everything, but it will fail fast if it's missing a lot of factors. It knows when it's potentially missing factors and tries to recover with the CPU. Which means no more fatal Computation Errors unless there's a systemic problem. And I finally implemented the last optimization from the CUDA version. Which didn't help older AMD cards, as expected, but nearly doubled the speed of GCN cards!
____________
|
|
|
|
Is it simple to test (i mean does it run on boinc or do i physically need to build it by programming it)?
thanks for your answer,
Chris |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Well, the goal here is to compile the new app, with a compiler, linked against both the BOINC libraries and the AMD OpenCL libraries. I think one of the presets in the MS Visual C++ files is set to something like TPSieve BOINC. In theory if all your paths are set up right, that should compile the app correctly.
You can test the app, once it's compiled, by running it from the command line, without running it in BOINC. A simple test is in the first post here. If it doesn't produce a stderr.txt file, you didn't compile against the BOINC libraries.
____________
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1033 ID: 301928 Credit: 543,624,271 RAC: 6,563
                         
|
I was able to build the Windows version without BOINC. I was also able to build the BOINC libraries. But I wasn't able to compile them together. When I asked about the error, the response wasn't helpful considering the instructions still say to use a compiler not supported on the latest BOINC libraries.
Look like all you have to do is to #include <windows.h> BEFORE boinc_api.h.
Or even better look inside boinc_api.h and find out why it does not #include <windows.h> although it references types defined there (HANDLE etc). May be they require some crazy #define to declare platform, may be they just don't care.
|
|
|
|
I have to R9 290X if they will help for testing. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Well, I don't know what I did differently this time, but I got a Windows BOINC binary built. Maybe I'd built the BOINC libraries as debug rather than release. Maybe I had a bad download of the AMD APP SDK. In any case, I now have Linux and Windows binaries, and all that's left is Mac. And testing.
I guess we should get started with testing now. Here's a link to a zipfile with both alpha non-BOINC and new BOINC versions for Windows and Linux. The alpha non-BOINC version has been previously tested by others, so it should work to make sure your setup is correct, and should give you an idea of what to expect when you run tpstest.bat (or .sh). You should basically get one factor for each line of the batch file, and no errors.
When you're ready to test the BOINC version(s), please rename each in place of the 32-bit non-BOINC version. (There are two Linux binaries, but only one Windows binary.) Then run the batch or .sh file. You should get one factor for each line of the batch file. Then look at stderr.txt which will have been generated. You should see lots of instances vaguely like this:
shmget in attach_shmem: Invalid argument
19:17:33 (22971): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 5092299000000 <= p < 5092300000000
Thread 0 starting
Detected 112 multiprocessors (560 SPUs) on device 0.
Device 0 is a NVIDIA Corporation GeForce GTX 460.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 5092299000000 <= p < 5092300000000
count=34234,sum=0x026b580320430c66
Elapsed time: 0.52 sec. (0.01 init + 0.52 sieve) at 2020954 p/sec.
Processor time: 0.51 sec. (0.00 init + 0.51 sieve) at 2064126 p/sec.
Average processor utilization: 0.72 (init), 0.98 (sieve)
19:17:33 (22971): called boinc_finish
The point being that you should see no other errors.
Testing on ATI and AMD GPUs and APUs is the most important. Testing on Nvidia is possible, but since the CUDA code is faster we probably won't use it. Testing on OpenCL-capable Intel graphics would be...really interesting. I don't know what to expect there, but it would be nice if it worked.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,144,525,586 RAC: 2,292,407
                                      
|
I guess we should get started with testing now. Here's a link to a zipfile with both alpha non-BOINC and new BOINC versions for Windows and Linux.
Ken, it got me confused.
Link is to 3+ years old executables.
I guess correct one is https://sites.google.com/site/kenscode/prime-programs/tps-cl-alpha.zip?attredirects=0&d=1
____________
My stats |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
I guess correct one is https://sites.google.com/site/kenscode/prime-programs/tps-cl-alpha.zip?attredirects=0&d=1
Yes, that's the right one.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,144,525,586 RAC: 2,292,407
                                      
|
You should get one factor for each line of the batch file. Then look at stderr.txt which will have been generated. You should see lots of instances vaguely like this:
Getting factors and no errors.
But getting a lot of factors on my HD 7950. One factor per range, 7 or even 15 factors per range.
tpfactors.txt has 451 factors/lines.
c:\_PG\TPSieve>tpsieve-cl-x86-windows.exe -p5097429000000 -P5097430000000 -k 5 -K 9999 -n 2M -N 3M -M 2
tpsieve version cl-0.2.5-alpha2 (testing)
nstart=2000000, nstep=29
tpsieve initialized: 5 <= k <= 9999, 2000000 <= n < 3000000
Sieve started: 5097429000000 <= p < 5097430000000
Thread 0 starting
Detected 448 multiprocessors (2240 SPUs) on device 0.
Device 0 is a Advanced Micro Devices, Inc. Tahiti.
GCN device detected; use -m1 --vecsize=4 to undo effect
Changed nstep to 28
CL setup complete.
cthread_count = 114688
5097429079643 | 3919*2^2048739-1
5097429116887 | 3989*2^2176328-1
5097429169793 | 3907*2^2064512+1
5097429208097 | 5691*2^2552740+1
5097429272119 | 4429*2^2565031-1
5097429517633 | 5823*2^2035150-1
5097429704791 | 6837*2^2087551+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 5097429000000 <= p < 5097430000000
Found 7 factors
count=34009,sum=0x0267e49e7236c87d
Elapsed time: 0.72 sec. (0.01 init + 0.71 sieve) at 1478909 p/sec.
Processor time: 0.56 sec. (0.03 init + 0.53 sieve) at 1976942 p/sec.
Average processor utilization: 3.12 (init), 0.75 (sieve)
c:\_PG\TPSieve>tpsieve-cl-x86-windows.exe -p35679989642e7 -P35679989643e7 -k5 -K9999 -n3000000 -N6000000 -T -M2 -c60
tpsieve version cl-0.2.5-alpha2 (testing)
nstart=3000000, nstep=46
tpsieve initialized: 5 <= k <= 9999, 3000000 <= n < 6000000
Sieve started: 356799896420000000 <= p < 356799896430000000
Thread 0 starting
Detected 448 multiprocessors (2240 SPUs) on device 0.
Device 0 is a Advanced Micro Devices, Inc. Tahiti.
GCN device detected; use -m1 --vecsize=4 to undo effect
nstep changed to 32
CL setup complete.
cthread_count = 114688
356799896420257891 | 6125*2^4654257+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 356799896420000000 <= p < 356799896430000000
Found 1 factor
count=247653,sum=0x24ec7a976776d2ff
Elapsed time: 5.05 sec. (2.26 init + 2.78 sieve) at 3674793 p/sec.
Processor time: 4.02 sec. (2.25 init + 1.78 sieve) at 5748736 p/sec.
Average processor utilization: 0.99 (init), 0.64 (sieve)
____________
My stats |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Nuts, you're right. It's been a long time since I set up this test. I'm running the test now with the old CPU app and I'll have a presumably-correct tpfactors.txt to compare against soon.
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
The alpha non-BOINC version has been previously tested by others, so it should work to make sure your setup is correct, and should give you an idea of what to expect when you run tpstest.bat (or .sh). You should basically get one factor for each line of the batch file, and no errors.
I'm testing on an Nvidia GTX 580. I get multiple factors for each test, not one. Only for the very last line in the .bat file do I get one factor.
Output of the boinc and non-boinc executables is identical. No unusual error messages in the stderr.txt.
I'll send you a PM with the output (so as not to clog up this thread.)
____________
My lucky number is 75898524288+1 |
|
|
|
I'm running an HD 5670, and I have the same experience as Michael - lots of factors, then one factor, and no errors.
GPU usage fluctuated between 0% and 22% until the last task when it hit 99%.
stderr.txt available if needed. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Yes, you all seem to be getting the correct results; I should have designed a better test. Working on that now with Mike...
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Here's a better test. Use these command line parameters:
-p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
This is an actual PPS-Sieve task. This is the correct output. The Mac ATI task missed the factors in red:
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
If you have an ATI card, try running the above command and see if you get all the factors.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
More information on doing this test:
So far we only have Windows and Linux builds, so let's start with those and see if the new builds find all the factors. If they do find them all, try running the test again with the old production app (which will ONLY work on ATI and won't run on Nvidia). You can download the production executables here:
Windows: http://www.primegrid.com/download/primegrid_tpsieve_1.39_windows_intelx86__atiPPSsieve.exe
Linux 32bit: http://www.primegrid.com/download/primegrid_tpsieve_1.39_i686-pc-linux-gnu__atiPPSsieve
Linux 64bit: http://www.primegrid.com/download/primegrid_tpsieve_1.39_x86_64-pc-linux-gnu__atiPPSsieve
____________
My lucky number is 75898524288+1 |
|
|
|
This is the output of tpfactors.txt. It looks like I missed two of them:
C:\Sieve>tpsieve-cl-boinc-x86-windows.exe -p13120716e9 -P13120725e9 -k5 -K9999 -
n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5 (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 5120
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 51 factors
Missed:
13120719059517079 | 2445*2^7778847-1
13120723071996931 | 4241*2^7534559+1
Hope this helps. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
That's different! Any error reports in stderr.txt?
Edit: You're not on a GCN card, right? What happens if you try adding --vecsize=1?
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Bug reproduced! (With --vecsize=4 on Nvidia/GCN if anyone else is interested.) I'll look into it when I have more time.
For now, no need to run more tests on the new code, but running the same range on one of the old executables Michael listed could be helpful.
____________
|
|
|
|
Results from Windows, 280X
C:\Users\*\Downloads>ppssieve.exe -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
-----------------------------------------------------------
C:\Users\*\Downloads\tps-cl-alpha>tpsieve-cl-x86-windows.exe -p131207
16e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5-alpha2 (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a Advanced Micro Devices, Inc. Tahiti.
GCN device detected; use -m1 --vecsize=4 to undo effect
nstep changed to 32
CL setup complete.
cthread_count = 131072
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Thread 0 completed
Waiting for threads to exit
Sieve complete: 13120716000000000 <= p < 13120725000000000
Found 51 factors
count=242515220,sum=0x2dca915341c02724
Elapsed time: 879.75 sec. (0.41 init + 879.34 sieve) at 10235206 p/sec.
Processor time: 1013.19 sec. (0.41 init + 1012.78 sieve) at 8886608 p/sec.
Average processor utilization: 0.99 (init), 1.15 (sieve)
-------------------------------------------------------------
2 factors missed. |
|
|
|
That's different! Any error reports in stderr.txt?
Edit: You're not on a GCN card, right? What happens if you try adding --vecsize=1?
Just to close the loop...
The card is not a GCN - it's an older HD 5670.
stderr.txt is as follows:
20:12:36 (6448): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 80 multiprocessors (400 SPUs) on device 0.
Device 0 is a Advanced Micro Devices, Inc. Redwood.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 7339.64 sec. (0.51 init + 7339.13 sieve) at 1226330 p/sec.
Processor time: 2534.88 sec. (0.53 init + 2534.35 sieve) at 3551288 p/sec.
Average processor utilization: 1.03 (init), 0.35 (sieve)
22:14:55 (6448): called boinc_finish
|
|
|
|
Using the binary Michael posted:
C:\Sieve>primegrid_tpsieve_1.39_windows_intelx86__atiPPSsieve.exe -p13120716e9 -
P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 5120
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt:
07:51:24 (6380): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 80 multiprocessors (400 SPUs) on device 0.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 8156.42 sec. (0.51 init + 8155.91 sieve) at 1103518 p/sec.
Processor time: 2498.93 sec. (0.55 init + 2498.39 sieve) at 3602400 p/sec.
Average processor utilization: 1.07 (init), 0.31 (sieve)
10:07:20 (6380): called boinc_finish |
|
|
|
I have an ATI Mac, but not a DP one. No point in trying to test anything with that, right ? |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
PPS Sieve doesn't use floating point at all. Go for it!
____________
|
|
|
|
Windows 7 - AMD HD5830
Missing 2 factors for the new Windows BOINC alpha app on my tests as well.
Old app
C:\Users\NeoMetal>primegrid_tpsieve_1.39_windows_intelx86__atiPPSsieve.exe -p131
20716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 14336
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr output:
19:07:37 (9628): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 224 multiprocessors (1120 SPUs) on device 0.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 2692.25 sec. (0.46 init + 2691.79 sieve) at 3343575 p/sec.
Processor time: 863.67 sec. (0.48 init + 863.18 sieve) at 10426725 p/sec.
Average processor utilization: 1.05 (init), 0.32 (sieve)
19:52:29 (9628): called boinc_finish
New alpha app
C:\Users\NeoMetal>tpsieve-cl-boinc-x86-windows.exe -p13120716e9 -P13120725e9 -k5
-K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5 (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 14336
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 51 factors
stderr output:
15:31 (9816): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 224 multiprocessors (1120 SPUs) on device 0.
Device 0 is a Advanced Micro Devices, Inc. Cypress.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 2545.42 sec. (0.46 init + 2544.95 sieve) at 3536484 p/sec.
Processor time: 929.80 sec. (0.47 init + 929.33 sieve) at 9684609 p/sec.
Average processor utilization: 1.02 (init), 0.37 (sieve)
20:57:56 (9816): called boinc_finish
EDIT: Added stderr output
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Interesting. I just ran the same test three times. The computer is a Windows 7 box with both an ATI and Nvidia GPU.
I ran the production ATI and CUDA apps, and they both found all 53 factors.
The new ATI app, however, missed two factors:
13120719059517079 | 2445*2^7778847-1
13120723071996931 | 4241*2^7534559+1
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Yes, there's clearly a bug. I won't get around to fixing it until at least the weekend. |
|
|
|
I have built a Mac binary using the new code. It is uploaded (along with the latest 'production' version) to http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac.zip
My machine has an Integrated Intel Iris Pro GPU, which only manages ~600,000p/sec so will take some hours to complete the test. In the meantime, if anyone has a discrete GPU (ATI or NVIDIA) and would like to test, please download those binaries and have a go.
I will post results when I get them (hopefully tomorrow!).
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
Windows 7 - AMD HD7970
>primegrid_tpsieve_1.39_windows_intelx86__atiPPSsieve.exe -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
I have built a Mac binary using the new code. It is uploaded (along with the latest 'production' version) to http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac.zip
My machine has an Integrated Intel Iris Pro GPU, which only manages ~600,000p/sec so will take some hours to complete the test. In the meantime, if anyone has a discrete GPU (ATI or NVIDIA) and would like to test, please download those binaries and have a go.
I will post results when I get them (hopefully tomorrow!).
Cheers
- Iain
It would also be interesting to see if you miss the 6th factor (or any factors) using the older production app.
____________
My lucky number is 75898524288+1 |
|
|
|
I have built a Mac binary using the new code. It is uploaded (along with the latest 'production' version) to http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac.zip
My machine has an Integrated Intel Iris Pro GPU, which only manages ~600,000p/sec so will take some hours to complete the test. In the meantime, if anyone has a discrete GPU (ATI or NVIDIA) and would like to test, please download those binaries and have a go.
I will post results when I get them (hopefully tomorrow!).
Cheers
- Iain
It would also be interesting to see if you miss the 6th factor (or any factors) using the older production app.
I'm trying the production version on a 2008 MacBook Pro with the 512Mb GeForce 8600M GT. I did find the 6th factor (5089*2^8180582+1); it will probably be Sunday before the rest finish.
|
|
|
|
OK, some results below. The old production app found all the expected factors (up to the point where I killed it halfway through as I needed to use the machine this morning). I will re-run this on my newer Iris Pro to confirm. The new is missing 2 factors as per previous posts.
Cheers
- Iain
Using the production app on an Nvidia 9600M GT :
./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 4096
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
^CFound 25 factors
20:23:02 (4153): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 64 multiprocessors (320 SPUs) on device 0.
Device 0 is a NVIDIA GeForce 9600M GT.
Thread 0 interrupted
Sieve incomplete: 13120716000000000 <= p < 13120720793303041
count=129161437,sum=0x5fe9be81dd4a0e9d
Elapsed time: 45286.56 sec. (0.29 init + 45286.27 sieve) at 105845 p/sec.
Processor time: 12337.81 sec. (0.28 init + 12337.52 sieve) at 388514 p/sec.
Average processor utilization: 0.98 (init), 0.27 (sieve)
Using the new code with an Intel Iris Pro HD:
./tpsieve-cl-boinc-x86_64-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5 (testing)
Compiled Jun 19 2014 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 286720
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 51 factors
mbpib2:testdir ibethune$ cat stderr.txt
23:04:27 (69871): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 4480 multiprocessors (22400 SPUs) on device 0.
Device 0 is a Intel Iris Pro.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 9508.32 sec. (0.17 init + 9508.15 sieve) at 946576 p/sec.
Processor time: 266.36 sec. (0.18 init + 266.18 sieve) at 33811974 p/sec.
Average processor utilization: 1.04 (init), 0.03 (sieve)
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
I'm trying the production version on a 2008 MacBook Pro with the 512Mb GeForce 8600M GT. I did find the 6th factor (5089*2^8180582+1); it will probably be Sunday before the rest finish.
Quicker than expected. Ran to completion and found all factors.
mbp15:tpsieve_cl_mac rob$ ./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 4096
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
Next up, the new version. |
|
|
|
I'm unable to run the new version - it keeps grumping about missing one of the libraries:
./tpsieve-cl-boinc-x86_64-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
dyld: Library not loaded: @rpath/libcudart.6.0.dylib
Referenced from: /Users/rob/Downloads/tpsieve_cl_mac2/./tpsieve-cl-boinc-x86_64-mac
Reason: image not found
Trace/BPT trap: 5
I've tried copying the "missing" file into the folder, adding it to the path, installing Xcode, rebooting in between everything...
Any suggestions? |
|
|
|
I just acquired a new Mac Pro :-)
Every opencl PPS WU result it computed is invalid :-(
http://www.primegrid.com/show_host_detail.php?hostid=434562 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
I know it doesn't sound like it, but this is great news! You can reproduce the problem, so you can help us fix it. Could you please run these programs on it:
http://www.primegrid.com/download/primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC
http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac.zip
with these arguments:
-p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
And let us know the results? Thanks!
____________
|
|
|
|
The old production app found all the expected factors (up to the point where I killed it halfway through as I needed to use the machine this morning). I will re-run this on my newer Iris Pro to confirm.
As expected, the old code finds all 53 factors when running on the Iris Pro:
ibethune$ ./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 286720
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
16:50:32 (1747): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 4480 multiprocessors (22400 SPUs) on device 0.
Device 0 is a Intel Iris Pro.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 10779.42 sec. (0.43 init + 10778.99 sieve) at 834975 p/sec.
Processor time: 207.08 sec. (0.46 init + 206.62 sieve) at 43558885 p/sec.
Average processor utilization: 1.06 (init), 0.02 (sieve)
19:50:12 (1747): called boinc_finish
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
I know it doesn't sound like it, but this is great news! You can reproduce the problem, so you can help us fix it. Could you please run these programs on it:
http://www.primegrid.com/download/primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC
http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac.zip
with these arguments:
-p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
And let us know the results? Thanks!
./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724738634179 | 1485*2^6690363+1
Found 44 factors
|
|
|
|
So what's the latest? Any more testing needed? I have macs with nVidia, AMD, and intel GPUs.
____________
Reno, NV
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Well, Chipotle ran one app but not the other. I'd need to see both to know if there's a significant difference.
I have a newer version that fixes one missing factor problem. Whether it would fix the other I really don't know. I also don't know if it would behave the same as the newer broken app, but it would be instructive to find out in any case.
Oh, and ppsieve (the +1-only version, as opposed to tpsieve) is apparently totally broken at this point. I don't know yet if I'll fix it up or not.
So, I guess, if you've got a Mac, go ahead and try both these programs:
http://www.primegrid.com/download/primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC
http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac.zip
With these arguments:
-p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
If the first one doesn't find only 44 factors, starting with missing 13120717075819579 | 5089*2^8180582+1, then your Mac isn't helpful to me.
If the first one finds only 44 factors, let me know what the second one does, and check its stderr.txt as well.
If the second one finds less than 52 factors, I may be stuck.
Thanks for testing!
____________
|
|
|
|
OK, so I tried again, this time understanding your instructions better. But why use cuda library for ATI (see below)?
Output (from 2 runs):
./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
./tpsieve-cl-boinc-x86_64-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
dyld: Library not loaded: /usr/local/cuda/lib/libcuda.dylib
Referenced from: /Users/ran/Desktop/test/./tpsieve-cl-boinc-x86_64-mac
Reason: image not found
Trace/BPT trap |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Nuts, need to get CUDA out of the OpenCL compiling. I had a heck of a time doing that for Linux, and I guess Mac isn't set right yet. I'll have to get with Iain and get back to you.
____________
|
|
|
|
Thanks for the quick reply!
BTW, I just ran the first program again, this time getting the expected 44 results. Unfortunately there may be more than one bug, since last time, after substantially fewer than 44 results, the program exited early - it didn't announce the number of factors :-( |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Was there anything in stderr.txt when it exited early?
____________
|
|
|
|
Was there anything in stderr.txt when it exited early?
Sorry - I'm a poor tester. the later run clobbered the stderr.txt of the first. I'll attempt to recreate it. In the meantime...
I am seeing inconsistent results (here are two runs, note that the lists are only mostly alike):
./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723983397483 | 5737*2^6363638+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 44 factors
./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 45 factors
|
|
|
|
OK, here's a truncated run + stderr (hope this is helpful!):
./primegrid_tpsieve_1.39_x86_64-apple-darwin__atiPPSsieveMAC -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.3e (testing)
Compiled Sep 12 2011 with GCC 4.2.1 (Apple Inc. build 5666) (dot 3)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
cat stderr.txt
18:19:04 (35737): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 2215.71 sec. (0.30 init + 2215.40 sieve) at 4062550 p/sec.
Processor time: 216.81 sec. (0.31 init + 216.51 sieve) at 41569993 p/sec.
Average processor utilization: 1.01 (init), 0.10 (sieve)
18:56:00 (35737): called boinc_finish
shmget in attach_shmem: Invalid argument
19:03:04 (36289): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 2214.91 sec. (0.31 init + 2214.60 sieve) at 4064030 p/sec.
Processor time: 217.21 sec. (0.31 init + 216.90 sieve) at 41494909 p/sec.
Average processor utilization: 1.01 (init), 0.10 (sieve)
19:39:59 (36289): called boinc_finish
shmget in attach_shmem: Invalid argument
19:48:23 (36879): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
Computation Error: no candidates found for p=13120721136041257 between 8629631 and 9000000.
20:09:26 (36879): called boinc_finish
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
So, that run errored out with a Computation Error? That's actually very helpful. Thank you! It means my changes are likely to fix the app, or possibly cause it to error out every time, which is almost as useful.
Now, we're all waiting to see if Iain can remove that CUDA dependency and make a new binary.
____________
|
|
|
|
Hello,
when will be approximately able PPS sieve tasks for ATI?
Thanks, Martin |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Hello,
when will be approximately able PPS sieve tasks for ATI?
Thanks, Martin
I can't give you a definitive answer. As soon as the application is corrected and tested it will be turned on.
____________
My lucky number is 75898524288+1 |
|
|
|
Now, we're all waiting to see if Iain can remove that CUDA dependency and make a new binary.
New binaries (version cl-0.2.5a) are now posted at http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac2.zip. The bogus CUDA dependency is now gone.
If you were able to recreate the missing factors, please re-run with this version and let us know!
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
./tpsieve-cl-boinc-x86_64-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a beta (testing)
Compiled Jul 6 2014 with GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 16384
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt:
10:51:53 (331): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 64 multiprocessors (320 SPUs) on device 0.
Device 0 is a NVIDIA GeForce 8600M GT.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 57563.88 sec. (0.27 init + 57563.61 sieve) at 156352 p/sec.
Processor time: 5139.82 sec. (0.26 init + 5139.55 sieve) at 1751162 p/sec.
Average processor utilization: 0.97 (init), 0.09 (sieve)
02:51:17 (331): called boinc_finish |
|
|
|
./tpsieve-cl-boinc-x86-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a beta (testing)
Compiled Jul 6 2014 with GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 16384
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt
07:46:30 (1233): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 64 multiprocessors (320 SPUs) on device 0.
Device 0 is a NVIDIA GeForce 8600M GT.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 57581.96 sec. (0.34 init + 57581.62 sieve) at 156303 p/sec.
Processor time: 5335.63 sec. (0.35 init + 5335.28 sieve) at 1686920 p/sec.
Average processor utilization: 1.02 (init), 0.09 (sieve)
23:46:12 (1233): called boinc_finish
|
|
|
|
Things look good :-)
Output on mac pro:
./tpsieve-cl-boinc-x86_64-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a beta (testing)
Compiled Jul 6 2014 with GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt
21:20:04 (86461): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 1261.14 sec. (0.29 init + 1260.84 sieve) at 7138225 p/sec.
Processor time: 271.47 sec. (0.30 init + 271.17 sieve) at 33190116 p/sec.
Average processor utilization: 1.01 (init), 0.22 (sieve)
21:41:06 (86461): called boinc_finish |
|
|
|
...and the second executable's output:
./tpsieve-cl-boinc-x86-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a beta (testing)
Compiled Jul 6 2014 with GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt
21:46:03 (89088): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 1264.51 sec. (0.36 init + 1264.14 sieve) at 7119593 p/sec.
Processor time: 414.81 sec. (0.37 init + 414.44 sieve) at 21716433 p/sec.
Average processor utilization: 1.02 (init), 0.33 (sieve)
22:07:08 (89088): called boinc_finish |
|
|
|
Okay, I know this will be seen as a petty complaint, but isn't there a better place to post factor files for verification than on the open forums? I just don't like scrolling :-) This isn't specific to this thread, but the cry "we're having problems, please post factors" is very annoying to me. And I am not ashamed. The numbers posted are of value (in a "how do I fix it" sense) to maybe 5-10 people, not the whole PG community. Keep them to an FTP upload site or something similar. Not here.
--Gary |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Okay, I know this will be seen as a petty complaint, but isn't there a better place to post factor files for verification than on the open forums? I just don't like scrolling :-) This isn't specific to this thread, but the cry "we're having problems, please post factors" is very annoying to me. And I am not ashamed. The numbers posted are of value (in a "how do I fix it" sense) to maybe 5-10 people, not the whole PG community. Keep them to an FTP upload site or something similar. Not here.
--Gary
This is not to help the individual users; it's to help the developer fix the app. I can't think of anything that's more valuable to the entire community than that. Also, while Ken (the developer) is the one fixing the app, it's other people (Iain, Jim, and myself) who need to eventually sign off on the app as working correctly and being production-ready, so we need to see this information too.
Furthermore, for the people helping with the testing, it's of great help to see the results from other people so they have a good idea of what they should be looking for.
If you disagree with this, that's fine, but if you want to follow up, please do it privately with me. I don't want the testing thread cluttered with a discussion about what is or is not appropriate.
Thank you.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Alright, V0.2.5a, a.k.a. Release Candidate 2, is ready for testing on Windows and Linux. Same URL just because I don't want to create another file. The only difference from the beta, which was on Mac, is that FirePro cards are better detected. Chipotle's card should have been detected as GCN, but it wasn't.
I should probably get Iain to compile a Mac version too so Chipotle can test it. But the beta he compiled would also work - just a little slower in a few cases.
____________
|
|
|
|
I should probably get Iain to compile a Mac version too so Chipotle can test it. But the beta he compiled would also work - just a little slower in a few cases.
Happy to test it for you :-) |
|
|
|
I have posted a build of the latest code (0.2.5a) at http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac3.zip for testing.
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
I have posted a build of the latest code (0.2.5a) at http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac3.zip for testing.
Nice, it's a bit faster :-)
./tpsieve-cl-boinc-x86_64-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a (testing)
Compiled Jul 24 2014 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 131072
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt
21:09:37 (18139): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
FirePro D series detected.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 1051.88 sec. (0.28 init + 1051.59 sieve) at 8558644 p/sec.
Processor time: 269.72 sec. (0.29 init + 269.43 sieve) at 33404316 p/sec.
Average processor utilization: 1.01 (init), 0.26 (sieve)
21:27:09 (18139): called boinc_finish
...and 32-bit...
./tpsieve-cl-boinc-x86-mac -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a (testing)
Compiled Jul 24 2014 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 131072
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt
21:38:23 (18456): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a AMD ATI Radeon HD - FirePro D700 Compute Engine.
FirePro D series detected.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 1052.23 sec. (0.38 init + 1051.85 sieve) at 8556556 p/sec.
Processor time: 436.82 sec. (0.39 init + 436.43 sieve) at 20622524 p/sec.
Average processor utilization: 1.02 (init), 0.41 (sieve)
21:55:55 (18456): called boinc_finish |
|
|
|
Win7-x64 box, Dual Xeon X5570, 48 GB DDR3 ECC, Dual R9-280X cards.
C:\Users\Dutch\Downloads\tps-cl-alpha>tpsieve-cl-boinc-x86-windows.exe -p1312071
6e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5a (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 131072
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt :
23:46:40 (3872): Can't set up shared mem: -1. Will run in standalone mode.
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a Advanced Micro Devices, Inc. Tahiti.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 707.11 sec. (0.47 init + 706.64 sieve) at 12736609 p/sec.
Processor time: 1037.47 sec. (0.47 init + 1037.00 sieve) at 8679056 p/sec.
Average processor utilization: 1.00 (init), 1.47 (sieve)
23:58:27 (3872): called boinc_finish
|
|
|
|
And just for good measure, same test with the current Boinc executable :
C:\Users\Dutch\Downloads\tps-cl-alpha>primegrid_tpsieve_1.39_windows_intelx86__a
tiPPSsieve.exe -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -
c 60
tpsieve version cl-0.2.3e (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 32768
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
Stderr.txt :
04:53:01 (1992): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 1262.26 sec. (0.45 init + 1261.81 sieve) at 7132738 p/sec.
Processor time: 354.23 sec. (0.47 init + 353.76 sieve) at 25441265 p/sec.
Average processor utilization: 1.05 (init), 0.28 (sieve)
05:14:03 (1992): called boinc_finish |
|
|
|
I know its summer and holidays etc has an effect, but any eta/news on possible reinstatement of the ATI CL app for PPS Sieving ? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
I know its summer and holidays etc has an effect, but any eta/news on possible reinstatement of the ATI CL app for PPS Sieving ?
If you've read this thread (which is about finding and fixing the problem), you have as much information as I do.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Well, in this thread we can all see that I have a new version, and it seems to be working, particularly on Mac but also on Windows. What I think we're all wondering is when you will consider it tested enough to put in production?
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Well, in this thread we can all see that I have a new version, and it seems to be working, particularly on Mac but also on Windows. What I think we're all wondering is when you will consider it tested enough to put in production?
Sorry, I wasn't aware you were ready! I haven't been following this thread very closely.
Where's the current executables?
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Alright, V0.2.5a, a.k.a. Release Candidate 2, is ready for testing on Windows and Linux. Same URL just because I don't want to create another file.
I have posted a build of the latest code (0.2.5a) at http://www2.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac3.zip for testing.
Cheers
- Iain
I'm a little concerned that DutchDK may have been CPU-limited in his test. (He has lots of cores, but only one can be used for the CPU part of the sieve.) But that's something I can work on optimizing later; it shouldn't prevent deployment of the current apps.
Ken
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Ken,
This version of the app runs on any OCL device; the old version would only run on an ATI device.
While that's great for testing, is there any reason to have the production version run on both ATI and Nvidia? My concern is that if a host computer has both an ATI and Nvidia GPU (rare, but it does happen), the app will run on the wrong GPU.
What do you think about making it select only ATI GPUs unless it gets a command line option (say, -any) which would let us select any OVL device?
____________
My lucky number is 75898524288+1 |
|
|
|
I'm a little concerned that DutchDK may have been CPU-limited in his test. (He has lots of cores, but only one can be used for the CPU part of the sieve.) But that's something I can work on optimizing later; it shouldn't prevent deployment of the current apps.
Ken
Not really a big issue, Ken. In Boinc, I set up one logical core for each GPU task. I tend to run 2 PPS sieve tasks on each R9-280X GPU, so two physical cores/4 logical cores are dedicated to the GPU tasks, which is fine.
What I am more curious about, is this in the new app :
cthread_count = 131072
compared to this in the current app :
cthread_count = 32768
Is this the cause of the speedup in the new app ?
And also the stderr timereports looks weird,
new:
Elapsed time: 707.11 sec. (0.47 init + 706.64 sieve) at 12736609 p/sec.
Processor time: 1037.47 sec. (0.47 init + 1037.00 sieve) at 8679056 p/sec.
Average processor utilization: 1.00 (init), 1.47 (sieve)
versus old:
Elapsed time: 1262.26 sec. (0.45 init + 1261.81 sieve) at 7132738 p/sec.
Processor time: 354.23 sec. (0.47 init + 353.76 sieve) at 25441265 p/sec.
Average processor utilization: 1.05 (init), 0.28 (sieve)
Its almost like processor and elapsed time reports have been swapped in the new vs old app. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
I tend to run 2 PPS sieve tasks on each R9-280X GPU, so two physical cores/4 logical cores are dedicated to the GPU tasks, which is fine.
The goal is to not require that anymore, so that people not using app_info or whatever can run as fast as you can. But if a 3GHz Nehalem can't keep up with a 280X, that goal might not be met.
What I am more curious about, is [more cthreads.] Is this the cause of the speedup in the new app ?
That's not the whole cause of the speedup, but it is part of it. The other part is that I implemented the algorithm I've been using for Nvidia, which is apparently faster for GCN, though not for older cards. In prior testing, the higher cthread count just turned out to be optimal for GCN cards.
And also the stderr timereports looks weird. [snip] Its almost like processor and elapsed time reports have been swapped in the new vs old app.
No, that looks mostly right to me, though how it's exceeding one core usage I'm not sure. (I assume it's running at 100% on the main thread and the rest is driver overhead on another thread.)
The first thing you need to realize is that I don't trust AMD's OpenCL to compute the tricky initialization of each step right. Nvidia does it on the GPU, but AMD had failures early on. An alpha GCC also had issues with that, so I gather it's tricky for compilers to get it right. Anyway, this means doing a significant amount of work on the CPU.
Then, with the app making errors, I implemented an error-check for each step. But this takes about the same time as the initialization, so that nearly doubles the CPU usage per step.
Now add in that the GPU is doing steps faster, and you get huge CPU use. So now I want to make the CPU work more efficiently.
I just found an intrinsic for MSVC++ called __emulu that might help significantly. I'm planning to try it in the next version. Based on Chipotle's results from the Mac version, which I assume was compiled with GCC and thus used a similar instruction, I'm hopeful. But if it doesn't help, then running two threads at once may remain necessary. Or I might have to work harder on optimization, or risk doing tougher math on the GPU or something.
Oh, and @Mike, aborting on Nvidia looks fairly easy. I'll test it more this weekend.
____________
|
|
|
|
I would like to run PPS-Sieve and the TRP-Sieve tasks. But when these are selected the TRP-Sieve preference is somehow automatically replaced by ESP-Sieve tasks. Is this correctable?
Correction I get PPS (Sieve) 1.39 (cpuPPSieve that are not open CL) and TRP-Sieve tasks. Is the Open CL just not in the task description anymore for Macs? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Correction I get PPS (Sieve) 1.39 (cpuPPSieve that are not open CL) and TRP-Sieve tasks. Is the Open CL just not in the task description anymore for Macs?
As per the title of this thread, the ATI (OpenCL) app is currently turned off, so you won't get any ATI PPS Sieve tasks.
You may have changed it, but at this moment you have also have "Use ATI GPU" unchecked in your preferences, and also don't have the ATI PPS-Sieve box checked. Both of those most be checked to get ATI PPS-Sieve tasks (once they're re-enabled on the server.)
____________
My lucky number is 75898524288+1 |
|
|
|
The first thing you need to realize is that I don't trust AMD's OpenCL to compute the tricky initialization of each step right. Nvidia does it on the GPU, but AMD had failures early on. An alpha GCC also had issues with that, so I gather it's tricky for compilers to get it right. Anyway, this means doing a significant amount of work on the CPU.
Out of pure interest, the current app does the init on the GPU ? Since the results with both the current and the new was the same on the windows run.
Then, with the app making errors, I implemented an error-check for each step. But this takes about the same time as the initialization, so that nearly doubles the CPU usage per step.
Possible to have the errorcheck only enabled when running with a -debug commandline parameter ? Or is it inlined with the rest of the code for the steps its checking ?
If you can make the errorcheck optional, that would then nearly halve the cpu usage. |
|
|
|
FYI:
Alright, V0.2.5a, a.k.a. Release Candidate 2, is ready for testing on Windows and Linux. Same URL just because I don't want to create another file.
Just dowloaded this to test on a LinuxMint17 system with an AMD Radeon HD6670 card (it's a 32-bit one?).
Wed 06 Aug 2014 12:32:59 BST | | CAL: ATI GPU 0: AMD Radeon HD 6570/6670/7570/7670 series (Turks) (CAL version 1.4.1848, 1024MB, 999MB available, 1536 GFLOPS peak)
Wed 06 Aug 2014 12:32:59 BST | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 6570/6670/7570/7670 series (Turks) (driver version 1214.3, device version OpenCL 1.2 AMD-APP (1214.3), 1024MB, 999MB available, 1536 GFLOPS peak)
Wed 06 Aug 2014 12:32:59 BST | | OpenCL CPU: Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1214.3 (sse2,avx), device version OpenCL 1.2 AMD-APP (1214.3))
The x86 (32-bit) executable in the zip file is statically linked against the BOINC libs.
That passes the test OK.
However, the 64-bit one isn't statically-linked, and it's expecting to find libboinc_api.so.6 and libboinc.so.6.
But the latest/current releases libs are v7 (I'm on 7.2.42).
So I've linked libboinc_api.so.6 to libboinc_api.so.7, and created a libboinc.so.6 from my libboinc.a (as there is no dynamic version built by default) and run the 64-bit one using that set-up.
That then passes the test OK as well. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
V0.2.5b is out. Same file. I was going to post a new version last week, but I found that while I could compile the 64-bit app on my actual machine, I couldn't compile the exact same code on my VM. It turned out the Makefile had some serious issues. "Make clean" wasn't working, so I had eventually lost the code to make one object file. It's now patched up and I think it should be working.
@Dutchdk, I'm not making the error check optional. It allows nice things like avoiding all computation errors that aren't bugs or misconfigurations. An overclocked GPU probably won't error out - its errors will just get fixed. I thought about it and did come to the conclusion that I could move the error checks to the GPU. And also that if one error check worked on the GPU I could move the initialization there as well. But that requires a level of work that would deserve another version number (0.2.6). I'm not up for that right now. Hopefully the little coding tweak I did for Windows will help. And hopefully it didn't break anything.
Edit: Oh, and I don't think it's possible to compile with static linking when doing OpenCL code. I know there's several other people around here who compile OpenCL code, so please correct me if I'm wrong. :)
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Hm, I bet I forgot to document the new option, either in the program or anywhere else! -a or "--anygpu" allows computing on Nvidia; otherwise it aborts with "These aren't the devices you're looking for." So that's what you need to use if you want to test on Nvidia.
____________
|
|
|
|
2.5b output :
C:\Users\Dutch\Downloads\tps-cl-alpha (1)>tpsieve-cl-boinc-x86-windows.exe -p131
20716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5b (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 131072
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt :
19:25:22 (10828): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a 'Advanced Micro Devices, Inc.' 'Tahiti'.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 672.34 sec. (0.62 init + 671.72 sieve) at 13398667 p/sec.
Processor time: 1038.12 sec. (0.62 init + 1037.50 sieve) at 8674880 p/sec.
Average processor utilization: 1.01 (init), 1.54 (sieve)
19:36:34 (10828): called boinc_finish |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
No significant improvement in Windows on the CPU. Oh, well. I might do something else later. But for now, let's go to sieve with the app we have, not the app we might want or wish to have at a later time.
____________
|
|
|
|
Edit: Oh, and I don't think it's possible to compile with static linking when doing OpenCL code. Not the OpenCL libraries, but it wasn't those that were the problem. It was the BOINC libs. The 64-bit one is looking for libboinc_api.so.6 and libboinc.so.6, while the 32-bit one isn't:
32-bit one:[mysys]: ldd tpsieve-cl-boinc-x86-linux
linux-gate.so.1 => (0xf7712000)
libOpenCL.so => /opt/AMDAPP/lib/x86/libOpenCL.so (0xf76e1000)
libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xf76c5000)
libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xf75db000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xf7595000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xf7578000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf73c8000)
/lib/ld-linux.so.2 (0xf7713000)
libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xf73c3000)
64-bit one:[mysys]: ldd tpsieve-cl-boinc-x86_64-linux
linux-vdso.so.1 => (0x00007fffb5fa1000)
libOpenCL.so => /opt/AMDAPP/lib/x86_64/libOpenCL.so (0x00007fb9277a8000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb92758a000)
libboinc_api.so.6 => not found
libboinc.so.6 => not found
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb927285000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb926f7e000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb926d68000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb9269a2000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb9279d8000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb92679d000)
|
|
|
|
In case it's of an use, here's a run of the command in 78651 on a LinuxMint 64-bit system with an AMD Radeon HD6670 card:
Command: ./tpsieve-cl-boinc-x86_64-linux -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60 Result:tpsieve version cl-0.2.5b (testing)
Compiled Aug 6 2014 with GCC 4.4.3
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 6144
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors stderr.txt:21:40:42 (3197): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Resuming from checkpoint p=13120718186280961 in tpcheck13120716e9.txt
Thread 0 starting
Detected 96 multiprocessors (480 SPUs) on device 0.
Device 0 is a 'Advanced Micro Devices, Inc.' 'Turks'.
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 4614.92 sec. (0.25 init + 4614.67 sieve) at 1476576 p/sec.
Processor time: 2793.34 sec. (0.23 init + 2793.11 sieve) at 2439541 p/sec.
Average processor utilization: 0.91 (init), 0.61 (sieve)
22:57:37 (3197): called boinc_finish
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Ken, are we ready to put the ATI app back into production? Do we have the necessary builds (Windows 32 bit, Mac 64 bit, Linux both 32 and 64 bit) built and, just as importantly, tested?
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
OK, the current status is:
0.2.5a works, but doesn't stop on Nvidia.
0.2.5b works, and aborts on Nvidia, although it lacks documentation and the 64-bit Linux build didn't have static BOINC libraries.
0.2.5c, which I just pushed out for Linux only, documents -a and allows all Linux binaries to have static BOINC libraries.
We have binaries for:
Mac: 0.2.5a (Iain would need to build c.)
Windows: 0.2.5b (but I could compile c if you want.)
Linux: 0.2.5c
The new algorithm works on Mac, and doesn't do any worse on anything else since 0.2.5a. (The original 0.2.5 was broken.) I don't know how much testing you need.
Also, will it be possible to have BOINC run these binaries on Intel graphics? They seem to work.
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
OK, the current status is:
0.2.5a works, but doesn't stop on Nvidia.
0.2.5b works, and aborts on Nvidia, although it lacks documentation and the 64-bit Linux build didn't have static BOINC libraries.
0.2.5c, which I just pushed out for Linux only, documents -a and allows all Linux binaries to have static BOINC libraries.
We have binaries for:
Mac: 0.2.5a (Iain would need to build c.)
Windows: 0.2.5b (but I could compile c if you want.)
Linux: 0.2.5c
The new algorithm works on Mac, and doesn't do any worse on anything else since 0.2.5a. (The original 0.2.5 was broken.) I don't know how much testing you need.
Also, will it be possible to have BOINC run these binaries on Intel graphics? They seem to work.
I'd want all versions to be at the same level, so I guess that would be 0.2c. We'd need both 32 and 64 bit binaries for linux, as well as 32 bit for Windows and 64 bit for mac.
Iain should be able to coordinate building the other versions of the apps.
____________
My lucky number is 75898524288+1 |
|
|
|
Mac 0.2.5c build are now online: http://www.epcc.ed.ac.uk/~ibethune/files/tpsieve_cl_mac0.2.5c.zip
So if the Windows builds are ready then we're good to go.
It would also be great to see these supported on Intel GPUs via BOINC. It certainly works on my Iris Pro via the command line.
Cheers
- iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
I'm not sure if we can support Intel video at this time.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Alright, all the files in that zipfile are now 0.2.5c.
____________
|
|
|
|
OK, the current status is:
...
0.2.5c, which I just pushed out for Linux only, documents -a and allows all Linux binaries to have static BOINC libraries. Thanks. Both the 32- and 64-bit ones now run without me having to point them towards my own, hacked BOINC libs.
|
|
|
|
2.5c run
C:\Users\Dutch\Downloads\tps-cl-alpha (2)>tpsieve-cl-boinc-x86-windows.exe -p131
20716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000 -T -M2 -c 60
tpsieve version cl-0.2.5c (testing)
nstart=6000000, nstep=41
tpsieve initialized: 5 <= k <= 9999, 6000000 <= n < 9000000
nstep changed to 32
CL setup complete.
cthread_count = 131072
13120716307705079 | 2959*2^8094354+1
13120716380747377 | 8859*2^7847166+1
13120716412325987 | 7211*2^7702451+1
13120716843190109 | 2167*2^7951358+1
13120716846657643 | 7003*2^8206351-1
13120717075819579 | 5089*2^8180582+1
13120717160591431 | 7745*2^8765661+1
13120717284827623 | 8907*2^8826378+1
13120717824685699 | 4613*2^7552946-1
13120717861537051 | 5537*2^7081032-1
13120718815920467 | 2501*2^8550910-1
13120718930066591 | 3699*2^7482137-1
13120719057141307 | 6089*2^8973032-1
13120719059517079 | 2445*2^7778847-1
13120719477216161 | 6435*2^8752766+1
13120719544063997 | 2985*2^7826085+1
13120719661679807 | 8253*2^8382530-1
13120719782159167 | 1543*2^7389775-1
13120719966492419 | 3749*2^6399636-1
13120720493876977 | 6645*2^7538049+1
13120720553340431 | 2055*2^8038331+1
13120720607106341 | 8311*2^7968572+1
13120720635031169 | 5679*2^8994161+1
13120720662833737 | 5469*2^6562478+1
13120720682005019 | 1935*2^7473932-1
13120721017491301 | 3531*2^6213937-1
13120721212413691 | 8341*2^6170513-1
13120721663769431 | 3585*2^6907848+1
13120721691198937 | 6395*2^7988954-1
13120721797389533 | 8075*2^8648682-1
13120721851239341 | 8595*2^6216665-1
13120721947519033 | 3327*2^6867165-1
13120722319646957 | 5003*2^6419493+1
13120722440108203 | 5707*2^6807554+1
13120722557491823 | 1731*2^8986587+1
13120722686753867 | 8135*2^7034866-1
13120723071996931 | 4241*2^7534559+1
13120723189240451 | 1119*2^6735109-1
13120723194075703 | 2055*2^6453532+1
13120723256598527 | 8165*2^8191313+1
13120723258653649 | 4355*2^7639969+1
13120723434822539 | 8535*2^6601016+1
13120723717933309 | 9707*2^7180267+1
13120723954016033 | 4685*2^8440045+1
13120723983397483 | 5737*2^6363638+1
13120724045408101 | 2565*2^6057669+1
13120724075161099 | 6163*2^7807643-1
13120724084494907 | 3081*2^6075501+1
13120724364906871 | 4281*2^8009816+1
13120724414854831 | 7821*2^6744179+1
13120724617851359 | 9975*2^7303461-1
13120724673389471 | 2361*2^6415555+1
13120724738634179 | 1485*2^6690363+1
Found 53 factors
stderr.txt :
19:00:59 (1948): Can't open init data file - running in standalone mode
Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 512 multiprocessors (2560 SPUs) on device 0.
Device 0 is a 'Advanced Micro Devices, Inc.' 'Tahiti'.
GCN device detected; use -m1 --vecsize=4 to undo effect
Thread 0 completed
Sieve complete: 13120716000000000 <= p < 13120725000000000
count=242515220,sum=0x2dca915341c02724
Elapsed time: 672.91 sec. (0.66 init + 672.25 sieve) at 13388222 p/sec.
Processor time: 1038.53 sec. (0.67 init + 1037.86 sieve) at 8671881 p/sec.
Average processor utilization: 1.01 (init), 1.54 (sieve)
19:12:12 (1948): called boinc_finish |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
The new ATI apps are online! Please let me know if they're working, if they're being sent out correctly, etc.
There's 32 bit Windows, 64 bit Mac, and both 32 and 64 bit Linux versions available.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,144,525,586 RAC: 2,292,407
                                      
|
Win7x64, HD7950, BOINC 7.2.42 working fine - got validated.
EDIT: Using ~1/2 of a CPU core.
____________
My stats |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Also, will it be possible to have BOINC run these binaries on Intel graphics? They seem to work.
I looked at the server's code yesterday. The good news (really good news) is that the binary c++ part of the server appears to support Intel GPUs. I can't tell if it works, but the code is in there for Intel. This means we probably don't have to upgrade the server, which would involve a lot of testing and the risk of breaking things that currently work.
The web server code doesn't support it, however. There's no "Use Intel GPU" checkbox on the PrimeGrid preferences page. Fortunately, this should be a fairly easy thing to add.
So the server side looks like it's possible.
On the application side, we need a method for telling the sieve app which type of GPU it's supposed to use. We'll do it the same way we did it with the GeneferOCL app, which can run on either Nvidia or ATI. The app should read a configuration file (in the current working directory). The contents of the file indicate which type of GPU to use.
This is how we did it with GFN. You could use the same convention, or change the file name and contents if you wish.
The file name is opencl_platform.txt which contains a single line containing either NVIDIA or AMD. For PPS sieve, we could use INTEL or AMD (and probably NVIDIA just in case we ever want to use it.)
It's okay, as far as BOINC is concerned, if the filename is the same as the config file used by GFN, but in case there's ever a desire to use this method in standalone mode it might be a good idea to change the name so it doesn't conflict. I can be very flexible as far as the file name and contents are concerned, but I do need to pass the information to the app in a file. I can't use a command line parameter for this.
Finally, you should use the BOINC file open API so that the filename gets redirected correctly and makes use of the boinc wait-and-retry mechanism. If you don't, let me know because I'll need to set the flag to copy the config file into the slot directory rather than creating a link to the file.
That's the one change that would be required for running on Intel. I'd also recommend the following, although it's completely optional and you can do whatever you feel is right:
* Add something like --intel, --amd, and --nvidia command line switches.
* When running in BOINC mode, if there's no config file, abort with an appropriate error message rather than attempting to run.
* Command line switches should override whatever is in the config file.
* There may be tuning that needs to be done specific to Intel GPUs.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Telling the app to use a platform is all well and good. But how do I detect which GPU is which platform? Also, BOINC sends an index number of the GPU to use. How do I reconcile that with using only one platform?
It seems like the GPU type to use should be specified in the plan_class, and the BOINC client should only be sending my app indices of GPUs that match that plan_class.
____________
|
|
|
|
On the application side, we need a method for telling the sieve app which type of GPU it's supposed to use. So does that mean the test app posted here should run on an Intel GPU if the -a command line option is given?
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
The -a switch just prevents it from aborting if it's run on Nvidia. If the BOINC client tells it to run on Intel, it will run on Intel.
____________
|
|
|
|
The -a switch just prevents it from aborting if it's run on Nvidia. If the BOINC client tells it to run on Intel, it will run on Intel. I'm actually thinking about the test program, which is run by hand from the command line.
But I've just noticed that this is exactly the same file that gets downloaded by the BOINC manager.
So I could use this as a test for getting OpenCL set-up under Linux on Intel CPU and GPU? |
|
|
|
Just noticed that the deadline for PPS-Sieve is 3 days. Seems a little short.
The Riesel Problem LLR (TRP), whcih has a similar recent CPU usage time, is set for a deadline of 6 days.
I've just has 5 sent to my desktop. Each reckons it will take 15 mins, but in practice each will take about 2 hours elapsed running time. With only a 3 day deadline and 5 other projects sharing the GPU that's not going to work well. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
Telling the app to use a platform is all well and good. But how do I detect which GPU is which platform? Also, BOINC sends an index number of the GPU to use. How do I reconcile that with using only one platform?
It seems like the GPU type to use should be specified in the plan_class, and the BOINC client should only be sending my app indices of GPUs that match that plan_class.
Iain should be able to help you with that. You can pull the code for that from the OCL version of Genefer.
____________
My lucky number is 75898524288+1 |
|
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
The new ATI apps are online! Please let me know if they're working, if they're being sent out correctly, etc.
AMD HD7970 GPU.
16 PPS-Sieve WU's so far.
12 Pending, 3 Valid, 1 in Error:
http://www.primegrid.com/result.php?resultid=566182591
Computation Error: Checksum mismatch
Average time per WU: 00:17:12
I am used to average more like 31 mins, so this is a huge improvement. |
|
|
|
Has there been any thoughts at using HSA on AMD APUs (ie; running as a CPU but having it take advantage of the GPU cores as well) on sieve apps? And can HSA be implemented in LLR or other apps as well?
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
HSA is just a buzzword. This app uses both CPU and GPU, as all GPU apps do. This one uses the CPU a little more than I'd like, actually. Technically, HSA indicates slightly closer integration between the two, but that's not an issue here.
All HSA-enabled APUs should be able to run this OpenCL app.
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
The new ATI apps are online! Please let me know if they're working, if they're being sent out correctly, etc.
AMD HD7970 GPU.
16 PPS-Sieve WU's so far.
12 Pending, 3 Valid, 1 in Error:
http://www.primegrid.com/result.php?resultid=566182591
Computation Error: Checksum mismatch
Average time per WU: 00:17:12
I am used to average more like 31 mins, so this is a huge improvement.
I Just looked through the database. There's 6 Windows hosts (including yours) that got that error, 3 of those that got that error on multiple tasks. 500+ successful results from Windows computers with the new app, including a bunch from your computer.
My guess would be a hardware problem, but most of your tasks are completing just fine, so I'm not really sure.
____________
My lucky number is 75898524288+1 |
|
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
My guess would be a hardware problem, but most of your tasks are completing just fine, so I'm not really sure.
My wife was using the computer at the time. Uploading 100's of photos to Facebook and who knows what else, so lets write that one error off. Main point is that the app does what its meant to do.
Now up to 11 Valid and 34 Pending tasks. |
|
|
|
Strange looking result for me on Linux.
http://www.primegrid.com/result.php?resultid=566171718
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
ksum mismatch for p=25785097999483649 between 10364055 and 9532140 at n=9000063.
Computation Error: Checksum mismatch for p=25785097999483661 between 9337148 and 5274531 at n=9000063.
Computation Error: Checksum mismatch for p=25785097999483679 between 10651500 and 13852497 at n=9000063.
.....
[lots of similar lines removed...]
...
Computation Error: Checksum mismatch for p=25785097999507597 between 6548372 and 9241212 at n=9000063.
Computation Error: Checksum mismatch for p=25785097999507601 between 7046866 and 15601207 at n=9000063.
Computation Error: Checksum mismatch for p=25785097999507663 between 3365387 and 3562234 at n=9000063.
Computation Error: Checksum mismatch for p=25785097999507681 between 10382823 and 16758743 at n=9000063.
Aborting because over 1 in 8 p's had computation errors: 768 of 768.
11:57:32 (5225): called boinc_finish
Sieve started: 25785093000000000 <= p < 25785102000000000
Resuming from checkpoint p=25785097929093633 in tpcheck25785093e9.txt
Thread 0 starting
Detected 96 multiprocessors (480 SPUs) on device 0.
Device 0 is a 'Advanced Micro Devices, Inc.' 'Turks'.
Sieve started: 25785093000000000 <= p < 25785102000000000
Resuming from checkpoint p=25785098290065921 in tpcheck25785093e9.txt
Thread 0 starting
Detected 96 multiprocessors (480 SPUs) on device 0.
Device 0 is a 'Advanced Micro Devices, Inc.' 'Turks'.
Thread 0 completed
Sieve complete: 25785093000000000 <= p < 25785102000000000
count=238159614,sum=0x96e0cf5f38895b44
Elapsed time: 2499.13 sec. (0.32 init + 2498.81 sieve) at 1484754 p/sec.
Processor time: 1542.58 sec. (0.32 init + 1542.26 sieve) at 2405648 p/sec.
Average processor utilization: 1.00 (init), 0.62 (sieve)
13:21:31 (3290): called boinc_finish
</stderr_txt>
]]> Despite reporting of all these error it seemed to keep going? |
|
|
|
HSA is just a buzzword. This app uses both CPU and GPU, as all GPU apps do. This one uses the CPU a little more than I'd like, actually. Technically, HSA indicates slightly closer integration between the two, but that's not an issue here.
All HSA-enabled APUs should be able to run this OpenCL app.
I know both CPU and GPU cores are both used but what I'm talking about is the increased efficiency with HSA because of the use of unified memory space and from using pointers instead of usual copying of memory between the CPU and iGPU in the HSA code. I know this may not make much of difference in the sieve app since its basically all integer math but there should be at least a noticeable difference. Now as far as LLR, since HSA offloads much of the DP FPU work to the iGPU cores, I wonder if LLR would benefit a little, a lot or not at all with a modified LLR app on the HSA enabled AMD APUs. This would only work on the current Steamroller and upcoming Excavator series of APUs (and the later new AMD architecture) since they're the only fully HSA enabled ones, but it would be interesting to know for those who do have, or plans on getting an one.
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
@Roger, I noticed your machine also had an error on TRP Sieve. I thought it was almost impossible to error there. So I'm chalking it up to some kind of hardware failure too. Unless I've accidentally created a race condition somewhere. In either case, the results that are returned are correct.
@Gordon Lack, that's rather mysterious. A BOINC app isn't supposed to be resumed after boinc_finish() is called. But it seems to have resumed from a point before the errors, and didn't repeat the errors, so it's probably a correct result.
@NeoMetal, nope, I don't see anything HSA could do for the sieve. There's a lot of parallelism already in there, so that the CPU and GPU can both be working on stuff while other stuff is copied between main memory and the GPU. I rather doubt there's anything it could help with on LLR either. The limiting factor there is the performance of a Fast Fourier Transform. And the fact that George Woltman created a great customized FFT system for Intel x86, but no one has made a custom FFT for any GPU that's similarly customized.
____________
|
|
|
|
@Gordon Lack, that's rather mysterious. A BOINC app isn't supposed to be resumed after boinc_finish() is called. But it seems to have resumed from a point before the errors, and didn't repeat the errors, so it's probably a correct result. Perhaps there was some stderr.txt left from a previous run? It is odd that it started in the middle of a line...
The next one has now finished with no such silliness (http://www.primegrid.com/result.php?resultid=566171389).
In between I did try updating to the released 14.4 AMD drivers, rather than the RC2 I'm running (and both actually install as 14.10) but that lead to a disaster as the install failed, and trying to switch back to the previous working version by installing that also failed.
Fortunately I take backups, so I was able to reverted my installed OS to a earlier, working version...
Worked OK, as the completion of that second job shows (the failed upgrade/successful restore was in the middle).
|
|
|
|
I'm ruining the sieve on HD5870, but get a load lower 80%. Is there a way to increase the GPU load? |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
You hid your computers! So I can't tell if you're CPU-limited. The new app takes twice the CPU of the old one for the same speed - and the new app is often faster. That CPU usage is something I want to address, but it may take awhile. If you're running 64-bit Windows, I may be able to improve that situation a little faster, though.
____________
|
|
|
|
Please do. I actually had to set up 1.5 CPU cores to each GPU task in app_config, to max out the R9-280x's.
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
I'm ruining the sieve on HD5870, but get a load lower 80%. Is there a way to increase the GPU load?
Oh! I might understand. Basically, even when vectorized, the sieve uses about four integer units at a time. Older VLIW-architecture cards like yours have five integer units available. 4/5 = 80%. :) And no, there's no good way to get to 100% if that's the issue. GCN cards really are better at this task.
If the issue is CPU limitation, which I don't think it would be on a 3770k and a 5870, I'm working on that. I finally found the intrinsic I need. Now I need to add it and see if I can compile for 64-bit with VS 2008 Express.
____________
|
|
|
|
I'm ruining the sieve on HD5870, but get a load lower 80%. Is there a way to increase the GPU load?
Oh! I might understand. Basically, even when vectorized, the sieve uses about four integer units at a time. Older VLIW-architecture cards like yours have five integer units available. 4/5 = 80%. :) And no, there's no good way to get to 100% if that's the issue. GCN cards really are better at this task.
If the issue is CPU limitation, which I don't think it would be on a 3770k and a 5870, I'm working on that. I finally found the intrinsic I need. Now I need to add it and see if I can compile for 64-bit with VS 2008 Express.
Thanks for your previous answer. The problem is actually in CPU load. I've assigned the highest priority to the sieve task and the load went up from 60% to 84%. If I stop all CPU tasks all together then the load goes up to 100%.
I'm using the latest Catalyst driver and I wonder if I should try another driver version. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Try leaving a full core for the sieve? Lots of OpenCL apps benefit from that.
You're also on an i7. Try turning off HT and leaving a full core for the sieve?
____________
|
|
|
|
Try leaving a full core for the sieve? Lots of OpenCL apps benefit from that.
You're also on an i7. Try turning off HT and leaving a full core for the sieve?
It's definitely going to work, but it's not a good idea for me. I'm running PSP Sieve on the CPU and it's better with HT on. If I can't get the GPU PPS Sieve to 100% load with all CPU cores occupied, then I'll leave it be.
Thanks anyway |
|
|
mikey Send message
Joined: 17 Mar 09 Posts: 1784 ID: 37043 Credit: 791,819,098 RAC: 1,251,944
                     
|
I just wanted to say a BIG THANKS to all the people who helped get this back on line again!! Your work IS appreciated!!! |
|
|
|
A few tasks ended with errors.
One with this can someone tell me the cause of the error?
Stderr output
<core_client_version>7.2.42</core_client_version>
< Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,614,871 RAC: 816,519
                               
|
I looked in the database for that type of error. There's a total of 36 hosts that have at least one occurrence of that error, but only 10 hosts with more than 5 such errors (including yours).
The hosts include Mac, Windows, and Linux, the old sieve app as well as the new sieve app, and both Nvidia and ATI GPUs. There doesn't appear to be any indication that this is related to the type of hardware or software.
Because the errors seem to be independent of the type of hardware and software, the most likely cause is hardware errors on the GPU. An obvious culprit would be overclocking, so if you're overclocking, I'd suggest reducing the clocks to see if the problem goes away.
Another possible cause is temperature, and this might account for why only some of your tasks are failing.
____________
My lucky number is 75898524288+1 |
|
|
|
i have checked the ambient, GPU and core temps as culprits but they are within limits. I went on line for under clocking Macs and all I saw were fan speed related solutions. When the latest task error occurred my computer was not under any load. I'm primarily a photoshop user and have enough RAM to cover it and then some. |
|
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,270,144 RAC: 179,433
                     
|
Every time I see a checksum mismatch I think about memory errors. |
|
|
|
Not a computer literate person, can this be explained to me?
Stderr output
<core_client_version>7.2.42</core_client_version>
< Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Huh, for some reason your GPU returned 0x000000ff for all tests. This claims that every test found a factor and the checksum was 0 for every test set. Most likely the numbers to test weren't sent to the GPU in the first place, though I don't know why that would be. Were you doing something graphics-intensive at the time?
____________
|
|
|
|