Author |
Message |
|
Hi all,
Thanks to Bryan, we have the client apps for the forthcoming AP27 search ready enough for some wider testing. There are CPU (requiring SSE2 at a minimum) and OpenCL applications, and builds for Windows, Linux and Mac OS. The CPU code is 64-bit only. You can download the builds from https://github.com/ibethune/ap26/tree/master/bin
At this stage, I'd invite you to try them out so we can find out any OS version/library incompatibilities, and of course any code bugs! There are three example test ranges that you can run - see https://github.com/ibethune/ap26/tree/master/tests. All you need to do is run the app:
./ap26_sse2_macintel 366384 366384 0
or
./ap26_sse2_macintel 44121555 44121555 0
or
./ap26_sse2_macintel 47715109 47715109 0
For OpenCL, append --device N
Once the app completes, there should be a SOL-AP26.txt file, which you can compare to the one in the directory linked above.
Please post here if you run into any problems, and especially if the outputs don't match!
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
Hold off a bit, working on a problem with the opencl build...
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
Apps are updated and should now be working. OpenCL app is reportedly causing LOTS of screen lag, but low CPU usage. Let us know how it works for you...
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
EDIT: Nvm. New apps seem to be working.
OpenCL doesn't seem to be working on my machine (or I'm doing something wrong). When I start the app, the following shows up on CMD
C:\Users\rtrig\Desktop\Arquivos\Manual Sieve\AP26>ap26_opencl_windows32.exe 366384 366384 0 -- 0
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 3 2016 with GCC 4.9.0
Search parameters are KMIN: 366384 KMAX: 366384 SHIFT: 0
Beginning a new search with parameters from the command line
Group 1 with 1 devices
Group 2 with 1 devices
Device 0
Platform name: Intel(R) OpenCL
Vendor: Intel(R) Corporation
Device name: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz
Device 1
Platform name: NVIDIA CUDA
Vendor: NVIDIA Corporation
Device name: GeForce GTX 970
using device 0
But there's no CPU / GPU usage. After a little while, the app just stops. Here's the Stderr file (I've separated the 32 and 64 bit results, in order, with a line of *).
17:28:48 (480): Can't open init data file - running in standalone mode
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little
Compiled Jul 3 2016 with GCC 4.9.0
GPU global memory available: 2147352576
GPU max memory allocation: 536838144
17:28:51 (480): Can't acquire lockfile (32) - waiting 35s
17:29:19 (480): No heartbeat from client for 30 sec - exiting
17:29:19 (480): timer handler: client dead, exiting
17:30:29 (40140): Can't open init data file - running in standalone mode
*************************
17:27:16 (37752): Can't open init data file - running in standalone mode
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little
Compiled Jul 3 2016 with GCC 4.9.0
GPU global memory available: 4240371712
GPU max memory allocation: 4281318400
17:27:17 (37752): Can't acquire lockfile (32) - waiting 35s
17:27:48 (37752): No heartbeat from client for 30 sec - exiting
17:27:48 (37752): timer handler: client dead, exiting
17:27:52 (37752): Can't acquire lockfile (32) - exiting
17:27:52 (37752): Error: The process cannot access the file because it is being used by another process.
(0x20)
SSE2 version seems to be running, at the very least. Will report back / edit this post once it finishes. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Rafael,
1) Make sure you've downloaded the latest binaries. The first OpenCL apps today didn't work because they tried to open the lockfile twice, which couldn't work and would produce the error you see about not acquiring the lockfile.
2) Before each run, delete all files in the directory except the binaries. If the old lockfile is there, the app won't start. If the old result file is there, the app will think it's already done.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Also, you seem to be missing the --device 0 parameter. You have "-- 0" rather than "--device 0".
____________
My lucky number is 75898524288+1 |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
Okay, let's try this again. Reminder to self, using device 0 actually does ocl on the CPU cores rather than the GPU. Would have been nice to know before hand.
Anyways, I've downloaded the newest apps, but there's still lots of screen lag on the GPU, both on 32 and 64 bit transforms. On the flipside, both return the expected result in 22~23 seconds (is that a fast time?). |
|
|
|
both, 64bit and 32bit opencl versions do not start on my 64bit windows 10 machine.
Error (roughly translated ;) ): xxx.exe is not compatible with the running windows version.
The error title bar says: Unsupported 16 Bit Application ??
edit: here is the error as screenshot.
The same error occurs with the sse2 version.
http://imgur.com/jZqeof0
____________
|
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
I just had that error... turns out you can't right-click and save as on the link otherwise you get a html page saved. You need to click it, then click view raw where it then downloads. I don't get their logic in how the page is arranged. |
|
|
Artist Volunteer tester Send message
Joined: 29 Sep 08 Posts: 88 ID: 29825 Credit: 395,055,329 RAC: 259,488
                         
|
No problems here.
$ ./ap26_sse2_linux64 366384 366384 0
AP26 SSE2 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 3 2016 with GCC 4.6.4
AP26 SSE2 10-shift search version 1.1-dev by Bryan Little
Compiled Jul 3 2016 with GCC 4.6.4
Search parameters are KMIN: 366384 KMAX: 366384 SHIFT: 0
Beginning a new search with parameters from the command line
1 K in this range remaining to be searched (0 skipped, 0 done).
Starting search... reporting APs of size 20 and larger
Solution: 25 366384 6171054912832631
Computation of K: 366384 SHIFT: 0 complete
Computation of K: 366384 SHIFT: 64 complete
Computation of K: 366384 SHIFT: 128 complete
Computation of K: 366384 SHIFT: 192 complete
Computation of K: 366384 SHIFT: 256 complete
Computation of K: 366384 SHIFT: 320 complete
Computation of K: 366384 SHIFT: 384 complete
Computation of K: 366384 SHIFT: 448 complete
Computation of K: 366384 SHIFT: 512 complete
Computation of K: 366384 SHIFT: 576 complete
total CPU time for K was 933 seconds
Checkpoint: KMIN=366384 KMAX=366384 SHIFT=0 K=366385 (100.00%)
$ cat SOL-AP26.txt
25 366384 6171054912832631
000B2E6000015550
____________
144052 *5^2018290+1 is Prime! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Okay, let's try this again. Reminder to self, using device 0 actually does ocl on the CPU cores rather than the GPU. Would have been nice to know before hand.
Anyways, I've downloaded the newest apps, but there's still lots of screen lag on the GPU, both on 32 and 64 bit transforms. On the flipside, both return the expected result in 22~23 seconds (is that a fast time?).
It depends on your computer. On mine, device 0 is the GPU.
This could be an issue running under BOINC. We need to make sure the correct device is selected.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
I just had that error... turns out you can't right-click and save as on the link otherwise you get a html page saved. You need to click it, then click view raw where it then downloads. I don't get their logic in how the page is arranged.
Yup. If you're having trouble running it, chances are you saved the web page rather than the binary. They don't run very well. :)
____________
My lucky number is 75898524288+1 |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
CPU tests done on two systems:
System 1: i7-6700k fixed at 4.2 GHz. Windows 10 64-bit. All tests gave expected results. Run time took 796, 950, 800 seconds respectively.
System 2: i5-4570S variable 3.2 to 3.6 GHz due to turbo. Windows 7 64-bit. All tests gave expected results. Run time took 958, 1133, 955 seconds respectively.
I have some older CPUs I can try tomorrow... |
|
|
|
Mostly tested the first range for kicks
GTX980ti (Factory OC): 15 seconds range 1, 17 range 2
GT430 (1GB): "clWrite Error" "CL_MEM_OBJECT_ALLOCATION_FAILURE", which is both expected and a little surprising. I know it is below the 1.4 GB memory requirement, but can't it borrow from system memory to make up the difference?
HD7950: 60 seconds empty CPU, 64 with all 12 threads running ESPSieve, not a bad slowdown, considering
On the CPU side, just for range 1:
i7 3930k @3.8; 974 sec
Xeon X5675 @ 3.46; 1136 sec
The SSE2 is great for architecture comparisons: Clock per clock, Sandy-E is ~6.2% faster than Westmere-EP, and using mackerel's results: Skylake (assuming 4.2Ghz) is 10.7% faster than Sandy-E.
Despite my GPU being 64x faster than CPU at the same range (!!!), It's nice to see a program where older hardware is still useful and potentially competitive.
I'll leave the in-depth per n cores work to mackerel ;)
____________
Eating more cheese on Thursdays. |
|
|
|
Despite my GPU being 64x faster than CPU at the same range (!!!), It's nice to see a program where older hardware is still useful and potentially competitive.
I was pleased to see the reference to the SSE2 instruction set since my two older duo cores (Intel 2.67GHz P9600 chip) will do very well with this app and which have helped me much more than I expected with ESP Sieve the past few months!
I will try and steal some time tomorrow from these duo cores which are currently busy with ESP Sieve for some benchmarking with the AP app. |
|
|
Nortech Volunteer tester Send message
Joined: 7 Jun 10 Posts: 23 ID: 61946 Credit: 256,237,454 RAC: 7,412
                       
|
Host ID: 499554 running ap26_opencl_linux64 on a 2GB GeForce GTX 960 (driver 352.63 & no monitor connected). All results as expected in 42s, 46s and 41s respectively.
Host ID: 400358 running ap26_sse2_linux64 on an i7-3770. All results as expected in 917s, 1037s & 907s respectively.
Host ID: 498992 running ap26_sse2_linux64 on an AMD Athlon X2 4200+. All results as expected in 4364s, 4478s & 4215s respectively.
Host ID: 498992 running ap26_opencl_linux64 on a 1GB GeForce GT730 (driver 340.96 & no monitor connected).
Tests also terminate with “clWrite Error” & “CL_MEM_OBJECT_ALLOCATION_FAILURE”. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Host ID: 498992 running ap26_opencl_linux64 on a 1GB GeForce GT730 (driver 340.96 & no monitor connected).
Tests also terminate with “clWrite Error” & “CL_MEM_OBJECT_ALLOCATION_FAILURE”.
Considering the app needs about 1.4 GB to run on the GPU, I would expect it to fail like that.
In production, the plan class will be written such that tasks won't (or at least shouldn't) be sent to systems with insufficient video memory.
____________
My lucky number is 75898524288+1 |
|
|
|
I just had that error... turns out you can't right-click and save as on the link otherwise you get a html page saved. You need to click it, then click view raw where it then downloads. I don't get their logic in how the page is arranged.
thanks - no it's working as expected.
gtx 980ti running fine, residues match.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
Win 10 x64, Fury Nano, Catalyst 16.5.1 | 51, 61, 49 secs
Win 7 x64, 280X, Catalyst 16.5.2 | 84, 99, 83 secs
Results match in all cases.
____________
My stats |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
I'll leave the in-depth per n cores work to mackerel ;)
LOL I'll leave that until things are more final and representative of actual project work units.
Still, up to now there is enough data to form some picture. My Skylake does seem faster than the other Intel results so far (after clock normalisation), although there is enough variation amongst the lot I will have to drill down into the generational architecture differences later.
The Skylake CPU is nothing compared to the 980 Ti, which will do over 13x the work of the CPU per given time, assuming the CPU scales to 4 cores without performance impact. Again, that is something to look at later.
On the GPU side, it looks like nvidia has the advantage here. Even the 960 is faster than the Fury nano.
Comparing the 280X against Skylake again, the gap is much closer, with the GPU overall doing about 2.4x the work per time. Personally I think that is a poor performance-per-watt use of the GPU.
If anyone has a modern AMD CPU I'd be interested to see what that does. As more results come in I'll see if I can chart this up in some way. I'm currently running on some really old CPUS and it is taking forever... |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
Intel E6600, 2.4 GHz, Windows 10-64, test 1: 1954s, result ok.
AMD E2-1800, Windows 10-64, 1.7 GHz, test 1: 5112s, result ok.
AMD N54L, Server 2008 R2, 2.2 GHz, test 1: 3409s, result ok.
I don't think I'll do the other two tests on those. |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
Some GPU tests: both systems run Windows 10-64 and nvidia driver 368.39.
ocl64 app, GTX 750 Ti, 143, 160, 138s, results ok
ocl32 app, GTX 750 Ti, 131, 138, 124s, results ok
ocl64 app, GTX 970M, 42, 46, 40s, results ok
ocl32 app, GTX 970M, 37, 42, 36s, results ok
Note the 32 bit app is faster than the 64 bit one by an average of 12%. For previous tests, who used which app?
The 970M worked out about the same speed as the 960 reported earlier (depending if 32 or 64 bit app used), and the 750 Ti is still faster overall than the 6700k. |
|
|
|
Mac Pro 2008, 2xE5472, 3ghz, osx 10.11.5
Nvidia web driver 346.03.10f02
cpu tasks, all results match, 1387 sec, 1538 sec , 1356 sec
GTX 680, 4GB, all results match, 84 sec, 93 sec, 84 sec
GT 640, 4GB, all results match, 399 sec, 446 sec, 404 sec
My device numbering started at 1, which threw me for a second.
|
|
|
|
Mac Pro 2009, 2xE5520, osx 10.11.5
R9 280x
Fail on all 3 (same error).
dora:1 vzimmerman$ ../ap26_opencl_macintel64 366384 366384 0 --device 1
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 3 2016 with GCC 4.2.1 Compatible Clang 3.6.2 (tags/RELEASE_362/final)
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little
Compiled Jul 3 2016 with GCC 4.2.1 Compatible Clang 3.6.2 (tags/RELEASE_362/final)
Search parameters are KMIN: 366384 KMAX: 366384 SHIFT: 0
Beginning a new search with parameters from the command line
Group 1 with 1 devices
Group 2 with 1 devices
Device 0
Platform name: Apple
Vendor: Apple
Device name: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
Device 1
Platform name: Apple
Vendor: Apple
Device name: AMD Radeon HD Tahiti XT Prototype Compute Engine
using device 1
compiling clearok
building with optimizations
compiling clearsol
building with optimizations
compiling offset
building with optimizations
compiling setok
building with optimizations
compiling sieve
building with optimizations
compiling setupokok
building withOUT optimizations
compiling checkn
building withOUT optimizations
Error on buildProgram
RequestingInfo
Build Log for checkn_program:
Error returned by cvms_element_build_from_source
CL_BUILD_PROGRAM_FAILURE
dora:1 vzimmerman$ more stderr.txt
10:53:46 (26589): Can't open init data file - running in standalone mode
GPU global memory available: 3221225472
GPU max memory allocation: 805306368
OpenCL Error, Code: -11
|
|
|
|
Mac Pro 2009, 2xE5520, osx 10.11.5
R9 280x
Fail on all 3 (same error).
...
compiling checkn
building withOUT optimizations
Error on buildProgram
RequestingInfo
Build Log for checkn_program:
Error returned by cvms_element_build_from_source
CL_BUILD_PROGRAM_FAILURE
Yup, I also have this problem. Can't figure out a workaround at this point, going to report a bug to Apple.
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
Mac Pro 2009, 2xE5520, osx 10.11.5
R9 280x
Fail on all 3 (same error).
...
compiling checkn
building withOUT optimizations
Error on buildProgram
RequestingInfo
Build Log for checkn_program:
Error returned by cvms_element_build_from_source
CL_BUILD_PROGRAM_FAILURE
Yup, I also have this problem. Can't figure out a workaround at this point, going to report a bug to Apple.
I wonder if it is related to the opencl2/4 bug?
|
|
|
|
I intended to run the OpenCL on the integrated graphics of my Ivy i5, but it turns out that the device is really OpenCL on CPU. To be fair I'm not entirely sure I have a fully functioning iGPU driver running on that machine anyway, GPUz reports no computing attributes or memory.
Anyway, running the OCL app on the CPU results in 3 used cores, same 1.4 GB total memory usage as a GPU and a 612 sec total running time, which, though much faster than 1 core running the CPU app at 862 sec, isn't the 3x that would be required to make it worthwhile.
Back on the GPU side, a GTX580 does 45s, 53s, 47s.
Looking at Task manager, sometimes the app shows using a full core, sometimes it doesn't, and the system memory usage is running much lower than 1.1 GB (350 MB for my 580). Were things improved in these regards?
Honza, I'm curious about your AMD GPU results, did you have other stuff running in the background? Your 280x should be faster than my HD7950 (60, 66, 60) which would also mean that your Fury is a bit off as well. My driver is Omega 14.12
____________
Eating more cheese on Thursdays. |
|
|
|
Some more testing done under Windows 10 64-bit
CPU:
- i5-3570k: @4.0 GHz. Run time took 916, 1047, 576 seconds,results ok.
GPU (64bit app):
- R9 280X: Run time took 85, 100, 84 seconds, results ok.
- GT 730: Termintes with "clWrite Error" and "CL_MEM_OBJECT_ALLOCATION_FAILURE" because of the insuficiente memory I guess |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Looking at Task manager, sometimes the app shows using a full core, sometimes it doesn't, and the system memory usage is running much lower than 1.1 GB (350 MB for my 580). Were things improved in these regards?
Yes. The OpenCL app no longer uses a CPU core, and the amount of main memory dropped down to less than 400MB. GPU memory remains at about 1.4GB.
____________
My lucky number is 75898524288+1 |
|
|
|
Honza, I'm curious about your AMD GPU results, did you have other stuff running in the background? Your 280x should be faster than my HD7950 (60, 66, 60) which would also mean that your Fury is a bit off as well. My driver is Omega 14.12
My 280x results are pretty similar to Onzas, I didn't have anything running in the background. My driver is Crimson 16.6.1
I don't know why is so slow |
|
|
|
Looking at Task manager, sometimes the app shows using a full core, sometimes it doesn't, and the system memory usage is running much lower than 1.1 GB (350 MB for my 580). Were things improved in these regards?
Yes. The OpenCL app no longer uses a CPU core, and the amount of main memory dropped down to less than 400MB. GPU memory remains at about 1.4GB.
Spendid!
Went to a Haswell laptop to try and see if the HD4600 would work and got this error:
compiling clearok
building with optimizations
compiling clearsol
building with optimizations
compiling offset
building with optimizations
compiling setok
building with optimizations
compiling sieve
building with optimizations
compiling setupokok
building withOUT optimizations
Error on buildProgram
RequestingInfo
Build Log for setupokok_program:
fcl build 1 succeeded.
fcl build 2 succeeded.
Error: out of memory.
CL_BUILD_PROGRAM_FAILURE
The output is different from when my GT430 failed, as it completed the compile before it errored out on insufficient memory. Is this essentially the same error or a different one (despite dynamic memory allocation in the driver, I don't think it would actually reserve enough to run anyway-it's a laptop that won't let me set it manually).
____________
Eating more cheese on Thursdays. |
|
|
|
Honza, I'm curious about your AMD GPU results, did you have other stuff running in the background? Your 280x should be faster than my HD7950 (60, 66, 60) which would also mean that your Fury is a bit off as well. My driver is Omega 14.12
My 280x results are pretty similar to Onzas, I didn't have anything running in the background. My driver is Crimson 16.6.1
I don't know why is so slow
Interesting. I wonder if it's the drivers. If it wasn't several pains in multiple backsides to update and then revert drivers (I need 14.12 for PRPnet) I'd try the latest Crimson and see what happened.
____________
Eating more cheese on Thursdays. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
With regards to using an iGPU, is your iGPU configured to use a specific amount of memory? If it's less than 1.5GB it will certainly fail. It may need more than that if it's also running the screen.
____________
My lucky number is 75898524288+1 |
|
|
|
With regards to using an iGPU, is your iGPU configured to use a specific amount of memory? If it's less than 1.5GB it will certainly fail. It may need more than that if it's also running the screen.
Hard to say, the bios is so locked down I don't have access to even check those kinds of features and the intel control panel just reports the 12GB of system memory. I'm sure it's less than 1.5GB, but it was the difference between out of memory errors between the iGPU and a dedicated card that made me think. I don't know if the error was due to it being an intel iGPU and unable to compile rather than not having enough memory as the Nvidia card actually finished compiling and tried to run the task.
____________
Eating more cheese on Thursdays. |
|
|
|
My i7 mac mini shows 1536MB on its hd4000 (which should be exactly 1.5G), but it gets a slightly different error (not out of memory), but that may be a platform thing.
[snip]
building with optimizations
compiling setupokok
building withOUT optimizations
Error on buildProgram
RequestingInfo
Build Log for setupokok_program:
Error: internal error.
CL_BUILD_PROGRAM_FAILURE |
|
|
mfl0p Project administrator Volunteer developer Send message
Joined: 5 Apr 09 Posts: 249 ID: 38042 Credit: 2,471,970,980 RAC: 3,397,462
                              
|
The app will not work on Intel gpu. Only AMD and Nvidia. |
|
|
|
Well, that clears that up then! Thanks
____________
Eating more cheese on Thursdays. |
|
|
|
Ran all three on my desktop CPU Windows 7/Athlon II 2.7GHz dual core, all went well with correct results. Took 1857, 2122, and 1893 seconds. The first test ran by itself, and the last two ran at the same time. Judging by other peoples' times, there is very little slowdown running multiple tasks at once.
I might run a few more tests of multi-task slowdown as well as my dual-boot Windows 8.1/Linux laptop. |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
Honza, I'm curious about your AMD GPU results, did you have other stuff running in the background? Your 280x should be faster than my HD7950 (60, 66, 60) which would also mean that your Fury is a bit off as well. My driver is Omega 14.12
Nope, both tests were with no other CPU intensive apps in background.
I've expected AMD to be slower, even comparing to old GTX 580.
Anybody with new GTX 1080/1070?
____________
My stats |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
Finished my round of testing. Here are the results:
1-OCL32 bit doesn't work on the CPU. SSE2 is doing just fine, OCL64 can also use 3 cores and complete it with the correct result as well. But OCL32 on the CPU gives me the following:
Group 1 with 1 devices
Group 2 with 1 devices
Device 0
Platform name: Intel(R) OpenCL
Vendor: Intel(R) Corporation
Device name: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz
Device 1
Platform name: NVIDIA CUDA
Vendor: NVIDIA Corporation
Device name: GeForce GTX 970
using device 0
compiling clearok
building with optimizations
compiling clearsol
building with optimizations
compiling offset
building with optimizations
compiling setok
building with optimizations
compiling sieve
building with optimizations
compiling setupokok
building withOUT optimizations
compiling checkn
building withOUT optimizations
compiling done
local workgroup size for sieve kernel is 64 threads
clMalloc Error
2- On my Gtx 970, the OCL32 app is actually 2 seconds faster than the 64bit one. Both return the correct result, but for some reason, 32bit always ends up being a bit faster.
3- Still horrible screen lag on GPU OCL. |
|
|
mfl0p Project administrator Volunteer developer Send message
Joined: 5 Apr 09 Posts: 249 ID: 38042 Credit: 2,471,970,980 RAC: 3,397,462
                              
|
Ocl app on any CPU will be slower than running a sse2 app on each core. This is because the cl interface does not expose the sse2 instructions directly to the programmer. You have to rely on the compiler.
Also, the Ocl app sieve uses 32bit instructions, which would be slower on 64bit cpu. |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
Ocl app on any CPU will be slower than running a sse2 app on each core. This is because the cl interface does not expose the sse2 instructions directly to the programmer. You have to rely on the compiler.
So it's not worth the time to fix the OCL32 bit to run on the CPU, huh...
Also, the Ocl app sieve uses 32bit instructions, which would be slower on 64bit cpu.
Then why does the OCL64 app even exist? If 32bit is faster, just use that. I can't see a reason to use 64, outside of dealing with clients that have <no_alternate_platform> set.
Not that I'd want to use it anyway. Screen lag is way too strong. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Then why does the OCL64 app even exist? If 32bit is faster, just use that. I can't see a reason to use 64, outside of dealing with clients that have <no_alternate_platform> set.
Not that I'd want to use it anyway. Screen lag is way too strong.
With regards to OpenCL, in production we'll probably only release the 32 bit Windows app, the 64 bit Mac app, and both 32 and 64 bit apps for Linux. That's the same thing we do with Genefer.
Windows: As you and others have seen, the 32 bit app is faster so it's advantageous to use it instead of the 64 bit app. I'm not surprised -- the 32 bit Genefer CUDA app is faster than the 64 bit version, but it's not as big a difference as with AP27.
Mac: No 32 bit apps, so it has to be 64 bits.
Linux: The 32 bit app is probably still faster. I believe on Linux the compilers use different parameter passing protocols with 32 bits than with 64 bits, so I'm not certain whether the 32 bit version is faster or slower. It might be, but someone will have to test. But with Linux, we're going to release both versions, no matter what. The reason is that many, many people with 64 bit linux distros do not have the 32 bit support libraries installed, and if we send them the 32 bit binary, it will fail. If you have a 64 bit linux system, we're going to send a 64 bit binary. Always.
____________
My lucky number is 75898524288+1 |
|
|
|
I did some testing in my laptop, opencl seems to work properly on the integrated HD 8600M graphics card, but is very slow, runtime took 547 seconds |
|
|
|
Yup, I also have this problem. Can't figure out a workaround at this point, going to report a bug to Apple.
I wonder if it is related to the opencl2/4 bug?
Hard to say. It's the same error message, but I think that's quite a generic "Oops something went wrong" type of message. FYI, the two relevant bug reports are 22931181 (OCL4 bug), and 27167337 (ap26 bug), although I'm not sure if they are visible to the public. Please do go and file your own reports, I understand duplicates get ranked more highly on Apple's to-fix list.
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
mfl0p Project administrator Volunteer developer Send message
Joined: 5 Apr 09 Posts: 249 ID: 38042 Credit: 2,471,970,980 RAC: 3,397,462
                              
|
So it's not worth the time to fix the OCL32 bit to run on the CPU, huh...
You don't WANT to run the OCL app on a CPU. You want to use the SSE2 app. OCL app is for GPUS
Then why does the OCL64 app even exist? If 32bit is faster, just use that. I can't see a reason to use 64, outside of dealing with clients that have <no_alternate_platform> set.
The 64 bit opencl app in theory should reduce CPU usage a bit. But it seems, at least with the compiler being used, this is not the case.
Screen lag is way too strong.
Screen refresh rate improvements are being made.
A note about GPU compatibility:
The app should work on any opencl 1.0 GPU with at least 1.5gb VRAM. There are some driver-specific compiler problems, though. The Mac/AMD GPU, and Intel GPUs for example. |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
You don't WANT to run the OCL app on a CPU. You want to use the SSE2 app. OCL app is for GPUS
Hey, whether it's actually desirable or not, the purpose of testing is to find potential problem with the apps, no? OCL64 works on CPU, but 32 doesn't - that's a problem. I'm totally fine with ignoring it; just keep in mind that the issue is still there, even if there's no reason to go out of the way to fix it.
The app should work on any opencl 1.0 GPU.
Wait.... wasn't it supposed to only work on GPUs with over 1.5gb of vRAM?
Anyway, question: the app requires lots of vRAM, BUT, does it also require lots of ram bandwidth? I'm kinda wondering if it would make sense to use on iGPUs, ignoring the ram size restriction. |
|
|
mfl0p Project administrator Volunteer developer Send message
Joined: 5 Apr 09 Posts: 249 ID: 38042 Credit: 2,471,970,980 RAC: 3,397,462
                              
|
Hey, whether it's actually desirable or not, the purpose of testing is to find potential problem with the apps, no? OCL64 works on CPU, but 32 doesn't - that's a problem. I'm totally fine with ignoring it; just keep in mind that the issue is still there, even if there's no reason to go out of the way to fix it.
The OCL app was never designed to run on a CPU. By design, it will be SLOW, if it works at all. That's why we have the SSE2 CPU app. I designed the opencl app around Nvidia and AMD GPU architectures.
Wait.... wasn't it supposed to only work on GPUs with over 1.5gb of vRAM?
Yes, I edited my post to add that. And I realized that 8600M was an AMD, not the old Nvidia card.... it doesn't have enough VRAM.
Anyway, question: the app requires lots of vRAM, BUT, does it also require lots of ram bandwidth? I'm kinda wondering if it would make sense to use on iGPUs, ignoring the ram size restriction.
that would be something good to test. I have mostly modern, fast gpus to test on. I suspect Intel IGPU will be slow not because of RAM speed but raw compute power. |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
that would be something good to test. I have mostly modern, fast gpus to test on. I suspect Intel IGPU will be slow not because of RAM speed but raw compute power.
Agreed. But even if it's slow, it would be worth using it if it won't affect other tasks. For example, I can do sieve or WCG work + genefer n=13 on my iGPU without any performance loss on either part. Better than not even using the thing in the first place, no?
Besides, AMD has iGPUs as well. That's actually what I had in mind when I made the question. |
|
|
|
The OpenCL app fails on Intel iGPU. My available graphics memory is 1792 Mb. Slightly above the 1.5Gb requirement.
This is how the app fails on Windows:
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 7 2016 with GCC 4.9.0
Search parameters are KMIN: 47715109 KMAX: 47715109 SHIFT: 0
Beginning a new search with parameters from the command line
Group 1 with 1 devices
Device 0
Platform name: Intel(R) OpenCL
Vendor: Intel(R) Corporation
Device name: Intel(R) HD Graphics 4000
using device 0
compiling clearok
building with optimizations
compiling clearsol
building with optimizations
compiling offset
building with optimizations
compiling setok
building with optimizations
compiling sieve
building with optimizations
compiling setupokok
building withOUT optimizations
Error on buildProgram
RequestingInfo
Build Log for setupokok_program:
fcl build 1 succeeded.
fcl build 2 succeeded.
Error: internal error.
CL_BUILD_PROGRAM_FAILURE
What's interesting is that other OpenCL apps seem to run fine, in particular wwwwcl for Wall-Sun-Sun and Wieferich projects on the PRPNet server.
____________
1281*2^594565+1
2393323632147*2^1290000-1 |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
The OpenCL app fails on Intel iGPU
mfl0p said "There are some driver-specific compiler problems, though. The Mac/AMD GPU, and Intel GPUs for example."
Wonder if I should try on my HD530 as well... drivers seem to be very different than previous iterations, as seen with the OCL freeze and OCL3 bugs. |
|
|
|
Just tested the opencl64 app on my HD6950 1GB card.
All three tests completed successfully and took around 4 minutes each.
Windows 7 Running driver 15.7.1
So older VLIW4 card works even with only 1GB memory. |
|
|
kiskaVolunteer tester Send message
Joined: 13 Apr 12 Posts: 47 ID: 138397 Credit: 237,493,391 RAC: 499,901
                    
|
GT 840M times, 1082MB VRAM used 330MB System RAM used
366384 126 seconds
44121555 144 seconds
47715109 128 seconds |
|
|
|
I tested the App on Ubuntu 16.04 LTS (kernel 4.4.0-28.47-generic4.4.13),
amdgpu-pro 16.30.3-306809(beta) Driver, AMD A10-7850K, 2GB VRAM.
CPU runs great:
Solution: 25 366384 6171054912832631
Computation of K: 366384 SHIFT: 0 complete in 155 seconds
Computation of K: 366384 SHIFT: 64 complete in 154 seconds
Checkpoint: KMIN:366384 KMAX:366384 SHIFT:0 K:366384 ITER:2 (20.00%)
Computation of K: 366384 SHIFT: 128 complete in 157 seconds
Computation of K: 366384 SHIFT: 192 complete in 162 seconds
Checkpoint: KMIN:366384 KMAX:366384 SHIFT:0 K:366384 ITER:4 (40.00%)
Computation of K: 366384 SHIFT: 256 complete in 153 seconds
Computation of K: 366384 SHIFT: 320 complete in 155 seconds
Checkpoint: KMIN:366384 KMAX:366384 SHIFT:0 K:366384 ITER:6 (60.00%)
Computation of K: 366384 SHIFT: 384 complete in 154 seconds
Computation of K: 366384 SHIFT: 448 complete in 155 seconds
Checkpoint: KMIN:366384 KMAX:366384 SHIFT:0 K:366384 ITER:8 (80.00%)
Computation of K: 366384 SHIFT: 512 complete in 157 seconds
Computation of K: 366384 SHIFT: 576 complete in 156 seconds
Checkpoint: KMIN:366384 KMAX:366384 SHIFT:0 K:366384 ITER:10 (100.00%)
Checkpoint: KMIN:366384 KMAX:366384 SHIFT:0 K:366385 ITER:0 (100.00%)
But the GPU runs only with the older App-Version.
./ap26_opencl_linux64 366384 366384 0 -- 0
Command line: ./ap26_opencl_linux64 366384 366384 0 -- 0
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 5 2016 with GCC 4.6.4
Search parameters are KMIN: 366384 KMAX: 366384 SHIFT: 0
Beginning a new search with parameters from the command line
Group 1 with 1 devices
Device 0
Platform name: AMD Accelerated Parallel Processing
Vendor: Advanced Micro Devices, Inc.
Device name: Spectre
using device 0
compiling clearok
building with optimizations
compiling clearsol
building with optimizations
compiling offset
building with optimizations
compiling setok
building with optimizations
compiling sieve
building with optimizations
compiling setupokok
building withOUT optimizations
compiling checkn
building withOUT optimizations
compiling done
local workgroup size for sieve kernel is 64 threads
1 K in this range remaining to be searched (0 skipped, 0 done).
Starting search... reporting APs of size 20 and larger
Solution: 25 366384 6171054912832631
Computation for K: 366384 SHIFT: 0 complete.
GPU time was 45 seconds
Computation for K: 366384 SHIFT: 64 complete.
GPU time was 44 seconds
Computation for K: 366384 SHIFT: 128 complete.
GPU time was 45 seconds
Computation for K: 366384 SHIFT: 192 complete.
GPU time was 45 seconds
Computation for K: 366384 SHIFT: 256 complete.
GPU time was 45 seconds
Computation for K: 366384 SHIFT: 320 complete.
GPU time was 45 seconds
Computation for K: 366384 SHIFT: 384 complete.
GPU time was 46 seconds
Computation for K: 366384 SHIFT: 448 complete.
GPU time was 45 seconds
Computation for K: 366384 SHIFT: 512 complete.
GPU time was 44 seconds
Computation for K: 366384 SHIFT: 576 complete.
GPU time was 45 seconds
total GPU time for K was 450 seconds
Checkpoint: KMIN=366384 KMAX=366384 SHIFT=0 K=366385 (100.00%)
But with the new App I get an Error. stderr.txt:
16:38:36 (4135): Can't open init data file - running in standalone mode
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 13 2016 with GCC 4.4.7 20120313 (Red Hat 4.4.7-17)
Command line: ./ap26_opencl_linux64 366384 366384 0 -- 0
16:38:36 (4135): Can't open init data file - running in standalone mode
Error: boinc_get_opencl_ids() failed with error -108
It would be nice if the AP27-Search-App running on ALL AMD OpenCL-Devices and not like the current genefer/pps-sieve Apps,
who needs the "AMD-CAL" Driver who is really hard to get it work(i have give up with this).
The AMDGPU PRO Linux Driver runs very well and it support OpenCL1.2 out of the box.
[/b] |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
It would be nice if the AP27-Search-App running on ALL AMD OpenCL-Devices and not like the current genefer/pps-sieve Apps,
who needs the "AMD-CAL" Driver who is really hard to get it work(i have give up with this).
The AMDGPU PRO Linux Driver runs very well and it support OpenCL1.2 out of the box. [/b]
I used the AP27 development process to do the testing necessary to fix that long standing ATI/AMD problem. AP27 should run, without app_info, on the new-ish AMD GPUs that have been problematic for so long.
Even better, I've applied the lessons learned in that process to the GFN and PPS-Sieve applications, and those are now working. Give it a try -- I think you'll find that your AMD GPUs now get tasks without any difficulties. :)
____________
My lucky number is 75898524288+1 |
|
|
|
Thanks for the Upgrade :-),
I can now run all GPU-Tasks. |
|
|
|
No problems with OpenCL version, file output matched.
Win7 Pro 64 Nvidia GTX 970 i5-3570K
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 7 2016 with GCC 4.9.0
...
total GPU time for K was 27 seconds
Checkpoint: KMIN=366384 KMAX=366384 SHIFT=0 K=366385 (100.00%)
...
total GPU time for K was 31 seconds
Checkpoint: KMIN=44121555 KMAX=44121555 SHIFT=0 K=44121556 (100.00%)
...
total GPU time for K was 27 seconds
Checkpoint: KMIN=47715109 KMAX=47715109 SHIFT=0 K=47715110 (100.00%)
|
|
|
mfl0p Project administrator Volunteer developer Send message
Joined: 5 Apr 09 Posts: 249 ID: 38042 Credit: 2,471,970,980 RAC: 3,397,462
                              
|
But with the new App I get an Error. stderr.txt:
16:38:36 (4135): Can't open init data file - running in standalone mode
AP26 OpenCL 10-shift search version 1.1-dev by Bryan Little and Iain Bethune
Compiled Jul 13 2016 with GCC 4.4.7 20120313 (Red Hat 4.4.7-17)
Command line: ./ap26_opencl_linux64 366384 366384 0 -- 0
16:38:36 (4135): Can't open init data file - running in standalone mode
Error: boinc_get_opencl_ids() failed with error -108
We are in the process of converting from the old --device # gpu selection to using the proper boinc gpu selection. The linux app has the new code right now. To test this stand-alone you will need an init_data.xml in the directory of the ap26 binary. The init_data.xml will be where you put your device type and number to use. When the project is started, the app running under boinc will do this automatically. I have posted examples of AMD/ATI and Nvidia init_data.xml for testing here:
https://github.com/ibethune/ap26/tree/master/tests/INIT_DATA%20test%20files
remember only AMD and Nvidia gpus are working at this time. |
|
|
|
Command line: ./ap26_opencl_linux64 366384 366384 0 -- 0
Oh, I used "--device 0" as the parameter, should it have been just "--0"?
Or is it different between OS versions?
Thanks. |
|
|
|
when we start crunching?
next day?
next week?
in a month? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
when we start crunching?
next day?
next week?
in a month?
When the software is ready, however long that takes. It's too early right now to guess when that might be. When we know, we'll let you know too.
____________
My lucky number is 75898524288+1 |
|
|
|
The OpenCL app fails on Intel iGPU
mfl0p said "There are some driver-specific compiler problems, though. The Mac/AMD GPU, and Intel GPUs for example."
Wonder if I should try on my HD530 as well... drivers seem to be very different than previous iterations, as seen with the OCL freeze and OCL3 bugs.
There are a good number of Intel GPU's sitting out there looking for work. The HD530 ranges from 3225 to 6500 MB memory. Even my older HD4000 has 1624 MB.
If something can be done, it would be excellent. |
|
|
|
Windows 10 64 bit, I5-6600, GTX 960
All three GPU tests ran OK with the expected results.
The CPU tests failed
I ran:
D:\AP26\4>ap26_sse2_windows64.exe 366384 366384 0
and got a windows pop-up message:
Unsupported 16-bit application
The program or feature "\...._sse2_windows64.exe" cannot start or run due to incompitability with 64-bit versios of windows. Please contact the...
When I hit the OK the command termianted with this message in the CMD window:
This version of D:\AP26\4\ap26_sse2_windows64.exe is not compatible with the version of Windows you're running. Check your computer's system information and then contact the software publisher.
What am I doing wrong
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Windows 10 64 bit, I5-6600, GTX 960
All three GPU tests ran OK with the expected results.
The CPU tests failed
I ran:
D:\AP26\4>ap26_sse2_windows64.exe 366384 366384 0
and got a windows pop-up message:
Unsupported 16-bit application
The program or feature "\...._sse2_windows64.exe" cannot start or run due to incompitability with 64-bit versios of windows. Please contact the...
When I hit the OK the command termianted with this message in the CMD window:
This version of D:\AP26\4\ap26_sse2_windows64.exe is not compatible with the version of Windows you're running. Check your computer's system information and then contact the software publisher.
What am I doing wrong
That's because you are trying to run a web page instead of the AP27 app. CPUs run x86 code, not HTML.
LEFT click on the file name -- do NOT right click and then use "Save as". When you left click on it you get another webpage. Find the "Raw" link, and LEFT click on that. You should now get the "Save as" dialog box.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
Anybody with new GTX 1080/1070?
Answering my own question using Win 7 x64, Skylake, GTX 1070 using 32-bit and 64-bit app:
14 sec (17 x64), 17 (18), 15 (16).
____________
My stats |
|
|
|
That's because you are trying to run a web page instead of the AP27 app. CPUs run x86 code, not HTML.
LEFT click on the file name -- do NOT right click and then use "Save as". When you left click on it you get another webpage. Find the "Raw" link, and LEFT click on that. You should now get the "Save as" dialog box.
Thanks, tests OK.
Test 366384 44121555 47715109
GPU 41s 46s 41s
CPU 938s 1114s 932s
|
|
|
|
GPU: AMD R9 290 Tri-X OC
366384 -> 54s
44121555 -> 67s
47715109 -> 53s
CPU: Intel Xeon E3-1230V3 at 3.3 GHz
366384 -> 1051s
Results ok |
|
|
|
Any new info? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Any new info?
Progress is happening behind the scenes, but sometimes the process is slower than any of us would like. There's no particular problem or roadblock. Sometimes we just have to wait for people to get back from vacation, or have free time, etc. Multiple people are involved, and they all have other commitments that sometimes take precedence. At least two of the people involved are associated with education, which means summer is the time when family vacations are usually planned.
____________
My lucky number is 75898524288+1 |
|
|
|
We have release-candidate binaries ready for (hopefully) final testing before the release. I posted a test matrix and instructions here: https://docs.google.com/spreadsheets/d/19W4tjULce7OE5iPFFK1juRTEiIOSKFC9ZvfU19sRliE/.
If you are able to run any of the tests, please follow the instructions and post here to report your results. NB. For GPU tests, an init_data.xml file is required (link in the spreadsheet).
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
Mac SSE4.1 all 3 tests complete. All results match. |
|
|
|
13:21:27 (12692): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.9.0
Command line: ap26_cpu_windows64.exe 25 366384 6171054912832631
Detected SSE4.1 CPU
../common/PrimeQ_x86.h: modulus out of range.
cpu i7 860 (Lynnfield)
os: win10 64bit |
|
|
|
Windows avx2, all 3 tests completed, all results match. |
|
|
|
13:21:27 (12692): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.9.0
Command line: ap26_cpu_windows64.exe 25 366384 6171054912832631
Detected SSE4.1 CPU
../common/PrimeQ_x86.h: modulus out of range.
cpu i7 860 (Lynnfield)
os: win10 64bit
Does the echoed command line match the one you executed? When I run on a q9550 I get
07:06:19 (2068): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.9.0
Command line: ..\ap26_cpu_windows64.exe 366384 366384 0
Detected SSE4.1 CPU
07:24:57 (2068): called boinc_finish(0)
|
|
|
|
13:21:27 (12692): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.9.0
Command line: ap26_cpu_windows64.exe 25 366384 6171054912832631
Detected SSE4.1 CPU
../common/PrimeQ_x86.h: modulus out of range.
cpu i7 860 (Lynnfield)
os: win10 64bit
Hi Steve,
Looks like you used the wrong command line arguments for the test. It should be:
ap26_cpu_windows64.exe 366284 366384 0
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
13:21:27 (12692): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.9.0
Command line: ap26_cpu_windows64.exe 25 366384 6171054912832631
Detected SSE4.1 CPU
../common/PrimeQ_x86.h: modulus out of range.
cpu i7 860 (Lynnfield)
os: win10 64bit
Does the echoed command line match the one you executed? When I run on a q9550 I get
07:06:19 (2068): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.9.0
Command line: ..\ap26_cpu_windows64.exe 366384 366384 0
Detected SSE4.1 CPU
07:24:57 (2068): called boinc_finish(0)
ohh ... nevermind :/
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
Win 10, GTX 1070, GPU tests passed.
Win 10, AMD Fury Nano, GPU tests passed.
____________
My stats |
|
|
|
now with the correct cmd line, all 3 tests passed and also all results match.
win10, sse4.1 |
|
|
|
Please tell me/us we are very close to launch! |
|
|
|
Please tell me/us we are very close to launch!
inside the google sheet you can connect already onto the BOINC testserver and crunch the ap27 project. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
I expect Iain to add another set of tests to the test matrix for actually testing the app with BOINC, as opposed to running the tests manually.
To run the boinc tests, you'll need to connect to our test server http://dev.primegrid.com/.
A few details about using this test server:
1) NOTHING COUNTS!!! Even if you find an ap27 on the test server, nobody will notice. The database gets wiped often. No credit. Nothing.
2) It's a small server. Please don't load up on 10 days worth of tasks. If the server gets swamped, I'll probably need to shut it off.
3) We'd like you to try it on a variety of hardware to see if anything breaks.
4) AP27 is for 64 bit systems only. No 32 bit systems. No 64 bit CPUs running 32 bit OSs either.
5) Due to a driver bug, AP27 is not available to run on Mac computers with ATI/AMD GPUs.
6) Did I mention it's a small server? It's possible that you may see intermittent "no work" errors if the server momentarily runs out of work. If this happens, it should generate more work within 75 seconds or so. If not, let me know.
7) AP27 is the ONLY app that has work on this server.
8) Your existing login credentials from PrimeGrid should be valid on the test server.
9) Regarding the big red legal message at the top of every page: this post constitutes written permission to use that server for the purpose of conducting this test.
10) Most important: If you see anything unusual, let me know please!
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Please tell me/us we are very close to launch!
We are very close to launch. :)
____________
My lucky number is 75898524288+1 |
|
|
|
Please tell me/us we are very close to launch!
inside the google sheet you can connect already onto the BOINC testserver and crunch the ap27 project.
Please note, no credit is assigned for work on dev.primegrid.com - please just run a couple of units then disconnect. This is only intended for testing. These are not production WUs (they are much shorter than the eventual tasks, for one thing).
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
The tasks on the test server are VERY short. They're all single-K tests, just like the three manual tests. In the context of AP27, "K' is the minimum test size, so a 2-K test will run twice as long as a 1-K test.
Actual production tasks will be considerably longer, probably in the range of 75 to 100 Ks.
____________
My lucky number is 75898524288+1 |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
To run the boinc tests, you'll need to connect to our test server http://dev.primegrid.com/.
Does the server (either dev/main) have a way to prevent tasks being sent to GPUs that can't crunch? If so, it would be nice to check that the function is actually working. |
|
|
288larsson Volunteer tester
 Send message
Joined: 17 Apr 10 Posts: 136 ID: 58815 Credit: 5,602,203,491 RAC: 3,165,820
                                   
|
Linux Nvidia GPU, GTX 980ti, GPU tests passed.
Linux ATI GPU, AMD Hawaii, GPU tests passed |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1033 ID: 301928 Credit: 543,624,271 RAC: 6,563
                         
|
Linux/SSE2 all tests passed. Good old and sloooow AMD... (~30 minutes per test)
16:05:51 (6403): Can't open init data file - running in standalone mode
AP26 CPU 10-shift search version 1.3 by Bryan Little and Iain Bethune
Compiled Aug 17 2016 with GCC 4.8.5 20150623 (Red Hat 4.8.5-4)
Command line: ./ap26_cpu_linux64 366384 366384 0
Assumed SSE2 CPU
16:33:42 (6403): called boinc_finish(0)
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
To run the boinc tests, you'll need to connect to our test server http://dev.primegrid.com/.
Does the server (either dev/main) have a way to prevent tasks being sent to GPUs that can't crunch? If so, it would be nice to check that the function is actually working.
Ah, that's the thing I forgot to mention.
The server is set up to only send tasks to GPUs with a minimum of 1.5GB of video memory. Except for that restriction, and the previously mentioned bug with ATI/AMD GPUs on Macs, it should work on most GPUs. The server shouldn't be sending the tasks to GPUs that can't run it.
That being said, there's a few oddities we've observed:
1) If you have multiple GPUs in one computer, and one of them has more than 1.5 GB and the other has less than 1.5 GB, the server will (correctly) send tasks to that computer. However, once the task is on the host computer, the BOINC client will ignore the memory limits and will run, or try to run, the task on the GPU with less than 1.5 GB. If you have a computer like this, you can manually force BOINC to only run AP27 on the big GPU using cc_config.xml.
2) As a result of #1 above, we discovered that sometimes the video driver will use main system memory to augment the video memory. We've seen AP27 run -- and run correctly -- on a GPU with only 1 GB of vram even though the app was using more than 1 GB of vram. We don't understand this completely, and it doesn't always work that way.
____________
My lucky number is 75898524288+1 |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
The server is set up to only send tasks to GPUs with a minimum of 1.5GB of video memory. Except for that restriction, and the previously mentioned bug with ATI/AMD GPUs on Macs, it should work on most GPUs. The server shouldn't be sending the tasks to GPUs that can't run it.
Good part: it's not sending for one of my systems (APU one), as it's supposed to.
Bad part: unless I have to do something that I'm not aware, it seems to not be sending me CPU tasks either.
19/08/2016 12:32:09 | CompositeGrid | Requesting new tasks for CPU and AMD/ATI GPU
19/08/2016 12:32:10 | CompositeGrid | Scheduler request completed: got 0 new tasks
19/08/2016 12:32:10 | CompositeGrid | No tasks sent
19/08/2016 12:32:10 | CompositeGrid | No tasks are available for The Riesel Problem (Sieve)
I say that because I went into the preferences tab and didn't find an option to turn AP27 or anything, only the regular Boinc options. |
|
|
|
Please tell me/us we are very close to launch!
We are very close to launch. :)
Awesome!!! |
|
|
|
I say that because I went into the preferences tab and didn't find an option to turn AP27 or anything, only the regular Boinc options.
You need to select AP27 on the project preferences page (on the dev.primegrid.com website, under Your Account). The default is TRP Sieve, which there aren't any tasks for on the test server.
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
I say that because I went into the preferences tab and didn't find an option to turn AP27 or anything, only the regular Boinc options.
You need to select AP27 on the project preferences page (on the dev.primegrid.com website, under Your Account). The default is TRP Sieve, which there aren't any tasks for on the test server.
- Iain
As I said, the option is not there. There's no AP27 subproject, only the regular ones. |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
As I said, the option is not there. There's no AP27 subproject, only the regular ones.
And Send work from any subproject if selected projects have no work doesn't work either.
Does AP27 needs to be admin enabled, per user?
____________
My stats |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
As I said, the option is not there. There's no AP27 subproject, only the regular ones.
And Send work from any subproject if selected projects have no work doesn't work either.
Does AP27 needs to be admin enabled, per user?
I'll look into it. I know I originally had the test system open to everyone, but when in got flooded by wuprop badge hunters I needed to do something to block them. I may have taken some steps I forgot to undo. Give me a few minutes...
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Ap27 should be visible to everyone now on the test server.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
Ap27 should be visible to everyone now on the test server.
It is now, thanks.
EDIT: Both AMD Fury Nano and nVidia GTX 1070 GPU tasks are working and validating.
____________
My stats |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
Ap27 should be visible to everyone now on the test server.
Working now. I was able to get CPU tasks and the GPU got the min RAM message, just as expected. Will crunch them to the end and (hopefully) validate.
Only one thing kinda bothered me... in the message "A minimum of 1500 MB (preferably 1500 MB) of video RAM is needed", is there a reason to have that "preferred" part? If the preferred is the same as the min, I don't really see a point in including that, unless there's a secret reason I'm unaware of. |
|
|
288larsson Volunteer tester
 Send message
Joined: 17 Apr 10 Posts: 136 ID: 58815 Credit: 5,602,203,491 RAC: 3,165,820
                                   
|
Linux avx2 http://dev.primegrid.com/result.php?resultid=735426354
Linux ATI GPU http://dev.primegrid.com/result.php?resultid=735426352
Linux Nvidia GPU http://dev.primegrid.com/result.php?resultid=735427348 |
|
|
|
Windows 10 x64 Nvidia GPU, GTX 750ti, GPU tests passed http://dev.primegrid.com/result.php?resultid=735427383 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Ap27 should be visible to everyone now on the test server.
Working now. I was able to get CPU tasks and the GPU got the min RAM message, just as expected. Will crunch them to the end and (hopefully) validate.
Only one thing kinda bothered me... in the message "A minimum of 1500 MB (preferably 1500 MB) of video RAM is needed", is there a reason to have that "preferred" part? If the preferred is the same as the min, I don't really see a point in including that, unless there's a secret reason I'm unaware of.
Not really sure. Due to how the BOINC client has changed behavior over the years, and that the server needs to support ALL of those versions, the server's logic is rather convoluted. This definitely falls into the "If it ain't broke, don't fix it" category.
____________
My lucky number is 75898524288+1 |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3207 ID: 130544 Credit: 2,285,550,731 RAC: 763,073
                           
|
BOINCing here & all good: http://dev.primegrid.com/results.php?userid=130544 |
|
|
|
BOINCing here & all good: http://dev.primegrid.com/results.php?userid=130544
Can you post links to tasks, for some reason I can't see your user page...
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
A OK on my 2600K @4.1Ghz, 3570K @4.2Ghz and AMD HD7950 @1040Ghz clock 1275Ghz mem. All Win7.
http://dev.primegrid.com/results.php?userid=92179
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
|
|
|
|
All completed and validated.
http://dev.primegrid.com/results.php?userid=168418
|
|
|
|
14 WU's completed and validated
http://dev.primegrid.com/results.php?userid=93509 |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3207 ID: 130544 Credit: 2,285,550,731 RAC: 763,073
                           
|
BOINCing here & all good: http://dev.primegrid.com/results.php?userid=130544
Can you post links to tasks, for some reason I can't see your user page...
- Iain
Oops sorry, here we go:
i7-2600K @4.4:
http://dev.primegrid.com/workunit.php?wuid=493760337
GTX580:
http://dev.primegrid.com/workunit.php?wuid=493760259 |
|
|
|
All completed and validated.
http://dev.primegrid.com/results.php?userid=168418
http://dev.primegrid.com/results.php?hostid=478753
|
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3207 ID: 130544 Credit: 2,285,550,731 RAC: 763,073
                           
|
1 comp error but I suspect it's my fault for using PC at the same time:
http://dev.primegrid.com/result.php?resultid=735427013 |
|
|
|
1 comp error but I suspect it's my fault for using PC at the same time:
http://dev.primegrid.com/result.php?resultid=735427013
That machine successfully completed plenty of other tasks, so if you think you did something that would affect the graphics driver that could probably be the cause.
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
Thanks for all the test results so far. If anyone has SSE2/SSE4.1 linux machines or SSE2/AVX Macs please connect and try some tests!
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
1 comp error but I suspect it's my fault for using PC at the same time:
http://dev.primegrid.com/result.php?resultid=735427013
That machine successfully completed plenty of other tasks, so if you think you did something that would affect the graphics driver that could probably be the cause.
- Iain
Actually, I think this is an instance of a rare bug in the BOINC client. Actually your task completed correctly, and returned a matching result to the other two. We saw this once before (with LLR): https://www.primegrid.com/forum_thread.php?id=6707
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
If anyone has SSE2/SSE4.1 linux machines try some tests!
Here you be ....
"Assumed SSE2 CPU"
AMD Phenom(tm) II X4 960T Processor [Family 16 Model 10 Stepping 0]
UBUNTU 16.04
4 completed tasks @ hostid=501864
GTX 275 w 895mb did not get tasks - as anticipated.
____________
There's someone in our head but it's not us. |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 911 ID: 370496 Credit: 550,428,493 RAC: 435,335
                         
|
35 valid Tasks, most from a Gtx 970, but some were from a Pentium E2180 and an A6-3500. |
|
|
|
Thanks for all the test results so far. If anyone has SSE2/SSE4.1 linux machines or SSE2/AVX Macs please connect and try some tests!
- Iain
Iain, I ran a couple tests on my 2012 iMac. So far all workunits for CPU (i7 with AVX) and GPU (GTX 680MX) are validating just fine. Below are a few links to CPU and GPU tests:
GPU:
http://dev.primegrid.com/workunit.php?wuid=493760638
http://dev.primegrid.com/workunit.php?wuid=493760632
http://dev.primegrid.com/workunit.php?wuid=493760626
CPU:
http://dev.primegrid.com/workunit.php?wuid=493760636
http://dev.primegrid.com/workunit.php?wuid=493760635
http://dev.primegrid.com/workunit.php?wuid=493760631 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
PLEASE READ:
I just changed the test system to send out tasks that run 2 Ks. New tasks will therefore take twice as long to run as the tasks previously run on the test system.
PLEASE NOTE that old tasks (those with 1 K) should be aborted. It's likely that none of them will validate even if completed correctly.
____________
My lucky number is 75898524288+1 |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
Late to join in but running an old Core 2 Win10 machine to be different. It did 36 units ok, and 5 came back with error e.g. http://dev.primegrid.com/workunit.php?wuid=493760662
The task says "workunit error". Is this a problem with the machine, the software, or the server doing something? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Late to join in but running an old Core 2 Win10 machine to be different. It did 36 units ok, and 5 came back with error e.g. http://dev.primegrid.com/workunit.php?wuid=493760662
The task says "workunit error". Is this a problem with the machine, the software, or the server doing something?
tl;dr: Don't worry about those tasks. Any tasks with a workunit ID below 493761152 should be aborted. They won't validate. (ON THE TEST SYSTEM. Don't abort tasks on the live system!!!)
Old workunits have been cancelled (hence, if you follow the link you posted, it says "WU canceled"). As per my previous post, I've switched the dev system to use workunits with 2 Ks. The old workunits won't be able to validate. (It's not the switch from 1K to 2 Ks that's causing the old tasks not to validate. It's a byproduct of the testing we're doing.)
____________
My lucky number is 75898524288+1 |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
I didn't think it was related to that as some affected units were much earlier, and units returned after those validated. Unless I'm looking at the wrong times. |
|
|
|
2K tasks are validating OK for me this morning: http://dev.primegrid.com/result.php?resultid=735428680.
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,475,826 RAC: 2,271,747
                                      
|
Both Fury Nano http://dev.primegrid.com/results.php?hostid=248611
and GTX 1070 validates http://dev.primegrid.com/results.php?hostid=376156
____________
My stats |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3207 ID: 130544 Credit: 2,285,550,731 RAC: 763,073
                           
|
Ok here W7 GTX580:
http://dev.primegrid.com/workunit.php?wuid=493761180
Although I did have 1 CPU invalid:
http://dev.primegrid.com/workunit.php?wuid=493760578 |
|
|
|
Although I did have 1 CPU invalid:
http://dev.primegrid.com/workunit.php?wuid=493760578
That's an old WU from before Mike upped the size of the task. It won't validate (see http://www.primegrid.com/forum_thread.php?id=6891&nowrap=true#97900) - don't worry about it.
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! |
|
|
|
If anyone has SSE2/SSE4.1 linux machines try some tests!
Here you be ....
"Assumed SSE2 CPU"
AMD Phenom(tm) II X4 960T Processor [Family 16 Model 10 Stepping 0]
UBUNTU 16.04
4 completed tasks @ hostid=501864
4 "double length" tasks for SSE2 @ hostid=501864
____________
There's someone in our head but it's not us. |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
It took overnight to run the major Windows 10 update on the old slow box, but I'll go again today and see if there are the same errors or if they are no longer present. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Anyone here have access to a computer still running the faulty Nvidia driver from earlier in the year?
____________
My lucky number is 75898524288+1 |
|
|
|
Anyone here have access to a computer still running the faulty Nvidia driver from earlier in the year?
Just the faulty driver, or the faulty driver + a maxwell card? |
|
|
tng Send message
Joined: 29 Aug 10 Posts: 486 ID: 66603 Credit: 47,380,766,060 RAC: 27,804,176
                                                    ![]() |