Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Generalized Fermat Prime Search :
New genefer (3.x.x) apps now available for testing
Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
We have been working for several months on getting some new versions of Genefer produced, and we're getting very close to being done. We're still working on a few things, but we think what we have now is usable. Note that we have NOT tested every build, on every platform, in all configurations. I consider this to be alpha level code, meaning that not everything that needs to be tested has been tested yet. It won't surprise me if there are bugs.
That being said -- I ran a new version of geneferCUDA through a pair of PRPNet n=18 tasks yesterday, and am currently running it, using app_info, on a BOINC n=20 task right now. I expect it to work.
New features of these new apps include:
- ATI/AMD GPUs: GeneferOCL will run on DP-capable ATI/AMD GPUs.
- GPU: Should fix at least some of the errors occurring on the World Record tasks.
- ALL: Should solve the problem where the app continues to run after you try to shut it down with BOINC.
- CPU: A faster transform is being used, with significant speed improvement.
- CPU: AVX version available.
- CPU: SSE3 version available.
To use these with BOINC, you'll need to use an app_info.xml file to run as anonymous platform.
The following builds are available and are listed in approximate speed order for each platform:
WINDOWS:
OpenCL (aka GeneferOCL) (3.1.2-7, in production on BOINC as 2.10) (Intended for ATI/AMD GPUs) (Please discuss OpenCL in this thread.)
CUDA (aka geneferCUDA) (3.1.2-9, in production on BOINC as 2.12)
AVX (In production on BOINC as v2.04)
SSE3 (In production on BOINC as v2.04)
SSE2 (aka genefx64) (In production on BOINC as v2.04)
32-bit (aka genefer)
8087 (aka genefer80)
LINUX:
OpenCL (aka GeneferOCL) (3.1.2-7, in production on BOINC as 2.07) (Intended for ATI/AMD GPUs) (Please discuss OpenCL in this thread.)
CUDA 3.2 (aka geneferCUDA) (3.1.2-2, in production on BOINC as 2.06)
CUDA 5.0 (may work better on newer GPUs than CUDA 3.2) (3.1.2-2)
AVX (In production on BOINC as v2.04)
SSE3 (In production on BOINC as v2.04)
SSE2 (aka genefx64) (In production on BOINC as v2.04)
32 bit (aka genefer)
128 bit
8087 (aka genefer80)
MAC:
OpenCL (aka GeneferOCL) (3.1.2-7, in production on BOINC as 2.08) (Intended for ATI/AMD GPUs) (Please discuss OpenCL in this thread.)
CUDA (aka geneferCUDA) (3.1.2-2, in production on BOINC as v2.06)
AVX (In production on BOINC as v2.04)
SSE3 (In production on BOINC as v2.04)
SSE2 (aka genefx64) (In production on BOINC as v2.04)
32 bit (aka genefer)
128 bit
8087 (aka genefer80)
Change log:
3.1.0-0: (BOINC 2.00/2.01) Initial new Genefer version.
3.1.1-0: (BOINC 2.02) Fixes stderr truncation zombie bug; adds AUTO-SHIFT for CUDA builds.
3.1.1-1: (BOINC 2.03) CUDA builds only: Slightly smarter AUTO-SHIFT uses actual b value when running benchmark.
3.1.2-0: (BOINC 2.04) Improved error handling can recover automatically from most transient errors such as CUDA driver errors. (All builds are 3.1.2-0 unless otherwise marked.)
3.1.2-1: (BOINC 2.05) CUDA builds only: More error handling improvements.
3.1.2-2: (BOINC 2.06) CUDA builds only: Fixes bug where BOINC short-task shift override would be used for WR tasks.
3.1.2-2: OpenCL (AMD/ATI) builds only: Latest beta build.
3.1.2-3: OpenCL builds only: Improved transform is faster than earlier versions. -B3 GLFOP ratings and -B "Genefer Mark" ratings now reflect actual rather than theoretical GFLOPS and shouldn't be compared with prior versions. CPU devices can no longer be selected with --device because they're too slow.
3.1.2-4: OpenCL builds only: Tuning parameters are automatically adjusted for your hardware.
3.1.2-5: OpenCL builds only: Fixes problem with 32-bit build on ATI/AMD GPUs.
3.1.2-6: OpenCL builds only: Fixes problem with 32-bit build on ATI/AMD GPUs. Really.
3.1.2-7: (BOINC 2.07/2.08/2.10) OpenCL builds only: Supports forcing specific platform so the correct GPU is used when running on systems with both Nvidia and AMD/ATI GPUs,
3.1.2-8: (BOINC 2.11) CUDA Windows build only: Improved error handling does a full restart of the application.
3.1.2-9: (BOINC 2.12) CUDA Windows build only: Even more robust than 3.1.2-8.
Genefer development plan:
So far, we've released new builds for public testing, as well as putting early versions of GeneferCUDA into production. There will be a full release once everything is done and tested.
Here's what's planned in the near term for Genefer:
- (complete - in production on BOINC) GeneferCUDA: Replace near-instant GPU initialization code with slower (but still fast) CPU initialization code. This greatly reduces video memory usage and allows GeneferCUDA to run on many GPUs that the previous version would fail on.
- (complete - available for testing, use links above) All CPU builds: Use fast CPU initialization code.
- (complete - available for testing, use links above) All CPU builds: Use new, faster transform.
- (partly complete - all but 128 bit Windows available for testing, use links above) New CPU versions for AVX, SSE3, and 128 bit double-double.
- (complete - in production on BOINC) GeneferCUDA: AUTO-SHIFT: Automatically determines fastest SHIFT value at run time.
- (complete - available for testing, use links above) Improved error handling: Attempt to recover from most CUDA and other errors. (More improvements may occur in the future.)
- Combine the many CPU builds into a single 32 bit build and a single 64 bit build.
____________
My lucky number is 75898524288+1
| |
|
|
730182^524288+1 on i7-2600K@4500, GTX470@1512
genefercuda 3.1.0-0 (Windows 32-bit CUDA)
maxErr during b^N initialization = 0.0000 (3.930 seconds)
Estimated total run time for 730182^524288+1 is 3:18:17
geneferavx 3.1.0-0 (Windows 64-bit AVX)
Initialization complete (1.325 seconds)
Estimated total run time for 730182^524288+1 is 11:53:59
genefersse3 3.1.0-0 (Windows 64-bit SSE3)
Initialization complete (1.305 seconds)
Estimated total run time for 730182^524288+1 is 16:46:44
genefx64 3.1.0-0 (Windows 64-bit SSE2)
Initialization complete (5.520 seconds)
Estimated total run time for 730182^524288+1 is 24:26:16
genefer80 3.1.0-0 (Windows 32-bit x87-80)
Initialization complete (6.935 seconds)
Estimated total run time for 730182^524288+1 is 63:15:28 | |
|
|
32-bit (aka genefer)
If I read this right it means there is a non-CUDA version that runs on a 32 bit windows OS.
Is that the case? If so it will be very popular and very useful to my wife who wants her Genefer badges but has only 32 bit OS on her machines.
____________
Member team AUSTRALIA
My lucky number is 9291*2^1085585+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
32-bit (aka genefer)
If I read this right it means there is a non-CUDA version that runs on a 32 bit windows OS.
Is that the case? If so it will be very popular and very useful to my wife who wants her Genefer badges but has only 32 bit OS on her machines.
Not all of those are necessarily appropriate for the tasks we're sending out on BOINC.
There's always been a 32 bit version. Two of them, actually: genefer and generfer80. We use them in PRPNet. They're too slow for the large numbers we're searching on BOINC.
I'm surprised nobody's asked what's up with the 128 bit genefer. :)
____________
My lucky number is 75898524288+1 | |
|
|
500000^1048576+1 on i7-2600K@4500 1-core, GTX470@1512
genefercuda 3.1.0-0 (Windows 32-bit CUDA)
maxErr during b^N initialization = 0.0000 (8.120 seconds)
Estimated total run time for 500000^1048576+1 is 11:41:24
geneferavx 3.1.0-0 (Windows 64-bit AVX)
Initialization complete (2.590 seconds)
Estimated total run time for 500000^1048576+1 is 49:14:31
genefersse3 3.1.0-0 (Windows 64-bit SSE3)
Initialization complete (2.590 seconds)
Estimated total run time for 500000^1048576+1 is 69:45:17
genefx64 3.1.0-0 (Windows 64-bit SSE2)
Initialization complete (10.270 seconds)
Estimated total run time for 500000^1048576+1 is 93:57:44
genefer80 3.1.0-0 (Windows 32-bit x87-80)
Initialization complete (14.730 seconds)
Estimated total run time for 500000^1048576+1 is 263:51:19 | |
|
|
I'm surprised nobody's asked what's up with the 128 bit genefer. :)
so, what's up with the 128 bit genefer?
If testing is still needed next week (once my current SoB and WR tasks finish) I should have some time then to set it up to run some cuda and cpu tasks. For the lazy amongst us can you provide the app_info? | |
|
|
The AVX version is quite impressive:
Intel i5 2500K @4.2
Command line: Genefx64.exe -q 129280^1048576+1
Starting initialization...
Initialization complete (120.103 seconds).
Estimated total run time for 129280^1048576+1 is 96:20:40
Command line: Geneferavx_windows.exe -q 129280^1048576+1
Starting initialization...
Initialization complete (1.372 seconds).
Estimated total run time for 129280^1048576+1 is 60:15:53
____________
676754^262144+1 is prime | |
|
|
I'm surprised nobody's asked what's up with the 128 bit genefer. :)
so, what's up with the 128 bit genefer?
It's basically a double-double precision (i.e. 128 bits per floating point number) implementation, so it can handle ridiculously high B without hitting round-off errors. However it's also very slow, so it's of little practical interest right now unless we get beyond the B limit of genefer80.
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2417 ID: 1178 Credit: 20,106,494,416 RAC: 22,266,129
                                                
|
So what are the b limits for the new genefer apps (especially AVX and SSE3)? Are these useful at all on the small genefer work we are doing on PRPnet (e.g., GFN 65536)?
| |
|
|
AVX (windows b limits):
Generalized Fermat Number b Limits
The upper bound m = 65536, b = 1590000, Err = 0.2969
Starting b = 1790000, Err b = 1595000, Err = 0.3438, 5 Err b = 0
The upper bound m = 262144, b = 1090000, Err = 0.3047
Starting b = 1180000, Err b = 1095000, Err = 0.3125, 5 Err b = 0
The upper bound m = 524288, b = 885000, Err = 0.2813
Starting b = 960000, Err b = 890000, Err = 0.3125, 5 Err b = 0
The upper bound m = 1048576, b = 735000, Err = 0.3047
Starting b = 780000, Err b = 740000, Err = 0.3359, 5 Err b = 0
The upper bound m = 4194304, b = 500000, Err = 0.3125
Starting b = 510000, Err b = 505000, Err = 0.3203, 5 Err b = 0
SSE 3 (windows) limits:
Generalized Fermat Number b Limits
The upper bound m = 65536, b = 1590000, Err = 0.2969
Starting b = 1790000, Err b = 1595000, Err = 0.3438, 5 Err b = 0
The upper bound m = 262144, b = 1090000, Err = 0.3047
Starting b = 1180000, Err b = 1095000, Err = 0.3125, 5 Err b = 0
The upper bound m = 524288, b = 885000, Err = 0.2813
Starting b = 960000, Err b = 890000, Err = 0.3125, 5 Err b = 0
The upper bound m = 1048576, b = 735000, Err = 0.3047
Starting b = 780000, Err b = 740000, Err = 0.3359, 5 Err b = 0
The upper bound m = 4194304, b = 500000, Err = 0.3125
Starting b = 510000, Err b = 505000, Err = 0.3203, 5 Err b = 0
In both cases, limits are higher than geneferx64, so new builds will be useful on PRPNet 262144 (not for long, though) and 524288.
____________
676754^262144+1 is prime | |
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3250 ID: 50683 Credit: 152,646,050 RAC: 13,531
                         
|
So what are the b limits for the new genefer apps (especially AVX and SSE3)? Are these useful at all on the small genefer work we are doing on PRPnet (e.g., GFN 65536)?
But , first PRPCLIENT must support those new applications; so we need to wait until update is online
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! | |
|
rogueVolunteer developer
 Send message
Joined: 8 Sep 07 Posts: 1259 ID: 12001 Credit: 18,565,548 RAC: 0
 
|
But , first PRPCLIENT must support those new applications; so we need to wait until update is online
That will probably not happen. The problem is that for the PRPNet client (and server) to manage all of those flavors of the same applications would be a nightmare. It's bad enough with the four that are out there.
I suggested that one version of genefer be built that has all flavors incorporated into it. That one version would detect the CPU/GPU capabilities and execute the best available FFT for that CPU/GPU. If maxerr is detected, it would automatically switch to the next best available FFT rather than requiring the PRPNet client to figure out which one to run. This would allow genefer to closely mimic the behavior of llr and pfgw and would eliminate the need for me to modify the PRPNet client (and server) every time a new FFT flavor is created.
In the end there would be a 32-bit and 64-bit build of genefer. The 32-bit build wouldn't have any of the x64 capability. Those differences could be managed by a makefile, i.e. different targets. I don't know if anyone is working on that, but for the long term it is the best solution. Too bad I didn't think about that when we had only four flavors of genefer. | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2417 ID: 1178 Credit: 20,106,494,416 RAC: 22,266,129
                                                
|
I suggested that one version of genefer be built that has all flavors incorporated into it. That one version would detect the CPU/GPU capabilities and execute the best available FFT for that CPU/GPU. If maxerr is detected, it would automatically switch to the next best available FFT rather than requiring the PRPNet client to figure out which one to run. This would allow genefer to closely mimic the behavior of llr and pfgw and would eliminate the need for me to modify the PRPNet client (and server) every time a new FFT flavor is created.
I understand that this is what might be preferred from a programming standpoint, but it would not be preferred in some user cases (such as for me). I run genefercuda only on some machines where I want the GPU engaged in PRPnet work, but the CPU is busy with other things (not necessarily other PRPnet or BOINC PG things, though that is sometimes the case). I would think that a compromise solution would be to have a genefercuda app and a "combined" CPU genefer app that does what you are suggesting above excluding the GPU.
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
So what are the b limits for the new genefer apps (especially AVX and SSE3)? Are these useful at all on the small genefer work we are doing on PRPnet (e.g., GFN 65536)?
But , first PRPCLIENT must support those new applications; so we need to wait until update is online
These are NOT the final versions!
There's more work that needs to occur, and one of the things planned is to combine the different programs together. Eventually, there will be a 32 bit (genefer + 80), 64 bit (128 + sse2 + sse3 + AVX), and CUDA. Or something along those lines. That just hasn't happened yet.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I'm surprised nobody's asked what's up with the 128 bit genefer. :)
so, what's up with the 128 bit genefer?
Glad you asked!
As Iain said, it's slow, but can handle huge B values. So...
The 128 code may be faster at computing (b^2)^(N/2)+1 than genefer80 is at computing b^N+1, so it may be a faster alternative to genefer80.
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2417 ID: 1178 Credit: 20,106,494,416 RAC: 22,266,129
                                                
|
I'm surprised nobody's asked what's up with the 128 bit genefer. :)
so, what's up with the 128 bit genefer?
Glad you asked!
As Iain said, it's slow, but can handle huge B values. So...
The 128 code may be faster at computing (b^2)^(N/2)+1 than genefer80 is at computing b^N+1, so it may be a faster alternative to genefer80.
Not available for windows though...? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I'm surprised nobody's asked what's up with the 128 bit genefer. :)
so, what's up with the 128 bit genefer?
Glad you asked!
As Iain said, it's slow, but can handle huge B values. So...
The 128 code may be faster at computing (b^2)^(N/2)+1 than genefer80 is at computing b^N+1, so it may be a faster alternative to genefer80.
Not available for windows though...?
Not yet. Iain said there were some issues with it.
____________
My lucky number is 75898524288+1 | |
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3254 ID: 130544 Credit: 2,446,790,434 RAC: 4,239,105
                           
|
For those of us wanting to alpha-test but not familiar with an app_info for PG, please provide what info needs to go in it or a paste-able template. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The 3.1.0 CUDA apps are live.
Of particular import is that this will hopefully correct at least some of the problems with the WR tasks on both Windows and Linux. Yes, this means that the WR Linux app is available once again.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
For those of us wanting to alpha-test but not familiar with an app_info for PG, please provide what info needs to go in it or a paste-able template.
RULE # 1: TURNING APP_INFO ON OR OFF, OR CHANGING IT, USUALLY ABORTS ALL TASKS THAT ARE ON YOUR COMPUTER. FINISH ALL YOUR TASKS BEFORE PLAYING WITH APP_INFO!!!!!
Rule #2: See Rule #1.
If that didn't scare you off completely, here's what you do:
You need to create a file called app_info.xml in your ...boinc/projects/www.primegrid.com directory. The contents contain information about each subproject you want to run. You can NOT turn on app_info for some sub-projects; if you use app_info, you must use it for everything you wish to run.
Here's what a sample file with just two projects in it:
<app_info>
<app>
<name>genefer</name>
<user_friendly_name>Genefer</user_friendly_name>
</app>
<app>
<name>genefer_wr</name>
<user_friendly_name>Genefer (World Record)</user_friendly_name>
</app>
<file_info>
<name>geneferCUDA-windows.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_32_16.dll</name>
<file_signature>
6f6c6846e06f08954c5d3744c8b3e206901e7d6cada0f98bc4736d89b82ebcf3
46a34061b44aafecc7daf6b21245da1a0b9116084f154409acd86db5b746aa51
cea02569028b1e61ef85602c7ed27cb7676cdc0db7685626209e6a40e78dbccf
a8a40f78cec19d9904ffd2d2b3581ab5931f06fd2d4c734bfc5a9fa271f6149a
.
</file_signature>
<nbytes>384616.000000</nbytes>
</file_info>
<file_info>
<name>cufft32_32_16.dll</name>
<file_signature>
0f510cac435e4772ff757cc92ebc1453cf1721a933164ee79eae3428c4a736bb
30fb39da86a8973e8064619dff31e0f4ef3245501b75fab4659de8e8cef58653
8c4bad39e15399b2d6a5bd2d92495590a9836ec614828c7413b1771ee54763b7
e8a71396a813c783280b6e34872202877c788b110bcd24b19720144195c97a69
.
</file_signature>
<nbytes>28551272.000000</nbytes>
</file_info>
<app_version>
<app_name>genefer</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>geneferCUDA-windows.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_32_16.dll</file_name>
<open_name>cudart32_32_16.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_32_16.dll</file_name>
<open_name>cufft32_32_16.dll</open_name>
<copy_file/>
</file_ref>
<platform>windows_intelx86</platform>
<plan_class>cuda32_13</plan_class>
<coproc>
<type>CUDA</type>
<count>1.000000</count>
</coproc>
<gpu_ram>536870912.000000</gpu_ram>
</app_version>
<app_version>
<app_name>genefer_wr</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>geneferCUDA-windows.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_32_16.dll</file_name>
<open_name>cudart32_32_16.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_32_16.dll</file_name>
<open_name>cufft32_32_16.dll</open_name>
<copy_file/>
</file_ref>
<platform>windows_intelx86</platform>
<plan_class>cuda32_13</plan_class>
<coproc>
<type>CUDA</type>
<count>1.000000</count>
</coproc>
<gpu_ram>536870912.000000</gpu_ram>
</app_version>
</app_info>
Most of this is copied from sched_reply_www.primegrid.com.xml, which is created by BOINC when you request work for one of the subprojects. You take what it gives you, and change it ever so slightly to make it use your own application, instead of the server.
App_info contains three different types of entries:
<app> defines the subproject -- in this case, genefer or genefer_wr. Copy what I have in the sample file above.
The second section, <file_info>, is the file definitions, where you tell boinc all the files you're using. Defined above is genefercuda and the two cuda libraries it requires. For a CPU app, you only have to define the genefer program as there aren't any libraries.
Finally, the <app_version> sections tie it all together and is what defines the program that you want to run tasks for that app. In the sample file above, I've defined cuda app_versions for both the genefer and genefer_wr apps. It defines both the executable file as well as the cuda libraries. The file signatures for the libraries aren't strictly necessary. They're just copied from the scheduler xml that boinc created, but this would work without them.
If you wanted to run a different cuda program, you would use the app_version section above, but change the executable name to the name of your program. If you need different libraries, you'd change that too.
If you wanted to run CPU programs, you'd use an app version section that looks something like this:
<app_version>
<app_name>genefer</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>geneferAVX.exe</file_name>
<main_program/>
</file_ref>
<platform>windows_intelx86_64</platform>
</app_version>
We got rid of the libraries, the plan class, and some GPU specific stuff, but notice that the platform is different: it's now the 64 bit windows platform.
Finally, once the file is set up, you need to restart your boinc-client for it to take affect. This is NOT the same as restarting the boinc manager window. If you're not sure how to do this, the easiest thing is to restart your computer. (The method varies depending on your O.S., how you installed BOINC, and which version of BOINC you're running.)
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The 3.1.0 CUDA apps are live.
Of particular import is that this will hopefully correct at least some of the problems with the WR tasks on both Windows and Linux. Yes, this means that the WR Linux app is available once again.
Unfortunately, there's a problem in the build of the linux cuda app, so that had to be taken out of production. Not so bad for the n=20 short tasks which can run 2.3.0, but we're back to having no linux WR app. Hopefully this will be short lived.
____________
My lucky number is 75898524288+1 | |
|
|
geneferavx 3.1.0-0 (Windows 64-bit AVX)
Estimated total run time for 1199996^262144+1 is 3:01:45
Testing 1199996^262144+1... 5292032 steps to go (3:01:41 remaining)
maxErr exceeded for 1199996^262144+1, 0.4688 > 0.4500
current leading edge 1006788^262144+1 is ok | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The 3.1.0 CUDA apps are live.
Of particular import is that this will hopefully correct at least some of the problems with the WR tasks on both Windows and Linux. Yes, this means that the WR Linux app is available once again.
Unfortunately, there's a problem in the build of the linux cuda app, so that had to be taken out of production. Not so bad for the n=20 short tasks which can run 2.3.0, but we're back to having no linux WR app. Hopefully this will be short lived.
Update: The linux CUDA build has been updated in the links in the first post. Let me know how it works for you.
____________
My lucky number is 75898524288+1 | |
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 686 ID: 845 Credit: 3,016,831,333 RAC: 1,036,540
                              
|
Some remarks after running the benchmarks on some different machines:
- AVX and SSE3 are fast :)
- no speed difference between 3.1.0 and 2.3.0 for SSE2 and 8087
- 32-bit 3.1.0 seems to be a lot slower than 32-bit 2.3.0 and even slower than 8087
I can also confirm that the new CUDA version does fix a problem with WR tasks:
>genefercuda.exe -q 7380^^4194304+1
genefercuda 2.3.0-0 (Windows x86 CUDA 3.2)
Copyright 2001-2003, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2012, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: genefercuda.exe -q 7380^4194304+1
Using default SHIFT value=8
Starting initialization...
cuda_subs.cu(690) : cufftSafeCall() CUFFT error: 6.
>genefercuda-windows.exe -q 7380^^4194304+1
genefercuda 3.1.0-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: genefercuda-windows.exe -q 7380^4194304+1
Using default SHIFT value=8
Starting initialization...
maxErr during b^N initialization = 0.0000 (50.540 seconds).
Estimated total run time for 7380^4194304+1 is 175:05:47
I was not even aware that this problem did affect my GTX 660Ti (only ran n=18, 19, 20 on that GPU before). Nice to see it fixed, hopefully this will lower the error rate significantly. :)
____________
| |
|
|
Thanks Patrick,
I'm aware of the performance problem re: genefer 32 bit, and hope to resolve it soon. The other results are as expected, and good to see!
Cheers
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! | |
|
|
Update: The linux CUDA build has been updated in the links in the first post. Let me know how it works for you.
Downloaded s/w at about 3PM PST Saturday 2 March.
GPU: GTX 570, clocked at 786/1572. CPU: 2600K, HT on, moderate o/c. Nothing running other than normal o/s background stuff. OS: Ubuntu 11.10 (64 bit), nvidia driver: 302.17.
% cksum ./genefercuda_linux
892039866 108656 ./genefercuda_linux
% ./genefercuda_linux -b2 20
genefercuda 3.1.0-0 (Linux 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: ./genefercuda_linux -b2 20
Generalized Fermat Number Bench 2
SHIFT=5 468750^1048576+1 Time: 3 ms/mul. Err: 1.56e-01 5946413 digits
SHIFT=6 468750^1048576+1 Time: 2.09 ms/mul. Err: 1.56e-01 5946413 digits
SHIFT=7 468750^1048576+1 Time: 1.86 ms/mul. Err: 1.56e-01 5946413 digits
SHIFT=8 468750^1048576+1 Time: 1.84 ms/mul. Err: 1.64e-01 5946413 digits
SHIFT=9 468750^1048576+1 Time: 1.95 ms/mul. Err: 1.56e-01 5946413 digits
SHIFT=10 468750^1048576+1 Time: 2.99 ms/mul. Err: 1.64e-01 5946413 digits
SHIFT=11 468750^1048576+1 Time: 1.84 ms/mul. Err: 1.64e-01 5946413 digits
SHIFT=12 468750^1048576+1 Time: 1.84 ms/mul. Err: 1.64e-01 5946413 digits
% ^20^22
./genefercuda_linux -b2 22
genefercuda 3.1.0-0 (Linux 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: ./genefercuda_linux -b2 22
Generalized Fermat Number Bench 2
SHIFT=5 309258^4194304+1 Time: 11.8 ms/mul. Err: 1.48e-01 23028076 digits
SHIFT=6 309258^4194304+1 Time: 8.47 ms/mul. Err: 1.56e-01 23028076 digits
SHIFT=7 309258^4194304+1 Time: 7.69 ms/mul. Err: 1.56e-01 23028076 digits
SHIFT=8 309258^4194304+1 Time: 7.23 ms/mul. Err: 1.56e-01 23028076 digits
SHIFT=9 309258^4194304+1 Time: 7.21 ms/mul. Err: 1.64e-01 23028076 digits
SHIFT=10 309258^4194304+1 Time: 7.43 ms/mul. Err: 1.64e-01 23028076 digits
SHIFT=11 309258^4194304+1 Time: 7.21 ms/mul. Err: 1.64e-01 23028076 digits
SHIFT=12 309258^4194304+1 Time: 7.21 ms/mul. Err: 1.64e-01 23028076 digits
% ./genefercuda_linux -b
genefercuda 3.1.0-0 (Linux 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: ./genefercuda_linux -b
Generalized Fermat Number Bench
2199064^8192+1 Time: 89.4 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 95.5 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 148 us/mul. Err: 0.2188 202102 digits
1203210^65536+1 Time: 218 us/mul. Err: 0.2080 398482 digits
984108^131072+1 Time: 375 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 556 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 1.02 ms/mul. Err: 0.2031 3050541 digits
538452^1048576+1 Time: 1.84 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 3.54 ms/mul. Err: 0.2031 11836006 digits
360204^4194304+1 Time: 7.33 ms/mul. Err: 0.1992 23305854 digits
294612^8388608+1 Time: 15.1 ms/mul. Err: 0.1761 45879398 digits
% ^b^t
./genefercuda_linux -t
genefercuda 3.1.0-0 (Linux 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: ./genefercuda_linux -t
Testing 2030234^8192+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (0.008 seconds).
Estimated total run time for 2030234^8192+1 is 0:00:16
2030234^8192+1 is a probable prime. (51672 digits) (err = 0.2188) (time = 0:00:15) 15:23:45
Testing 1651902^16384+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (0.022 seconds).
Estimated total run time for 1651902^16384+1 is 0:00:32
1651902^16384+1 is a probable prime. (101876 digits) (err = 0.1875) (time = 0:00:33) 15:24:18
Testing 1277444^32768+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (0.057 seconds).
Estimated total run time for 1277444^32768+1 is 0:01:35
1277444^32768+1 is a probable prime. (200093 digits) (err = 0.1875) (time = 0:01:38) 15:25:56
Testing 857678^65536+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (0.178 seconds).
Estimated total run time for 857678^65536+1 is 0:04:52
857678^65536+1 is a probable prime. (388847 digits) (err = 0.1250) (time = 0:04:47) 15:30:43
Testing 572186^131072+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (0.506 seconds).
Estimated total run time for 572186^131072+1 is 0:15:35
572186^131072+1 is a probable prime. (754652 digits) (err = 0.0820) (time = 0:15:35) 15:46:18
Testing 24518^262144+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (1.060 seconds).
Estimated total run time for 24518^262144+1 is 0:35:49
24518^262144+1 is a probable prime. (1150678 digits) (err = 0.0002) (time = 0:35:51) 16:22:09
Testing 75898^524288+1...
Starting initialization...
maxErr during b^N initialization = 0.0000 (3.928 seconds).
Estimated total run time for 75898^524288+1 is 2:23:22
^Csting 75898^524288+1... 8454144 steps to go (2:22:36 remaining)
^C caught.
%
That's as far as my patience allowed :-)
--Gary | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The 3.1.0 CUDA apps are live.
Of particular import is that this will hopefully correct at least some of the problems with the WR tasks on both Windows and Linux. Yes, this means that the WR Linux app is available once again.
Unfortunately, there's a problem in the build of the linux cuda app, so that had to be taken out of production. Not so bad for the n=20 short tasks which can run 2.3.0, but we're back to having no linux WR app. Hopefully this will be short lived.
The corrected linux app is now in production, so hopefully we'll have more success with the WR apps on Windows and Linux. (Mac is fixed, too, but there's so few that can run this. Although I think Apple may be shipping 600 series Nvidia GPUs, so there may be more Macs participating in the future.)
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I moved the development plan into the first post.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
We're building version 3.1.1 for the various platforms. 3.1.1 fixes the stderr truncation problem (I call it a zombie bug because we killed this bug last year, too), and for CUDA builds also adds the new AUTO-SHIFT feature. For some computers, this has the potential to increase performance.
You can download them from the links in the first post. Currently, all the Mac builds, and the Windows CUDA build are built.
I expect to put the CUDA apps into production quickly because they do fix a small bug and will increase performance on some systems.
Update: The Windows and Mac CUDA apps are now in production.
____________
My lucky number is 75898524288+1 | |
|
|
avx version has a problem when running 6 tasks on a 3770.
Current prod with 6 tasks:
Starting initialization...
Initialization complete (176.733 seconds).
Testing 138630^1048576+1...
Estimated total run time for 138630^1048576+1 is 136:26:20
Avx with 6 tasks:
Starting initialization...
Initialization complete (5.320 seconds).
Testing 138694^1048576+1...
Estimated total run time for 138694^1048576+1 is 216:23:36
AVX with 3 tasks shows an estimate of 98 hours. Didn't try current prod with 3 but obviously it will be less than 136 hours. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
avx version has a problem when running 6 tasks on a 3770.
Current prod with 6 tasks:
Starting initialization...
Initialization complete (176.733 seconds).
Testing 138630^1048576+1...
Estimated total run time for 138630^1048576+1 is 136:26:20
Avx with 6 tasks:
Starting initialization...
Initialization complete (5.320 seconds).
Testing 138694^1048576+1...
Estimated total run time for 138694^1048576+1 is 216:23:36
AVX with 3 tasks shows an estimate of 98 hours. Didn't try current prod with 3 but obviously it will be less than 136 hours.
Is that repeatable?
____________
My lucky number is 75898524288+1 | |
|
|
yes, I didn't believe it the 1st time so tried it again after a reboot.
and the estimate is accurate. I let them run for 10 mins and they run as per estimate. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
yes, I didn't believe it the 1st time so tried it again after a reboot.
and the estimate is accurate. I let them run for 10 mins and they run as per estimate.
The 3770 is a quad-core, right? So with 3 tasks you're running on real cores and with 6, at least 4 of the six tasks are running on hyperthreads. So you expect the time to about double going from 3 to 6 (98 to 216 hours). What doesn't makes sense is why it's a lot slower than genefX64.
____________
My lucky number is 75898524288+1 | |
|
|
yes, I didn't believe it the 1st time so tried it again after a reboot.
and the estimate is accurate. I let them run for 10 mins and they run as per estimate.
The 3770 is a quad-core, right? So with 3 tasks you're running on real cores and with 6, at least 4 of the six tasks are running on hyperthreads. So you expect the time to about double going from 3 to 6 (98 to 216 hours). What doesn't makes sense is why it's a lot slower than genefX64.
it's a challenge for you :-)
When I have some time later or tomorrow I'll do a few more tests with varying numbers of cores for both apps and report back.
| |
|
|
avx
1: Estimated total run time for 49740^1048576+1 is 40:15:33
2: Estimated total run time for 123298^1048576+1 is 67:37:22
3: Estimated total run time for 131046^1048576+1 is 97:28:50
4: Estimated total run time for 135440^1048576+1 is 133:26:58
5: Estimated total run time for 138704^1048576+1 is 173:43:32
6: Estimated total run time for 138722^1048576+1 is 214:54:51
7: Estimated total run time for 128070^1048576+1 is 255:06:15
8: Estimated total run time for 138710^1048576+1 is 306:51:47
-------------------------
non-avx
1: Estimated total run time for 138728^1048576+1 is 72:34:19
2: Estimated total run time for 123466^1048576+1 is 77:46:28
3: Estimated total run time for 138608^1048576+1 is 92:16:41
4: Estimated total run time for 123926^1048576+1 is 101:38:09
5: Estimated total run time for 138092^1048576+1 is 115:17:29
6: Estimated total run time for 123582^1048576+1 is 124:21:16
7: Estimated total run time for 138732^1048576+1 is 143:44:47
8: Estimated total run time for 138734^1048576+1 is 159:58:18
| |
|
Yves Gallot Volunteer developer Project scientist Send message
Joined: 19 Aug 12 Posts: 843 ID: 164101 Credit: 306,539,995 RAC: 5,437

|
GeneferAVX uses a large array for cos/sin and GenefX64 two small tables and a product (to check b^N+1, GeneferAVX allocates 40N bytes and GenefX64 about 9N bytes). That's more efficient when a single process is running but you're right, that's a major problem when 3 or 4 instances are running! I will improve it. For the moment, you can run 2 GeneferAVX and 2 GenefX64...
Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
L3 cache 10 MBytes, 20-way set associative, 64-byte line size
Memory Channels 4, 4 x 4096 MBytes DDR3 @ 667 MHz
geneferavx 3.1.2dev-0 (Windows 64-bit AVX)
1 instance 804904^262144+1 Time: 2.18 ms/mul. Err: 0.1875 1548156 digits
2 instances 804904^262144+1 Time: 2.28 ms/mul. Err: 0.1875 1548156 digits
3 instances 804904^262144+1 Time: 2.63 ms/mul. Err: 0.1875 1548156 digits
4 instances 804904^262144+1 Time: 3.52 ms/mul. Err: 0.1875 1548156 digits
1 instance 658332^524288+1 Time: 4.66 ms/mul. Err: 0.1875 3050541 digits
2 instances 658332^524288+1 Time: 5.43 ms/mul. Err: 0.1875 3050541 digits
3 instances 658332^524288+1 Time: 7.65 ms/mul. Err: 0.1875 3050541 digits
4 instances 658332^524288+1 Time: 9.91 ms/mul. Err: 0.1875 3050541 digits
1 instance 538452^1048576+1 Time: 11.2 ms/mul. Err: 0.1875 6009544 digits
2 instances 538452^1048576+1 Time: 13.7 ms/mul. Err: 0.1875 6009544 digits
3 instances 538452^1048576+1 Time: 18.3 ms/mul. Err: 0.1875 6009544 digits
4 instances 538452^1048576+1 Time: 22.7 ms/mul. Err: 0.1875 6009544 digits
1 instance 440400^2097152+1 Time: 24.8 ms/mul. Err: 0.1875 11836006 digits
2 instances 440400^2097152+1 Time: 29.1 ms/mul. Err: 0.1875 11836006 digits
3 instances 440400^2097152+1 Time: 38.1 ms/mul. Err: 0.1875 11836006 digits
4 instances 440400^2097152+1 Time: 50.2 ms/mul. Err: 0.1875 11836006 digits
1 instance 360204^4194304+1 Time: 54.8 ms/mul. Err: 0.1875 23305854 digits
2 instances 360204^4194304+1 Time: 64.6 ms/mul. Err: 0.1875 23305854 digits
3 instances 360204^4194304+1 Time: 85.1 ms/mul. Err: 0.1875 23305854 digits
4 instances 360204^4194304+1 Time: 110 ms/mul. Err: 0.1875 23305854 digits
1 instance 1471094^32768+1 Time: 206 us/mul. Err: 0.2031 202102 digits
2 instances 1471094^32768+1 Time: 206 us/mul. Err: 0.2031 202102 digits
3 instances 1471094^32768+1 Time: 212 us/mul. Err: 0.2031 202102 digits
4 instances 1471094^32768+1 Time: 252 us/mul. Err: 0.2031 202102 digits
| |
|
|
For the moment, you can run 2 GeneferAVX and 2 GenefX64...
that may be fantastic for prpnet but not much use for boinc and while the speed improvement is very impressive for those 2 cores if you can't run on more than 4 cores you might as well use a gpu.
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Version 3.1.2-0 of GeneferCUDA for Windows is now available for testing with the app_info/anonymous platform mechanism. The first post in this thread includes a link for downloading this executable.
3.1.2-0 includes vastly improved error handling which can handle many types of transient CUDA errors which until now would kill a Boinc task. The program will now attempt to restart following a "random" CUDA system failure.
I've been "fortunate" to have had Microsoft Update install the 311 version of the Nvidia driver, which has, on this computer, been quite unstable and prone to crashing. For a developer working on error handling code, this is heaven. :)
In real life situations, the new version of the code has been able to restart and successfully continue in all but 1 of the crashes I've experienced so far. All previous versions of geneferCUDA would have failed on all of the crashes, so this is a huge improvement.
____________
My lucky number is 75898524288+1 | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,414,212,272 RAC: 2,794,970
                                      
|
Curious about latest version, I was trying to download it from Assembla.
It gave me "Operation not permitted", login/pw etc for CUDA version (aka geneferCUDA).
When trying geneferavx_windows.exe, this binary file cannot be displayed. File is too big for download.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Curious about latest version, I was trying to download it from Assembla.
It gave me "Operation not permitted", login/pw etc for CUDA version (aka geneferCUDA).
When trying geneferavx_windows.exe, this binary file cannot be displayed. File is too big for download.
I think Assembla may have changed the way it works and guests can no longer download from there.
Give me a few minutes and I'll move all the executables over to the primegrid server.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
It took a bit more than a few minutes, but the links on top now point to the files on the primegrid.com server rather than Assembla, so there should be no more problems.
____________
My lucky number is 75898524288+1 | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,414,212,272 RAC: 2,794,970
                                      
|
Thanks for moving apps to PG server.
Is there any maintenance work going on?
I can;t get any work.
How do I persuade PRPNEt to use GeneferAVX?
c:\_PG\PRP.1>prpclient.exe
Genefer version of geneferavx from 'genefer.exe' is not supported
[2013-03-26 13:07:15 SE(Å”] PRPNet Client application v5.2.2 started
[2013-03-26 13:07:16 SE(Å”] GFN262144: Getting work from server prpnet.mine.nu at port 11002
[2013-03-26 13:08:11 SE(Å”] GFN262144: INFO: No available candidates are left on this server.
^C
c:\_PG\PRP.1>prpclient.exe
Genefer version of geneferavx from 'genefer.exe' is not supported
[2013-03-26 13:11:24 SE(Å”] PRPNet Client application v5.2.2 started
[2013-03-26 13:11:25 SE(Å”] GFN32768: Getting work from server prpnet.primegrid.com at port 12005
[2013-03-26 13:12:32 SE(Å”] GFN32768: No active candidates found on server
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Thanks for moving apps to PG server.
Is there any maintenance work going on?
I can;t get any work.
How do I persuade PRPNEt to use GeneferAVX?
c:\_PG\PRP.1>prpclient.exe
Genefer version of geneferavx from 'genefer.exe' is not supported
[2013-03-26 13:07:15 SE(Å”] PRPNet Client application v5.2.2 started
[2013-03-26 13:07:16 SE(Å”] GFN262144: Getting work from server prpnet.mine.nu at port 11002
[2013-03-26 13:08:11 SE(Å”] GFN262144: INFO: No available candidates are left on this server.
^C
c:\_PG\PRP.1>prpclient.exe
Genefer version of geneferavx from 'genefer.exe' is not supported
[2013-03-26 13:11:24 SE(Å”] PRPNet Client application v5.2.2 started
[2013-03-26 13:11:25 SE(Å”] GFN32768: Getting work from server prpnet.primegrid.com at port 12005
[2013-03-26 13:12:32 SE(Å”] GFN32768: No active candidates found on server
I'm not sure if the new versions of genefer work with PRPNet. They will eventually, but it's possible they don't right now.
____________
My lucky number is 75898524288+1 | |
|
|
I'm not sure if the new versions of genefer work with PRPNet. They will eventually, but it's possible they don't right now.
Right now, the new CPU codes (AVX, SSE3) won't as prpclient doesn't recognise the new transform versions. This will be fixed in a future release as all the CPU transform codes get merged into a single binary. The new versions of genefercuda work fine with PRPNet.
- Iain
____________
Twitter: IainBethune
Proud member of team "Aggie The Pew". Go Aggie!
3073428256125*2^1290000-1 is Prime! | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,414,212,272 RAC: 2,794,970
                                      
|
This will be fixed in a future release as all the CPU transform codes get merged into a single binary.
This is the way to go, this will reduce complexity of PRPNet.
I guess I'll need to wait a bit longer.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
All builds of 3.1.2-0 are now available via the links in the first post. (Windows CUDA and all Linux builds went up yesterday; I just added the Mac builds and Windows CPU builds).
I expect to put the CUDA (3.2) and CPU (sse2/genefx64) apps into production fairly soon, probably later today.
On a somewhat related note, I'll probably be installing the new scheduler code on the server sometime in the next few days once I've convinced myself I've tested everything I can test. Once that's done, this in theory opens up the means to install the SSE3 and AVX versions of the CPU apps into production, as well as CUDA 5.0 apps.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
3.1.2-0 is now in production on Boinc as version 2.04 for both CUDA and CPU.
____________
My lucky number is 75898524288+1 | |
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
Any idea of "B" limit on new Version 3.x.x GeneferCUDA?
N ; old GeneferCUDA
65536 ; 1,525,000
262144 ; 995,000
524288 ; 815,000
1048576 ; 695,000
4194304 ; 475,000
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Any idea of "B" limit on new Version 3.x.x GeneferCUDA?
N ; old GeneferCUDA
65536 ; 1,525,000
262144 ; 995,000
524288 ; 815,000
1048576 ; 695,000
4194304 ; 475,000
The main math calculations in the CUDA programs are unchanged, so the b limits are identical. All the changes in geneferCUDA have been to increase reliability.
____________
My lucky number is 75898524288+1 | |
|
|
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (14418095 iterations left)
Estimated total run time for 151978^1048576+1 is 16:38:44
maxErr exceeded for 151978^1048576+1, 0.5000 > 0.4500
MaxErr exceeded may be caused by overclocking, overheated GPUs and other transient errors.
Waiting 10 minutes before attempting to continue from last checkpoint...
Generalized Fermat Number Bench 2
...
SHIFT=5 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=6 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=7 151978^1048576+1 Time: 3.35 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=8 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=9 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=10 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|GeForce GTX 460|1620|151978|1048576=5 to genefer.cfg.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=5
Resuming 151978^1048576+1 from a checkpoint (13238271 iterations left)
Estimated total run time for 151978^1048576+1 is 22:13:52
Very interesting that after maxErr termination genefer decided to run benchmark again and the results of bench were far from the first bench run:
Generalized Fermat Number Bench 2
...
SHIFT=5 151978^1048576+1 Time: 4.76 ms/mul. Err: 1.66e-002 5433491 digits
...
SHIFT=6 151978^1048576+1 Time: 3.57 ms/mul. Err: 1.86e-002 5433491 digits
...
SHIFT=7 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=8 151978^1048576+1 Time: 3.34 ms/mul. Err: 1.76e-002 5433491 digits
...
SHIFT=9 151978^1048576+1 Time: 3.61 ms/mul. Err: 1.66e-002 5433491 digits
...
SHIFT=10 151978^1048576+1 Time: 5.24 ms/mul. Err: 1.86e-002 5433491 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
____________
| |
|
|
Also there are 2 lines in genefer.cfg:
AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7
AUTOSHIFT|genefercuda|3.1.2-0|1|GeForce GTX 460|1620|151978|1048576=5
1st line from the initial bench, next one - after maxErr termination.
____________
| |
|
|
Also I noticed that after SHIFT shifting ))) from 7 to 5 estimated run time changed from 16:38:44 to 22:13:52
and Memory Controller Load decresed from 48% to 33-37% (varies from time to time)
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
There's a few things there which are unexpected (if not completely bizarre).
SHIFT=5 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
SHIFT=6 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
SHIFT=7 151978^1048576+1 Time: 3.35 ms/mul. Err: 1.76e-002 5433491 digits
SHIFT=8 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
SHIFT=9 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
SHIFT=10 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
That's from the first pass through. There's just no way at all you should be seeing times that close together. Especially at SHIFT=5, the times should be much higher. For whatever reason, those results are wrong.
Then there's this from your log:
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|GeForce GTX 460|1620|151978|1048576=5 to genefer.cfg.
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
Your GPU seems to have disappeared during the second run. That comes straight from the video driver.
And you said this came from your genefer.cfg file:
AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7
AUTOSHIFT|genefercuda|3.1.2-0|1|GeForce GTX 460|1620|151978|1048576=5
That's in the opposite order that the log is in, which isn't possible -- but it makes more sense, actually. I'm guessing you posted the log pieces in the reverse order, right?
If there's no GPU, the driver works in emulation mode, running on the CPU. Genefer is supposed to detect this and refuse to run, but it looks like that didn't happen for whatever reason this time. Because it's running on the CPU, SHIFT has no effect, which could explain why the times are nearly identical. However, the times are way too fast for a CPU.
Something very unusual is happening here, and it's not obvious what the cause is. This goes beyond the overclocking (which will likely prevent you from computing a correct result, even if the enhanced error recovery does let the program eventualy finish. You may get a valid result, but it's not likely.)
It might be helpful if you posted the complete, unabridged log. Perhaps there's something useful in the lines that were redacted. It's a bug that it's as verbose as it is, but in this situation it might be useful.
____________
My lucky number is 75898524288+1 | |
|
|
I'm guessing you posted the log pieces in the reverse order, right?
You are right.
If there's no GPU, the driver works in emulation mode, running on the CPU.
You are right twice. I noticed one of genefer tasks (I have 2 identical GPUs) uses 1 CPU core while other task - ~ 0.05 as expected.
And, what's interesting, both GPUs fully loaded at ~99%:
Here's complete log for task with anomal behavior:
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
Generalized Fermat Number Bench 2
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=5 151978^1048576+1 Time: 4.76 ms/mul. Err: 1.66e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=6 151978^1048576+1 Time: 3.57 ms/mul. Err: 1.86e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=7 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=8 151978^1048576+1 Time: 3.34 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=9 151978^1048576+1 Time: 3.61 ms/mul. Err: 1.66e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=10 151978^1048576+1 Time: 5.24 ms/mul. Err: 1.86e-002 5433491 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Starting initialization...
maxErr during b^N initialization = 0.0000 (12.389 seconds).
Testing 151978^1048576+1...
Estimated total run time for 151978^1048576+1 is 16:36:02
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (17633424 iterations left)
Estimated total run time for 151978^1048576+1 is 16:36:20
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (17611883 iterations left)
Estimated total run time for 151978^1048576+1 is 16:36:38
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (16277046 iterations left)
Estimated total run time for 151978^1048576+1 is 16:36:20
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (16112905 iterations left)
Estimated total run time for 151978^1048576+1 is 16:35:44
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (15991492 iterations left)
Estimated total run time for 151978^1048576+1 is 16:37:32
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (15835518 iterations left)
Estimated total run time for 151978^1048576+1 is 16:36:56
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (15813613 iterations left)
Estimated total run time for 151978^1048576+1 is 16:36:38
Terminating because BOINC client requested that we should quit.
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Resuming 151978^1048576+1 from a checkpoint (14418095 iterations left)
Estimated total run time for 151978^1048576+1 is 16:38:44
maxErr exceeded for 151978^1048576+1, 0.5000 > 0.4500
MaxErr exceeded may be caused by overclocking, overheated GPUs and other transient errors.
Waiting 10 minutes before attempting to continue from last checkpoint...
Generalized Fermat Number Bench 2
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=5 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=6 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=7 151978^1048576+1 Time: 3.35 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=8 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=9 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=10 151978^1048576+1 Time: 3.31 ms/mul. Err: 1.76e-002 5433491 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|GeForce GTX 460|1620|151978|1048576=5 to genefer.cfg.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=5
Resuming 151978^1048576+1 from a checkpoint (13238271 iterations left)
Estimated total run time for 151978^1048576+1 is 22:13:52
____________
| |
|
|
Pay attention,
GeForce GTX 460(0) in GPU Monitor equals to --device 1
GeForce GTX 460(1) in GPU Monitor equals to --device 0
I'm used to this confusion.
____________
| |
|
|
I have one stupid assumption that bench don't use --device parameter and always runs for --device 0
I came to this conclusion looking at bench log.
Bench knows what device he tests:
Generalized Fermat Number Bench 2
GPU=GeForce GTX 460
but wrote to genefer.cfg data about unknown device:
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I have one stupid assumption that bench don't use --device parameter and always runs for --device 0
I came to this conclusion looking at bench log.
Bench knows what device he tests:
Generalized Fermat Number Bench 2
GPU=GeForce GTX 460
but wrote to genefer.cfg data about unknown device:
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
Good guess, but probably not correct. I don't have a dual CPU system, but you can run genefercuda from a command line and with your GPU meters easily see which GPU it's running the test on.
Here's the bizarre part:
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=10 151978^1048576+1 Time: 5.24 ms/mul. Err: 1.86e-002 5433491 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
The code that prints the GPU information saves the GPU type and speed in order to put that information in the shift cache -- but somehow it got lost here. I need to dig into this a bit further.
____________
My lucky number is 75898524288+1 | |
|
|
I was able to reproduce this bug from command line:
21:33:40 (6592): Can't open init data file - running in standalone mode
genefercuda 3.1.2-0 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: primegrid_genefer_3_1_2_0_2.04_windows_intelx86__cuda32_13.exe -boinc -q 151978^1048576+1 --device 1
Priority change succeeded.
Generalized Fermat Number Bench 2
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=5 151978^1048576+1 Time: 4.46 ms/mul. Err: 1.66e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=6 151978^1048576+1 Time: 3.58 ms/mul. Err: 1.86e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=7 151978^1048576+1 Time: 3.33 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=8 151978^1048576+1 Time: 3.35 ms/mul. Err: 1.76e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=9 151978^1048576+1 Time: 3.62 ms/mul. Err: 1.66e-002 5433491 digits
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
SHIFT=10 151978^1048576+1 Time: 5.3 ms/mul. Err: 1.86e-002 5433491 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|1|(null)|0|151978|1048576=7 to genefer.cfg.
GPU=GeForce GTX 460
Global memory=1073545216 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Starting initialization...
maxErr during b^N initialization = 0.0000 (10.643 seconds).
Testing 151978^1048576+1...
Estimated total run time for 151978^1048576+1 is 16:41:27
But you are right, bench and test run at the same device.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I can reproduce that error, so it will definitely get fixed.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The links in the first post updated for the windows-cuda build of 3.1.2-1, which has some more improvements in the error handling. Linux and Mac builds should be updated in the near future.
This change only applies to CUDA so the CPU builds won't be changing.
3.1.2-1 (cuda) and 3.1.2-0 (cpu) will be the production app once the new server code is placed in production.
____________
My lucky number is 75898524288+1 | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,414,212,272 RAC: 2,794,970
                                      
|
3.1.2-1 (cuda) and 3.1.2-0 (cpu) will be the production app once the new server code is placed in production.
Any ETA on PRPNet version (CPU/AVX support perhaps merged into single CPU version)?
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
3.1.2-1 (cuda) and 3.1.2-0 (cpu) will be the production app once the new server code is placed in production.
Any ETA on PRPNet version (CPU/AVX support perhaps merged into single CPU version)?
There's no ETA yet, sorry.
____________
My lucky number is 75898524288+1 | |
|
|
3.1.2-1 (cuda) and 3.1.2-0 (cpu) will be the production app once the new server code is placed in production.
When is this planned? I am trying to decide if I can wait, or should go ahead and mess with the anonymous platform. Thanks!
____________
Reno, NV
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
3.1.2-1 (cuda) and 3.1.2-0 (cpu) will be the production app once the new server code is placed in production.
When is this planned? I am trying to decide if I can wait, or should go ahead and mess with the anonymous platform. Thanks!
That's a really good question, and a wish I had a really good answer for you.
For purely technical reasons, I need to install new application versions of both the GFN and PPS-sieve applications when I install the new server code, even if the "new" version is identical to the existing version. There's also good reason to keep the number of application versions as low as possible, so I'm trying to avoid installing new application versions because of the new server software, and then a few days later install yet another new version of Genefer. I want to install the new new Genefer version at the same time I install the new server software so I'm installing just one set of new Genefer apps instead of two.
The new Genefer software is written and ready to go, at least on Windows, and the new server software is ready to go, but on my hardware I can only test Windows CUDA apps. I can't test Linux CUDA apps, and I can't build or test anything on a Mac. Other people need to build and test those versions. When I have all the builds available, I'll be installing everything.
The problem is that people have real lives and aren't always available, so I'm not sure how long it will take, although if I had to guess I'd say this will happen within the next few days.
If you're having trouble with errors and you're comfortable with app_info, I'd go ahead and use it. The newer code is more tolerant of errors, so it may save a workunit or two that would otherwise generate an error. If you're not having a problem with errors, or app_info is something you'd rather not bother with, chances are it won't be long until the new apps are in.
For what it's worth, I'm running with the new code with app_info right now -- but I'm also intentionally running with some drivers that are particularly problematic and causing a lot of errors.
____________
My lucky number is 75898524288+1 | |
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
500000^1048576+1 on AMD X6 1100T @ 3493MHz
genefersse3 3.1.2-0 (Windows 64-bit SSE3)
Initialization complete (3.938 seconds)
Estimated total run time for 500000^1048576+1 is 206:31:26
genefx64 3.1.2-0 (Windows 64-bit SSE2)
Initialization complete (19.428 seconds)
Estimated total run time for 500000^1048576+1 is 168:17:58
genefer 3.1.2-0 (Windows 32-bit Default)
Initialization complete (6.657 seconds)
Estimated total run time for 500000^1048576+1 is 1176:59:16
genefer80 3.1.2-0 (Windows 32-bit x87-80)
Initialization complete (24.738 seconds)
Estimated total run time for 500000^1048576+1 is 701:21:12
It's unexpected to me that SSE3 is slower than SSE2 and that 32 is slower than 80.
A combined GFN application is going to have to be smarter than just go down the list and see what works first.
I am happy to run more tests if it helps.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Just so we're clear, what were the conditions of those tests? Was it one instance of Genefer with the other cores idle, or was it 6 instance running on al the cores?
____________
My lucky number is 75898524288+1 | |
|
|
FWIW, I tried out the app_info.xml detailed in this post. It seems *way* more wordy than most app_info.xml files I have used. But I tried it anyway, and BOINC just kept telling me that there were no tasks available for the projects I selected. Yes, I had all projects selected, both CPU and GPU in all cases. I still could not get any genefer tasks for either regular or world record tasks.
____________
Reno, NV
| |
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
I did the tests on Windows 8 - 64 bit with only one instance of Genefer running at a time and no other programs.
I ran it a few times to be sure, only small variations found.
I did some more tests at currently active search levels where within B-limits:
6462696^32768+1: Genefer80: 00:32:05
3030596^65536+1: Genefer80: 02:14:12
1015824^262144+1: GeneferSSE3: 11:07:38, Genefer32: 70:07:37, Genefer80: 40:09:15
741302^524288+1: GeneferSSE3: 45:58:37, Genefer32: 281:26:46, Genefer80: 169:57:43
142972^1048576+1: GeneferSSE3: 194:08:34, Genefx64: 155:57:31
9928^4194304+1: GeneferSSE3: 2762:18:45, Genefx64: 2782:36:29
You can see with the last test SSE3 becomes with quickest.
AVX version crashes, not supported by my CPU. Would be nicer to exit with some error code.
____________
| |
|
Yves Gallot Volunteer developer Project scientist Send message
Joined: 19 Aug 12 Posts: 843 ID: 164101 Credit: 306,539,995 RAC: 5,437

|
GeneferSSE3 and GeneferAVX performances depend on L1 data cache associativity:
8-way => :-)
4-way => :-|
2-way => :-(
Intel i7/i5/i3, Core2 and Core L1 data cache: 8-way set associative.
AMD family 15 (Bulldozer & Piledriver) L1 data cache : 4-way set associative.
AMD K8 and K10 L1 data cache : 2-way set associative.
The Phenom II X6 is a K10 then Genefx64 (which was optimized for Athlon and P4 processors) is faster.
On AMD FX-Series, Athlon X2 340 or X4 740, GeneferAVX should be the fastest. | |
|
Yves Gallot Volunteer developer Project scientist Send message
Joined: 19 Aug 12 Posts: 843 ID: 164101 Credit: 306,539,995 RAC: 5,437

|
A reply to "avx version has a problem when running 6 tasks on a 3770":
GeneferAVX 3.1.2 is based on a new transform which is faster for N >= 262144 and when multiple tasks are running.
Then now 4 GeneferAVX run faster than 4 Genefx64 on a Core i7. | |
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
Thanks Yves,
K10 support 4 SSE4 instructions and 4 AMD only SSE4a instructions. Do you know if they would be any better for Genefer than SSE3? I suppose their performance would also depend on L1 cache associativity.
Great work!
Cheers,
Roger | |
|
|
After ~11days of work (>90% done) all my tasks failed with same error. I used genefersse3_linux. I never saw this error before. Deadline was set to April 13. What's wrong?
06-Apr-2013 13:04:11 [PrimeGrid] Aborting task genefer_1048576_205026_10: exceeded elapsed time limit 963810.21 (453217342.91G/470.24G)
06-Apr-2013 13:04:12 [PrimeGrid] Computation for task genefer_1048576_205026_10 finished
06-Apr-2013 13:04:12 [PrimeGrid] Output file genefer_1048576_205026_10_0 for task genefer_1048576_205026_10 absent
06-Apr-2013 13:05:33 [PrimeGrid] Sending scheduler request: To fetch work.
06-Apr-2013 13:05:33 [PrimeGrid] Reporting 1 completed tasks, requesting new tasks for CPU
06-Apr-2013 13:05:34 [PrimeGrid] Scheduler request completed: got 1 new tasks
06-Apr-2013 13:05:36 [PrimeGrid] Starting task genefer_1048576_209927_1 using genefer version 7 in slot 3
06-Apr-2013 13:19:24 [PrimeGrid] Aborting task genefer_1048576_207291_4: exceeded elapsed time limit 971507.73 (456836986.55G/470.24G)
06-Apr-2013 13:19:25 [PrimeGrid] Computation for task genefer_1048576_207291_4 finished
06-Apr-2013 13:19:25 [PrimeGrid] Output file genefer_1048576_207291_4_0 for task genefer_1048576_207291_4 absent
06-Apr-2013 13:20:45 [PrimeGrid] Sending scheduler request: To fetch work.
06-Apr-2013 13:20:45 [PrimeGrid] Reporting 1 completed tasks, requesting new tasks for CPU
06-Apr-2013 13:20:46 [PrimeGrid] Scheduler request completed: got 1 new tasks
06-Apr-2013 13:20:48 [PrimeGrid] Starting task genefer_1048576_209907_2 using genefer version 7 in slot 0
06-Apr-2013 13:58:54 [PrimeGrid] Aborting task genefer_1048576_206215_2: exceeded elapsed time limit 968030.03 (455201651.95G/470.24G)
06-Apr-2013 13:58:55 [PrimeGrid] Computation for task genefer_1048576_206215_2 finished
06-Apr-2013 13:58:55 [PrimeGrid] Output file genefer_1048576_206215_2_0 for task genefer_1048576_206215_2 absent
06-Apr-2013 14:00:09 [PrimeGrid] Sending scheduler request: To fetch work.
06-Apr-2013 14:00:09 [PrimeGrid] Reporting 1 completed tasks, requesting new tasks for CPU
06-Apr-2013 14:00:11 [PrimeGrid] Scheduler request completed: got 1 new tasks
06-Apr-2013 14:00:13 [PrimeGrid] Starting task genefer_1048576_209906_3 using genefer version 7 in slot 1
06-Apr-2013 15:39:16 [PrimeGrid] Aborting task genefer_1048576_206207_3: exceeded elapsed time limit 968001.11 (455188049.39G/470.24G)
06-Apr-2013 15:39:17 [PrimeGrid] Computation for task genefer_1048576_206207_3 finished
06-Apr-2013 15:39:17 [PrimeGrid] Output file genefer_1048576_206207_3_0 for task genefer_1048576_206207_3 absent
06-Apr-2013 15:40:46 [PrimeGrid] Sending scheduler request: To fetch work.
06-Apr-2013 15:40:46 [PrimeGrid] Reporting 1 completed tasks, requesting new tasks for CPU
06-Apr-2013 15:40:47 [PrimeGrid] Scheduler request completed: got 1 new tasks
06-Apr-2013 15:40:50 [PrimeGrid] Starting task genefer_1048576_209932_1 using genefer version 7 in slot 2
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
After ~11days of work (>90% done) all my tasks failed with same error. I used genefersse3_linux. I never saw this error before. Deadline was set to April 13. What's wrong?
I've never seen that either, and I'm not sure what's wrong. I'll look into what exactly is going on.
I do have a suggestion for you: You're seeing run times of about 300 hours, which is fast enough to meet the deadline, but if I were running a corei3 (or any CPU with hyperthreading) I'd run only half as many tasks so each task is running on a real core rather than a hyperthread. I'd rather get the tasks done twice as fast rather than doing twice as many simultaneously. My much older Core2 Q6600 runs the same WUs in 188 hours:
genefersse3 3.1.2-0 (Windows 64-bit SSE3)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: genefersse3_windows -q 129096^1048576+1
Priority change succeeded.
Testing 129096^1048576+1...
Starting initialization...
Initialization complete (5.661 seconds).
Estimated total run time for 129096^1048576+1 is 188:38:26
Running fewer, shorter, tasks is not only more reliable, but also uses less cache memory and therefore should run even faster. With both LLR and Genefer, both of which are getting more and more optimized, running faster, and crunching bigger numbers, we're seeing the effects of running too many instances slowing down the programs because of cache misses.
I'm not sure if that would have had any impact on the error you experienced, but BOINC doesn't know anything about hyperthreads, so as far as it's concerned, Genefer running on a hyperthread is consuming double the cpu time as the same program running on a real thread.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Aborting task genefer_1048576_205026_10: exceeded elapsed time limit 963810.21 (453217342.91G/470.24G)
The BOINC client on your computer seems to think your CPU is cranking out about 470 GFLOP per second, which is the speed of a decent GPU. Your benchmarks show a more realistic number in the 2 GFLOP range.
Because of this, the BOINC client thinks you exceeded the GFLOP limit for the task, which is set to 15 times the expected total GFLOPs and never is a problem, except for now.
There's two possible causes I can think of. By far the most likely is that there's something in your app_info which is throwing the GLFOP counts off.
Could you post your entire app_info? (or send it to me via PM?)
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Could you post your entire app_info? (or send it to me via PM?)
No need to send or post your app_info. I think I see what the problem is.
A few posts earlier, Zombie67 posted a link to *MY* post with a sample app_info, and the error is in there -- it's explicitly setting the run speed of the computer to 470 GFLOPS/s, even for the CPU.
I dont think that needs to be in there at all, so just remove all the <flops>xxxxxxx</flops> tags from your app_info.xml files and everything should work fine.
My apologies for causing this!
____________
My lucky number is 75898524288+1 | |
|
|
Michael, thanks for your help.
I've copied part of the app_info.xml file from http://www.primegrid.com/forum_thread.php?id=4889&nowrap=true#63045 and changed some params in it.
My full app_info.xml file is here:
<app_info>
<app>
<name>genefer</name>
<user_friendly_name>Genefer</user_friendly_name>
</app>
<file_info>
<name>genefersse3_linux</name>
<executable/>
</file_info>
<app_version>
<app_name>genefer</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>genefersse3_linux</file_name>
<main_program/>
</file_ref>
<platform>linux_intelx86_64</platform>
<flops>470235051047.497681</flops>
</app_version>
</app_info>
Upd:
I dont think that needs to be in there at all, so just remove all the <flops>xxxxxxx</flops> tags from your app_info.xml files and everything should work fine.
I'll try this. | |
|
|
Alright, so I'd love to test out these apps, by which I mean the 32-bit CUDA app, because that's all I can use at the moment...
But I don't quite understand the SHIFT parameter, not having written C/CUDA ever - or indeed any GPGPU code - assuming this is part of that general syntactical family.
I completed recently 2 normal (short) Genefer tests via BOINC using the old 2,04(cuda32_23) app. They both ran to completion and are pending validation (lots of errors for other would-be wingmen).
Weird thing is this is an overclocked GTX 460 running on a hyper-threaded single-core Pentium 4 Prescott 520 from 2004 under Windows XP. My highest-credit machine by far! In real life, it looks quite bizarre.
So I see this in stderr:
...stuff...
...stuff...
SHIFT=10 157280^1048576+1 Time: 5.34 ms/mul. Err: 1.76e-002 5449108 digits
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|0|(null)|0|157280|1048576=7 to genefer.cfg.
GPU=GeForce GTX 460
Global memory=1073283072 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.1
Clock=1620 MHz
# of MP=7
No project preference set; using AUTO-SHIFT=7
Starting initialization...
maxErr during b^N initialization = 0.0000 (21.032 seconds).
Testing 157280^1048576+1...
Estimated total run time for 157280^1048576+1 is 16:44:01
157280^1048576+1 is complete. (5449108 digits) (err = 0.0234) (time = 16:45:21) 15:56:25
15:56:25 (1508): called boinc_finish
</stderr_txt>
and I notice this specifically:
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|0|(null)|0|157280|1048576=7 to genefer.cfg.
How exactly are you accessing device parameters? In this case particularly, the canonical device name?
Granted, enumeration isn't the issue here (as it was in the previously discussed case) as my machine contains only GPU 0, i.e. a single GPU that the application and BOINC and everything else see as one GPU.
Apologies if this is way unrelated in reality, but I have had trouble with this kind of stuff before - especially on Windows - when trying to report hardware names i.e. CPU type, and at that especially when trying to print it to a Unicode string, even when the access function (written in Assembly) returns, or ought to return, a human-readable string.
Sorry if that's not at all what's going on here - I just know that 64-bit MSVC compiler has no _asm intrinsic, and this is a 32-bit application, so with my admittedly limited knowledge, I'd query device names with inline Assembly code provided it isn't too troublesome - I have been discouraged from doing so, but I was also writing C++ code, which cannot be compiled into a CUDA program (safely).
All this and I haven't even tested the 3.* apps yet! I definitely will, although I'm not expecting the 16-hour runtime to change for me anytime soon.
(somewhat unrelated: via PRPNet, if I recall correctly, the CUDA application cannot be used for m=32768 or m=65536 - is this still the case?)
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Best SHIFT determined experimentally. Saving AUTOSHIFT|genefercuda|3.1.2-0|0|(null)|0|157280|1048576=7 to genefer.cfg.
How exactly are you accessing device parameters? In this case particularly, the canonical device name?
First of all, the part in red was a bug in that one version of the program. It's supposed to read "GTX 460|1350", i.e., the GPU model and clock speed.
The answer to your question is the Nvidia CUDA library provides API calls to get that information. No assembly language required.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The links in the first post updated for the windows-cuda build of 3.1.2-1, which has some more improvements in the error handling. Linux and Mac builds should be updated in the near future.
This change only applies to CUDA so the CPU builds won't be changing.
3.1.2-1 (cuda) and 3.1.2-0 (cpu) will be the production app once the new server code is placed in production.
The Linux version of the 3.1.2-1 CUDA apps can now be downloaded from the links in the first post.
____________
My lucky number is 75898524288+1 | |
|
|
for your attention, Mike:
http://www.primegrid.com/result.php?resultid=446656921
2 maxErr during 1 task execution and... and yet the task was vaildated.
I definitely like the new version more and more. )
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
for your attention, Mike:
http://www.primegrid.com/result.php?resultid=446656921
2 maxErr during 1 task execution and... and yet the task was vaildated.
I definitely like the new version more and more. )
Indeed. That's somewhat surprising, but that's great news.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
For Windows and Linux, 3.1.2-1 is now live in production, so if you're running that version with app_info you no longer need to use app_info. If you're running an older version with app_info, you really should either stop using app_info or run the latest version because of improved reliability due to better error handling.
A Mac version of 3.1.2-1 is planned and hopefully will be available soon.
____________
My lucky number is 75898524288+1 | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,414,212,272 RAC: 2,794,970
                                      
|
Noticed new GFN app version.
BOINC went to panic mode since run-time estimation was 450+ hours (for short GFN and GTX 580) ans past deadline. It takes only 8.5 hours, will settle down and no real harm.
btw, leading edge it ~162k.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
BOINC went to panic mode since run-time estimation was 450+ hours (for short GFN and GTX 580) ans past deadline. It takes only 8.5 hours, will settle down and no real harm.
That should be at least partially fixed now.
____________
My lucky number is 75898524288+1 | |
|
|
avx
1: Estimated total run time for 49740^1048576+1 is 40:15:33
2: Estimated total run time for 123298^1048576+1 is 67:37:22
3: Estimated total run time for 131046^1048576+1 is 97:28:50
4: Estimated total run time for 135440^1048576+1 is 133:26:58
5: Estimated total run time for 138704^1048576+1 is 173:43:32
6: Estimated total run time for 138722^1048576+1 is 214:54:51
7: Estimated total run time for 128070^1048576+1 is 255:06:15
8: Estimated total run time for 138710^1048576+1 is 306:51:47
-------------------------
non-avx
1: Estimated total run time for 138728^1048576+1 is 72:34:19
2: Estimated total run time for 123466^1048576+1 is 77:46:28
3: Estimated total run time for 138608^1048576+1 is 92:16:41
4: Estimated total run time for 123926^1048576+1 is 101:38:09
5: Estimated total run time for 138092^1048576+1 is 115:17:29
6: Estimated total run time for 123582^1048576+1 is 124:21:16
7: Estimated total run time for 138732^1048576+1 is 143:44:47
8: Estimated total run time for 138734^1048576+1 is 159:58:18
Latest avx version:
1: Estimated total run time for 160114^1048576+1 is 40:46:08
2: Estimated total run time for 161920^1048576+1 is 46:56:29
3: Estimated total run time for 161826^1048576+1 is 64:10:35
4: Estimated total run time for 161906^1048576+1 is 82:47:18
5: Estimated total run time for 161914^1048576+1 is 105:16:44
6: Estimated total run time for 156336^1048576+1 is 129:30:41
7: Estimated total run time for 161636^1048576+1 is 155:39:38
8: Estimated total run time for 161278^1048576+1 is 186:36:18
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
The usage of different bases makes it hard to compare the performance and causes itself different running times. How should we evaluate the increasing computing time?
Please can you retest your calculation only with "138710^1048576+1" for avx and non-avx, on 1 until 8 cores and with old and new app?
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
Please can you retest your calculation only with "138710^1048576+1" for avx and non-avx, on 1 until 8 cores and with old and new app?
not easily as I'm getting the tasks using boinc manager but I'll re-run all 3 with whatever boinc throws at me on the weekend - I've just kicked off a bunch of SoB and Gen WR tasks so need to let them finish first. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
For Windows and Linux, 3.1.2-1 is now live in production, so if you're running that version with app_info you no longer need to use app_info. If you're running an older version with app_info, you really should either stop using app_info or run the latest version because of improved reliability due to better error handling.
A Mac version of 3.1.2-1 is planned and hopefully will be available soon.
The Mac version of 3.1.2-1 is now in production, as well as being available for download via the link in the first post.
IMPORTANT: The binaries linked to above were not properly updated last time (maybe the last two times), so if you're using any of the beta apps linked in the first post, either for PRPNet, for BOINC with app_info, or for anything else, please verify you actually have the correct versioin by running the binary from the command line with the -V option. Alternatively, you could re-download the binary to insure you have the most recent version. Thank you.
____________
My lucky number is 75898524288+1 | |
|
|
I find that BOINC manager estimates CPU usage very high in genefercuda 3.1.2-1 (Windows 32-bit CUDA).
In my case, it shows "0.919 CPUs + 1 NVIDIA GPU (device 0)"l,
which has no problem with single GPU.
When I use multiple GPUs, its estimation exceeds 1 CPUs.
As a result, 1 core of the CPU can not run for other CPU tasks.
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I find that BOINC manager estimates CPU usage very high in genefercuda 3.1.2-1 (Windows 32-bit CUDA).
In my case, it shows "0.919 CPUs + 1 NVIDIA GPU (device 0)"l,
which has no problem with single GPU.
When I use multiple GPUs, its estimation exceeds 1 CPUs.
As a result, 1 core of the CPU can not run for other CPU tasks.
We're aware of this problem. This is part of a larger problem involving time estimates with the new server. All I can say, at this point, is that we're investigating the cause and possible solutions.
This is really a problem related to the new server code, rather than the genefer app, so further discussion on this would be better off in this thread: http://www.primegrid.com/forum_thread.php?id=4966
____________
My lucky number is 75898524288+1 | |
|
|
New avx version: 3.1.2
1) Estimated total run time for 161278^1048576+1 is 40:28:52
2) Estimated total run time for 161278^1048576+1 is 46:41:20
3) Estimated total run time for 161278^1048576+1 is 64:46:05
4) Estimated total run time for 161278^1048576+1 is 83:42:49
5) Estimated total run time for 161278^1048576+1 is 108:04:52
6) Estimated total run time for 161278^1048576+1 is 133:14:05
7) Estimated total run time for 161278^1048576+1 is 165:56:10
8) Estimated total run time for 161278^1048576+1 is 172:32:13
old avx version: 3.1.1
1) Estimated total run time for 161278^1048576+1 is 48:48:37
2) Estimated total run time for 161278^1048576+1 is 71:17:54
3) Estimated total run time for 161278^1048576+1 is 102:29:53
4) Estimated total run time for 161278^1048576+1 is 135:59:09
5) Estimated total run time for 161278^1048576+1 is 175:40:52
6) Estimated total run time for 161278^1048576+1 is 216:33:20
7) Estimated total run time for 161278^1048576+1 is 263:28:53
8) Estimated total run time for 161278^1048576+1 is 352:03:45
current non-avx version:
1) Estimated total run time for 161278^1048576+1 is 81:49:45
2) Estimated total run time for 161278^1048576+1 is 88:16:26
3) Estimated total run time for 161278^1048576+1 is 104:23:16
4) Estimated total run time for 161278^1048576+1 is 117:35:21
5) Estimated total run time for 161278^1048576+1 is 131:49:07
6) Estimated total run time for 161278^1048576+1 is 149:25:45
7) Estimated total run time for 161278^1048576+1 is 161:55:31
8) Estimated total run time for 161278^1048576+1 is 177:15:11
| |
|
|
I find that BOINC manager estimates CPU usage very high in genefercuda 3.1.2-1 (Windows 32-bit CUDA).
In my case, it shows "0.919 CPUs + 1 NVIDIA GPU (device 0)"l,
which has no problem with single GPU.
When I use multiple GPUs, its estimation exceeds 1 CPUs.
As a result, 1 core of the CPU can not run for other CPU tasks.
We're aware of this problem. This is part of a larger problem involving time estimates with the new server. All I can say, at this point, is that we're investigating the cause and possible solutions.
This is really a problem related to the new server code, rather than the genefer app, so further discussion on this would be better off in this thread: http://www.primegrid.com/forum_thread.php?id=4966
Workaround is to create app_config.xml
<app_config>
<app>
<name>genefer</name>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>.02</cpu_usage>
</gpu_versions>
</app>
</app_config>
The placement of the app_config.xml is in its respective project folder, e.g. for PrimeGrid it goes into .\projects\www.primegrid.com
You need BOINC 7.0.40+ for app_config support.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
For those of you using app_info.xml for World Record tasks, there's a new Windows CUDA version of GeneferCUDA that fixes a single bug. Previously, GenefercUDA was reading the BOINC shift override for the short tasks when running both short or world record tasks.
With this new version, GeneferCUDA will correctly read the WR preference when running WR tasks.
This new version for Windows is in production on BOINC, and can also be downloaded from the links in the first post if you're running with app_info.
Mac and Linux versions will be available (and put in production) in the near future.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The Linux CUDA 3.2 and 5.0 builds of 3.1.2-2 are now available for download via the links in the first post and the 3.2 build is in production on BOINC.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The Mac build of 3.1.2-2 is now available for download via the link in the first post and is in production on BOINC.
____________
My lucky number is 75898524288+1 | |
|
|
Hi. This is my first post here, and I wasn't completely sure where to post.
I run Boinc (including PrimeGrid) on one Linux machine: 3.5.0-28-generic #48~precise1-Ubuntu SMP Wed Apr 24 21:42:24 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux; Nvidia GeForce GTX 560 Ti, driver 304.88 (from precise-updates repository). No overclocking. Processor is "Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz".
I've installed boinc, boinc-client, boinc-manager and boinc-nvidia-cuda version 7.0.27+dfsg-5ubuntu0.12.04.1.
I'm having problems with Genefer WR. I get a computation error immediately. Genefer WR worked earlier, and this happened after I updated KDE (which may be coincidental). After that the Genefer said in boincmgr that "Waiting for shared memory"; I aborted that task and the computation errors started. PPS sieve does works on GPU.
Relevant event log:
Wed 15 May 2013 21:30:57 EEST | PrimeGrid | Starting task genefer_wr_4194304_144099_31 using genefer_wr version 206 (cudaGFNWR) in slot 0
Wed 15 May 2013 21:30:58 EEST | PrimeGrid | Computation for task genefer_wr_4194304_144099_31 finished
Wed 15 May 2013 21:30:58 EEST | PrimeGrid | Output file genefer_wr_4194304_144099_31_0 for task genefer_wr_4194304_144099_31 absent
The application is version "primegrid_genefer_3_1_2_2_2.06_i686-pc-linux-gnu__cudaGFNWR". I have no app_info or app_config.
I don't really know how I should try to fix Genefer WR calculation so any help is appreciated. | |
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3254 ID: 130544 Credit: 2,446,790,434 RAC: 4,239,105
                           
|
Are there any graphics driver crashes noted in the Linux-equivalent of eventvwr? Do you have Nvidia driver 310.xx or above? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
The actual error you're getting is this:
Couldn't copy file: shmget() failed
That's either an error inn the WU. of there's something wrong with your boinc installation.
Since that app version (v5.06 Linux GFN-WR) has been successfully returned by others, it's not the WU definition, so it's a configuration problem on your computer.
That error occurs when the boinc client is trying to copy files from the project directory into the slot directory. Typical reasons for that failure are that you're out of disk space or the permissions on the directory aren't allowing the file to be copied, but this time, based on the error message, it looks like a memory allocation failure. That would indicate that either something is consuming all the memory, or Linux is configured wrong.
But that assumes that the error message is accurate, and since this is boinc I wouldn't be certain of that.
This isn't an error that's specific to the GFN-WR tasks, however, the files that are copied into the slot directory are relatively large (over 30 MB), so perhaps that has something to do with it.
I've never seen that error pop up before, so perhaps someone else will have more useful advice.
____________
My lucky number is 75898524288+1 | |
|
|
Thank you, Michael.
After reading your post I realised I had changed something about configuration: I had been building a tile server for Openstreetmap using these instructions. There is a recommendation to change kernel.shmmax to a bigger value there, and after changing it back, Genefer is now working okay.
So it was a user error, not something dependent on PG or Boinc. | |
|
|
There's been a long time since I stopped crunching GFN (and reading GFN threads on forum). So, please forgive if this has been answered.
The CPU production app for windows 64 bits is already using AVX or app_config is still needed to use it? If so, can someone provide a standalone (ie, no other sub-projects needed) app_config text?
Thanks in advance.
____________
676754^262144+1 is prime | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
There's been a long time since I stopped crunching GFN (and reading GFN threads on forum). So, please forgive if this has been answered.
The CPU production app for windows 64 bits is already using AVX or app_config is still needed to use it? If so, can someone provide a standalone (ie, no other sub-projects needed) app_config text?
Thanks in advance.
The production apps don't yet use AVX, so you'll have to use app_info.
____________
My lucky number is 75898524288+1 | |
|
|
Thanks Michael.
I'll try that as soon as I figure out how app_info works
____________
676754^262144+1 is prime | |
|
|
>WINDOWS:
>CUDA (aka geneferCUDA) (3.1.2-2, in production on BOINC as 2.06)
The pattern that I tried to be calculated in the above test app to GTX TITAN, but it become invalid in the light with the calculation result of the Wingman.
App of CUDA 5.0 or higher seems necessary to GTX TITAN also. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
>WINDOWS:
>CUDA (aka geneferCUDA) (3.1.2-2, in production on BOINC as 2.06)
The pattern that I tried to be calculated in the above test app to GTX TITAN, but it become invalid in the light with the calculation result of the Wingman.
App of CUDA 5.0 or higher seems necessary to GTX TITAN also.
I'm certain that CUDA 5.0 is NOT required for GTX TITAN:
1) There's 110 successfully validated GFN-short results from a TX TITAN currently in the database.
2) 109 of those 110 are using the stock app, which is not CUDA 5.0.
Your particular results show the typical "repeating-residue" result that you get from overclocking. The most likely cause of your validation failures is therefore overclocking or overheating.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
People running short GFN tasks on their AVX-capable CPUs should now receive the AVX version of Genefer instead of the SSE2 version of Genefer.
____________
My lucky number is 75898524288+1 | |
|
|
Hello.
I want to use AVX for my AMD FX 8120 cpu.
I read the posts, but i'm not sure is this right text for the file?
What is the version_num?
api_version - is this version of BOINC client?
platform - my cpu is AMD fx....... what should be the platform?
<app_version>
<app_name>genefer</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>geneferAVX.exe</file_name>
<main_program/>
</file_ref>
<platform>windows_intelx86_64</platform>
</app_version>
My PC:
OS: windows 7 ultimate SP1 x64
boinc client - last version
cpu - AMD FX 8120 | |
|
|
Now i try with this:
<app_config>
<app>
<name>genefer</name>
<file_ref>
<file_name>geneferavx_windows.exe</file_name>
<main_program/>
</file_ref>
<platform>windows_x86_64</platform>
<cpu_versions>
<cpu_usage>1</cpu_usage>
</cpu_versions>
<max_concurrent>1</max_concurrent>
</app>
</app_config>
but still don't download AVX work. Some help? | |
|
|
I've looked through this thread and I'm curious what these .exe files in the intial post actually do. I apologize if this is a rather newb level question (I've never run an app_info) or if it's been answered elsewhere (I did look around but couldn't find a clear answer).
I know my processor, i7-3770k, is AVX capable but from what I'm seeing Windows is not automatically detecting that. Do I need to download and install the AVX .exe link in order to take advantage of AVX on Genefer? And will this necessitate my running BOINC in a different fashion?
Thanks in advance for any help.
| |
|
|
I've looked through this thread and I'm curious what these .exe files in the intial post actually do. I apologize if this is a rather newb level question (I've never run an app_info) or if it's been answered elsewhere (I did look around but couldn't find a clear answer).
I know my processor, i7-3770k, is AVX capable but from what I'm seeing Windows is not automatically detecting that. Do I need to download and install the AVX .exe link in order to take advantage of AVX on Genefer? And will this necessitate my running BOINC in a different fashion?
Thanks in advance for any help.
Assuming you're using windows 7
You still need app_info.
Read this: http://www.primegrid.com/forum_thread.php?id=5179&nowrap=true#67856
My working app_info (copy the text to a notepad file and save it as app_info.xml, then place it on the primegrid folder -usually ProgramData/Boinc/projects/www.primegrid.com, with the downloaded app from the first post on this thread):
<app_info>
<app>
<name>genefer</name>
<user_friendly_name>Genefer</user_friendly_name>
</app>
<file_info>
<name>geneferavx_windows.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>genefer</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>geneferavx_windows.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>
____________
676754^262144+1 is prime | |
|
|
I've looked through this thread and I'm curious what these .exe files in the intial post actually do. I apologize if this is a rather newb level question (I've never run an app_info) or if it's been answered elsewhere (I did look around but couldn't find a clear answer).
I know my processor, i7-3770k, is AVX capable but from what I'm seeing Windows is not automatically detecting that. Do I need to download and install the AVX .exe link in order to take advantage of AVX on Genefer? And will this necessitate my running BOINC in a different fashion?
Thanks in advance for any help.
Assuming you're using windows 7
You still need app_info.
Read this: http://www.primegrid.com/forum_thread.php?id=5179&nowrap=true#67856
My working app_info (copy the text to a notepad file and save it as app_info.xml, then place it on the primegrid folder -usually ProgramData/Boinc/projects/www.primegrid.com, with the downloaded app from the first post on this thread):
<app_info>
<app>
<name>genefer</name>
<user_friendly_name>Genefer</user_friendly_name>
</app>
<file_info>
<name>geneferavx_windows.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>genefer</app_name>
<version_num>007</version_num>
<api_version>6.10.25</api_version>
<file_ref>
<file_name>geneferavx_windows.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>
I tried this file, but my BOINC still not recognized AVX at my Fx 8120 cpu. Still can't download the AVX work. I changed the api_version to 7.0.64 (because my boinc client is this version) but still nothing. One time i downloaded 1 wu but it was "Lockal: Genefer 007" - not AVX. | |
|
|
I tried this file, but my BOINC still not recognized AVX at my Fx 8120 cpu. Still can't download the AVX work. I changed the api_version to 7.0.64 (because my boinc client is this version) but still nothing. One time i downloaded 1 wu but it was "Lockal: Genefer 007" - not AVX.
You must download the proper file from the first thread.
Boinc will report "local 007" because that's what is written on the app_info file (version number) and will have no reference to AVX. To check if it is using avx app look for the stderr.txt file on the slot folder (ProgramData/Boinc/Slot/0 (or 1 or 2, etc). Copy the stderr.txt to a different folder and see which app it is using. If you downloaded the proper .exe file you should see something like this:
Command line: projects/www.primegrid.com/[b]geneferavx[/b]_windows.exe -boinc -q xxxx^1048576+1
This means you are using the AVX app.
____________
676754^262144+1 is prime | |
|
|
I tried this file, but my BOINC still not recognized AVX at my Fx 8120 cpu. Still can't download the AVX work. I changed the api_version to 7.0.64 (because my boinc client is this version) but still nothing. One time i downloaded 1 wu but it was "Lockal: Genefer 007" - not AVX.
You must download the proper file from the first thread.
Boinc will report "local 007" because that's what is written on the app_info file (version number) and will have no reference to AVX. To check if it is using avx app look for the stderr.txt file on the slot folder (ProgramData/Boinc/Slot/0 (or 1 or 2, etc). Copy the stderr.txt to a different folder and see which app it is using. If you downloaded the proper .exe file you should see something like this:
Command line: projects/www.primegrid.com/[b]geneferavx[/b]_windows.exe -boinc -q xxxx^1048576+1
This means you are using the AVX app.
Thank you. I try this and this is the result:
"geneferavx 3.1.2-0 (Windows 64-bit AVX)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
Command line: projects/www.primegrid.com/geneferavx_windows.exe -boinc -q 207988^1048576+1
Starting initialization...
Initialization complete (8.658 seconds).
Testing 207988^1048576+1...
BOINC client requested that we should suspend.
BOINC client requested that we should resume.
BOINC client requested that we should suspend.
BOINC client requested that we should resume.
BOINC client requested that we should suspend.
BOINC client requested that we should resume.
"
Is this OK?
| |
|
|
Your particular results show the typical "repeating-residue" result that you get from overclocking. The most likely cause of your validation failures is therefore overclocking or overheating.
I tried "GTX TITAN" to match the frequency of the same number as "TESLA K20X".
The display of overclocking or overheating disappears and it could gain the credit satisfactorily.
Thank you for advice. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
I've put GeneferOCL into the beta download area. Feel free to test it and, if it works, use it with app_info. The first post in this thread has the download link.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
Minor update to the GeneferOCL beta app, version 3.1.2-2:
* -b3 benchmarks now run all the way to N=23.
* Benchmarks will produce correct times under Linux/Mac (when a build is available for those platforms).
It is NOT necessary to upgrade from the Windows GeneferOCL 3.1.2-1 to 3.1.2-2, however, if anyone builds a Linux or Mac version before we do, you should definitely use the 3.1.2-2 source code rather than 3.1.2-1 since the benchmarks will work correctly.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
IMPORTANT:
When turning app_info on or off, any tasks currently on your computer WILL BE LOST.
So don't turn app_info on or off when you're, for example, 90% done with an SoB or GFN-WR. You'll be a very unhappy camper.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
GeneferOCL 3.1.2-3 is now available for download. You DO want this new version if you're running GeneferOCL because it's faster!
It's got a new and improved faster transform.
It's now 32-bits rather than 64 bits, meaning it will run on more computers and it's either the same speed or it's faster than the 64 bit version, depending on your system.
-b3 GFLOP ratings now reflect actual GFLOPs rather than theoretical GFLOPS, which means they're lower numbers than in previious versions. Likewise, the "Genefer Mark" overall benchmark score printed at the end of the -b benchmarks is also lower than on previous versions.
We've decided not to be truly sadistic, so you can no longer use -d or --device to select your CPU to run GeneferOCL. It was painfully slow.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
So that people only have to look in one thread for GeneferOCL information, I'm going to move all the OpenCL-related posts to the other thread that's dedicated to only OpenCL.
Please continue the OpenCL discussion in this thread.
When new versions of GeneferOCL are available for testing, I'll announce it here, but discussion should occur in the other thread.
Thank you!
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14039 ID: 53948 Credit: 479,665,689 RAC: 431,032
                               
|
GeneferOCL 3.1.2-4 is now available for testing with app_info, and can be downloaded via the first post in this thread.
3.1.2-4 adds automatic adjusting of tuning parameters.
___________ |
|