Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Generalized Fermat Prime Search :
GCBW 1.07 BETA 1 -- testers needed
Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
I have put together a beta release that attempts to fix the problem of stderr output getting truncated. It seems we're not the only ones with that problem.
For those who want to try it, you can download it here.
Please let me know if you have any further results with missing or truncated stderr text.
The only difference in 1.07 beta 1 is the fix for stderr truncation problem, and the beta is to see if this fix works. There's no need to have a linux build or to do a full release of 1.07 unless it appears that this fix actually corrects the problem.
____________
My lucky number is 75898524288+1 | |
|
|
I am in
____________
| |
|
|
10 WUs finished with GeneferCUDA-boinc 1.07 beta 1:
http://www.primegrid.com/result.php?resultid=346315847
Testing b^262144+1...
679622^262144+1 is a probable composite. (RES=af415ee5d5f61a46) (1528894 digits) (err = 0.1875) (time = 1:22:35) 11:41:48
11:41:48 (5968): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=346316163
Resuming b^262144+1 from a checkpoint (3342335 iterations left)
680576^262144+1 is a probable composite. (RES=e1ecf9b024430c8c) (1529054 digits) (err = 0.1875) (time = 1:22:09) 14:52:55
14:52:55 (4052): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=346316760
Resuming b^262144+1 from a checkpoint (788596 iterations left)
682268^262144+1 is a probable composite. (RES=ef68b080b369c0bd) (1529337 digits) (err = 0.1875) (time = 1:22:00) 17:05:55
17:05:55 (3852): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=346317237
Testing b^262144+1...
683752^262144+1 is a probable composite. (RES=ed7004340b3ec89e) (1529584 digits) (err = 0.2031) (time = 1:22:12) 18:28:17
18:28:17 (4808): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=346317479
Testing b^262144+1...
684502^262144+1 is a probable composite. (RES=de079bc10503f720) (1529709 digits) (err = 0.1953) (time = 1:22:11) 19:50:31
19:50:31 (2368): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=346731090
Resuming b^262144+1 from a checkpoint (360084 iterations left)
697686^262144+1 is a probable composite. (RES=b9fe645e35fea079) (1531881 digits) (err = 0.1875) (time = 1:22:44) 21:05:54
21:05:54 (3980): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=346804751
Resuming b^262144+1 from a checkpoint (2710975 iterations left)
674132^262144+1 is a probable composite. (RES=455be2fe574e2c4f) (1527971 digits) (err = 0.1875) (time = 1:22:33) 09:57:46
09:57:46 (5676): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=347125111
Testing b^262144+1...
685878^262144+1 is a probable composite. (RES=1f4e400b3d254430) (1529938 digits) (err = 0.1875) (time = 1:22:10) 22:20:02
22:20:02 (2724): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=347152337
Resuming b^262144+1 from a checkpoint (919300 iterations left)
686480^262144+1 is a probable composite. (RES=f1b098302f8a12ce) (1530037 digits) (err = 0.1875) (time = 1:22:14) 23:43:56
23:43:56 (5808): called boinc_finish
</stderr_txt>
]]>
http://www.primegrid.com/result.php?resultid=347732751
Resuming b^262144+1 from a checkpoint (3739106 iterations left)
697596^262144+1 is a probable composite. (RES=f3bfb9d10300f998) (1531866 digits) (err = 0.2031) (time = 1:27:56) 22:35:56
22:35:56 (3772): called boinc_finish
</stderr_txt>
]]>
As you can see,
4 of 10 are already validated,
6 of 10 were finished after resume
... and 10 of 10 are probable composite... ;`(
:)
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
Looking good so far, thanks.
All of mine have finished without loosing and stderr output also.
EDIT: All of mine were composite, too. ;-)
____________
My lucky number is 75898524288+1 | |
|
|
EDIT: All of mine were composite, too. ;-)
... probable composite ;-)
____________
| |
|
|
Actually very interesting question: would you continue to crunch if you would know that your number is a probable composite?
For example my wingman has already finished:
http://www.primegrid.com/result.php?resultid=352610910
But he is using 1.06 version and so happened that log truncated and I don't know what is a result.
Michael Goetz, I guess you need to hide the result of testing completely, the log has look equal for a probable composite and for a probable prime.
It will more actual for "world record" long tasks. Noone wants waste time to doublechecking. Everyone wants to be a prime finder.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
Actually very interesting question: would you continue to crunch if you would know that your number is a probable composite?
For example my wingman has already finished:
http://www.primegrid.com/result.php?resultid=352610910
But he is using 1.06 version and so happened that log truncated and I don't know what is a result.
Michael Goetz, I guess you need to hide the result of testing completely, the log has look equal for a probable composite and for a probable prime.
It will more actual for "world record" long tasks. Noone wants waste time to doublechecking. Everyone wants to be a prime finder.
If the admins would prefer it, I could remove that text. However, consider this:
You're 6 days into crunching and your wingman returns his result and it's composite. What do you do? If I abort my WU, I just threw away 6 days of GPU crunching. No credit. Also, I'm being a jerk, because I'm forcing someone else to do the crunching instead. Most people, in my experience, are pretty decent, and wouldn't do that.
But the last one is the kicker.
You may have noticed that lots of people have their GPUs overclocked and a lot of those fail on Genefer because of the overclocking.
Most of those failures result in an maxErr exceeded error, but not all do. Sometimes they complete, but fail on validation.
Your wingman reporting the composite might have suffered an error and be wrong. You could be killing the processing on a number that's actually a world record prime.
I will ask the admins what they would prefer.
____________
My lucky number is 75898524288+1 | |
|
|
How many wrong completed tasks did you see?
I saw several my own, but all of them had trivial Residue.
You're 6 days into crunching and your wingman returns his result and it's composite. What do you do? If I abort my WU, I just threw away 6 days of GPU crunching.
In case when we both start almost simultaneously, you are right.
In fact in subprojects with long time tasks there are a lot of happenings, when one task completed, but his wingman doesn't return or even abort task until deadline, so the 3rd task be sent when the result of 1st task is known.
So, if I'm the 3rd, I can look at my "old brother" log and decide do not even begin this task, abort and receive a new one, more fresh and undefined in terms of prime / composite.
____________
| |
|
|
I would prefer to remove this text together with adding it to work unit like how it's done, for example, for PPS LLR:
http://www.primegrid.com/workunit.php?wuid=254130819
Is prime? | 7449*2^775265+1 is not prime.
____________
| |
|
|
back to topic. @michael: 4 days with 1.07 b1 and all stderr outputs seems to be ok. I did not found any faulty up to now.
Regards Odi
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
back to topic. @michael: 4 days with 1.07 b1 and all stderr outputs seems to be ok. I did not found any faulty up to now.
Regards Odi
Thanks, I'm pretty certain now that the 1.07 code does fix the truncation problem. Whenever the next release will be, that code will be in there. That release is also likely to include changes based upon the discussion regarding suppressing the "is composite" messages, as well as some changes to delay reporting some errors back to the server (in order to reduce rapid-fire WU errors.)
A substantial amount of regression testing is necessary on the code before it's released, however. Iain went and did a superb job of combining all the Genefer variants (Genefer, Genefer80, GenerX64, and GeneferCUDA) into a single code base. So nearly everything needs to be tested before we let it go and break stuff.
____________
My lucky number is 75898524288+1 | |
|
|
Iain went and did a superb job of combining all the Genefer variants (Genefer, Genefer80, GenerX64, and GeneferCUDA) into a single code base.
Does it mean only one executable with all former apps inside? Will this also usable at prpnet?
Regards Odi
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
Iain went and did a superb job of combining all the Genefer variants (Genefer, Genefer80, GenerX64, and GeneferCUDA) into a single code base.
Does it mean only one executable with all former apps inside? Will this also usable at prpnet?
Regards Odi
No. Yes.
Second part first. The "BOINC" builds of any version of Genefer are also usable for PRPNet, and always have been. I see no reason why that would ever change. From a technical perspective, BOINC functionality is activated by including "-boinc" on the command line issued to any version of Genefer built with the new unified source code that was built with BOINC support. If "-boinc" is not included on the command line, then Genefer operates in non-boinc mode and is fully compatible with PRPNet.
First part: No, unfortunately we still need separate executables. You need different executables for Mac and Linux and Windows. While its technically possible to create a single executable that would run Genefer, Genefer80, and GeneferCUDA, there's no easy way to combine x86 and x64 code together. Furthermore, combining the CUDA version with the CPU versions would mean that even CPU users would need to have the CUDA DLLs to run, and there's no reason to force that. Therefore, there will also be 4 separate executables for Genefer, Genefer80, GeneferX64, and GeneferCUDA. Combine the 4 variants with the three platforms and there will be 12 executables in all. (It's not necessary to have separate executables for BOINC and non-BOINC since, as mentioned above, the BOINC build serves both purposes.)
What this does mean, however, is that a lot of the same source code is now shared between the four variants, which makes their behavior more consistent and makes future maintenance easier.
____________
My lucky number is 75898524288+1 | |
|
|
Thx. for the info. That's I expected. I wondered if it should possible to merge these 4 apps.
Regards Odi
____________
| |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,402,379,339 RAC: 2,512,564
                                      
|
Whenever the next release will be, that code will be in there. That release is also likely to include changes based upon the discussion...
Not big issues but could be of help if it is not time consuming to implement.
How about -b3 switch. This would be a benchmark as -b2 (testing block sizes) with input of specific test where both b and N will be known.
Example: geneferCUDA-boinc-windows.exe -b3 498^4194304+1
Output would be similar to -b2 benchmark (ms/mul) and should also include estimated time using magic formula X * ((N * LOG(B) / LOG(2)) + 1
This benchmark would give optimal block size AND real processing time of specific test.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
Example: geneferCUDA-boinc-windows.exe -b3 498^4194304+1
Output would be similar to -b2 benchmark (ms/mul) and should also include estimated time using magic formula X * ((N * LOG(B) / LOG(2)) + 1
This benchmark would give optimal block size AND real processing time of specific test.
I'm missing something; I don't see the purpose in doing that. What's the use case under which it would be useful?? (Understanding that you already have the tools to accurately predict run times, if that's what your goal is.)
Although the code is simple, this is a change that would need to be tested in up to 12 different versions of the program, by at least 3 or 4 different people. There needs to be a reason to do that.
Is it just a way to get an estimate of the run time of the current WU without double clicking on Excel? If that's what you want, it would actually be much easier to have the program ALWAYS output the estimated run time. In BOINC, you could just open up the stderr.txt file in the BOINC slot directory to see the value.
____________
My lucky number is 75898524288+1 | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,402,379,339 RAC: 2,512,564
                                      
|
I'm missing something; I don't see the purpose in doing that. What's the use case under which it would be useful?? (Understanding that you already have the tools to accurately predict run times, if that's what your goal is.)
It would be in one place, always accurate and up-to-date with your hardware setting, human readable, user friendly etc.
But I agree that we already have means to get such number so the time needed for implementing/testing is not worth it.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 476,993,571 RAC: 281,271
                               
|
Whenever the next release will be, that code will be in there. That release is also likely to include changes based upon the discussion...
Not big issues but could be of help if it is not time consuming to implement.
How about -b3 switch. This would be a benchmark as -b2 (testing block sizes) with input of specific test where both b and N will be known.
Example: geneferCUDA-boinc-windows.exe -b3 498^4194304+1
Output would be similar to -b2 benchmark (ms/mul) and should also include estimated time using magic formula X * ((N * LOG(B) / LOG(2)) + 1
This benchmark would give optimal block size AND real processing time of specific test.
We're getting close to having a new release of GeneferCUDA. I'm incorporating something like this into that release.
There's no special switch for this functionality. At the beginning of ALL WUs, a message will be printed showing the estimated run time. (In BOINC this can be found by opening up the stderr.txt file in the appropriate slot directory.)
Furthermore, when run under PRPNet, the periodic progress messages will show the estimated remaining run time.
____________
My lucky number is 75898524288+1 | |
|
Message boards :
Generalized Fermat Prime Search :
GCBW 1.07 BETA 1 -- testers needed |