Author |
Message |
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
I'm happy to say that new software, LLR2, is installed on almost all big LLR projects.
LLR2 is a rewrite of LLR with amazing new features:
- It does not requires second task to verify correctness of result. Instead, a "proof of computation" is built and later validated. Validation of the proof takes just about 1% of original task length.
- Since only one long test required to find a prime, you always will be first. Any, even old and slow hardware will find a prime. Prime hunting becomes more like lottery - if you got a lucky ticket from the server, you'll win no matter will you open it sooner or later.
- It can detect and correct computing errors. No more inconclusive tasks!
These new features have it's price:
- New tasks will use more memory and disk space;
- The proof of computation must be uploaded to server. Length of the proof, in bytes, is roughly equal to the "N" of the test. So a task with N=10M will need to upload about 10 MBytes of data. So:
- Be careful if you have a metered or paid connection;
- If upload speed of your connection is too low (DSL or some cellular types), limit upload speed in Boinc settings. This will not affect speed of calculations (Boinc will start calculations of new task first, then slowly upload results of previous task).
- A MAC application is not ready yet. If you're using a MAC, either switch to another project (see list below) either run Linux in a virtual machine. A MacOS application at 321 project requires MacOS 10.15 and above. Applications on other projects can work on older MacOS versions 321 project will be updated to compatible app when all it's old tasks will be finished.
- It seems that minimal supported MacOS version is 10.7.
Project currently converted to LLR2
321
CUL
ESP
PSP
SOB
TRP
WOO
GCW (2020-09-19)
SR5 (2020-09-19)
DIV (2020-09-25)
PPS-MEGA (2020-11-03)
PPS
Projects which status is unknown:
PPSE (currently in LLR2, but may be reverted back to LLR if servers will not handle load)
Project which will be NOT converted in nearest future:
SGS |
|
|
|
Hi, i run 12 321 small Tasks and got full pending credit(6500+Credit).
Is this temporary, or a bug?
https://www.primegrid.com/workunit.php?wuid=674629315
https://www.primegrid.com/workunit.php?wuid=674629313
Thanks. |
|
|
|
Very exciting! This is an enormous step forward for PrimeGrid.
Many thanks to Pavel, Stream and everyone else involved with the implementation of LLR2 with Fast-DC! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
Hi, i run 12 321 small Tasks and got full pending credit(6500+Credit).
Is this temporary, or a bug?
https://www.primegrid.com/workunit.php?wuid=674629315
https://www.primegrid.com/workunit.php?wuid=674629313
Thanks.
The pending credit probably hasn't been updated yet.
____________
My lucky number is 75898524288+1 |
|
|
|
Ok, thanks.
After validation it is 51 or so. |
|
|
|
Fab news ! You mentioned more RAM needed - let's say you were getting a 64 core threadripper and running 1 task per core (no hyper or multi thread as no rush to return result now to be first), how much RAM would you need running the 64 tasks ? |
|
|
|
b n depth DC Temporary Upload Download
----------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
321 2 16075875 7 1/128 245 MB 15 MB 1.9 MB
CUL 2 18310545 7 1/128 279 MB 20 MB 2.2 MB
ESP 2 14820654 7 1/128 226 MB 14 MB 1.8 MB
PSP 2 22513138 7 1/128 344 MB 21 MB 2.7 MB
SOB 2 32698611 7 1/128 499 MB 31 MB 3.9 MB
TRP 2 10796670 7 1/128 165 MB 10 MB 1.3 MB
WOO 2 18730248 7 1/128 286 MB 20 MB 2.2 MB
GCW 121 2022636 7 1/128 214 MB 15 MB 1.7 MB
DIV 2 6166918 7 1/128 94 MB 6 MB 0.7 MB
SR5 5 3173566 7 1/128 112 MB 7 MB 0.9 MB
PPS-MEGA 2 3337564 7 1/128 51 MB 3 MB 0.4 MB
|
|
|
|
b n depth DC Temporary Upload Download
----------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
321 2 16075875 7 1/128 245 MB 15 MB 1.9 MB
CUL 2 18310545 7 1/128 279 MB 20 MB 2.2 MB
ESP 2 14820654 7 1/128 226 MB 14 MB 1.8 MB
PSP 2 22513138 7 1/128 344 MB 21 MB 2.7 MB
SOB 2 32698611 7 1/128 499 MB 31 MB 3.9 MB
TRP 2 10796670 7 1/128 165 MB 10 MB 1.3 MB
WOO 2 18730248 7 1/128 286 MB 20 MB 2.2 MB
GCW 121 2022636 7 1/128 214 MB 15 MB 1.7 MB
DIV 2 6166918 7 1/128 94 MB 6 MB 0.7 MB
SR5 5 3173566 7 1/128 112 MB 7 MB 0.9 MB
PPS-MEGA 2 3337564 7 1/128 51 MB 3 MB 0.4 MB
Cool!! GCW, DIV, SR5, and MEGA joined the list too? Nice!
____________
My lucky number is 6219*2^3374198+1
|
|
|
Nick  Send message
Joined: 11 Jul 11 Posts: 2298 ID: 105020 Credit: 8,356,519,392 RAC: 5,814,361
                            
|
New tasks will use more memory
This changes how we use FFT x 8 for tasks to run in cache? |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
Fab news ! You mentioned more RAM needed - let's say you were getting a 64 core threadripper and running 1 task per core (no hyper or multi thread as no rush to return result now to be first), how much RAM would you need running the 64 tasks ?
First of all, the multi-threading thing must be used anyway. It will improve overall efficiency of your PC because internal CPU cache will be utilized better. You will finish more tasks per day with multi-threading.
As for RAM usage, some tasks may use up to 512M. Note the you don't heed to have so much physical RAM - RAM is used to cache some rarely used data so having a large enough virtual memory (swap file) in you OS will be sufficient (OS will automatically move this data to the swap file and back when necessary).
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
New tasks will use more memory
This changes how we use FFT x 8 for tasks to run in cache?
No. Main prime searching part was not changed much. All things which we know about CPU cache are still valid. Extra RAM is required only to keep some rarely used intermediate data.
|
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3232 ID: 50683 Credit: 151,439,306 RAC: 133,570
                         
|
b n depth DC Temporary Upload Download
----------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
321 2 16075875 7 1/128 245 MB 15 MB 1.9 MB
CUL 2 18310545 7 1/128 279 MB 20 MB 2.2 MB
ESP 2 14820654 7 1/128 226 MB 14 MB 1.8 MB
PSP 2 22513138 7 1/128 344 MB 21 MB 2.7 MB
SOB 2 32698611 7 1/128 499 MB 31 MB 3.9 MB
TRP 2 10796670 7 1/128 165 MB 10 MB 1.3 MB
WOO 2 18730248 7 1/128 286 MB 20 MB 2.2 MB
GCW 121 2022636 7 1/128 214 MB 15 MB 1.7 MB
DIV 2 6166918 7 1/128 94 MB 6 MB 0.7 MB
SR5 5 3173566 7 1/128 112 MB 7 MB 0.9 MB
PPS-MEGA 2 3337564 7 1/128 51 MB 3 MB 0.4 MB
Cool!! GCW, DIV, SR5, and MEGA joined the list too? Nice!
It is not joined to list ( at least not now)
and yes, from this moment PG become fair: no matter does you have lastest ultra fast cPU or slow CPU if he works OK, and you pick right WU. prime is yours!
Congratulations and thanks to all who make this happen!
I wait for SR5!
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
|
Cheers for explanation on threading and RAM ! |
|
|
Nita Send message
Joined: 14 Dec 18 Posts: 25 ID: 1085395 Credit: 725,616,636 RAC: 15,609
               
|
Great news. |
|
|
Nick  Send message
Joined: 11 Jul 11 Posts: 2298 ID: 105020 Credit: 8,356,519,392 RAC: 5,814,361
                            
|
Cheers stream! |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3201 ID: 130544 Credit: 2,282,781,827 RAC: 1,025,870
                           
|
This is fantastic, congratulations to everyone who has developed this. |
|
|
|
Greetings ... does the new llr2 application is going to be downloaded automaticaly ... ?
i've made a reset and detach, however ... the mt application for SoB ... has a value of 8.10 instead of 9.00 ?
should we wait ?
sincerely, Grzegorz Roman Granowski |
|
|
|
Fab news ! You mentioned more RAM needed - let's say you were getting a 64 core threadripper and running 1 task per core (no hyper or multi thread as no rush to return result now to be first), how much RAM would you need running the 64 tasks ?
First of all, the multi-threading thing must be used anyway. It will improve overall efficiency of your PC because internal CPU cache will be utilized better. You will finish more tasks per day with multi-threading.
As for RAM usage, some tasks may use up to 512M. Note the you don't heed to have so much physical RAM - RAM is used to cache some rarely used data so having a large enough virtual memory (swap file) in you OS will be sufficient (OS will automatically move this data to the swap file and back when necessary).
Just to emphasise how bad an idea it is to run a task per core where multithreading is the better option I've kicked off 16 SoB tasks on my 3950x.
If I run 2 tasks with 8 threads each I can run roughly 2.5/day.
After 15 minutes of running 16 tasks the estimate was for 18.5 days.
18.5*2.5 = 46. So you'll get roughly 1/3 the throughput by loading up all cores and on a threadripper it will likely be even worse.
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
Greetings ... does the new llr2 application is going to be downloaded automaticaly ... ?
i've made a reset and detach, however ... the mt application for SoB ... has a value of 8.10 instead of 9.00 ?
should we wait ?
sincerely, Grzegorz Roman Granowski
Yes, it will be doiwnloaded automatically, but there's still "old" workunits that use 8.04 in the queue that have to be sent out first.
With SoB, there's about 164 old 8.04 candidates ahead of the ne 9.00 candidates.
As of right now, here's the status on each of those queues:
Woodall has about 518 old workunits in the queue.
Cullen has about 272 old workunits.
321 is sending out new 9.00 tasks.
PSP is sending out new 9.00 tasks.
SoB has about 164 old workunits.
TRP, of course, is sending out new 9.00 tasks.
ESP has about 501 old workunits.
Tasks aren't sent out exactly in order, so you may get some old tasks even when the new ones are being sent out, and you may get some new tasks even though there's still some old ones remaining.
____________
My lucky number is 75898524288+1 |
|
|
|
Greetings ... does the new llr2 application is going to be downloaded automaticaly ... ?
i've made a reset and detach, however ... the mt application for SoB ... has a value of 8.10 instead of 9.00 ?
should we wait ?
sincerely, Grzegorz Roman Granowski
Yes, it will be doiwnloaded automatically, but there's still "old" workunits that use 8.04 in the queue that have to be sent out first.
With SoB, there's about 164 old 8.04 candidates ahead of the ne 9.00 candidates.
As of right now, here's the status on each of those queues:
Woodall has about 518 old workunits in the queue.
Cullen has about 272 old workunits.
321 is sending out new 9.00 tasks.
PSP is sending out new 9.00 tasks.
SoB has about 164 old workunits.
TRP, of course, is sending out new 9.00 tasks.
ESP has about 501 old workunits.
Tasks aren't sent out exactly in order, so you may get some old tasks even when the new ones are being sent out, and you may get some new tasks even though there's still some old ones remaining.
Dear Administartor ... Mr. Goetz ... thank you for information ... it is greatly aprecciated ...
I am waiting form new WU's and sincerely, Grzegorz Roman Granowski ...
|
|
|
|
Yes, it will be doiwnloaded automatically, but there's still "old" workunits that use 8.04 in the queue that have to be sent out first.
With SoB, there's about 164 old 8.04 candidates ahead of the ne 9.00 candidates.
Mike is in a hurry....Typoes from Mike? Impossible!
HUGE THANKS TO ALL THAT MADE THIS HAPPEN!
Fair prime-searching AT LAST!
____________
My lucky number is 6219*2^3374198+1
|
|
|
mikey Send message
Joined: 17 Mar 09 Posts: 1771 ID: 37043 Credit: 783,805,084 RAC: 1,610,582
                     
|
WOO HOO!!! A HUGE Thank You to everyone who made this happen!!! |
|
|
|
Just to emphasise how bad an idea it is to run a task per core where multithreading is the better option I've kicked off 16 SoB tasks on my 3950x.
If I run 2 tasks with 8 threads each I can run roughly 2.5/day.
After 15 minutes of running 16 tasks the estimate was for 18.5 days.
18.5*2.5 = 46. So you'll get roughly 1/3 the throughput by loading up all cores and on a threadripper it will likely be even worse.
I am having trouble following your comment about threadrippers. TRs have 128mb or 256mb of L3. That is way more L3 per core than any other non-server CPU. You should be able to run 1 task per core without coming close to filling the L3.
And if you are running 1 task per core, you don't need to worry about it crossing CCDs.
What am I missing?
____________
Reno, NV
|
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 358
                              
|
I am having trouble following your comment about threadrippers. TRs have 128mb or 256mb of L3. That is way more L3 per core than any other non-server CPU. You should be able to run 1 task per core without coming close to filling the L3.
I didn't get the claim either, but current Threadrippers don't have any more cache than their desktop counterparts at 16MB/CCX. Per core varies depending on how many cores are active per CCX, but at max core configuration it is 4MB/core. That's only enough for smaller tasks, maybe up to MEGA without looking up actual current FFT sizes. |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Woodall has about 518 old workunits in the queue. I read somewhere only Proth-types were getting LLR2 at the beginning, so this has extended to -1 as well?
Why are PPS/PPSE and SGS excluded for now? Aren't they all using cllr anyway? Has it to do with their smaller size?
Prime hunting becomes more like lottery - if you got a lucky ticket from the server, you'll win no matter will you open it sooner or later. Good hardware still has a bonus, you get more tickets per time. The good thing is, unlike before it's not at the expense of slower computers!
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
Woodall has about 518 old workunits in the queue. I read somewhere only Proth-types were getting LLR2 at the beginning, so this has extended to -1 as well?
Early, we installed LLR2 executable on few projects to use only one small part of LLR2 capabilities: error correction. It was a drop-in replacement: "a LLR with error correction". Error correction mode is compatible with classic LLR only on Proth projects (on -1 projects residues will be incompatible).
Now LLR2 is working solo on it's full power (fast doublechecking), we don't need to have residues compatible with classic LLR anymore, so LLR2 can be installed on all types of projects (+1 and -1) and any bases.
Why are PPS/PPSE and SGS excluded for now? Aren't they all using cllr anyway? Has it to do with their smaller size?
Yes, high amount of these small tasks will generate too much "proof of computation" data which will overload the server - these proofs will require too much network traffic to upload them and too much CPU resources on the server to verify them.
|
|
|
|
Nice! This may attract a few crunchers to run conjecture projects more :) |
|
|
|
I am having trouble following your comment about threadrippers. TRs have 128mb or 256mb of L3. That is way more L3 per core than any other non-server CPU. You should be able to run 1 task per core without coming close to filling the L3.
I didn't get the claim either, but current Threadrippers don't have any more cache than their desktop counterparts at 16MB/CCX. Per core varies depending on how many cores are active per CCX, but at max core configuration it is 4MB/core. That's only enough for smaller tasks, maybe up to MEGA without looking up actual current FFT sizes.
Well there's only 1 way to find out... waiting for you to report back. |
|
|
|
A MAC application is not ready yet. If you're using a MAC, either switch to another project (see list below) either run Linux in a virtual machine.
I'm running Macs, will I be sent LLR2 tasks and it comes back as an error, or will it just not send me anything once all of the straggler LLR tasks are finished?
____________
Just a poor Mac user with no fancy GPUs
Tested GFN CPU transforms |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
A MAC application is not ready yet. If you're using a MAC, either switch to another project (see list below) either run Linux in a virtual machine.
I'm running Macs, will I be sent LLR2 tasks and it comes back as an error, or will it just not send me anything once all of the straggler LLR tasks are finished?
It won't send you anything. However, we may have a Mac app "Real Soon Now"TM. The 321 app has a Mac app right now, actually. Why don't you go ahead and test it?
____________
My lucky number is 75898524288+1 |
|
|
|
A MAC application is not ready yet. If you're using a MAC, either switch to another project (see list below) either run Linux in a virtual machine.
I'm running Macs, will I be sent LLR2 tasks and it comes back as an error, or will it just not send me anything once all of the straggler LLR tasks are finished?
It won't send you anything. However, we may have a Mac app "Real Soon Now"TM. The 321 app has a Mac app right now, actually. Why don't you go ahead and test it?
321 (LLR) v9.00 (MT) seems to be working on my MacBookPro with an i5, though it is running with a higher CPU % than the 321 v8.04 (but that might just be me messing with my settings). My older iMac is only running the single-thread LLRs (SGS, PPSE) and Sieves so it's not currently affected by the switch
____________
Just a poor Mac user with no fancy GPUs
Tested GFN CPU transforms |
|
|
|
It won't send you anything. However, we may have a Mac app "Real Soon Now"TM. The 321 app has a Mac app right now, actually. Why don't you go ahead and test it?
Here is one:
https://www.primegrid.com/workunit.php?wuid=674487626
____________
Reno, NV
|
|
|
TernVolunteer developer Volunteer tester
 Send message
Joined: 20 Sep 15 Posts: 31 ID: 421148 Credit: 465,933,839 RAC: 139,114
                        
|
There are several Mac LLR2 321 tasks out running now - assuming they verify and nobody hits any problems, the Mac code should be good to roll out to the other subprojects. The actual LLR2 code was thoroughly tested on Mac, it was the wrapper that was a problem and JUST got finished yesterday. So it had, um... minimal testing... The nature of the wrapper though is that if it runs at all, it's probably good to go.
|
|
|
|
Greetings ...
could I please have an update how many more tasks are left for the old application for SoB? ... ???
sincerely, Grzegorz Roman Granowski ... ...
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
could I please have an update how many more tasks are left for the old application for SoB? ... ???
+----------+----------+
| appid | count(*) |
+----------+----------+
| WOO 3 | 513 |
| CUL 4 | 195 |
| SOB 13 | 128 |
| ESP 20 | 394 |
+----------+----------+
There are unsent tasks. So, 4 projects still have significant number of old tasks ready.
Note that sent old tasks are not shown here. They may be aborted or time out, so few resents of old tasks may eventually reappear in any project. Currently all projects except TRP still have lot of in-progress old tasks.
|
|
|
|
could I please have an update how many more tasks are left for the old application for SoB? ... ???
+----------+----------+
| appid | count(*) |
+----------+----------+
| WOO 3 | 513 |
| CUL 4 | 195 |
| SOB 13 | 128 |
| ESP 20 | 394 |
+----------+----------+
There are unsent tasks. So, 4 projects still have significant number of old tasks ready.
Note that sent old tasks are not shown here. They may be aborted or time out, so few resents of old tasks may eventually reappear in any project. Currently all projects except TRP still have lot of in-progress old tasks.
Thank you for the information !
sincerely, Grzegorz Roman Granowski ...
|
|
|
|
Every single old task you're doing now makes the day of complete transition closer. |
|
|
|
Having no luck on an I7 Mac mini, even after installing latest Boinc and resetting project.
https://www.primegrid.com/result.php?resultid=1129704408 |
|
|
|
Having no luck on an I7 Mac mini, even after installing latest Boinc and resetting project.
https://www.primegrid.com/result.php?resultid=1129704408
It looks like the LLR2 wrapper was built for Mac OS X 10.15, but the Mac you're using is running something older. |
|
|
|
That’s unfortunate. Most of the Macs on PG are running something older, and many of those cannot run on 10.15 (Darwin 19 - Catalina), ever. Unless a wrapper can be built that is compatible with older versions of MacOS than the very latest one, most Mac users will be cut out of this project. |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
Having no luck on an I7 Mac mini, even after installing latest Boinc and resetting project.
https://www.primegrid.com/result.php?resultid=1129704408
As already posted, it seems that currently minimum required version of Mac OS is 10.15. We also have tasks successfully completed on more recent versions of OS - 12.xx and 13.xx. We're investigating this issue. Hope that we can build it a way compatible with somewhere around 10.5-10.7. The problem is in the LLR program itself, the wrapper is built in compatible way and running correctly.
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
New MAC application, which should work on old versions of Mac OS down at least to 10.7, was installed on TRP project.
After we finish testing of this version on TRP for both new and old OS versions, it will be installed on all remaining projects except 321. Unfortunately, it's not possible to update an application at 321 until all old workunits are finished. So, if you have old Mac OS below 10.15, please switch to TRP and give new application a try.
|
|
|
|
An important note for anyone who has an app setting affinity for these processes is that they're now called llr2_1.0.0_win64_200814.exe rather than primegrid_cllr.exe. |
|
|
mikey Send message
Joined: 17 Mar 09 Posts: 1771 ID: 37043 Credit: 783,805,084 RAC: 1,610,582
                     
|
An important note for anyone who has an app setting affinity for these processes is that they're now called llr2_1.0.0_win64_200814.exe rather than primegrid_cllr.exe.
UUUUH that's good to know!!! |
|
|
|
New MAC application, which should work on old versions of Mac OS down at least to 10.7, was installed on TRP project.
After we finish testing of this version on TRP for both new and old OS versions, it will be installed on all remaining projects except 321. Unfortunately, it's not possible to update an application at 321 until all old workunits are finished. So, if you have old Mac OS below 10.15, please switch to TRP and give new application a try.
TRP v9.00 seems to be working on my iMac running macOS 10.11 (Darwin 15.6)
____________
Just a poor Mac user with no fancy GPUs
Tested GFN CPU transforms |
|
|
|
Hi.
The new Mac application is doing fine on a 2008 MacBook running OS X 10.7.5.
If it doesn't crash within the next two and a half days and some hours, it should be fine.
Anyone running Snow Leo?
Don't know if I have it installed anywhere atm, and I don't want to downgrade this machine.
____________
Greetings, Jens
147433824^131072+1 |
|
|
|
greetings to Administartors !!!
I wonder how many more old SoB tasks are left ...
looking forward ... to get an answer ...
and assuring of my continous support ...
Grzegorz Roman Granowski ... ... |
|
|
|
Just saw this thread and just looked at my 321 tasks. Same pc and the times went from 10,264 seconds to 138.23 running 4 cores mt. Is that right? The units are marked valid. Also noticed a huge decrease in credit per unit as well.
If these are correct times wow that's an incredible feat and well done to all !!!
Cheers Rick
ps - this will make the TDP a totally different race next year |
|
|
|
Rick, that was one of "certificate validation" or double-check tasks. In theory, it should take 1/128 of a full task to complete. Those tasks are so fast you need luck to actually witness one running in your client. |
|
|
|
Rick, that was one of "certificate validation" or double-check tasks. In theory, it should take 1/128 of a full task to complete. Those tasks are so fast you need luck to actually witness one running in your client.
Guess I am not sure what certificate validation means.. all of them are marked first so I'm guessing they are "test" units. Either way dang it's fast... fast... fast... |
|
|
|
The task performing primality test computes a proof which is uploaded to the server. The proof is converted into a certificate. If the certificate is valid, then the test was performed correctly. Certificate validation still takes some time, so it is sent to a user in a separate task. By running those short tasks you were "double-checking" someone's results. |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
all of them are marked first so I'm guessing they are "test" units. Either way dang it's fast... fast... fast... They are fast because they are just to check if the real LLR was correct. The actual LLR2 does take a little bit longer afaik, but since we don't need two LLR tasks anymore, it's still way fast. You are always first because only one person does the error check.
So it's not a test unit but validation. You should also get LLR tasks that take as long as usual.
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
all of them are marked first so I'm guessing they are "test" units. Either way dang it's fast... fast... fast... They are fast because they are just to check if the real LLR was correct. The actual LLR2 does take a little bit longer afaik, but since we don't need two LLR tasks anymore, it's still way fast. You are always first because only one person does the error check.
So it's not a test unit but validation. You should also get LLR tasks that take as long as usual.
Great and thanks to you and Pavel for the excellent explanation!
Cheers |
|
|
|
Here is one of the new first-check work units (WU) "without competition". It takes about the same time as the old tasks:
granted credit: 6,604.84
minimum quorum: 1
replication: 1
Since there is only one task in the WU, you are always 1st. It will be rare to have unnoticed errors, because it will check the sanity of the calculation as it runs, and restart from last checkpoint if it looks bad. So even computers that have a glitch once in a while, should be able to correctly complete the task (after a number of restarts from checkpoints). No "inconclusive" state can arise.
And here is one of the new check-certificate WUs:
granted credit: 51.60
minimum quorum: 1
replication: 1
This one should take about 1/128 the time a first-check task takes. It only validates that the certificate produced by a first-check WU is correct. This prevents cheating. Again, only one task in the WU, so you are always 1st.
PS! These links will become dead after some days, when the concrete WUs I link are purged from the server.
/JeppeSN |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
It takes about the same time as the old tasks I don't remember where it was, but someone posted they take about 1-1.5% longer.
Maybe this means SoB will be solved within our lifetimes... :D
Not to be demanding, but is it possible to implement this error checking method into the genefer app?
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
greetings to Administartors !!!
I wonder how many more old SoB tasks are left ...
looking forward ... to get an answer ...
and assuring of my continous support ...
Grzegorz Roman Granowski ... ...
greetings ... I have some 9.00 SoB wu's dowloaded ... so I guess there is none old, 8.04 work units ... left ...
Let's Solve Sierpiński Problem TOGETHER !!!
sincerely, Grzegorz Roman Granowski .... |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
greetings ... I have some 9.00 SoB wu's dowloaded ... so I guess there is none old, 8.04 work units ... left ...
Indeed, as of right now, SoB, TRP, 321, and PSP are sending out the new 9.00 tasks.
Woodall still has 287 old workunits to go. Cullen still has 13 to go. Finally, ESP has 241 to go.
____________
My lucky number is 75898524288+1 |
|
|
|
greetings ... I have some 9.00 SoB wu's dowloaded ... so I guess there is none old, 8.04 work units ... left ...
Indeed, as of right now, SoB, TRP, 321, and PSP are sending out the new 9.00 tasks.
Woodall still has 287 old workunits to go. Cullen still has 13 to go. Finally, ESP has 241 to go.
Dear Administrator, Mr. Goetz ... these are perfect news ...
we should shorten the time needed to Solve The Sierpiński Problem ... by half ...
hopefully, The Sierpiński Problem is going to be solved ... during my life, I am 38 as for now ...
or, if You please, do Administrators make some predictions ... basing on inside statistics, how much time do Crunchers need to Solve the Sierpiński Problem ?
sincerely, Grzegorz Roman Granowski ... |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Someone once posted a table with probabilities of a certain n producing a prime for the remaining k's of the various Sierpinski subprojects. It was quite low for some of them.
Generally, the larger n gets, the less likely it is that the number is prime just due to it being larger and larger... :D
Woodall still has 287 old workunits to go.
I'm currently doing Woodall on my faster CPU to finally get a gold badge and at the current pace I might get an LLR2 task before I am finished and move to 321.
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
Sysadm@Nbg Volunteer moderator Volunteer tester Project scientist
 Send message
Joined: 5 Feb 08 Posts: 1224 ID: 18646 Credit: 875,916,049 RAC: 316,022
                      
|
... Finally, ESP has 241 to go.
since yesterday working on the old stuff ;-)
____________
Sysadm@Nbg
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/
|
|
|
|
This is great!
I will donate to this achievement. but sorry for small amount. |
|
|
KEP Send message
Joined: 10 Aug 05 Posts: 301 ID: 110 Credit: 12,352,853 RAC: 187
          
|
or, if You please, do Administrators make some predictions ... basing on inside statistics, how much time do Crunchers need to Solve the Sierpiński Problem ?
Hi :)
Since n=2,000,000 we have found a total of 7 primes. There were 12 k's remaining at n=2,000,000
In n-range 2M-4M we found 1 prime, removing 8.33% of k's remaining
In n-range 4M-8M we found 3 primes, removing 27.27% of k's remaining
In n-range 8M-16M we found 2 primes, removing 25.00% of k's remaining
In n-range 16M-32M we found 1 prime, removing 16.66% of k's remaining
Over these 4 doubling of n, we have in average removed 19.318% of the candidates remaining.
Using this statistical average, it appears that at n=4,096,000,000 will be the first doubling of n, that brings us under 1 k remaining. So from a statistical point of view, we may actually be able to sieve the range needed to prove the SOB conjecture, when sieving has to begin from n=50M to n=???
My i5-4670 at 3.x GHz uses about 2 days on an SOB workunit, at n=32,000,000. This means, if I could do a test at n=4,096,000,000 I would need to use: 2*65536 days or 71072 days or 194.5845311 years on a single test. So yes we can sieve the statistically needed testrange, but testing that far is going to require several doubling in computation power. But hey at least now you have a statistical range that you can use as reference for when we can expect to have SOB proven :)
Take care
____________
|
|
|
|
194 years ? Hopefully get quantum computers by then, probably sold by Elon Musk and implanted in brain :) |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Using this statistical average, it appears that at n=4,096,000,000 will be the first doubling of n, that brings us under 1 k remaining. n = 4e9 means about 1 billion digits. Isn't there a price money offered for such prime? On the other hand, the GIMP guys will probably get there first... ;)
But I don't think it will be that long, unfortunately I cannot find the post I mentioned that gave probabilities for finding a prime. On the other hand, it's all just statistics. Who knows how long it's really take. The only thing that is for sure, is that the probability of the leading edge candidate being prime decreases exponentionally with n all the time. The remaining k sort of missed their chance to generate a prime at lower numbers where prime density is high and now their chances are just awfully low. :D
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
Using this statistical average, it appears that at n=4,096,000,000 will be the first doubling of n, that brings us under 1 k remaining. n = 4e9 means about 1 billion digits. Isn't there a price money offered for such prime? On the other hand, the GIMP guys will probably get there first... ;)
Yeah, that would be over 1 billion digits. You are right, there are prizes still not awarded for a 100 million digit prime and for a 1 billion digit prime. /JeppeSN |
|
|
|
Using this statistical average, it appears that at n=4,096,000,000 will be the first doubling of n, that brings us under 1 k remaining. n = 4e9 means about 1 billion digits. Isn't there a price money offered for such prime? On the other hand, the GIMP guys will probably get there first... ;)
Yeah, that would be over 1 billion digits. You are right, there are prizes still not awarded for a 100 million digit prime and for a 1 billion digit prime. /JeppeSN
would the finder of such a prime be awarded to the person who crunched the unit or to the PrimeGrid project? Just curious..... |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
After some testing, it seem that minimal supported MacOS version will be 10.7 (a task run on 10.6.8 failed due to incompatible C++ library, if I remember correctly Apple switched to new C++ library in 10.7).
|
|
|
|
would the finder of such a prime be awarded to the person who crunched the unit or to the PrimeGrid project? Just curious.....
I do not think PrimeGrid has (or needs) a policy for that. We do not run any 100 million digit candidates.
GIMPS who has claimed the two previous prizes, does have a policy. The award will be split, so some money will go to the organization GIMPS, and some money will go directly to the person who crunched that particular number.
/JeppeSN |
|
|
darkclown Volunteer tester Send message
Joined: 3 Oct 06 Posts: 331 ID: 3605 Credit: 1,441,652,181 RAC: 554,971
                          
|
Given that all of the work for DIV has been generated, would it make sense to remove, re-process, and move it to LLR2?
____________
My lucky #: 60133106^131072+1 (GFN 17-mega) |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
Given that all of the work for DIV has been generated, would it make sense to remove, re-process, and move it to LLR2?
The state of DIV is currently being discussed. It's a big temptation to move it to LLR2 considering upcoming challenge, which will receive a 2x boost. On the other hand, the challenge is a problem - network bandwidth required to upload checkpoints and CPU resources to validate them in reasonable time may overload our current infrastructure.
|
|
|
|
Given that all of the work for DIV has been generated, would it make sense to remove, re-process, and move it to LLR2?
The state of DIV is currently being discussed. It's a big temptation to move it to LLR2 considering upcoming challenge, which will receive a 2x boost. On the other hand, the challenge is a problem - network bandwidth required to upload checkpoints and CPU resources to validate them in reasonable time may overload our current infrastructure.
I suggest against moving it to LLR2 ever.
____________
My lucky number is 6219*2^3374198+1
|
|
|
mikey Send message
Joined: 17 Mar 09 Posts: 1771 ID: 37043 Credit: 783,805,084 RAC: 1,610,582
                     
|
Given that all of the work for DIV has been generated, would it make sense to remove, re-process, and move it to LLR2?
The state of DIV is currently being discussed. It's a big temptation to move it to LLR2 considering upcoming challenge, which will receive a 2x boost. On the other hand, the challenge is a problem - network bandwidth required to upload checkpoints and CPU resources to validate them in reasonable time may overload our current infrastructure.
One option would be to only send out the actual tasks during the challenge and then after it's over send out the checkpoint files. Since the checkpoint files are smaller they will be done very quickly and most people will get their credits for the challenge within a couple of days after it's over. There has always been 'cleanup time' after a challenge anyway so this would just be part of that. You could even put a 2 or 3 day time limit on the checkpoint tasks to ensure they get done asap. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
Given that all of the work for DIV has been generated, would it make sense to remove, re-process, and move it to LLR2?
The state of DIV is currently being discussed. It's a big temptation to move it to LLR2 considering upcoming challenge, which will receive a 2x boost. On the other hand, the challenge is a problem - network bandwidth required to upload checkpoints and CPU resources to validate them in reasonable time may overload our current infrastructure.
One option would be to only send out the actual tasks during the challenge and then after it's over send out the checkpoint files. Since the checkpoint files are smaller they will be done very quickly and most people will get their credits for the challenge within a couple of days after it's over. There has always been 'cleanup time' after a challenge anyway so this would just be part of that. You could even put a 2 or 3 day time limit on the checkpoint tasks to ensure they get done asap.
That is the worst possible thing we could do.
The main LLR2 tasks generate large checkpoint files that must be sent to the server (that's the bandwidth problem), and then must be stored on the server until the server processes them and creates the fast double check task (that's the disk space problem). If the creation of the double check tasks is delayed -- either because we do it intentionally, or because the server falls behind -- those files start filling up our disks.
The expected number of tasks processed during the DIV Cullen/Woodall challenge will consume over three times the amount of free disk space that exists on the server. If we don't process those files and create the double check tasks promptly, the server dies. PrimeGrid would be completely shut down for an extended period while we clean up the fiasco.
Actually sending out the double check tasks, once they're created, and processing their results, have very little impact on the server.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
There seems to be quite a bit of confusion as to how LLR2 fast double checking works. I'm going to describe what the process is, so you have a better understanding of what to expect, and also what the drawbacks are. I'm going to skip all the math parts that make this possible, because it's not necessary to understand the "how" in order to comprehend the "what".
OLD LLR process:
Identical tasks are sent to two computers, which run the full, long, computation. A very short result is sent back to the server, and the results from the two computers must match to be validated. We're all familiar with this paradigm.
NEW LLR2 process:
Just one full task is sent out. At the end of the computation, the same (or at least similar) short result is returned to the server.
There's an additional new step, however. At various times during the computation, LLR2 is recording checkpoints to disk. These are fairly large, and a substantial number of them are recorded. This takes up disk space on your computer, so you'll need somewhat more disk space for LLR2 than you did for LLR.
At the end of the computation, all of those checkpoints are compressed, and the compressed checkpoints are sent to the server along with normal short result. The checkpoints are then deleted from your computer, freeing up your disk space.
Sending the compressed checkpoints to the server uses a lot of bandwidth, and then the checkpoint files use a lot of disk space on the server. When we built these servers, our applications did not consume a lot of bandwidth and did not use a lot of disk space. The current servers aren't designed for LLR2's requirements, so we have to manage LLR2's rollout very carefully, or PrimeGrid will essentially die.
Once the short result and the compressed checkpoints are sent to the server, your computer tells the server that the task is completed. Your task then goes into the "pending validation" state.
The server now uses the compressed checkpoints from your full task to create the fast DC task. It decompresses the checkpoints, does a moderately lengthy FFT-style computation (single core takes a few tens of seconds for a TRP), and creates the fast DC task in a new workunit. The large checkpoint files are then deleted, freeing up disk space on the server. The new fast DC task then gets sent to a user, gets quickly processed (it's less than 1% of the main task), and sent back to the server. The server checks this result against information from the main result, and if correct, both are validated and get credit. The fast DC's credit is proportionally smaller than the credit for the main task, of course.
From the server's perspective, there are three problems here:
* The bandwidth used sending the compressed checkpoints from the main task up to the server.
* The CPU time consumed processing those checkpoints in order to create the fast DC tasks.
* The potential for exhausting all of the server's disk space if we fall behind in creating the fast DC tasks.
Smaller, more numerous tasks are a larger problem than a small number of larger tasks. This is why we're moving the big tasks to LLR2 but the small tasks, at least for now, are staying with the old style full double checking.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,125,498,739 RAC: 2,248,736
                                      
|
On a side note - since DC tasks are much smaller, has deadline for those tasks been considered?
Not that it would solve just mentioned but while moving my resources from already LLR2 TRP to newly LLR2 SoB, it would nice to have DC tasks back sooner.
____________
My stats |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
On a side note - since DC tasks are much smaller, has deadline for those tasks been considered?
Yes, it's truncated to 3 days if standard project's deadline is more then 3 days.
This will not affect server at all but allows faster validation of main task.
|
|
|
|
The Initial (Large) tasks and Double check (small tasks) all show up as 1st. Is/was this expected? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
The Initial (Large) tasks and Double check (small tasks) all show up as 1st. Is/was this expected?
Yes.
____________
My lucky number is 75898524288+1 |
|
|
mikey Send message
Joined: 17 Mar 09 Posts: 1771 ID: 37043 Credit: 783,805,084 RAC: 1,610,582
                     
|
Given that all of the work for DIV has been generated, would it make sense to remove, re-process, and move it to LLR2?
The state of DIV is currently being discussed. It's a big temptation to move it to LLR2 considering upcoming challenge, which will receive a 2x boost. On the other hand, the challenge is a problem - network bandwidth required to upload checkpoints and CPU resources to validate them in reasonable time may overload our current infrastructure.
One option would be to only send out the actual tasks during the challenge and then after it's over send out the checkpoint files. Since the checkpoint files are smaller they will be done very quickly and most people will get their credits for the challenge within a couple of days after it's over. There has always been 'cleanup time' after a challenge anyway so this would just be part of that. You could even put a 2 or 3 day time limit on the checkpoint tasks to ensure they get done asap.
That is the worst possible thing we could do.
The main LLR2 tasks generate large checkpoint files that must be sent to the server (that's the bandwidth problem), and then must be stored on the server until the server processes them and creates the fast double check task (that's the disk space problem). If the creation of the double check tasks is delayed -- either because we do it intentionally, or because the server falls behind -- those files start filling up our disks.
The expected number of tasks processed during the DIV challenge will consume over three times the amount of free disk space that exists on the server. If we don't process those files and create the double check tasks promptly, the server dies. PrimeGrid would be completely shut down for an extended period while we clean up the fiasco.
Actually sending out the double check tasks, once they're created, and processing their results, have very little impact on the server.
I like your idea much better!! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
All seven of the projects where LLR2 fast double checking is enabled should now be primarily sending out tasks using fast double checking. There will be some stragglers with the old LLR for a while, because of resends, but the old tasks have all now been sent out.
____________
My lucky number is 75898524288+1 |
|
|
|
Just have my first LLR2 WOO task validated. Confident I'm first because of time the test took.
As the first test and the double check are now in two distinct work units, is it possible to "e-meet" each other or only when there is a prime and an official announcement? |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
As the first test and the double check are now in two distinct work units, is it possible to "e-meet" each other or only when there is a prime and an official announcement?
You can be your own double-checker. It's safe, no cheating is possible. It also helps peoples with large computing farms to concentrate on a single project with small participation. Early it was not possible, they stopped to receive tasks very quickly because they had no double-checkers. This problem has been eliminated.
But there is no easy way to know about such "e-meet" except for scanning stderr output of each task for number being tested and compare them.
As for prime reporting, I think only one person will be reported now (except for rare cases when tasks has been timed out/resent and finally get second excessive result). Early, double-checker did same amount of work as original finder, he just was less lucky and reported his finding later. Now DC tasks is not only 100 times smaller, they also has no idea about final result of the test they're verifying.
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
We haven't looked at it yet, but since 321 now takes half as long to check as it did before, the optimal sieving point for the 321 sieve should be half of what it was before.
Off the top of my head, we're almost certainly now past the revised optimal sieving point. We haven't run the numbers yet, but those of you prone to panicking about getting that next/last badge whenever something is shut down should probably start thinking about your plans for 321 sieve badges.
It's entirely possible that the official 30 day warning may be given in the very near future. On the other hand, this decision has not been made yet, and there's other reasons to keep the sieve running. So it's not 100% certain that we'll be suspending the sieve soon, but it's certainly possible.
Consider this to be the pre-warning to the 30-day warning for the suspension of the 321-sieve.
____________
My lucky number is 75898524288+1 |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
GCW and SR5 are on LLR2 now.
LLR2 tasks where b<>2 are running slower then standard LLR tests. Depending on test conditions (CPU type, single or multi-threaded LLR, etc), the loss is from 20 to 35%. Anyway, running one task in 1,35 of original time is still much better then running two full tasks (2,00).
Credit for these tasks has been scaled accordingly. Please note that pages which show "pending" credit may temporary show incorrect values until validator processes a workunit at least once. We're working on improvement of this behavior.
|
|
|
|
We haven't looked at it yet, but since 321 now takes half as long to check as it did before, the optimal sieving point for the 321 sieve should be half of what it was before.
Off the top of my head, we're almost certainly now past the revised optimal sieving point. We haven't run the numbers yet, but those of you prone to panicking about getting that next/last badge whenever something is shut down should probably start thinking about your plans for 321 sieve badges.
It's entirely possible that the official 30 day warning may be given in the very near future. On the other hand, this decision has not been made yet, and there's other reasons to keep the sieve running. So it's not 100% certain that we'll be suspending the sieve soon, but it's certainly possible.
Consider this to be the pre-warning to the 30-day warning for the suspension of the 321-sieve.
I've been running 321 sv for 7 hours now, but nearly all my tasks are pending. Is it that no one cares about this enough or that everyone already got their badges?
____________
My lucky number is 6219*2^3374198+1
|
|
|
|
I'm sure many care but may not be aware of the potential situation as of yet. I for one now have roughly 50 threads on 321-sieve. Need about 3m to reach 20m. Should take 7-10 days depending upon level of participation in THOR challenge (WCG). |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3201 ID: 130544 Credit: 2,282,781,827 RAC: 1,025,870
                           
|
I need 77 days at 50k a day to reach Jadeypoos. |
|
|
tng Send message
Joined: 29 Aug 10 Posts: 482 ID: 66603 Credit: 47,205,990,064 RAC: 26,827,794
                                                    
|
I dropped 321 Sieve because there appears to be no chance of another badge for me. Some people have undoubtedly done the same, while others are jumping in to get a badge. Don't know how that works out overall.
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
I'm sure many care but may not be aware of the potential situation as of yet. I for one now have roughly 50 threads on 321-sieve. Need about 3m to reach 20m. Should take 7-10 days depending upon level of participation in THOR challenge (WCG).
We haven't (yet) decided if or when to shut down the sieve, so you have a minimum of 30 days. When a decision is made, we'll give you 30 days warning before it shuts down.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
I need 77 days at 50k a day to reach Jadeypoos.
If it's going to take you 77 days, I wouldn't bother. Either it's going to shut down in less than 77 days, or we're going to keep it open. It doesn't seem likely that we'll take more than a few days to make a decision.
____________
My lucky number is 75898524288+1 |
|
|
|
I've been running 321 sv for 7 hours now, but nearly all my tasks are pending. Is it that no one cares about this enough or that everyone already got their badges?
It is possible there's just a lot of tasks out there due to the Formula Boinc sprint on PrimeGrid.
I'd definately consider the possibility of people bunkering to make the adverseries drop their guards.
____________
Greetings, Jens
147433824^131072+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
Please note that pages which show "pending" credit may temporary show incorrect values until validator processes a workunit at least once. We're working on improvement of this behavior.
The pending credit page, and the badges page (which also shows pending credit), have been updated to show the correct credit for LLR2 tasks.
Please let me know if you see any discrepancies in the pending credit.
____________
My lucky number is 75898524288+1 |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
edit: Haha, not at all. I just realized the first came in already three days ago. I did however notice that the run time increased quite a bit from 75000 s to 85000 s. CPU time on the other hand decreased. From what I read here such a big increase is not to be expected, any idea what's wrong?
Btw, it's really nice to finally see what number was tested in the tasks tab.
My first LLR2 task! In Woodall, I'm curious when I'll get the first validation task.
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
I've noticed the new SR5 tasks they get to 99.9x% with 0 time remaining then spend around 10 minutes (or an additional 10% of the time) doing the final < .1%. Is that the certificate creation process?
They're also 70% slower than previously (and here I'll put the obligatory credit moan as the credit granted is only 30% more than before).
edit:
if this task is the check then task+check is taking longer than 2xtask. |
|
|
|
my AMD Ryzen 9 3900X also performs poorly with the new LLR2 app at SR5, with 4x 3 core multithreading it needs ~35000s (including ~900s for 'Compressed 128 points to 7 products'), the old app did a task in ~20000s. |
|
|
|
I've noticed the same with 321 as well except it appears to have been fixed. on 14/9 I have a task that took 20k seconds. Now they're taking 14-15k. |
|
|
|
Yes, SR5 has some performance issues. There are remedies for that in the pipeline. |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
I've noticed the new SR5 tasks they get to 99.9x% with 0 time remaining then spend around 10 minutes (or an additional 10% of the time) doing the final < .1%. Is that the certificate creation process?
Yes. (Techically, it's not a certificate, it's a compression of your 128 checkpoints to 7 "products" which will be sent to server).
Unfortunately, GWNUM (a library which is used in all LLR programs) have some surprises when b <> 2.
It seem that GWNUM spends an unreasonable high amount of time when it need to read checkpoint file from disk. As a preventive measure, I reduced number of checkpoints for SR5 and GCW from 128 to 64. This should make compression faster - less checkpoints to read. Also, if you didn't paused a test, checkpoints can stay cached in memory, completely avoiding disk reads.
This change can be applied only to new tasks. Task which are already created will use old settings.
They're also 70% slower than previously (and here I'll put the obligatory credit moan as the credit granted is only 30% more than before).
It seems to be CPU-specific. 35% is a difference on a system which is used as reference for all benchmarks and credit calculations (i7-4770K).
Pavel have some ideas about improvements for this situation, but these changes are big and complex and must be tested very carefully before new version can be used on PG.
|
|
|
|
edit:
if this task is the check then task+check is taking longer than 2xtask.
The task you link is not a check task. They have a "c" in their names, and the credit is much smaller, and in the stderr output they have something with -pVerifyCert somewhere.
It seems some hosts, including that one of yours, get less than 50 % check tasks, by number. At least in some periods. Edit: Or maybe it only looks that way because the check tasks are purged from the server much sooner than the long main tasks?
/JeppeSN |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
It seems some hosts, including that one of yours, get less than 50 % check tasks, by number. At least in some periods. Edit: Or maybe it only looks that way because the check tasks are purged from the server much sooner than the long main tasks?
They have higher priority then normal tasks and always sent first. Considering their small size, first connected host can get all of them from server queue.
|
|
|
|
They're also 70% slower than previously (and here I'll put the obligatory credit moan as the credit granted is only 30% more than before).
It seems to be CPU-specific. 35% is a difference on a system which is used as reference for all benchmarks and credit calculations (i7-4770K).
Pavel have some ideas about improvements for this situation, but these changes are big and complex and must be tested very carefully before new version can be used on PG.
I'm running ESP now as a test and it looks like it will be somewhere between 5 and 10% slower than previously which is inline with expectation. (previous tasks were taking 20-22k seconds, current ones are due to finish in 23k).
There's something else wrong with SR5 for it to be taking 70% longer. Hopefully Pavel can tick the "optimise for Ryzen box" :-) (for avoidance of doubt, yes, I realise it will be more difficult than this) |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Since problems related to LLR2 are often posted here, I'll link my thread from CUL/WOO board: Low CPU utilization with LLR2 WOO tasks
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
Bur, looks like your problem is OS-related. Something prevents threads from running with correct affinity. Note that the process name is different now, tools trained on old LLR may work incorrectly. |
|
|
|
j.sheridan, yes, there were some specific performance problems with b!=2 code that revealed themselves with live SR5 tasks. The fix is coming, but we need to test it thoroughly to not break down everything. The changes are in the very core of the test. |
|
|
|
j.sheridan, yes, there were some specific performance problems with b!=2 code that revealed themselves with live SR5 tasks. The fix is coming, but we need to test it thoroughly to not break down everything. The changes are in the very core of the test.
good news. If you'd like me to help testing anything just let me know. |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Bur, looks like your problem is OS-related. Something prevents threads from running with correct affinity. Note that the process name is different now, tools trained on old LLR may work incorrectly. I set affinity for boinc.exe and it is inherited by whatever process boinc.exe runs. So the affinity for the LLR2 process is set as previously.
I also found out that no matter what I do, as soon as I change affinity in any way the CPU usage increases to 50%. It appears to decrease at some point in time.
I changed settings to assign all cores and will see if the problem occurs again.
Interesting thing is, the CPU time is lower than with LLR1! This could mean, if I manage to get CPU usage to 50% all the time the actual runtime will decrease compared to LLR2 which would be unexpected.
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
From my experience there's no need to set affinity if you run LLR on all cores, even if HT is on. Affinity helps if you target CCXs or NUMA nodes. |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,125,498,739 RAC: 2,248,736
                                      
|
From my experience there's no need to set affinity if you run LLR on all cores, even if HT is on. Affinity helps if you target CCXs or NUMA nodes.
Yes.
But since LLR2 is run via softlink, I don't know how to set affinity on 3950X for each CCX or Intel dual-CPU.
Figured out that SoB is better to let it run with 1x14 threads comparing to 2x7threads.
SR5 might be another situation, I used to run 4 instances...
____________
My stats |
|
|
|
From my experience there's no need to set affinity if you run LLR on all cores, even if HT is on. Affinity helps if you target CCXs or NUMA nodes.
Yes.
But since LLR2 is run via softlink, I don't know how to set affinity on 3950X for each CCX or Intel dual-CPU.
Figured out that SoB is better to let it run with 1x14 threads comparing to 2x7threads.
SR5 might be another situation, I used to run 4 instances...
It runs as "LLR2_1.0.0_win64_200814.exe".
Setting affinty is the same as before in windows. I wrote an app to do this but there are more refined options you can download. |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,125,498,739 RAC: 2,248,736
                                      
|
It runs as "LLR2_1.0.0_win64_200814.exe".
Setting affinty is the same as before in windows. I wrote an app to do this but there are more refined options you can download.
I used to have different affinities for each slot.
It included "Also check process pathnames for matches".
Without path included, both/all LLR2 apps would have same affinity.
Or I might be missing something there.
Could you post link for mentioned app?
____________
My stats |
|
|
|
It runs as "LLR2_1.0.0_win64_200814.exe".
Setting affinty is the same as before in windows. I wrote an app to do this but there are more refined options you can download.
I used to have different affinities for each slot.
It included "Also check process pathnames for matches".
Without path included, both/all LLR2 apps would have same affinity.
Or I might be missing something there.
Could you post link for mentioned app?
process lassoo is one example that will let you set equal affinity for all apps with matching names. ie if you have 2 apps and 16 cores it will give them 8 each.
I made mine slightly different (it only needs to work on ryzen) so it checks the number of threads the task is using and assigns it to a ccx or a ccd as appropriate. I've not investigated whether any of the online offerings do this.
I'm happy to share that with you with the usual disclaimers about it not being a commercial product and may have unknown bugs (eg it probably won't like HT being switched off).
|
|
|
|
I have a similiar simple AffinityWatcher: https://github.com/patnashev/primeUtils/tree/master/AffinityWatcher
It scans for processes of known names and binds them to configurable "nodes". |
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Is a 3rd party app required? I just set affinity for boinc.exe and all tasks inherit that setting.
From my experience there's no need to set affinity if you run LLR on all cores It's not run on all cores, I have "use 50% of available CPUs" to simulate HT=off. With LLR1 I had a 2-3% decrease in runtime with affinity set. With LLR2 it totally messes things up and runtime increases by 15-20%.
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
|
Is a 3rd party app required? I just set affinity for boinc.exe and all tasks inherit that setting.
From my experience there's no need to set affinity if you run LLR on all cores It's not run on all cores, I have "use 50% of available CPUs" to simulate HT=off. With LLR1 I had a 2-3% decrease in runtime with affinity set. With LLR2 it totally messes things up and runtime increases by 15-20%.
What they mean is if you have 4 real cores and you're running a single task with 4 threads there should be no difference if you set affinity or not which doesn't mean there isn't a difference.
On ryzen if you don't limit per ccx it's more like a 20-30% difference in runtime so quite obvious that setting affinity makes a difference. I didn't notice any difference between setting real cores and real+ht cores although if you believe prime95s benchmarks ryzen is slightly faster with ht on.
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
DIV project is moved to LLR2.
Regarding upcoming DIV challenge - we have no final decision about it's format yet. It'll depend on how much CPU load and network traffic we will have on server during remaining days. It's possible to do the challenge in old format. Most probably we will start the challenge on LLR2 but may switch to old app if things starts to get out of control.
|
|
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 515 ID: 1241833 Credit: 414,481,880 RAC: 295
                
|
Great to hear. Thanks for all the work behind the scenes. I pays off though, I recently browsed T5K and the majority of entries comes from PG.
I'm excited to see when the first big find with LLR2 will happen. Another Cullen or Woodall prime would be nice. I still find it amazing they are so rare.
____________
1281979 * 2^485014 + 1 is prime ... no further hits up to: n = 5,700,000 |
|
|
KEP Send message
Joined: 10 Aug 05 Posts: 301 ID: 110 Credit: 12,352,853 RAC: 187
          
|
I'm excited to see when the first big find with LLR2 will happen. Another Cullen or Woodall prime would be nice. I still find it amazing they are so rare.
It appeared to have happened a few days ago, but now it appears not to have happened. Sure hope that TRP or the conjectures will fall a few k's before new year. |
|
|
WezHSend message
Joined: 9 Jun 11 Posts: 126 ID: 101605 Credit: 777,810,158 RAC: 983,260
                           
|
One of my hosts has problems with LLR2 Extended Sierpinski Problem v9.00.
Crunches wu but then comes File transfer errors. Proof of computation transfers?
http://www.primegrid.com/result.php?resultid=1130877910
<file_xfer_error>
<file_name>llrESP_345876363_0_r1102069354_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
It could be old Boinc client or old Linux... Havent tested yet in others LLR projects. |
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1032 ID: 301928 Credit: 543,593,602 RAC: 8,480
                         
|
One of my hosts has problems with LLR2 Extended Sierpinski Problem v9.00.
Crunches wu but then comes File transfer errors. Proof of computation transfers?
The computation went fine but Boinc was unable to find all output files.
It's difficult to diagnose, it can be bug in Boinc, or something wrong with access rights on slot directories. Do you have any additional messages about this case in Boinc log file?
|
|
|
|
It seem that GWNUM spends an unreasonable high amount of time when it need to read checkpoint file from disk. As a preventive measure, I reduced number of checkpoints for SR5 and GCW from 128 to 64. This should make compression faster - less checkpoints to read.
I have started to get GCW units running under LLR2 and the run times have increased dramatically (generally from +30% to +70%).
My worst example is my i3 6100 system running only GCW (MT) and where it went from a very steady 56,750 sec to 96,120 sec. To be totally fair, the credit has been adjusted accordingly.
I hope to have additional data into next week and until I switch projects due to a badge flip for GCW. |
|
|
WezHSend message
Joined: 9 Jun 11 Posts: 126 ID: 101605 Credit: 777,810,158 RAC: 983,260
                           
|
One of my hosts has problems with LLR2 Extended Sierpinski Problem v9.00.
Crunches wu but then comes File transfer errors. Proof of computation transfers?
The computation went fine but Boinc was unable to find all output files.
It's difficult to diagnose, it can be bug in Boinc, or something wrong with access rights on slot directories. Do you have any additional messages about this case in Boinc log file?
Something in stdoutdae.txt
26-Sep-2020 02:14:19 [PrimeGrid] Computation for task llrESP_345876363_0 finished
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_0 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_1 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_2 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_3 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_4 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_5 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_6 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_7 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_8 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_9 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_10 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_11 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_12 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_13 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_14 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_15 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_16 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_17 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_18 for task llrESP_345876363_0 absent
For permissions, slot -directories are 771, files inside are 666.
Currently there is files named between proof.0 & proof0.md5 to proof.109 & proof.109.md5
|
|
|
mikey Send message
Joined: 17 Mar 09 Posts: 1771 ID: 37043 Credit: 783,805,084 RAC: 1,610,582
                     
|
One of my hosts has problems with LLR2 Extended Sierpinski Problem v9.00.
Crunches wu but then comes File transfer errors. Proof of computation transfers?
The computation went fine but Boinc was unable to find all output files.
It's difficult to diagnose, it can be bug in Boinc, or something wrong with access rights on slot directories. Do you have any additional messages about this case in Boinc log file?
Something in stdoutdae.txt
26-Sep-2020 02:14:19 [PrimeGrid] Computation for task llrESP_345876363_0 finished
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_0 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_1 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_2 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_3 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_4 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_5 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_6 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_7 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_8 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_9 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_10 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_11 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_12 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_13 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_14 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_15 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_16 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_17 for task llrESP_345876363_0 absent
26-Sep-2020 02:14:19 [PrimeGrid] Output file llrESP_345876363_0_r1102069354_18 for task llrESP_345876363_0 absent
For permissions, slot -directories are 771, files inside are 666.
Currently there is files named between proof.0 & proof0.md5 to proof.109 & proof.109.md5
Is this the pc with OLD version of Boinc?
Boinc version 6.12.22
GenuineIntel
Intel(R) Pentium(R) CPU G2020 @ 2.90GHz [Family 6 Model 58 Stepping 9]
(2 processors) |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
One of my hosts has problems with LLR2 Extended Sierpinski Problem v9.00.
Crunches wu but then comes File transfer errors. Proof of computation transfers?
http://www.primegrid.com/result.php?resultid=1130877910
<file_xfer_error>
<file_name>llrESP_345876363_0_r1102069354_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
It could be old Boinc client or old Linux... Havent tested yet in others LLR projects.
Are you running out of disk space? Not just actual space on the disk, but the amount of space you've configured BOINC to use? LLR2 uses a lot of disk space compared to the old LLR.
____________
My lucky number is 75898524288+1 |
|
|
WezHSend message
Joined: 9 Jun 11 Posts: 126 ID: 101605 Credit: 777,810,158 RAC: 983,260
                           
|
Is this the pc with OLD version of Boinc?
Boinc version 6.12.22
GenuineIntel
Intel(R) Pentium(R) CPU G2020 @ 2.90GHz [Family 6 Model 58 Stepping 9]
(2 processors)
Yes it is. Newest version which I could get running, command-line interface only.
Dont dare to upgrade os, it's running my wide format printers old Rip-software, which cannot be upgraded because newer version wont support my old printer... |
|
|
WezHSend message
Joined: 9 Jun 11 Posts: 126 ID: 101605 Credit: 777,810,158 RAC: 983,260
                           
|
One of my hosts has problems with LLR2 Extended Sierpinski Problem v9.00.
Crunches wu but then comes File transfer errors. Proof of computation transfers?
http://www.primegrid.com/result.php?resultid=1130877910
<file_xfer_error>
<file_name>llrESP_345876363_0_r1102069354_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
It could be old Boinc client or old Linux... Havent tested yet in others LLR projects.
Are you running out of disk space? Not just actual space on the disk, but the amount of space you've configured BOINC to use? LLR2 uses a lot of disk space compared to the old LLR.
Dont think so, preferences are:
Use no more than --- GB
Leave at least --- GB free
Use no more than 99% of total
and host has:
Total disk space 431.85 GB
Free Disk Space 351.09 GB |
|
|
|
Something's wrong for me too. I've just had 2 LLR2 SR5 tasks fail on a DO droplet that has never had failures before - in fact it's already run several LLR2 SR5 tasks without issues in the last few days.
Does LLR2 have a memory leak? It keeps growing in memory usage as the computation continues - PrimeGrid website also tells me that the first task was killed due to too much RAM usage, and you can see on the left of this graph the moment that it was killed. With the 2nd one, again it was growing in memory usage, and then when it came to generating the files, you can see that it absolutely slammed the disk before suddenly saying that every file was missing. (link to the task: https://www.primegrid.com/result.php?resultid=1133440100)
It can't be a disk space issue because the amount of disk storage used remained steady at 12% the whole time. BOINC is allowed to use a full 25% of the disk's storage.
____________
1 PPSE (+2 DC) & 5 SGS primes |
|
|
Nick  Send message
Joined: 11 Jul 11 Posts: 2298 ID: 105020 Credit: 8,356,519,392 RAC: 5,814,361
                            
|
I am curious with the first graph - that the memory builds up to just before midnight and then just before 12 pm. Is there something about 12 / 24 hour clocks? |
|
|
|
This is normal behavior. As mentioned before, we had to turn on caching of intermediate checkpoints in memory for GCW and SR5. Due to radix conversion issues of b!=2 numbers, file reading is extremely slow (for now). Checkpoints are stored in memory in ready-to-use format, so they can be accessed fast. Note that they're not needed until the end of the test (until the "compression" stage), so it's okay if they're swapped out to disk.
Please increase BOINC memory limits if you're doing SR5 or GCW. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
This is normal behavior. As mentioned before, we had to turn on caching of intermediate checkpoints in memory for GCW and SR5. Due to radix conversion issues of b!=2 numbers, file reading is extremely slow (for now). Checkpoints are stored in memory in ready-to-use format, so they can be accessed fast. Note that they're not needed until the end of the test (until the "compression" stage), so it's okay if they're swapped out to disk.
Please increase BOINC memory limits if you're doing SR5 or GCW.
Typically, these droplets will have 3 cores and 1GB. If you’re running single threaded you have only about 300 MB per task. I’m not sure if virtual memory is enabled by default on these images. If not, the tasks will just fail if memory is exceeded.
____________
My lucky number is 75898524288+1 |
|
|
|
I am curious with the first graph - that the memory builds up to just before midnight and then just before 12 pm. Is there something about 12 / 24 hour clocks?
The first big drop is the first LLR2 SR5 task that failed. The moment where it drops is right where it got terminated for using too much memory. After that, the second build-up is the second LLR2 SR5 task, which also eventually failed after using around the same amount of RAM.
After SR5 switched to LLR2 the tasks grew to about 12 hours each (up from around 8 hours before), which is why the graph looks like it's running from midnight to noon - a 12 hour period. In fact the time shown in that graph is in my time zone, not the VPS's time zone, so the precise hours are merely a coincidence. :)
Typically, these droplets will have 3 cores and 1GB. If you’re running single threaded you have only about 300 MB per task. I’m not sure if virtual memory is enabled by default on these images. If not, the tasks will just fail if memory is exceeded.
In my case it's a single-core droplet, so one task has almost all of the 1GB RAM to itself. I didn't expect a single task to end up using what appears to be >800MB of RAM.
Pavel, thanks for the advice about b!=2 tests. For now I've switched the droplet to TRP and 321 LLR2, hopefully things run a bit more smoothly now. :)
____________
1 PPSE (+2 DC) & 5 SGS primes |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
I am curious with the first graph - that the memory builds up to just before midnight and then just before 12 pm. Is there something about 12 / 24 hour clocks?
The first big drop is the first LLR2 SR5 task that failed. The moment where it drops is right where it got terminated for using too much memory. After that, the second build-up is the second LLR2 SR5 task, which also eventually failed after using around the same amount of RAM.
After SR5 switched to LLR2 the tasks grew to about 12 hours each (up from around 8 hours before), which is why the graph looks like it's running from midnight to noon - a 12 hour period. In fact the time shown in that graph is in my time zone, not the VPS's time zone, so the precise hours are merely a coincidence. :)
Typically, these droplets will have 3 cores and 1GB. If you’re running single threaded you have only about 300 MB per task. I’m not sure if virtual memory is enabled by default on these images. If not, the tasks will just fail if memory is exceeded.
In my case it's a single-core droplet, so one task has almost all of the 1GB RAM to itself. I didn't expect a single task to end up using what appears to be >800MB of RAM.
Pavel, thanks for the advice about b!=2 tests. For now I've switched the droplet to TRP and 321 LLR2, hopefully things run a bit more smoothly now. :)
<insert lots of admin chat here>
Rytis should be able to put in a fix for this tomorrow, hopefully. The reason DO/TSC is so inexpensive is, in part, because the droplets have very little memory.
Which we never needed.
Until now.
We have a workaround. It won't slow down the calculation. It will just work.
CAUTION: On a three core droplet, if you're going to run GCW, you MUST run -t3. These use a lot of memory and you can't fit three of them in 1 GB. With SR5, it uses the memory just for the cache points, and that's ok. You should be able to run 3 of them in 1 GB after the fix. But not GCW.
Until Rytis pushes the change, however, neither GCW nor SR5 will work on 1 or 3 core droplets.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
Until Rytis pushes the change, however, neither GCW nor SR5 will work on 1 or 3 core droplets.
The change is now in place, and all PG apps can be used on 1 and 3 core droplets. The only restriction is that you can only run a single GCW task at a time on the 3 core droplets. If you're running GCW on a TSC 3 core droplet, make sure to set "Multi-threading: Max # of threads for each task" to "No Limit" on PrimeGrid's Project Preferences Settings.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                               
|
Something's wrong for me too. I've just had 2 LLR2 SR5 tasks fail on a DO droplet that has never had failures before - in fact it's already run several LLR2 SR5 tasks without issues in the last few days.
Does LLR2 have a memory leak? It keeps growing in memory usage as the computation continues - PrimeGrid website also tells me that the first task was killed due to too much RAM usage, and you can see on the left of this graph the moment that it was killed. With the 2nd one, again it was growing in memory usage, and then when it came to generating the files, you can see that it absolutely slammed the disk before suddenly saying that every file was missing. (link to the task: https://www.primegrid.com/result.php?resultid=1133440100)
It can't be a disk space issue because the amount of disk storage used remained steady at 12% the whole time. BOINC is allowed to use a full 25% of the disk's storage.
It turns out that our "fix" was actually always in place. At this point, we don't understand why your first task failed.
The behavior you see is correct for the second task on those charts. With SR5 (or GCW) to work around a problem in gwnum, we keep those large checkpoint files in memory as the calculation pregresses. That's why you see the memory ramping up. It's not a memory leak; it's by design.
On a 1 GB droplet, you *almost* have room to fit those files in memory, but not quite. So it starts using the swap file to hold the excess. That's fine because we're not actively using that data until the very end. You see this on the second task, where the memory levels out and there's disk activity as the swap file becomes active.
____________
My lucky number is 75898524288+1 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14009 ID: 53948 Credit: 427,970,802 RAC: 1,087,263
                              |