Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Extended Sierpinski Problem :
Change to PSP/SoB/ESP sieve validation criteria
Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13570 ID: 53948 Credit: 249,642,052 RAC: 130,417
                           
|
We've made a change to the validation criteria for the PSP/SoB/ESP sieve.
Please see this post for details: http://www.primegrid.com/forum_thread.php?id=5797&nowrap=true#78748
____________
My lucky number is 75898524288+1 | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 858 ID: 301928 Credit: 503,817,766 RAC: 262,675
                        
|
We've made a change to the validation criteria for the PSP/SoB/ESP sieve.
Please see this post for details: http://www.primegrid.com/forum_thread.php?id=5797&nowrap=true#78748
Look like I've been hit by unknown Boinc glitch which caused false validator alarm:
http://www.primegrid.com/workunit.php?wuid=405596683 and http://www.primegrid.com/result.php?resultid=572879075
For some unknown reason, Boinc reported very small runtimes (less then validator limit). Really, execution time for this workunit, taken from full log, was 1536 seconds (summed from 4 pieces because Boinc is suspended by user activity on this PC) which exactly matches average execution times on this PC. Full task log (second link) shows that task went smoothly up to full 100%.
Numbers are wrong in the local job_log file too. It could be seen that workunit run on second core also reported wrong times but they fit within validator limits:
1411107418 ue 3094.838448 ct 1460.309800 fe 9238280000000 nm ESP_sieve_39190692_1 et 1471.347155 es 0
1411109489 ue 3094.838448 ct 568.530000 fe 9238280000000 nm ESP_sieve_39190983_2 et 574.119837 es 0
1411118140 ue 3094.838448 ct 1027.984200 fe 9238280000000 nm ESP_sieve_39191215_0 et 1063.230814 es 0
1411118924 ue 3094.838448 ct 1436.519600 fe 9238280000000 nm ESP_sieve_39191322_0 et 1457.844383 es 0
Unfortunately, I have no idea what could it be. No system errors or events near this time, just another working day. Let's call it another Boinc glitch. Lucky, this happened for me only once, although I had quite lot of sieving.
The full log which confirms that true task runtime was 07:04+12:34+02:30+03:28=25:36 (1536 seconds):
19-Sep-2014 09:40:52 [PrimeGrid] Starting task ESP_sieve_39190983_2
19-Sep-2014 09:40:54 [PrimeGrid] Started upload of ESP_sieve_39190532_2_0
19-Sep-2014 09:40:56 [PrimeGrid] Finished upload of ESP_sieve_39190532_2_0
19-Sep-2014 09:41:01 [PrimeGrid] Sending scheduler request: To report completed tasks.
19-Sep-2014 09:41:01 [PrimeGrid] Reporting 1 completed tasks
19-Sep-2014 09:41:01 [PrimeGrid] Requesting new tasks for CPU and intel_gpu
19-Sep-2014 09:41:03 [PrimeGrid] Scheduler request completed: got 1 new tasks
19-Sep-2014 09:41:05 [PrimeGrid] Started download of ESP_sieve_39191322_cmd
19-Sep-2014 09:41:07 [PrimeGrid] Finished download of ESP_sieve_39191322_cmd
19-Sep-2014 09:47:56 [---] Suspending computation - computer is in use
19-Sep-2014 09:47:56 [---] Suspending network activity - computer is in use
19-Sep-2014 10:13:37 [---] Resuming computation
19-Sep-2014 10:13:37 [---] Resuming network activity
19-Sep-2014 10:16:58 [PrimeGrid] Computation for task ESP_sieve_39190692_1 finished
19-Sep-2014 10:16:58 [PrimeGrid] Starting task ESP_sieve_39191215_0
19-Sep-2014 10:17:00 [PrimeGrid] Started upload of ESP_sieve_39190692_1_0
19-Sep-2014 10:17:02 [PrimeGrid] Finished upload of ESP_sieve_39190692_1_0
19-Sep-2014 10:17:02 [PrimeGrid] Sending scheduler request: To report completed tasks.
19-Sep-2014 10:17:02 [PrimeGrid] Reporting 1 completed tasks
19-Sep-2014 10:17:02 [PrimeGrid] Requesting new tasks for CPU and intel_gpu
19-Sep-2014 10:17:04 [PrimeGrid] Scheduler request completed: got 1 new tasks
19-Sep-2014 10:17:06 [PrimeGrid] Started download of ESP_sieve_39191720_cmd
19-Sep-2014 10:17:08 [PrimeGrid] Finished download of ESP_sieve_39191720_cmd
19-Sep-2014 10:26:11 [---] Suspending computation - computer is in use
19-Sep-2014 10:26:11 [---] Suspending network activity - computer is in use
19-Sep-2014 10:34:23 [---] Resuming computation
19-Sep-2014 10:34:23 [---] Resuming network activity
19-Sep-2014 10:36:53 [---] Suspending computation - computer is in use
19-Sep-2014 10:36:53 [---] Suspending network activity - computer is in use
19-Sep-2014 10:48:00 [---] Resuming computation
19-Sep-2014 10:48:00 [---] Resuming network activity
19-Sep-2014 10:51:28 [PrimeGrid] Computation for task ESP_sieve_39190983_2 finished
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13570 ID: 53948 Credit: 249,642,052 RAC: 130,417
                           
|
Noted. At first glance I'm not sure what we can do about that.
Based on the information you provided as well as some additional limited information we have on the server, one of two things happened:
1) For reasons unknown, amongst all the other correct tasks on that computer, this one task reported the time incorrectly, but otherwise ran correctly.
2) For reasons unknown, amongst all the other correct tasks on that computer, this one task failed for some reason and ran for a shorter period of time, but left no trace of this in the log.
The second possibility seems less likely, but in either case it's not at all obvious what the cause is or what we could do about it.
Thanks for bringing it to our attention.
____________
My lucky number is 75898524288+1 | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 858 ID: 301928 Credit: 503,817,766 RAC: 262,675
                        
|
1) For reasons unknown, amongst all the other correct tasks on that computer, this one task reported the time incorrectly, but otherwise ran correctly.
This case, definitely.
I've checked logs on different computers and found that this case is much more common then I thought. I found quite a lot of sieve tasks which reported less CPU and elapsed time then they really takes. They should be within 1400-1600 seconds, depending on CPU, but sometimes reported lower value up to 1000 seconds. The true execution time, according to full log, was in correct range. Since 1000 seconds was enough to pass validation, I didn't noticed anything until I got one unlucky task which reported really low value.
The first suspect for me is that all these computers are set to "suspend on user activity" Boinc mode, so all tasks is question were killed and restarted few times. May be there is a glitch somewhere. But it didn't happens with each ever restarted task. All computers were running Boinc 7.2.42 on WIN64. I'll check logs from other computers (with different OS and without restart) later.
| |
|
Message boards :
Extended Sierpinski Problem :
Change to PSP/SoB/ESP sieve validation criteria |