Author |
Message |
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Hi,
I just found my computer extremely sluggish...and found that primegrid_sr2sieve_wrapper_1.11_windows_x86_64.exe }dated was with AboveNormal priority. After manually chanign priority everything is fine again.
Anybody else having this issue?
It is not listed on app page; download of this app occured ~24 hours ago.
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
pschoefer Volunteer developer Volunteer tester
 Send message
Joined: 20 Sep 05 Posts: 673 ID: 845 Credit: 2,495,934,707 RAC: 632,403
                           
|
I just found my computer extremely sluggish...and found that primegrid_sr2sieve_wrapper_1.11_windows_x86_64.exe }dated was with AboveNormal priority. After manually chanign priority everything is fine again.
Anybody else having this issue?
It is not listed on app page; download of this app occured ~24 hours ago.
There was a bug in that app version. See this thread.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Thanks for pointing out to correct thread.
I was wondering how it got downloaded app that has known issue (I got cache for 1 day so it wasn't sitting there from last month).
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
RytisVolunteer moderator Project administrator
 Send message
Joined: 22 Jun 05 Posts: 2653 ID: 1 Credit: 86,393,830 RAC: 14,227
                     
|
Thanks for pointing out to correct thread.
I was wondering how it got downloaded app that has known issue (I got cache for 1 day so it wasn't sitting there from last month).
The app is fine, it just needs an extra command line parameter to run under low priority. It's added for new workunits, and I mistakenly thought they were out already, so I released the 1.11. When I realized I was wrong, I pulled it (again). It was about 15 minute period while the new app was being sent.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Thanks Rytis, John already informed me via PM.
I guess I was just "lucky" that one of my ~25 cores running 24/7 on PPS Sieve got this.
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Be aware that old WUs with AboveNormal priority may occur again.
I found 2 on one machine (out of 10 in queue) and 2 on another (out of 40); third one if fine so far (out of 40 WUs).
Will check other hosts later...
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
|
Confirmed, I killed 7 of the above-normal priority processes today (and some a few days ago).
EDIT: Any chance of having the remaining resource-hogs sought out and killed before they are sent (or re-sent)? |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Today, I found 12 of them.
I solve this issue by editing pps_sr2sieve_?????_cmd and changing wrong parameter (if not yet started) or simply by lowering priority in taskmanager (if started)...so no killing.
Brickhead - if you kill a result, wouldn't it be resent again (to me, lennart or whoever).
Anyway, it's time consuming to monitor and baby-sit all machines...
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
|
Today, I found 12 of them.
I solve this issue by editing pps_sr2sieve_?????_cmd and changing wrong parameter (if not yet started) or simply by lowering priority in taskmanager (if started)...so no killing.
Brickhead - if you kill a result, wouldn't it be resent again (to me, lennart or whoever).
Anyway, it's time consuming to monitor and baby-sit all machines...
Changing priority on those processes is not something I'm allowed to do as a mere domain admin (I've tried). How did you do that?
I am aware that any of us might get the same WU again (if it hasn't reached one of its limits), and I'd prefer to just leave those WUs alone and hope they'll finish soon. But when, for example, a domain controller takes forever to respond to authentication requests, that's not an option. |
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
There are currently 190 old WU's with the -Z switch. We'll post when they are all completed. Until then, please keep all critical systems off of the PPS sieve. The -Z switch will run at "above normal" priority and may make your systems sluggish.
As I type this, 4 have already been completed. Hopefully we'll see the end soon. Lennart has all of his cores on this and we are expecting that he'll absorb a majority of the old WU's.
We apologize for the inconvenience and appreciate your patience.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Well, I'm not running BOINC on domain controller - too risky. Not even on application server nor Oracle. Even I'm temped...perhaps sometimes in future, running only overnight...
Other machines in domain are fine to change priority in TaskManager, both are Windows 2008 under domain admin.
Rest are W2K3 under local admin (not in a domain).
John: Expect 12+ to finish in <24 hours.
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
|
I have 24 cores running PPS Sieve atm, and all are my own (including the DC), so nothing business critical. Just checked, and runtime priority change is disallowed in Windows 2003-64 (DC or member) and XP-64 (member). I'll continue to let any raised-priority WUs finish whenever I don't need that particular machine's services.
Oh, and it's no news that Lennart will chew through his part and then some, I expect Intel to build a separate manufacturing plant just for him :D |
|
|
|
There are currently 190 old WU's with the -Z switch. We'll post when they are all completed. Until then, please keep all critical systems off of the PPS sieve. The -Z switch will run at "above normal" priority and may make your systems sluggish.
We apologize for the inconvenience and appreciate your patience.
There is 164 left today at 11:00 UTC
/Lennart |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1930 ID: 352 Credit: 5,461,946,229 RAC: 5,763,797
                                   
|
Thanks for the update, Lennart.
Now, something for your amusement - what a stupid thing I just did.
I thought that if I set larger cache, more WUs will download, I'll edit those above-normal priority, finish them first to get rid of them (and help others) and then no baby-sitting. Sounds good, heh?
Set cache to 5 days instead of 1, edited a single WU on first machine and told BOINC to suspend other tasks (using BOINCView, which is great for managing BOINC...unlike BOINC Manager).
As I suspended a task already running (that was a mistake), another one started, got suspended, another one started etc....the mess ended with 33 tasks started for a short period with each creating a slot taking 400MB resulting in 13GB BOINC folder. This was hard time even for my 64GB SSD with a machine that has 8GB of RAM.
I'm really not blaming anyone but myself that I did such a mistake even after many years of running BOINC.
Anyway, I got 1000+ PPS Sieve WUs, enough to make 6 quads busy for 5 days...and no above-normal WU.
____________
My stats
Badge score: 1*1 + 5*1 + 8*3 + 9*11 + 10*1 + 11*1 + 12*3 = 186 |
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
There are currently 190 old WU's with the -Z switch. We'll post when they are all completed. Until then, please keep all critical systems off of the PPS sieve. The -Z switch will run at "above normal" priority and may make your systems sluggish.
We apologize for the inconvenience and appreciate your patience.
There is 164 left today at 11:00 UTC
134 as of 16:30 UTC 18 Oct 2008
____________
|
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
There are currently 190 old WU's with the -Z switch. We'll post when they are all completed. Until then, please keep all critical systems off of the PPS sieve. The -Z switch will run at "above normal" priority and may make your systems sluggish.
We apologize for the inconvenience and appreciate your patience.
There is 164 left today at 11:00 UTC
134 as of 16:30 UTC 18 Oct 2008
96 as of 01;15 UTC 20 Oct 2008
____________
|
|
|
|
There are currently 190 old WU's with the -Z switch. We'll post when they are all completed. Until then, please keep all critical systems off of the PPS sieve. The -Z switch will run at "above normal" priority and may make your systems sluggish.
We apologize for the inconvenience and appreciate your patience.
There is 164 left today at 11:00 UTC
134 as of 16:30 UTC 18 Oct 2008
96 as of 01;15 UTC 20 Oct 2008
79 today 10:47 21 Oct 2008 (UTC)
/Lennart |
|
|