PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Sieving : Problems with longer CUDA WUs

Author Message
Profile Beyond
Avatar
Send message
Joined: 20 Sep 06
Posts: 74
ID: 3518
Credit: 500,031,187
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (36,394)TPS LLR (retired) Silver: Earned 100,000 credits (192,405)321 Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,997,552)PPS Sieve Double Silver: Earned 200,000,000 credits (494,307,821)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (768,186)TRP Sieve (suspended) Bronze: Earned 10,000 credits (26,600)AP 26/27 Silver: Earned 100,000 credits (167,767)PSA Ruby: Earned 2,000,000 credits (2,228,200)
Message 31924 - Posted: 1 Feb 2011 | 17:08:48 UTC
Last modified: 1 Feb 2011 | 17:12:44 UTC

Since the double length WUs hit I've been having errors on 2 of my GPUs. Both run the older shorter WUs with no errors. The double length WUs are a problem. Both machines have 4GB ram and huge free disk space.

The first is a GTX 260:

http://www.primegrid.com/show_host_detail.php?hostid=134752

It runs the shorter WUs with no problems but many of the longer ones fail at 4056 seconds with this message:

<core_client_version>6.12.4</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> Sieve started: 19949512000000000 <= p < 19949518000000000 Thread 0 starting Detected GPU 0: GeForce GTX 260 Detected compute capability: 1.3 Detected 27 multiprocessors. Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x75A622A1


The second problem GPU is on a dual GPU machine:

http://www.primegrid.com/show_host_detail.php?hostid=89885

The GTX 260 runs the longer WUs fine but every long WU on the GT 240 fails at 3876 seconds with a similar message as above:

<core_client_version>6.12.11</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> Sieve started: 19469632000000000 <= p < 19469638000000000 Thread 0 starting Detected GPU 1: GeForce GT 240 Detected compute capability: 1.2 Detected 12 multiprocessors. Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x7D61002D


The key seems to be "Maximum elapsed time exceeded"

For now I've had to switch the GTX 260 box to Collatz as it's a remote machine. I've kept the dual GTX 260 / GT 240 box limping along by aborting all the longer WUs. As I stated above they all run the shorter WUs perfectly. Are these double length WUs a permanent change or are they the WUs under 1P? Is it a problem with v1.38? I can't imagine that many other machines aren't also having this problem. Any well thought out ideas would be appreciated

Profile Beyond
Avatar
Send message
Joined: 20 Sep 06
Posts: 74
ID: 3518
Credit: 500,031,187
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (36,394)TPS LLR (retired) Silver: Earned 100,000 credits (192,405)321 Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,997,552)PPS Sieve Double Silver: Earned 200,000,000 credits (494,307,821)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (768,186)TRP Sieve (suspended) Bronze: Earned 10,000 credits (26,600)AP 26/27 Silver: Earned 100,000 credits (167,767)PSA Ruby: Earned 2,000,000 credits (2,228,200)
Message 31969 - Posted: 2 Feb 2011 | 1:52:09 UTC - in response to Message 31924.

Any well thought out ideas would be appreciated

Maybe I set too high a bar. Ignore the statement above ;)

Profile Tony Li
Avatar
Send message
Joined: 17 Aug 10
Posts: 22
ID: 65809
Credit: 2,921,285
RAC: 0
Cullen/Woodall Sieve Bronze: Earned 10,000 credits (25,840)PPS Sieve Ruby: Earned 2,000,000 credits (2,841,437)GFN Bronze: Earned 10,000 credits (53,575)
Message 32005 - Posted: 2 Feb 2011 | 20:23:34 UTC - in response to Message 31969.

See this thread:

http://www.primegrid.com/forum_thread.php?id=3049

____________

Profile Beyond
Avatar
Send message
Joined: 20 Sep 06
Posts: 74
ID: 3518
Credit: 500,031,187
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (36,394)TPS LLR (retired) Silver: Earned 100,000 credits (192,405)321 Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,997,552)PPS Sieve Double Silver: Earned 200,000,000 credits (494,307,821)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (768,186)TRP Sieve (suspended) Bronze: Earned 10,000 credits (26,600)AP 26/27 Silver: Earned 100,000 credits (167,767)PSA Ruby: Earned 2,000,000 credits (2,228,200)
Message 32011 - Posted: 2 Feb 2011 | 21:47:33 UTC

Thanks Tony, but I don't think it's a WU parameter problem because I have other machines taking as long or longer but never fail the long WUs. I tried installing Win7-64 (from XP-64) and it made no difference. Tried upgrading to the newest NVidia drivers and the long WUs still fail. Both the GTX 260 and the GT 240 run GPUGRID and Collatz perfectly. The GTX 260 also runs MW without a hitch. I'm stumped.

Profile rroonnaallddProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 3 Jul 09
Posts: 1213
ID: 42893
Credit: 34,634,263
RAC: 0
321 LLR Silver: Earned 100,000 credits (101,692)Cullen LLR Silver: Earned 100,000 credits (104,876)ESP LLR Silver: Earned 100,000 credits (101,979)PPS LLR Silver: Earned 100,000 credits (148,018)PSP LLR Silver: Earned 100,000 credits (140,441)SoB LLR Silver: Earned 100,000 credits (119,475)SR5 LLR Silver: Earned 100,000 credits (120,939)SGS LLR Silver: Earned 100,000 credits (122,783)TRP LLR Silver: Earned 100,000 credits (100,115)Woodall LLR Silver: Earned 100,000 credits (107,459)321 Sieve (suspended) Silver: Earned 100,000 credits (202,757)Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,908,135)PPS Sieve Sapphire: Earned 20,000,000 credits (25,450,104)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Silver: Earned 100,000 credits (130,966)TRP Sieve (suspended) Silver: Earned 100,000 credits (201,525)AP 26/27 Silver: Earned 100,000 credits (100,015)GFN Silver: Earned 100,000 credits (246,369)PSA Silver: Earned 100,000 credits (226,594)
Message 32022 - Posted: 2 Feb 2011 | 22:35:54 UTC - in response to Message 32011.

If you have not to many hosts then my app_info files at http://primegrid.pytalhost.net could be a solution...
But an app-change is not a good idea whilst calculating a unit. Mostly you will lose all units in cache on this host.
____________
Best wishes. Knowledge is power. by jjwhalen

Profile Beyond
Avatar
Send message
Joined: 20 Sep 06
Posts: 74
ID: 3518
Credit: 500,031,187
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (36,394)TPS LLR (retired) Silver: Earned 100,000 credits (192,405)321 Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,997,552)PPS Sieve Double Silver: Earned 200,000,000 credits (494,307,821)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (768,186)TRP Sieve (suspended) Bronze: Earned 10,000 credits (26,600)AP 26/27 Silver: Earned 100,000 credits (167,767)PSA Ruby: Earned 2,000,000 credits (2,228,200)
Message 32061 - Posted: 3 Feb 2011 | 16:57:20 UTC

Did some experimenting with the machine that has both a GTX 260 and GT 240. If I start a long WU on the GTX 260, pause it and finish it on the GT 240 it completes fine as long as the time does not exceed 3876 seconds. Likewise I can start a long WU on the GT 240, pause it and finish it on the GTX 260 and it completes fine as long as the time does not exceed 3876 seconds (the GTX 260 is twice as fast as the GT 240). It seems the client is aborting WUs at 3876 seconds on this machine. Does this clue help?

JohnProject donor
Honorary cruncher
Avatar
Send message
Joined: 21 Feb 06
Posts: 2875
ID: 2449
Credit: 2,681,934
RAC: 0
321 LLR Bronze: Earned 10,000 credits (11,773)Cullen LLR Bronze: Earned 10,000 credits (14,945)ESP LLR Bronze: Earned 10,000 credits (26,855)PPS LLR Bronze: Earned 10,000 credits (84,876)PSP LLR Bronze: Earned 10,000 credits (15,311)SoB LLR Bronze: Earned 10,000 credits (21,440)SR5 LLR Bronze: Earned 10,000 credits (29,270)SGS LLR Bronze: Earned 10,000 credits (26,616)TPS LLR (retired) Bronze: Earned 10,000 credits (36,288)TRP LLR Bronze: Earned 10,000 credits (41,655)Woodall LLR Bronze: Earned 10,000 credits (15,807)321 Sieve (suspended) Bronze: Earned 10,000 credits (20,014)Cullen/Woodall Sieve Bronze: Earned 10,000 credits (23,405)PPS Sieve Bronze: Earned 10,000 credits (36,192)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (20,306)TRP Sieve (suspended) Bronze: Earned 10,000 credits (21,738)GFN Bronze: Earned 10,000 credits (86,217)PSA Ruby: Earned 2,000,000 credits (2,143,756)
Message 32063 - Posted: 3 Feb 2011 | 17:02:27 UTC - in response to Message 32061.

Did some experimenting with the machine that has both a GTX 260 and GT 240. If I start a long WU on the GTX 260, pause it and finish it on the GT 240 it completes fine as long as the time does not exceed 3876 seconds. Likewise I can start a long WU on the GT 240, pause it and finish it on the GTX 260 and it completes fine as long as the time does not exceed 3876 seconds (the GTX 260 is twice as fast as the GT 240). It seems the client is aborting WUs at 3876 seconds on this machine. Does this clue help?

Yes, please see here.
____________

Profile Beyond
Avatar
Send message
Joined: 20 Sep 06
Posts: 74
ID: 3518
Credit: 500,031,187
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (36,394)TPS LLR (retired) Silver: Earned 100,000 credits (192,405)321 Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,997,552)PPS Sieve Double Silver: Earned 200,000,000 credits (494,307,821)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (768,186)TRP Sieve (suspended) Bronze: Earned 10,000 credits (26,600)AP 26/27 Silver: Earned 100,000 credits (167,767)PSA Ruby: Earned 2,000,000 credits (2,228,200)
Message 32093 - Posted: 4 Feb 2011 | 1:55:04 UTC - in response to Message 32063.

Now we have attempted to adjust the old WU's in the buffer that were affected. Please provide feedback if this worked or not.

That fixed it. Now working as expected. Thanks John!

Message boards : Sieving : Problems with longer CUDA WUs

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2023 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.75, 1.62, 1.68
Generated 3 Jun 2023 | 9:03:45 UTC