PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : AP26 - AP27 Search : AP27 tasks being consistently invalid?

Author Message
Incqption
Avatar
Send message
Joined: 15 Nov 17
Posts: 11
ID: 947827
Credit: 5,399,304
RAC: 39,143
PPS LLR Gold: Earned 500,000 credits (851,915)SR5 LLR Bronze: Earned 10,000 credits (68,289)SGS LLR Bronze: Earned 10,000 credits (63,383)321 Sieve Amethyst: Earned 1,000,000 credits (1,242,336)Generalized Cullen/Woodall Sieve (suspended) Gold: Earned 500,000 credits (614,349)PPS Sieve Amethyst: Earned 1,000,000 credits (1,294,464)AP 26/27 Silver: Earned 100,000 credits (315,354)GFN Gold: Earned 500,000 credits (555,214)PSA Silver: Earned 100,000 credits (394,000)
Message 130509 - Posted: 19 Jun 2019 | 23:13:48 UTC
Last modified: 19 Jun 2019 | 23:16:23 UTC

Running AP27 tasks on a Sapphire RX 550 (4GB) GPU. Nothing is overclocked, and 3/4 of my latest AP27 tasks have failed to validate. All other GPU tasks (GFN, Manual Sieving, etc.) are computing as normal. What is going on? :P

EDIT: forgot to mention: The host is running Ubuntu 18.04.2 LTS. It is equipped with the 19.20 AMD GPU driver.

stream
Volunteer developer
Volunteer tester
Send message
Joined: 1 Mar 14
Posts: 580
ID: 301928
Credit: 451,697,040
RAC: 114
Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (9,919,609)Cullen LLR Turquoise: Earned 5,000,000 credits (9,934,320)ESP LLR Turquoise: Earned 5,000,000 credits (9,909,084)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,921,052)PPS LLR Turquoise: Earned 5,000,000 credits (7,262,900)PSP LLR Turquoise: Earned 5,000,000 credits (5,089,560)SoB LLR Turquoise: Earned 5,000,000 credits (5,824,522)SR5 LLR Turquoise: Earned 5,000,000 credits (5,399,087)SGS LLR Turquoise: Earned 5,000,000 credits (5,436,940)TRP LLR Turquoise: Earned 5,000,000 credits (9,911,706)Woodall LLR Turquoise: Earned 5,000,000 credits (5,011,851)321 Sieve Sapphire: Earned 20,000,000 credits (20,004,228)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,047,667)PPS Sieve Sapphire: Earned 20,000,000 credits (20,866,490)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,043,271)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,015,177)AP 26/27 Sapphire: Earned 20,000,000 credits (20,045,194)GFN Emerald: Earned 50,000,000 credits (50,752,940)PSA Double Silver: Earned 200,000,000 credits (200,301,443)
Message 130515 - Posted: 20 Jun 2019 | 10:45:09 UTC

Checking error messages from your task lead do this thread:

https://lists.freedesktop.org/archives/mesa-dev/2019-June/219737.html

Alas, I have no idea who is right and who is wrong here, and what to do with it. Mesa developers blame AMD. Peoples blame Mesa because this change broke lot of software. But it seems that Mesa starting from 18.1.4 (where this change was introduced first) and current AMD drivers are not compatible. The problem may or not may not appear depending on complexity and style of source code of GPU program.

Probably there is also a small bug in AP27 program which should check for successful run of the kernel and immediately exit on error, instead of running full time with invalid result. But I didn't looked at the source and may be wrong.

Incqption
Avatar
Send message
Joined: 15 Nov 17
Posts: 11
ID: 947827
Credit: 5,399,304
RAC: 39,143
PPS LLR Gold: Earned 500,000 credits (851,915)SR5 LLR Bronze: Earned 10,000 credits (68,289)SGS LLR Bronze: Earned 10,000 credits (63,383)321 Sieve Amethyst: Earned 1,000,000 credits (1,242,336)Generalized Cullen/Woodall Sieve (suspended) Gold: Earned 500,000 credits (614,349)PPS Sieve Amethyst: Earned 1,000,000 credits (1,294,464)AP 26/27 Silver: Earned 100,000 credits (315,354)GFN Gold: Earned 500,000 credits (555,214)PSA Silver: Earned 100,000 credits (394,000)
Message 130516 - Posted: 20 Jun 2019 | 11:09:20 UTC - in response to Message 130515.

But it seems that Mesa starting from 18.1.4 (where this change was introduced first) and current AMD drivers are not compatible.

So should I just install an older driver?

The problem may or not may not appear depending on complexity and style of source code of GPU program.

That makes sense as all GFN tasks which are also OCL work perfectly fine..

Probably there is also a small bug in AP27 program which should check for successful run of the kernel and immediately exit on error, instead of running full time with invalid result. But I didn't looked at the source and may be wrong.

That is true. The AP27 tasks actually run for the whole 5-6 hours before exiting:P

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 13048
ID: 53948
Credit: 203,040,365
RAC: 86,329
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,822,730)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Turquoise: Earned 5,000,000 credits (5,009,577)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Turquoise: Earned 5,000,000 credits (9,167,278)PSP LLR Turquoise: Earned 5,000,000 credits (5,098,748)SoB LLR Sapphire: Earned 20,000,000 credits (34,221,148)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,014,138)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,591,349)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Jade: Earned 10,000,000 credits (10,114,260)GFN Emerald: Earned 50,000,000 credits (66,792,810)PSA Jade: Earned 10,000,000 credits (12,404,447)
Message 130520 - Posted: 20 Jun 2019 | 12:06:07 UTC - in response to Message 130516.

But it seems that Mesa starting from 18.1.4 (where this change was introduced first) and current AMD drivers are not compatible.

So should I just install an older driver?

The problem may or not may not appear depending on complexity and style of source code of GPU program.

That makes sense as all GFN tasks which are also OCL work perfectly fine..

Probably there is also a small bug in AP27 program which should check for successful run of the kernel and immediately exit on error, instead of running full time with invalid result. But I didn't looked at the source and may be wrong.

That is true. The AP27 tasks actually run for the whole 5-6 hours before exiting:P


Actually, that behavior (detecting the error and exiting) isn't ideal either. You'll chew up thousands of tasks that way. Not good for the server, your wingmen, or (if you pay for your data connection) your wallet. Mostly, it's all the other users that get irritated.

Even better is what I do with Genefer: if it detects an environmental error, and thus suspects that subsequent tasks will also immediately fail, it pauses for an hour, and then aborts. That way your computer isn't consuming electricity for no reason and won't run through a useless task every three seconds. Genefer will also periodically check during the hour to see if the problem has resolved itself and will continue the calculation if it can.

____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

stream
Volunteer developer
Volunteer tester
Send message
Joined: 1 Mar 14
Posts: 580
ID: 301928
Credit: 451,697,040
RAC: 114
Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (9,919,609)Cullen LLR Turquoise: Earned 5,000,000 credits (9,934,320)ESP LLR Turquoise: Earned 5,000,000 credits (9,909,084)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,921,052)PPS LLR Turquoise: Earned 5,000,000 credits (7,262,900)PSP LLR Turquoise: Earned 5,000,000 credits (5,089,560)SoB LLR Turquoise: Earned 5,000,000 credits (5,824,522)SR5 LLR Turquoise: Earned 5,000,000 credits (5,399,087)SGS LLR Turquoise: Earned 5,000,000 credits (5,436,940)TRP LLR Turquoise: Earned 5,000,000 credits (9,911,706)Woodall LLR Turquoise: Earned 5,000,000 credits (5,011,851)321 Sieve Sapphire: Earned 20,000,000 credits (20,004,228)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,047,667)PPS Sieve Sapphire: Earned 20,000,000 credits (20,866,490)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,043,271)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,015,177)AP 26/27 Sapphire: Earned 20,000,000 credits (20,045,194)GFN Emerald: Earned 50,000,000 credits (50,752,940)PSA Double Silver: Earned 200,000,000 credits (200,301,443)
Message 130521 - Posted: 20 Jun 2019 | 12:11:12 UTC - in response to Message 130516.

But it seems that Mesa starting from 18.1.4 (where this change was introduced first) and current AMD drivers are not compatible.

So should I just install an older driver?

This is a simplest and safest thing to try.

More dangerous things to try:

- Install older version of Mesa package (18.1.3 and below), if it's possible for your distro. Success will depend on are these relocations, which Mesa does not know how to handle, really required to run GPU program or not. If yes, GPU program may crash or hang.

- Uninstall Mesa if it's possible for your distro, install generic OpenCL ICD loader (Kronos on NVidia one). This may really break dependencies and desktop on your system, so it's for "really advanced" users only.

Incqption
Avatar
Send message
Joined: 15 Nov 17
Posts: 11
ID: 947827
Credit: 5,399,304
RAC: 39,143
PPS LLR Gold: Earned 500,000 credits (851,915)SR5 LLR Bronze: Earned 10,000 credits (68,289)SGS LLR Bronze: Earned 10,000 credits (63,383)321 Sieve Amethyst: Earned 1,000,000 credits (1,242,336)Generalized Cullen/Woodall Sieve (suspended) Gold: Earned 500,000 credits (614,349)PPS Sieve Amethyst: Earned 1,000,000 credits (1,294,464)AP 26/27 Silver: Earned 100,000 credits (315,354)GFN Gold: Earned 500,000 credits (555,214)PSA Silver: Earned 100,000 credits (394,000)
Message 130522 - Posted: 20 Jun 2019 | 12:23:03 UTC - in response to Message 130521.

Problem is that older AMD drivers, for some reason, are made SPECIFICALLY for older Ubuntu versions and won't even run on older/newer ones and I'm not really willing to downgrade my entire system. :P

More dangerous things to try:

- Install older version of Mesa package (18.1.3 and below), if it's possible for your distro. Success will depend on are these relocations, which Mesa does not know how to handle, really required to run GPU program or not. If yes, GPU program may crash or hang.

- Uninstall Mesa if it's possible for your distro, install generic OpenCL ICD loader (Kronos on NVidia one). This may really break dependencies and desktop on your system, so it's for "really advanced" users only.

I actually have no idea how to do any of this but I am willing to try, if instructed :P. I am more experienced with Windows. I have just unselected the AP27 project from my properties.

What I initially thought about, and I don't know if it makes any sense, is that AP27 as well as all GFN tasks both run on OpenCL. If there was a problem with my OpenCL driver, wouldn't it make all GFN tasks fail aswell? (I have 0 history of failed GFN tasks before.) Could there be an issue with the AP27 program itself?

Even better is what I do with Genefer: if it detects an environmental error, and thus suspects that subsequent tasks will also immediately fail, it pauses for an hour, and then aborts. That way your computer isn't consuming electricity for no reason and won't run through a useless task every three seconds. Genefer will also periodically check during the hour to see if the problem has resolved itself and will continue the calculation if it can.

That's an elegant solution right there :P

stream
Volunteer developer
Volunteer tester
Send message
Joined: 1 Mar 14
Posts: 580
ID: 301928
Credit: 451,697,040
RAC: 114
Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (9,919,609)Cullen LLR Turquoise: Earned 5,000,000 credits (9,934,320)ESP LLR Turquoise: Earned 5,000,000 credits (9,909,084)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,921,052)PPS LLR Turquoise: Earned 5,000,000 credits (7,262,900)PSP LLR Turquoise: Earned 5,000,000 credits (5,089,560)SoB LLR Turquoise: Earned 5,000,000 credits (5,824,522)SR5 LLR Turquoise: Earned 5,000,000 credits (5,399,087)SGS LLR Turquoise: Earned 5,000,000 credits (5,436,940)TRP LLR Turquoise: Earned 5,000,000 credits (9,911,706)Woodall LLR Turquoise: Earned 5,000,000 credits (5,011,851)321 Sieve Sapphire: Earned 20,000,000 credits (20,004,228)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,047,667)PPS Sieve Sapphire: Earned 20,000,000 credits (20,866,490)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,043,271)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,015,177)AP 26/27 Sapphire: Earned 20,000,000 credits (20,045,194)GFN Emerald: Earned 50,000,000 credits (50,752,940)PSA Double Silver: Earned 200,000,000 credits (200,301,443)
Message 130523 - Posted: 20 Jun 2019 | 12:58:33 UTC - in response to Message 130522.

More dangerous things to try:

- Install older version of Mesa package (18.1.3 and below), if it's possible for your distro. Success will depend on are these relocations, which Mesa does not know how to handle, really required to run GPU program or not. If yes, GPU program may crash or hang.

- Uninstall Mesa if it's possible for your distro, install generic OpenCL ICD loader (Kronos on NVidia one). This may really break dependencies and desktop on your system, so it's for "really advanced" users only.

I actually have no idea how to do any of this but I am willing to try, if instructed :P. I am more experienced with Windows. I have just unselected the AP27 project from my properties.

Then it's safer to keep AP27 unselected until this issue is resolved officially (although I'm a bit pessimistic on this subject - as you can see in the Mesa mailing list thread, although first reports about this problem appeared almost a year ago, some reasonable actions on this subject were taken only few days ago).

What I initially thought about, and I don't know if it makes any sense, is that AP27 as well as all GFN tasks both run on OpenCL. If there was a problem with my OpenCL driver, wouldn't it make all GFN tasks fail aswell? (I have 0 history of failed GFN tasks before.) Could there be an issue with the AP27 program itself?

If all your GFN tasks were validated, you have nothing to worry about. OpenCL programs are shipped in source code. A GPU driver is really a full-scale compiler which compiles this source code to machine code of specific GPU on the fly. Since this is a very complex piece of software, it often contain hard-to-find bugs which are triggered only by specific patterns in the source code. So it's quite common situation that program YYY compiles and runs, but program ZZZ (syntactically correct) causes crash/internal error/generation of incorrect machine code.

Post to thread

Message boards : AP26 - AP27 Search : AP27 tasks being consistently invalid?

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2019 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.16, 1.22, 1.17
Generated 14 Dec 2019 | 7:13:40 UTC