Author |
Message |
|
I'm not a new user but I feel that this belongs here:
To prepare for the upcoming TdP I have been running some GFN15/16 and PPS tasks. The PPS task ran fine, but the GFN15/16 tasks always stopped near midway and I looked in the stderr.txt to see something like this:
Maxerr exceeded, 1.0000 >0.4500
Pausing 10 minutes to continue from last checkpoint
Does this have anything to do my AMD driver? I am running these tasks on a Vega 8 iGPU, and I recently installed Adrenaline 19.2.2. Do I have to install newer versions? Or is this another problem, because other projects are running fine, such as milkyway and even stream's GFN13 tasks.
Edit: here is one of the tasks that I aborted
http://www.primegrid.com/result.php?resultid=1056775901
____________
My lucky number is 6219*2^3374198+1
|
|
|
Monkeydee Volunteer tester
 Send message
Joined: 8 Dec 13 Posts: 526 ID: 284516 Credit: 1,383,056,451 RAC: 790,385
                         
|
Update to the newest drivers and see what happens.
Check that your temperatures aren't too high. CPU below 80C should be fine.
____________
My Primes
Badge Score: 4*2 + 6*2 + 7*5 + 8*7 + 11*3 + 12*1 = 156
|
|
|
|
Update to the newest drivers and see what happens.
I tried 20.1.2 but Windows says, "This app doesn't work on your PC, please go to the distributor to find one that does"
Check that your temperatures aren't too high. CPU below 80C should be fine.
CPU runs constantly at 75-78; when GPU has a task it goes up to 82; when it doesn't, about 60.
____________
My lucky number is 6219*2^3374198+1
|
|
|
Zach Send message
Joined: 3 May 18 Posts: 24 ID: 1010731 Credit: 45,423,135 RAC: 45,316
                 
|
I was getting a bunch of these with Genefer WUs too. I couldn't figure out how to get my machine to let me declock the GPU, but I did reduce the max. allowable temperature to 70. This seemed to fix the issue.
I can see in GPU-Z that it's now "capping performance" to keep the temperature down. I think it kept kicking into boost mode or something which I've effectively disallowed with the temp. cap. Might work for you too. |
|
|
|
okay, I'll try that, but it's an iGPU as I said, and I couldnt seem to throttle the temperature from GPU-Z
____________
My lucky number is 6219*2^3374198+1
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1022 ID: 301928 Credit: 543,195,188 RAC: 150
                        
|
Unfortunately, the problem is too generic. The reason could be anywhere. It could be driver problem - OpenCL code is compiled on the fly and, as a rough example, compiler may produce correct GPU processor code for 13 iterations of some loop in GFN-13 and bad code for 16 iteration. It may be hardware problem (overclocking, overheating) because GFN-13 units are very small (CPU/iGPU is not stressed much) and using smaller amount of memory comparing to GFN15/16.
|
|
|
|
Unfortunately, the problem is too generic. The reason could be anywhere. It could be driver problem - OpenCL code is compiled on the fly and, as a rough example, compiler may produce correct GPU processor code for 13 iterations of some loop in GFN-13 and bad code for 16 iteration.
What do you mean when you say "compiled on the fly"?
Unfortunately I have such terrble network here in China that downloading AMD's 20.1.4 needed 12 hours or so and it was unrealistic so I haven't been able to reinstall drivers.
So AFAIK I have no drivers installed on this laptop.
It may be hardware problem (overclocking, overheating) because GFN-13 units are very small (CPU/iGPU is not stressed much) and using smaller amount of memory comparing to GFN15/16.
Could be, but how do I throttle down to see if it is this problem?
____________
My lucky number is 6219*2^3374198+1
|
|
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1022 ID: 301928 Credit: 543,195,188 RAC: 150
                        
|
Unfortunately, the problem is too generic. The reason could be anywhere. It could be driver problem - OpenCL code is compiled on the fly and, as a rough example, compiler may produce correct GPU processor code for 13 iterations of some loop in GFN-13 and bad code for 16 iteration.
What do you mean when you say "compiled on the fly"?
GPU part of OpenCL program really is a C source code (it could be seen inside executable). OpenCL-compatible video driver really is a very complex full-scale compiler which translates this C source to GPU-specific instructions and data. It's done every time when the program is run. Writing and debugging compilers is a very difficult thing, so these drivers/compilers may contain hidden bugs which may expose himself only in certain apps under some specific conditions. These bug may lead to generation of incorrect GPU machine code, which in turn causes wrong calculations or crashes.
|
|
|
Zach Send message
Joined: 3 May 18 Posts: 24 ID: 1010731 Credit: 45,423,135 RAC: 45,316
                 
|
Could be, but how do I throttle down to see if it is this problem?
I believe the most common program is MSI Afterburner. It's GUI and you don't have to mess around with the BIOS.
|
|
|
|
I think it actually is this problem because I've been having no such errors since I used GPU-Z, which magically throttles down!Now both CPU and GPU are running at 70 constantly.
One more question that I have is why GPU-Z shwos 10% of GPU usage when my iGPU is running a GFN?
____________
My lucky number is 6219*2^3374198+1
|
|
|
|
Only GPU-Z works for me.I have an iGPU and it's not able to throttle its temps.
For now I'll stick with GPU-Z until the driver completes its download. ;)
428/483MB, 20kB/s
____________
My lucky number is 6219*2^3374198+1
|
|
|
|
It did complete and now everything is working fine! Thank you!
____________
My lucky number is 6219*2^3374198+1
|
|
|