How often does Gerbicz error checking take place while running a unit? Per iteration? Per n iterations? Per x% progress? In case it may vary depending on project, then it would suffice to narrow it to CUL/WOO for now.
Back story: I'm running an unstable system which was found as such during the previous challenge. I'm feeling an itch for getting some more work done in this challenge, so I decided to run it while simultaneously trying to fix it. I know, it isn't the best idea to do this on live data, but due to the infrequent occurrence I though I'd take that chance. The attempted fix last time to lower power limit did not help at all, and actually made it worse. So this time I'm going the opposite route: increasing voltage offset. At same power cap, this might reduce average clocks, but I'd take stability over speed. I've been monitoring stderr while the 1st unit has been running and roughly half way through I had a detected Gerbicz error. I just made the first voltage offset increase and will see if there are more. Understanding how often Gerbicz error checks are performed would be useful to know how much potentially in-progress work may be done in potential error state after making configuration changes.