Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Sieving :
DC on sieving using CPU is Only waste of TIME.
Author |
Message |
|
There are no reason to DC sieve WU !!!
We use to sieve to we get sec/factors about 100% of LLR time on a candidate about 75-80% of high n.level in the range.
Now with DC we get that sec/factors multiplied by 2 ( or more).
That means that there are no reason to sieve as deep as we should have done if we only did a single test.
If we miss 10% factors on a single tests we save time compare to DC.
Sieve is only to save time when running LLR !!!
If we not lose 50% or more without DC it is just a waste of time.
My goal is to minimize the time to find the primes. so I would like to stop all DC'ing on sieve wu.
Any thoughts ??
Lennart
____________
| |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,143,342,172 RAC: 2,285,158
                                      
|
Just a thought...does it make sense to use AR?
Only small overhead but still high reliability.
____________
My stats | |
|
|
Just a thought...does it make sense use AR?
Only small overhead but still high reliability.
Yes it make sense. I think that's a good idea if we only find out a good way to find as many errors we can on a easy way.
Lennart
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2392 ID: 1178 Credit: 18,665,844,835 RAC: 6,925,862
                                                
|
I thought the DC on the sieves was brought about due to a problem with some form of cheating. Am I not remembering that correctly? If I am remembering correctly, is that no longer a concern? | |
|
|
I thought the DC on the sieves was brought about due to a problem with some form of cheating. Am I not remembering that correctly? If I am remembering correctly, is that no longer a concern?
I don't see that as a problem. If we have a expected time in validator
such cheating should not be a big problem.
Lennart
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
I thought the DC on the sieves was brought about due to a problem with some form of cheating. Am I not remembering that correctly? If I am remembering correctly, is that no longer a concern?
That's one of the problems. The other problem is that the lack of double checking allowed a serious flaw in the PPS sieve program to go unnoticed for many years. Since the ATI version of the PPS-Sieve program was introduced (in 2010?) it's been missing factors and we didn't discover this until recently. It's only once we started the partial double checking and made the validator really picky about matching the results that we noticed something was wrong.
The question boils down to which should we consider to be more important:
1) Getting the maximum amount of sieving done as quickly as possible, with the goal of therefore reducing the number of LLR candidates as quickly as possible, or
2) Making sure the sieve file is as accurate as we can, i.e., providing the maximum protection against both hardware and software errors, while at the same time providing the most protection against cheaters.
____________
My lucky number is 75898524288+1 | |
|
|
I thought the DC on the sieves was brought about due to a problem with some form of cheating. Am I not remembering that correctly? If I am remembering correctly, is that no longer a concern?
That's one of the problems. The other problem is that the lack of double checking allowed a serious flaw in the PPS sieve program to go unnoticed for many years. Since the ATI version of the PPS-Sieve program was introduced (in 2010?) it's been missing factors and we didn't discover this until recently. It's only once we started the partial double checking and made the validator really picky about matching the results that we noticed something was wrong.
The question boils down to which should we consider to be more important:
1) Getting the maximum amount of sieving done as quickly as possible, with the goal of therefore reducing the number of LLR candidates as quickly as possible, or
2) Making sure the sieve file is as accurate as we can, i.e., providing the maximum protection against both hardware and software errors, while at the same time providing the most protection against cheaters.
PPS sieve have nothing to do with CPU sieving.
If the goal is to get a accurate sievefile then I must have misunderstood the goal of finding primes.
Lennart | |
|
Dave  Send message
Joined: 13 Feb 12 Posts: 3208 ID: 130544 Credit: 2,286,744,994 RAC: 758,237
                           
|
Definitely #2 is most important in my book. | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
When I first saw the title of this thread I thought it was a poorly timed April fool's joke. However, after reviewing the responses, it's for real. Â I looked further at the sieves and it appears that all of them are in full DC. Â I don't know when DC went into effect but I've seen reference to it as far back as Jan 2014. Â Ouch!
"Sieve DC", "pending sieve credit", "sieve replication 2" are all oxymorons. :). It's such a contradiction in terms that it must have taken extreme circumstances to reach this point of implementation...not to mention a complete failure of communication, discussion, collaboration, and problem solving. Â I believe it's in this last section where the real problem exists.
Honestly, there's so many things wrong with this that I don't know where to begin. Â Let me first start off with presenting an unfair cherry picked example to present one extreme occurence. Â 2, 3, 5, 7 are all factors of 210. Â If my sieve only finds the number 7 and misses the others, it would only have a 25% accuracy rate. Â Nevertheless, 210 would still be removed from the LLR test pool.
Now let's take another example where the sieve misses a factor for an n=28M SoB candidate. Â This means that this candidate will now have to be LLR'd to determine if it is prime or not.
I understand there have been several issues with the PRS (PPS & RSP) sieves:
k=5 and others undersieve
p=35T-40T error
Range boundary omissions (both k and n boundaries)
Ati bug
However, even with all these issues, I can bet the farm, the winning lotto ticket, and the pension that the time lost in repairing these issues pales extremely in relationship to the time wasted with converting to full DC.
A missed factor is only a penalty after the candidate has been LLR'd. Â Doing a full DC on the sieves in real time is very damaging to managed resources and a waste of time. It's like using 99.9% of the resources to find 0.1% of the problem. Â Why is there no discussion at PG about this?
The depressing thing is that the PG community has drank this coolaid for at least the past 8 months with no real discussion or problem solving. Apparently, PG is failing its secondary goal of educating the community. Â Does anyone see the painful irony of a PPS sieve DC and a PPS LLR AR (Adaptive Replication)?
When it comes to cheating it appears that cobblestones have overruled common sense. The pursuit of efficiency and best use of resources has been replaced by the protection of a meaningless credit.Â
Don't get me wrong, cobblestones are an important part of running projects on BOINC but when it trumps finding a good solution, something is broken.Â
If PG wishes to continue DCing sieves, at the very least they should hold off until the winter months so the hosts can at least be effective as heaters. Â (Granted, sunlight passing through a window landing on a thermal mass will be a 100 times more effective.)
By not coming up with a good solution to the cheating issue, a few unscrupulous users have forced the entire PG community to waste their time and resources (if they so choose to participate in a sieve).Â
From what I can tell, PG has GREATLY improved its internals, operations appear smooth, stats are more detailed, capacity has increased, challenges run without a hitch, and data integrity is much higher. Â The ship is clean and in good form; however, someone is asleep in the wheelhouse. ;)
I understand that times have changed, that I'm out of touch, and that new perspectives are needed, but c'mon, someone needs to get into the wheelhouse or at least wake up the driver!
____________
| |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1033 ID: 301928 Credit: 543,624,271 RAC: 6,563
                         
|
From my point of view, AR with strict settings could be acceptable to make life of cheaters hard and minimize number of possible missed factors. AR coefficients must be tweaked in such a way that:
1. Even for most reliable hosts, probability of workunit to be DC'ed must be not less then 5-10%.
2. Host must return significant number (e.g. 50-100) of valid workunits before it's considered reliable.
3. A single "inconclusive" error must immediately return host to "unreliable" mode.
4. Host with multiple inconclusive results could be locked to "unreliable" mode or even banned from this subproject.
5. PPS sieve still returns significant number of factors per workunit, so results without factors must be forced DC'ed. Unfortunately, it's not possible for CPU sieves without significant increase of workunit size.
Of course, the challenge must be completed and DC'ed before any changes are made.
Just my two cents.
Does anyone see the painful irony of a PPS sieve DC and a PPS LLR AR (Adaptive Replication)?
Yes, indeed (although when I've noticed AR in PPS I've considered this acceptable - this is not a conjecture).
BTW, another point of possible waste of resources could be SoB. I could be mistaken, but, as far as I know, the SoB (non-PG part, of course), is not DC'ed at all. To be exact, some of their early ranges had partial verification (kind of manual AR) - the result of this DC was two missed primes... I found no traces that they changed to DC or did any new manual verification, so I'm considering their recent ranges completely untrusted. I'll be glad to hear that I'm wrong.
| |
|
|
From my point of view, AR with strict settings could be acceptable to make life of cheaters hard and minimize number of possible missed factors. AR coefficients must be tweaked in such a way that:
1. Even for most reliable hosts, probability of workunit to be DC'ed must be not less then 5-10%.
2. Host must return significant number (e.g. 50-100) of valid workunits before it's considered reliable.
3. A single "inconclusive" error must immediately return host to "unreliable" mode.
4. Host with multiple inconclusive results could be locked to "unreliable" mode or even banned from this subproject.
5. PPS sieve still returns significant number of factors per workunit, so results without factors must be forced DC'ed. Unfortunately, it's not possible for CPU sieves without significant increase of workunit size.
[...snip...]
Points 1, 2 and 3 make sense to me, although I somehow have the idea that at the moment to be considered reliable you have to have valid tasks than you currently suggest. Not sure though, as that's something TPTB arrange.
Point 4 is an absolute no go. A simple update for your OS, your graphicsdrivers or a borked virusscanner can all cause invalid results. Not to mention failing cooling systems, power outtages, uncaught bugs in the software, trying to overclock or random bad luck.
So banning from a subproject is really a bad idea if you ask me. And getting back to 'not trusted queue' seems punishment enough.
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
Does anyone see the painful irony of a PPS sieve DC and a PPS LLR AR (Adaptive Replication)?
5. PPS sieve still returns significant number of factors per workunit, so results without factors must be forced DC'ed.
When the cheating happened, it was in a range where many WUs were returning empty even for PPS Sieve. As I recall, I proposed adding some kind of checksum to PPS Sieve WUs, so that WUs could reliably be checked with AR, but that idea was rejected for some reason. Of course, this would be easier if we abandoned the CPU app, which might move this out of the scope of this thread.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
As I recall, I proposed adding some kind of checksum to PPS Sieve WUs, so that WUs could reliably be checked with AR, but that idea was rejected for some reason. Of course, this would be easier if we abandoned the CPU app, which might move this out of the scope of this thread.
I'll let Jim speak for himself with regards to the validator, but from my perspective I would certainly welcome some sort of checksum in the sieve.
I'd go so far as to say that in a perfect world, all of the apps should have such a checksum.
I too remember the prior discussion, and I don't remember why we didn't do it. It wasn't because of a lack of interest on my part. Perhaps it was merely a desire to get the existing app back into production first before thinking about a change like that.
____________
My lucky number is 75898524288+1 | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
When it comes to cheating it appears that cobblestones have overruled common sense. The pursuit of efficiency and best use of resources has been replaced by the protection of a meaningless credit.
...the above part of his post strikes me as painfully naive regarding the social and psychological efficiency of distributed computing.Â
Ah man, Scott, you failed to also include the trailing paragraph connected to the one above:
Don't get me wrong, cobblestones are an important part of running projects on BOINC but when it trumps finding a good solution, something is broken.
Maybe a simpler way of putting it is that the cobblestone itself is meaningless but the value it carries in the BOINC community is priceless. Â ;) or should that be vice versa? Heck, the semantics probably fail in that statement as well.
You, of all people, know my history in regards to volunteering, BOINC, and project management. I may have been out of touch for the past 2 years, but I haven't forgotten what a cobblestone is.
The goal here is to simply find a better solution.
____________
| |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
BTW, another point of possible waste of resources could be SoB. I could be mistaken, but, as far as I know, the SoB (non-PG part, of course), is not DC'ed at all. To be exact, some of their early ranges had partial verification (kind of manual AR) - the result of this DC was two missed primes... I found no traces that they changed to DC or did any new manual verification, so I'm considering their recent ranges completely untrusted. I'll be glad to hear that I'm wrong.
SoB had a DC effort that found the missed primes. Last I recall it was in the 12M-13M range but that was over 2 years ago. I have no clue where it is now.
____________
| |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
I'm curious, for the upcoming challenge project, TRP Sieve, in the past 8 months how many results have come back as invalid? And of those invalids, how many are attributed to people attempting to cheat and how many are due to other issues (truncated factors/boundary)?
____________
| |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2392 ID: 1178 Credit: 18,665,844,835 RAC: 6,925,862
                                                
|
I have hidden my former post that referred to John as being "painfully naive" and wish to publicly apologize for it.
Regarding the issue in this thread, my opinion is that if the cheating issues can be held in check otherwise, then the DC efforts on the sieves at PG should be eliminated.
| |
|
KEP Send message
Joined: 10 Aug 05 Posts: 301 ID: 110 Credit: 12,352,853 RAC: 103
          
|
Regarding the issue in this thread, my opinion is that if the cheating issues can be held in check otherwise, then the DC efforts on the sieves at PG should be eliminated.
Well I do believe that checksum should be possible to create, or at least something that works similar to LLR, it is just a case of having the brains and the muscles work together, in this case the brains is the people like Axn who can propably come up with a checksum system that is hard to mimick by the cheaters and somewhat possible to impliment without much distress and constraint by the muscles (Rogue who can program the checksum) :)
Just my 2 cents :) | |
|
|
If I understand Lennart's first post, the only risk of faulty sieve WU's is missing one or more factors, which could cause an extra WU having to be LLRed.
The main reason I saw to DC sieves was kind of opposite to that one: avoiding a prime to be wrongly ruled out of LLR by a faulty sieve. This would be problematic in conjectures as well as in MEGA PPS.
If that risk does not exist, I vote for ending DC on sieves or, at least, for moving to an AR system that largely minimizes the risk of "cheating".
An "accurate" sieve file is a nice idea. However, we can expect to have much faster hardware and software in a near future, that could do a full DC, if that file became necessary, with a fraction of the cost of a full DC at the present time. So, if pursuing that "nice idea" now means a huge waste of resources - in the sense that they could be used for deeper sieving or primality testing - it should be postponed.
I had my share of wasted resources with Primegrid (actually, with PRPNet)- a few hundreds of hours of GPU work crunching Wall-Sun-Sun, that had later to be redone. That time, it was nobody's fault.
This time, we - or you - have a chance to chose avoiding this kind waste.
There's a sieve challenge starting in a few hours. Probably, a huge amount of resources will be channeled to TRP Sieve on he next couple of days. Half of those resources will be spent doing DC work. How deeper could we go with single pass?
Despite the last paragraph, I think that a full DC should be kept during the challenge. Not only because the challenge series has been very competitive this year, but also because the challenge data might shed some light on the usefulness (or, hopefully, on the uselessness) of DCing sieves.
Please forgive the length of this post. I found my 13th top5000 prime today. I know that would not be possible without the sieving work some of you have been doing. But I'm also wondering if it could be the 20th prime (or a larger one) if sieving DC was off.
Cheers,
____________
676754^262144+1 is prime | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
I want to put a few misconceptions to rest.
First of all, this idea isn't up for a vote, and I'm not soliciting opinions. This is a decision that was made a while ago, for reasons I'll explain (but will not debate). Feel free to express your opinions if you wish, but please do so with the understanding that this is not something we're looking to change in the near future.
There's three reasons for the DC, with the common thread being that without it we can't *detect* a problem. There could be no problem, a little problem, or a huge 100% problem, but without the checking we have no way of knowing. Here's the events that led up to the decision:
1) The PPS Sieve cheating went on for many months completely undetected. It was a fluke that we caught it. In the end we had to resieve months of work and we're not at all sure that there's not huge numbers of factors that were missed in other ranges that weren't resieved. We honestly don't even know the scope of the damage. Jim tabulated the number of extra factors found when we redid the sieve. It was around 7 or 10 million (I forget which.)
2) The PPS-Sieve ATI app was broken, and it was broken the beginning. It's been missing factors all the way back to 2010 or 2011, whenever it came online. Based on a limited number of data points, some 10% of all the factors that should have been found on ATI tasks were missed. There's no way of getting those back unless we resieve everything done in the last 4 years.
3) We discovered earlier this year that the Android sieve app, used for TRP-Sieve (and potentially for ESP-Sieve) was completely broken. It **ALWAYS** returned "no factors". 100% of the time. So every task done on Android missed 100% of the factors.
Those last two events have also led to an increased testing regimen for apps, as well as the increased focus on double checking.
In short, the real reason for double checking isn't so much to catch the occasional missed factor. It's to detect problems, such as cheating or bad software, or any other problem. Without the DC, we don't know when something has broken. The loss of efficiency is the cost necessary to improve quality by having an early warning system for problems.
Based on 2 and 3 above, I'm also considering whether we should adopt a policy of temporarily doing full DC on any project for a period of time (perhaps 1 to 3 months) following the installation of new versions of any app. At the present time, only PPS, PPSE, and SGS aren't already full DC, so this would only apply to those apps when a new version of LLR comes out. With the last LLR release for FMA, we didn't do this, but we did several months worth of testing of the LLR app.
We may reconsider the DC decision for the sieves at some future time if the conditions warrant, but for now it's a done deal.
____________
My lucky number is 75898524288+1 | |
|
|
I want to put a few misconceptions to rest.
First of all, this idea isn't up for a vote, and I'm not soliciting opinions. This is a decision that was made a while ago, for reasons I'll explain (but will not debate). Feel free to express your opinions if you wish, but please do so with the understanding that this is not something we're looking to change in the near future.
There's three reasons for the DC, with the common thread being that without it we can't *detect* a problem. There could be no problem, a little problem, or a huge 100% problem, but without the checking we have no way of knowing. Here's the events that led up to the decision:
1) The PPS Sieve cheating went on for many months completely undetected. It was a fluke that we caught it. In the end we had to resieve months of work and we're not at all sure that there's not huge numbers of factors that were missed in other ranges that weren't resieved. We honestly don't even know the scope of the damage. Jim tabulated the number of extra factors found when we redid the sieve. It was around 7 or 10 million (I forget which.)
2) The PPS-Sieve ATI app was broken, and it was broken the beginning. It's been missing factors all the way back to 2010 or 2011, whenever it came online. Based on a limited number of data points, some 10% of all the factors that should have been found on ATI tasks were missed. There's no way of getting those back unless we resieve everything done in the last 4 years.
3) We discovered earlier this year that the Android sieve app, used for TRP-Sieve (and potentially for ESP-Sieve) was completely broken. It **ALWAYS** returned "no factors". 100% of the time. So every task done on Android missed 100% of the factors.
Those last two events have also led to an increased testing regimen for apps, as well as the increased focus on double checking.
In short, the real reason for double checking isn't so much to catch the occasional missed factor. It's to detect problems, such as cheating or bad software, or any other problem. Without the DC, we don't know when something has broken. The loss of efficiency is the cost necessary to improve quality by having an early warning system for problems.
Based on 2 and 3 above, I'm also considering whether we should adopt a policy of temporarily doing full DC on any project for a period of time (perhaps 1 to 3 months) following the installation of new versions of any app. At the present time, only PPS, PPSE, and SGS aren't already full DC, so this would only apply to those apps when a new version of LLR comes out. With the last LLR release for FMA, we didn't do this, but we did several months worth of testing of the LLR app.
We may reconsider the DC decision for the sieves at some future time if the conditions warrant, but for now it's a done deal.
What you are saying is that you did put a untested new app in production ??
And a second app. ATI app also untested ??
Do you think you could test the app's before releasing them ??
Lennart
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
What you are saying is that you did put a untested new app in production ??
And a second app. ATI app also untested ??
Do you think you could test the app's before releasing them ??
Lennart
Actually, the ATI app predates Jim and I by a few years. However, given the nature of that particular error, I can understand how the persons in charge at the time might have missed that. Errors do sometimes escape detection in testing.
I take full responsibility, however, for not adequately testing the Android app. None of us, at the time, had an Android device to personally test it, and we relied on users testing it with app_info. We'll try not to make that mistake again!
____________
My lucky number is 75898524288+1 | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
The main reason I saw to DC sieves was kind of opposite to that one: avoiding a prime to be wrongly ruled out of LLR by a faulty sieve. This would be problematic in conjectures as well as in MEGA PPS.
This risk has never existed. The resolved problems experienced with the sieves are that of missing a factor. A missed factor only means the the candidate must be LLR'd. Never is a candidate removed without first confirming it has a factor.
Also, as mentioned earlier, a missed factor is only a penalty after a candidate has been LLR'd as it could have saved that primality test by removing the candidate.
____________
| |
|
compositeVolunteer tester Send message
Joined: 16 Feb 10 Posts: 1149 ID: 55391 Credit: 1,097,478,508 RAC: 755,733
                        
|
we relied on users testing it with app_info. We'll try not to make that mistake again!
Hear hear! May I further suggest that new apps be used only for DC ranges during initial production rollout (the "parallel testing" period)? Users for whom cobblestones have utmost importance will not be bothered at all that they are using the latest and greatest version of an app and not doing first-pass tests, as long as they are accruing credit. | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
First of all, this idea isn't up for a vote, and I'm not soliciting opinions. This is a decision that was made a while ago, for reasons I'll explain (but will not debate). Feel free to express your opinions if you wish, but please do so with the understanding that this is not something we're looking to change in the near future.
Well that certainly dampens the mood a bit. Hopefully it doesn't deter anyone from contributing to the dialogue, maybe even brain storming for a better solution. Â Heck, a good solution has already been submitted (apparently over 8 months ago as well). I'm not sure why PG doesn't encourage open discussion and collaboration to problem solve relatively simple issues.
And as for the topic of DCing sieves and the inefficiency of that, I thought it couldn't possibly get worse. However, I stand corrected:
In short, the real reason for double checking isn't so much to catch the occasional missed factor. It's to detect problems, such as cheating or bad software, or any other problem. Without the DC, we don't know when something has broken. The loss of efficiency is the cost necessary to improve quality by having an early warning system for problems.
The real reason isn't even to catch factors but to catch cheaters and bad software? Â The PG community really deserves better than this. I'm at a loss for words. Â That even succeeded in silencing me for now.
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
First of all, this idea isn't up for a vote, and I'm not soliciting opinions. This is a decision that was made a while ago, for reasons I'll explain (but will not debate). Feel free to express your opinions if you wish, but please do so with the understanding that this is not something we're looking to change in the near future.
Well that certainly dampens the mood a bit. Hopefully it doesn't deter anyone from contributing to the dialogue, maybe even brain storming for a better solution. Â Heck, a good solution has already been submitted (apparently over 8 months ago as well). I'm not sure why PG doesn't encourage open discussion and collaboration to problem solve relatively simple issues.
I'd like to think we do just that. Many of the things we do come from suggestions on the forums. You'all are free to discuss this as you wish, and it wouldn't surprise me at all if we get some useful suggestions. However, there's some things that *I* shouldn't be discussing on the forums (such as anti-spam or anti-cheating techniques, as well as others I won't mention at all) and that limits the extent to which I can participate. So it's going to be a rather one sided conversation, I'm afraid.
But please continue. We are listening. I just didn't want anyone to get the wrong impression and think this was a topic where we were actively looking to change something. It's not.
At this point, I'd like to comment on the title of this thread:
"DC on sieving using CPU is Only waste of TIME."
I disagree with that. It saves time. What it wastes is computing resources.
By "time", I mean it saves our time. Double checking provides a relatively easy way to deal with the problems I listed, which otherwise will occupy the admins' time. Call it laziness if you will, but from my perspective I'd rather have Jim and I working on enhancements to PrimeGrid rather than constantly putting out fires related to the sieves.
From my perspective, it's a no brainer. (Go ahead and make the joke about my not having a brain. I set you up for it, after all.) We can throw a switch and not have to worry so much about cheating and bad software, and then work on things such as improving the reliability of the server (Notice the 100% up time on the status page? Ever see that before?), improving the credit calculations, improving the estimated time calculations, bringing new projects online, and so forth.
Or we could spend a lot more time trying to build a reliable system to make sieving more robust without double checking. Please note that we HAVE already done exactly that. We're not big fans of double checking sieves either. It hasn't worked out as well as we hoped, and turning on DC was a last resort in an effort that's been going on for the last 18 months. When I say this isn't likely to change, that's because this isn't the beginning of this discussion. For us, it's been something we've been working on for a year and a half, and DC is all that's left to try.
For me, it's an easy choice, although I recognize there's people in both the user community and within PrimeGrid that disagree. We've already spent enough time on the sieve problems. Going forward, I'd rather spend time on things that will be more important. Computing time is something we have a lot of. We're very limited when it comes to human time.
Question:
(And slight change of topic.)
Clearly, if we DC a sieve, we're doing half as much sieving, in that after X number of tasks, P has advanced by half as much. That's only part of the equation, however.
I'm not an expert on the mechanics and mathematics of sieving, so I'm asking for some "crowd" help here.
In all the sieves I've looked at, as P increases, the density of factors drops off dramatically. On the GFN sieve, for example, it's very predictable: Every time you double P, you get half as many factors for the same amount of sieving. So as you get deeper and deeper into the sieve, you find fewer and fewer factors. At the beginning of the sieve, you'll find millions of factors in a task; by the end of the sieve most tasks will find no factors at all.
With DC, we'll only sieve to half the P depth, but that doesn't mean we'll find half the factors. The part of the sieve we'll be skipping has far fewer factors than the part we'll be dong.
Here's my question. Actually, questions.
1) Do all sieves drop off the way GFN does?
2) Anyone want to try to come up with a formula for how many fewer factors we'll find because of DC? It's clearly less than 50%, and my gut feeling is it's a lot less than 50%. The question is how much less.
____________
My lucky number is 75898524288+1 | |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
1) Do all sieves drop off the way GFN does?
2) Anyone want to try to come up with a formula for how many fewer factors we'll find because of DC? It's clearly less than 50%, and my gut feeling is it's a lot less than 50%. The question is how much less.
1. Yes-ish.
2. I'll leave out the calculation for GFN sieve, but for the others, use this simple formula. Number of candidates sieved out by sieving from p1 to p2 is N*(1-log(p1)/log(p2)), where N is the number of candidates we have after sieving to p1.
If we take p1=1P (10^15), and p2=2P. That amounts to N*0.019 or about 1.9% candidates sieved out. Feel free to plugin different values for p1 and p2.
Incidentally, due to the existence of this formula, we can pretty accurately estimate how much factors we're supposed to find for specific (large-ish) sieve ranges.
EDIT:- Formula not applicable for SGS where we do triple or quad sieve. It is not part of this discussion, but just thought everyone should know. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
1) Do all sieves drop off the way GFN does?
2) Anyone want to try to come up with a formula for how many fewer factors we'll find because of DC? It's clearly less than 50%, and my gut feeling is it's a lot less than 50%. The question is how much less.
1. Yes-ish.
2. I'll leave out the calculation for GFN sieve, but for the others, use this simple formula. Number of candidates sieved out by sieving from p1 to p2 is N*(1-log(p1)/log(p2)), where N is the number of candidates we have after sieving to p1.
If we take p1=1P (10^15), and p2=2P. That amounts to N*0.019 or about 1.9% candidates sieved out. Feel free to plugin different values for p1 and p2.
Incidentally, due to the existence of this formula, we can pretty accurately estimate how much factors we're supposed to find for specific (large-ish) sieve ranges.
EDIT:- Formula not applicable for SGS where we do triple or quad sieve. It is not part of this discussion, but just thought everyone should know.
Thanks, axn.
That's lower than I thought. Cutting the sieve duration in half reduces the number of factors found by less than 2 percent. Playing with the numbers a bit, it looks like the percentage goes down as p1 and p2 get higher.
That's very, very helpful. Thanks!
____________
My lucky number is 75898524288+1 | |
|
|
1) Do all sieves drop off the way GFN does?
2) Anyone want to try to come up with a formula for how many fewer factors we'll find because of DC? It's clearly less than 50%, and my gut feeling is it's a lot less than 50%. The question is how much less.
1. Yes-ish.
2. I'll leave out the calculation for GFN sieve, but for the others, use this simple formula. Number of candidates sieved out by sieving from p1 to p2 is N*(1-log(p1)/log(p2)), where N is the number of candidates we have after sieving to p1.
If we take p1=1P (10^15), and p2=2P. That amounts to N*0.019 or about 1.9% candidates sieved out. Feel free to plugin different values for p1 and p2.
Incidentally, due to the existence of this formula, we can pretty accurately estimate how much factors we're supposed to find for specific (large-ish) sieve ranges.
EDIT:- Formula not applicable for SGS where we do triple or quad sieve. It is not part of this discussion, but just thought everyone should know.
Thanks, axn.
That's lower than I thought. Cutting the sieve duration in half reduces the number of factors found by less than 2 percent. Playing with the numbers a bit, it looks like the percentage goes down as p1 and p2 get higher.
That's very, very helpful. Thanks!
That is not changing anything .
If the sieve to 1P takes a year it will be 2 year with DC.
Lennart | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
Today:
That is not changing anything .
If the sieve to 1P takes a year it will be 2 year with DC.
Lennart
Of course it does. You said it yourself that turning on DC will change the "optimal sieving point".
Last week:
[8/3/2014 8:31:56 AM] Lennart Vogel: if we dc siieve wu we have to lower our calculated optimal sieve deep /2
[8/3/2014 8:32:01 AM] Lennart Vogel: or more
[8/3/2014 8:32:21 AM] Lennart Vogel: that means we will stop sieving on a much lower P
You had it right the first time.
Going by what you said last week, and taking axn's calcularions into the account, that means the implication of double checking is that we'll find a bit less than 2% fewer factors.
____________
My lucky number is 75898524288+1 | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1033 ID: 301928 Credit: 543,624,271 RAC: 6,563
                         
|
At this point, I'd like to comment on the title of this thread:
"DC on sieving using CPU is Only waste of TIME."
I disagree with that. It saves time. What it wastes is computing resources.
I'd rather rephrase the title.
Complete DC on sieving using CPU is Only waste of TIME.
I agree that necessity of DC out of question. I do have some experience with distributed computing. I know how much damage even a single overheated host or buggy application could do.
But considering that fact that sieving is not a conjecture and few missed results are allowed, we could discuss a way how we could improve our computing "efficiency" from current 50% to 80-90%, still detecting bad hosts and cheaters quickly, before they would do much damage. This, and only this could be point of discussion.
Currently the only proposed way was adaptive replication with strict settings ("hard to gain, easy to lose"). Any other ideas?
| |
|
|
Since we have a way to calculate density of factors at a specific P (or we can have a look at recent figures), then we can expect a certain number of factors per a number of tasks. Say we expect one factor to be returned per 100 tasks, then if a host doesn't return a factor after 300 WUs, we DC those tasks to see if there's a problem there.
Also at some depth of a sieve when factors become so scarce so there's no point to DC, we can stop DCing and continue sieving, only double-checking suspicious hosts occasionally.
I was also thinking about working out a point of optimal sieve depth. Since we know all factors returned for a range, is it possible to estimate time saved by sieving? Say we choose an i7-4770 as a default CPU and estimate the time it would take to LLR all factors found for a range and compare the time with the period it would take to sieve the range.
For instance, we sieve TRP 'n' between 10m and 50m, but not all factors are found at a higher 'n'. We could get most factors in a lower 'n' value and that could make a difference.
I think this way we could better estimate the point where to stop sieving.
I think that GPU sieving depth is a little bit more difficult to understand. First, it's difficult to compare GPU to CPU time spent. We could compare watts used?
Second, if at some point we decide that we should stop GPU sieve and don't want to sieve a higher n-range, then we can just loose all our GPU power to other projects. Not all of our members will move their video cards to GFN. I'm saying that GPU power is "free" for PrimeGrid and as long as we have a good paying [cobblestones] sieve like PPS, we can freely use GPUs. But once we stop sieving, we will loose them. So we shouldn't stop sieving at any point as it's free.
These are my 2 cents. Thanks. | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,143,342,172 RAC: 2,285,158
                                      
|
Since we have a way to calculate density of factors at a specific P (or we can have a look at recent figures), then we can expect a certain number of factors per a number of tasks. Say we expect one factor to be returned per 100 tasks, then if a host doesn't return a factor after 300 WUs, we DC those tasks to see if there's a problem there.
Well, there are hosts never returning more than couple of WU per subproject.
Host will get upgraded, migrated to another subproject, get removed, has no subprojects preference hence doing all of them, running only a small percentage on PG (or turned on only once in a while) and many other reasons.
In another way, it may be impractical to wait unil recounting of factors found per host is possible (and storing all data in DB alongside).
Also at some depth of a sieve when factors become so scarce so there's no point to DC, we can stop DCing and continue sieving, only double-checking suspicious hosts occasionally.
What is suspicious host if we don't DC and first paragraph can't be used?
I was also thinking about working out a point of optimal sieve depth. Since we know all factors returned for a range, is it possible to estimate time saved by sieving?
I believe it is already done this way.
I think that GPU sieving depth is a little bit more difficult to understand. First, it's difficult to compare GPU to CPU time spent. We could compare watts used?
I believe this has been taken into account in case of GFN recently - where GPU sieving came relatively recently (and has been previously done only on CPUs), optimal sieve depth was re-estimated and changed.
Second, if at some point we decide that we should stop GPU sieve and don't want to sieve a higher n-range, then we can just loose all our GPU power to other projects. Not all of our members will move their video cards to GFN. I'm saying that GPU power is "free" for PrimeGrid and as long as we have a good paying [cobblestones] sieve like PPS, we can freely use GPUs. But once we stop sieving, we will loose them. So we shouldn't stop sieving at any point as it's free.
Well, this one looks to me like leading to support DC.
I'm more worried about losing GPUs due to errors in app (ie. WWWW) than running out of work (reaching optimal sieve depth).
In a long term, PG will have more LLR to offer comparing to sieving.
Yes, we may loose some GPUs (those looking for cobblestones) but will always be hungry for CPUs...and megaprimes!
____________
My stats | |
|
|
Double check everything, without exception.
--G | |
|
|
Double check everything, without exception.
--G
I think you need to explain why you think that way.
I also think we need to ask our self " why are we sieving"
It seems that many don't know the purpose of sieving.
What do we gain with sieving.
Lennart
| |
|
|
I find the reasoning made earlier in this thread regarding detecting errors in the software to be compelling. Sure, if we only miss 1% of the factors and therefore needlessly LLR them, OK, I can see where sieving double-check in that case is overkill and a waste. But what if we're missing 50% of the factors. Or 90%. We don't know what that ratio is without a reasonable mechanism to notice it.
Do we put untested software out there? No, hopefully not. I've tested here myself quite a bit. But software is written by humans, and tested by humans, and humans make mistakes... no matter how much "beta" test is done. Maybe the double-check is like a "peer review" prior to publication in academic journals (that's far from a perfect analogy, I know).
As for the "cheating" aspect, well, detecting cheating is good, just on a moral basis, and that makes me feel good and want to continue to contribute. As for *why* someone would want to cheat, I believe I understand the reasons, but they are better discussed by a psychologist than a computer programmer, and is off-topic, I guess.
Perhaps some guidance can be provided by the number of "invalid" tasks that come out of the current TRP Sieve challenge.
Cheers,
--Gary | |
|
|
I find the reasoning made earlier in this thread regarding detecting errors in the software to be compelling. Sure, if we only miss 1% of the factors and therefore needlessly LLR them, OK, I can see where sieving double-check in that case is overkill and a waste. But what if we're missing 50% of the factors. Or 90%. We don't know what that ratio is without a reasonable mechanism to notice it.
Do we put untested software out there? No, hopefully not. I've tested here myself quite a bit. But software is written by humans, and tested by humans, and humans make mistakes... no matter how much "beta" test is done. Maybe the double-check is like a "peer review" prior to publication in academic journals (that's far from a perfect analogy, I know).
As for the "cheating" aspect, well, detecting cheating is good, just on a moral basis, and that makes me feel good and want to continue to contribute. As for *why* someone would want to cheat, I believe I understand the reasons, but they are better discussed by a psychologist than a computer programmer, and is off-topic, I guess.
Perhaps some guidance can be provided by the number of "invalid" tasks that come out of the current TRP Sieve challenge.
Cheers,
--Gary
Valid 527333
Invalid 2
Lennart
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
Perhaps some guidance can be provided by the number of "invalid" tasks that come out of the current TRP Sieve challenge.
Cheers,
--Gary
Here's the full statistics from a few hours ago, with the relevant numbers highlighted:
Challenge: Perseid Shower
(As of 2014-08-14 06:31:57 UTC)
671864 tasks have been sent out. [CPU/GPU/anonymous_platform: 671864 (100%) / 0 (0%) / 0 (0%)]
Of those tasks that have been sent out:
5347 (1%) came back with some kind of an error. [5354 (1%) / 0 (0%) / 0 (0%)]
444985 (66%) have returned a successful result. [446178 (66%) / 0 (0%) / 0 (0%)]
222265 (33%) are still in progress. [222447 (33%) / 0 (0%) / 0 (0%)]
Of the tasks that have been returned successfully:
128778 (29%) are pending validation. [129070 (29%) / 0 (0%) / 0 (0%)]
316204 (71%) have been successfully validated. [317105 (71%) / 0 (0%) / 0 (0%)]
1 (0%) were invalid. [1 (0%) / 0 (0%) / 0 (0%)]
2 (0%) are inconclusive. [2 (0%) / 0 (0%) / 0 (0%)]
The numbers need a little bit of interpretation because of the nature of this particular sieve.
From an error detection standpoint, this very old program isn't well designed. When no factors are found, the program produces no output at all. It's therefore impossible to tell if "no factors" is actually the correct result, or the program malfunctioned in some way. When you're relatively close to the end of the sieving process, as we are now, "no factors" is actually the correct result in most of the tasks, so it's very hard to distinguish a correct result from an erroneous result. (We're considering also looking at the stderr output, but doing so has problems of its own.)
There are in fact, more errors than are shown there, but it's difficult to know exactly how many. We use a too-short run time and cpu time to determine that the task is faulty, but that only catches some of them because the time needs to be short enough to allow the fastest computers to not have good results invalidated. This permits slower computer that fail later on to pass validation.
Because of the way we process the results, a lot of the "returned with some kind of error" tasks were actually returned as "successful" by the BOINC client and were invalidated by some of the error checking that Jim's added to the validator. They used to get counted as valid results.
It's not a great situation, to be honest, but it's the best we have right now.
Even with DC on, a lot of bad results will still pass and get credit. It's impossible to know how many of the validated tasks were actually faulty. It's an "even a broken clock still shows the correct time twice a day" sort of thing. But at least there's two chances to find every factor.
So there are the raw statistics, but unfortunately the true answer to your question is really unknown.
____________
My lucky number is 75898524288+1 | |
|
|
Do ESP and PPS sieves have any output other than the factors file?
____________
676754^262144+1 is prime | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,251,659 RAC: 825,321
                               
|
Do ESP and PPS sieves have any output other than the factors file?
ESP is identical; it's the same software.
If I remember correctly, the PPS sieve was modified to produce an "I'm done" message of some sort earlier this year.
____________
My lucky number is 75898524288+1 | |
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 940 ID: 3110 Credit: 261,913,874 RAC: 11,928
                            
|
The PPS Sieve was modified to say "No factors" when no factors are found some years ago. It's just one version wasn't upgraded at PrimeGrid for awhile.
____________
| |
|
Message boards :
Sieving :
DC on sieving using CPU is Only waste of TIME. |