Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummerslowrise

Message boards :
Sieving :
Numerical equation for GFN sieve speed at specific depth
Author 
Message 

I had lots of fun procrastinating yesterday, since I decided to analyse the sieve statistics which are available on this page: http://www.primegrid.com/sieving/gfn/
Especially, I wanted to find out what is the "removed from sieve" rate at specific depths. This "removed from sieve" number tells how much candidates under 100M were removed at the sieved range (Under 400M for GFN15GFN16). I focused on only on GFN17GFN22
This is what we see when we plot the rates logarithmically:
X axis is the sieve depth (in P, Peta), and Y axis is the removed from sieve value.
You can immediately see that the rate seems to be a logartichmic power law, most clearly between 200P and 20000P. Under this the ranges are relatively large, causing a flat interval because we only get one value for the range sieved. After 20000P there are a lot of small ranges sieved combined with inaccuracy caused by the sieverate only showing the first decimal when going to smaller rates.
So, since the rates seem to be a straight line (i.e. a power law), we can fit a line through the values between 200P and 20000P. From this fit we get an equation for the sieverate for all the 6 project, which are as follows:
GFN17: 10^(5.699895611539641  1.047809735406364 * log10(P))
GFN18: 10^(5.770189918802291  1.047054122505504 * log10(P))
GFN19: 10^(5.773619230259421  1.047460916471634 * log10(P))
GFN20: 10^(5.822252032580029  1.048505447896963 * log10(P))
GFN21: 10^(5.857281943722451  1.045637924155947 * log10(P))
GFN22: 10^(5.893575606716400  1.046124290520683 * log10(P))
So, you can use these numerical equations to estimate the "removed from sieve" rates for a specific depth P.
Using the equation, we can predict the current rates of "removed from sieve", which are as follows (latest real rate of one large sieve interval in parentheses):
GFN17: 1.607641841985315 (1.6)
GFN18: 2.157718107008705 (2.2)
GFN19: 1.307389966120157 (1.3)
GFN20: 2.321153593850324 (2.4)
GFN21: 1.110968091110168 (1.2)
GFN22: 1.024786816287452 (1.0)
This result tells how many are removed from values under b<100M for each P (=Remove density). But if we assume that the removals are evenly distributed (which is roughly the case), you can just multiple the value by the fraction you are interested in. So if you would be interested in ranges 01M or 50M51M, you would multiply this result by 0.01.
You can also predict how many removals sieving a specific range would net. Taking 150050P to 200200P of GFN19 as an example, the equation sums up to 96585 removals in this range. The real number of removals was 97266, so there was a 0.7% difference between the equation and the reality. Similarly we can predict that sieving GFN17 from 175000P to 200000P would net about 37449 removals.
This equation works quite well for approximating rates from 1P onwards, but it doesnt tell how much removals are done at the start of the sieve, since the equation doesnt work for values between 0 and 1P (log10(0)= Infinity). But it does seem to work after the first sieve range in the stats:
GFN17 predicted removals between 75P to 175000P: 2644000
GFN17 real number of removals 75P to 175000P: 2635698 (actually there's a few ranges still sieving)
GFN20 predicted removals between 50P to 160000P: 3674325
GFN20 real number of removals 50P to 160000P: 3674668 (Thats only a 0,009% error)
I dunno if there already is a some analytical equation for the stuff I've calculated (maybe related to the value of π(x)/(x / log x)? See https://en.wikipedia.org/wiki/Prime_number_theorem ), but its always nice to have numerical verification :)
This method could also work for other types of sieve, if similar stats are available somewhere.
All feedback, corrections and ideas are very welcome :)  

JimBHonorary cruncher Send message
Joined: 4 Aug 11 Posts: 918 ID: 107307 Credit: 977,945,376 RAC: 0

After 20000P there are a lot of small ranges sieved combined with inaccuracy caused by the sieverate only showing the first decimal when going to smaller rates.
You've got the exact size of the sieving range and the exact count of factors found and removed. I think you can figure out the rate yourself to as many digits as you'd like.  


After 20000P there are a lot of small ranges sieved combined with inaccuracy caused by the sieverate only showing the first decimal when going to smaller rates.
You've got the exact size of the sieving range and the exact count of factors found and removed. I think you can figure out the rate yourself to as many digits as you'd like.
Actually I mispoke. I did not use "removed density" in any of my calculations, except when comparing to the equation. Just a brain fart on my part.
The real reason for the higher variance is most likely that when the sieverate goes down, the variance in small sieve ranges (only 1P for example) will be very large. Here's a prime example:
Here is also the image from the first post, since the link seems to have stopped working:
 

Yves GallotVolunteer developer Project scientist Send message
Joined: 19 Aug 12 Posts: 803 ID: 164101 Credit: 305,701,038 RAC: 5,414

You can find the formula for the expected number of candidates at:
http://www.primegrid.com/forum_thread.php?id=4417&nowrap=true#57237
We have #candidates = e^gamma * C_n * B_max / log(p_max)
where
C_15 = 5.80266
C_16 = 11.1957
C_17 = 11.0043
C_18 = 13.0078
C_19 = 13.0724
C_20 = 14.5167
C_21 = 16.0846
C_22 = 17.4099
 


You can find the formula for the expected number of candidates at:
http://www.primegrid.com/forum_thread.php?id=4417&nowrap=true#57237
We have #candidates = e^gamma * C_n * B_max / log(p_max)
where
C_15 = 5.80266
C_16 = 11.1957
C_17 = 11.0043
C_18 = 13.0078
C_19 = 13.0724
C_20 = 14.5167
C_21 = 16.0846
C_22 = 17.4099
Ooh, nice.
So for C_20, B_max=1E8 and we obtain
#candidates = e^(0.5772156649) * (14.5167*10^8) / log(p_max)
And in log10:
#candidates = 10^8.5490 /log10(p_max)
For GFN20 50P to 160000P this formula gives a difference of
10^8.5490 /log10(50*10^15)10^8.5490 /log10(160000*10^15)
= 3677706
It is of course also possible obtain the sieverate from this or vice versa. Nice to see that my shenanigans are working.
PS: it seems my image links from dropbox keep dying...  

robishVolunteer moderator Volunteer tester
Send message
Joined: 7 Jan 12 Posts: 2197 ID: 126266 Credit: 7,324,010,506 RAC: 3,147,376

You can find the formula for the expected number of candidates at:
http://www.primegrid.com/forum_thread.php?id=4417&nowrap=true#57237
We have #candidates = e^gamma * C_n * B_max / log(p_max)
where
C_15 = 5.80266
C_16 = 11.1957
C_17 = 11.0043
C_18 = 13.0078
C_19 = 13.0724
C_20 = 14.5167
C_21 = 16.0846
C_22 = 17.4099
Ooh, nice.
So for C_20, B_max=1E8 and we obtain
#candidates = e^(0.5772156649) * (14.5167*10^8) / log(p_max)
And in log10:
#candidates = 10^8.5490 /log10(p_max)
For GFN20 50P to 160000P this formula gives a difference of
10^8.5490 /log10(50*10^15)10^8.5490 /log10(160000*10^15)
= 3677706
It is of course also possible obtain the sieverate from this or vice versa. Nice to see that my shenanigans are working.
PS: it seems my image links from dropbox keep dying...
Nice. Respect. I'll take your word for it. :) :) :)
Edit: just wondering just how many are running that number thru LLR right now!! :D
____________
My lucky numbers 1059094^{1048576}+1 and 224584605939537911+81292139*23#*n for n=0..26  

dukebgVolunteer tester
Send message
Joined: 21 Nov 17 Posts: 242 ID: 950482 Credit: 23,670,125 RAC: 0

PS: it seems my image links from dropbox keep dying...
Yeah, dropbox was never intended for hotlinking like that. You should probably throw the images on imgur. Or, alternatively funkyimg.com also provides image hosting.  

Message boards :
Sieving :
Numerical equation for GFN sieve speed at specific depth 