I had lots of fun procrastinating yesterday, since I decided to analyse the sieve statistics which are available on this page: http://www.primegrid.com/sieving/gfn/
Especially, I wanted to find out what is the "removed from sieve" rate at specific depths. This "removed from sieve" number tells how much candidates under 100M were removed at the sieved range (Under 400M for GFN15-GFN16). I focused on only on GFN17-GFN22
This is what we see when we plot the rates logarithmically:
X axis is the sieve depth (in P, Peta), and Y axis is the removed from sieve value.
You can immediately see that the rate seems to be a logartichmic power law, most clearly between 200P and 20000P. Under this the ranges are relatively large, causing a flat interval because we only get one value for the range sieved. After 20000P there are a lot of small ranges sieved combined with inaccuracy caused by the sieverate only showing the first decimal when going to smaller rates.
So, since the rates seem to be a straight line (i.e. a power law), we can fit a line through the values between 200P and 20000P. From this fit we get an equation for the sieverate for all the 6 project, which are as follows:
GFN17: 10^(5.699895611539641 - 1.047809735406364 * log10(P))
GFN18: 10^(5.770189918802291 - 1.047054122505504 * log10(P))
GFN19: 10^(5.773619230259421 - 1.047460916471634 * log10(P))
GFN20: 10^(5.822252032580029 - 1.048505447896963 * log10(P))
GFN21: 10^(5.857281943722451 - 1.045637924155947 * log10(P))
GFN22: 10^(5.893575606716400 - 1.046124290520683 * log10(P))
So, you can use these numerical equations to estimate the "removed from sieve" rates for a specific depth P.
Using the equation, we can predict the current rates of "removed from sieve", which are as follows (latest real rate of one large sieve interval in parentheses):
GFN17: 1.607641841985315 (1.6)
GFN18: 2.157718107008705 (2.2)
GFN19: 1.307389966120157 (1.3)
GFN20: 2.321153593850324 (2.4)
GFN21: 1.110968091110168 (1.2)
GFN22: 1.024786816287452 (1.0)
This result tells how many are removed from values under b<100M for each P (=Remove density). But if we assume that the removals are evenly distributed (which is roughly the case), you can just multiple the value by the fraction you are interested in. So if you would be interested in ranges 0-1M or 50M-51M, you would multiply this result by 0.01.
You can also predict how many removals sieving a specific range would net. Taking 150050P to 200200P of GFN19 as an example, the equation sums up to 96585 removals in this range. The real number of removals was 97266, so there was a 0.7% difference between the equation and the reality. Similarly we can predict that sieving GFN17 from 175000P to 200000P would net about 37449 removals.
This equation works quite well for approximating rates from 1P onwards, but it doesnt tell how much removals are done at the start of the sieve, since the equation doesnt work for values between 0 and 1P (log10(0)= -Infinity). But it does seem to work after the first sieve range in the stats:
GFN17 predicted removals between 75P to 175000P: 2644000
GFN17 real number of removals 75P to 175000P: 2635698 (actually there's a few ranges still sieving)
GFN20 predicted removals between 50P to 160000P: 3674325
GFN20 real number of removals 50P to 160000P: 3674668 (Thats only a 0,009% error)
I dunno if there already is a some analytical equation for the stuff I've calculated (maybe related to the value of π(x)/(x / log x)? See https://en.wikipedia.org/wiki/Prime_number_theorem ), but its always nice to have numerical verification :)
This method could also work for other types of sieve, if similar stats are available somewhere.
All feedback, corrections and ideas are very welcome :)