Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Sieving :
We need your help with GFN Sieving!
Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
We Need Your Help!
In advance of the Alan Turing Year challenge (June 20-23), we are inviting everyone do a little bit of CUDA sieving to help prepare the GFN search for its next steps.
While the Alan Turing Year challenge itself is going to use the short (N=524288) WUs, we need more sieving done on the long (N=4194304) WUs. Fortunately, AXN just wrote a CUDA based sieve that is ridiculously fast. It's so fast that with a little help we should be able to complete the sieve -- to DOUBLE the original goal -- before the challenge starts.
So if you're currently using your GPU(s) to crunch GFN (or anything else) -- especially if you're currently crunching short N=524288 WUs -- please consider switching over to GFN sieving until the challenge starts. Lets save those short WUs for the challenge.
The new sieve program is extremely easy to use, so if you've never done manual sieving before and/or want to get some easy PSA credits, this is the way to do it.
For more information please see the Alan Turing Year Challenge and the Generalized Fermat Number Search.
About the Sieving
Manual PSA credits are available for this sieve.
Within a month, we're going to use up all of the N=4194304 work that's been generated, i.e., all candidates for b < 5000. New work with b starting at 5000 will be generated before that happens.
Work is created from the sieve files. A sieve file is the list of all candidates that have not yet been eliminated by the sieving process. The numbers in the sieve file, therefore, are the numbers that we end up testing with GeneferCUDA and GenefX64. With the new fast CUDA sieve program, we can eliminate candidates from the sieve file over 100 times faster than we can test them with GeneferCUDA, so it's clearly beneficial to do as much sieving as possible.
Currently, the sieve is up to 6000P. (After half a year of sieving on CPUs, it had gotten to about 4000P. That was 7 days ago, when we started using the new CUDA sieve. Now it's at 6000P, after one week.) We need to get to 18446P before generating new work. Given the speed of this sieve, that seems easily attainable with a few more people sieving.
Looking beyond that, there's also sieving to be done at N=524288, N=1048576, and N=2097192. Depending on how many WUs get crunched during the challenge, the next sieving priority after N=4194304 will probably be either N=524288 or N=1048576.
As always, of course, you are free to use your computing resources as you see fit and crunch or sieve whatever you wish.
The sieving itself is coordinated over on the Prime Search Forums. Instructions can be located there, as can lots of help if you need it. You can also ask questions here, of course.
NOTE: By default, the sieve program uses b7 (block size), which gives decent speed and almost no screen lag. Most people prefer to run at b11, which is significantly faster, but does have noticeable lag. It's easy and fast to switch back to b7 when you're doing something on the computer where the lag is intrusive, such as watching video.
____________
My lucky number is 75898524288+1 | |
|
|
Nice to see more GPU apps becoming available :)
Just started cranking out factors on my GTX 570. For comparison, the speedometer goes back and forth between 56P/day and 62P/day.
Resourcewise it uses about 80% GPU without causing lag. CPU use is negligible.
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Nice to see more GPU apps becoming available :)
Just started cranking out factors on my GTX 570. For comparison, the speedometer goes back and forth between 56P/day and 62P/day.
Resourcewise it uses about 80% GPU without causing lag. CPU use is negligible.
What I do (and I think this is also done by most, if not all of the others) is to use the b11 switch to crank up the block size as high as it will go.
This DOES cause some lag -- tolerable for typing, horrible for watching video -- but it significantly increases the speed of the sieve.
Since this program starts and stops so easily, it's easy enough to stop it and restart without the b11 switch if I want to do something on the computer, then restart with b11 again when I'm done.
For a dedicated cruncher, use b11 all the time.
____________
My lucky number is 75898524288+1 | |
|
|
Thanks for the tip. I'll kick it into b11/overdrive overnight :)
*bleep* That's one helluva difference by the way... B11 gives me ~87P/day.
B10 still leaves video playable and gives about 83P/day.
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
*bleep* That's one helluva difference by the way... B11 gives me ~87P/day.
Yeah. :)
That's why everyone likes to run at b11, despite the lag. It's easy enough to switch to b7 when watching video.
____________
My lucky number is 75898524288+1 | |
|
|
Is this particular page down for maintenance or is it just me.
http://uwin.mine.nu/sieves/gfn/instructions/generalized_fermat_number_prime_gfnsvcuda.htm | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Is this particular page down for maintenance or is it just me.
http://uwin.mine.nu/sieves/gfn/instructions/generalized_fermat_number_prime_gfnsvcuda.htm
Looks ok to me.
EDIT: It's possible the server is down and I was seeing a cached copy of the page.
It's probably going to be down for another few hours given that it's the middle of the night there.
____________
My lucky number is 75898524288+1 | |
|
|
Immediate crash on Linux:
gary$ ./GFNSvCUDA-0_3b-linux64 22 6400 6405 B11
GFNSvCUDA v0.3b (c) 2012 Anand Nair (anand.s.nair AT gmail)
GFN Sieve for k^4194304+1 [k == 2 to 100000000]
Using factor file 'f22_6400P_6405P.txt'
Using checkpoint file 'c22_6400P_6405P.txt'
Floating point exception
gary$
Ubuntu 11.04, nvidia driver 295.53. Tried using both the libcudart.so that was in the .zip and the one I already had on my box. Same result. Tried n=20 and n=22, with different p ranges. No luck; identical failure.
Back to wwwwcl. :-(
--Gary | |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
Immediate crash on Linux:
gary$ ./GFNSvCUDA-0_3b-linux64 22 6400 6405 B11
GFNSvCUDA v0.3b (c) 2012 Anand Nair (anand.s.nair AT gmail)
GFN Sieve for k^4194304+1 [k == 2 to 100000000]
Using factor file 'f22_6400P_6405P.txt'
Using checkpoint file 'c22_6400P_6405P.txt'
Floating point exception
gary$
Ubuntu 11.04, nvidia driver 295.53. Tried using both the libcudart.so that was in the .zip and the one I already had on my box. Same result. Tried n=20 and n=22, with different p ranges. No luck; identical failure.
The linux client is completely untested. Only the windows client is operational currently. | |
|
|
The linux client is completely untested. Only the windows client is operational currently.
If you need someone to do test builds / runs on Linux, let me know.
--Gary | |
|
|
So do you have to manually sieve the long tasks or can you use the BOINC client?
I have a Quadro 600 which should be compatible, although it is on a 64 bit OS.
If it is possible to use the BOINC client what settings would I put on it in my preferences tab? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
So do you have to manually sieve the long tasks or can you use the BOINC client?
I have a Quadro 600 which should be compatible, although it is on a 64 bit OS.
If it is possible to use the BOINC client what settings would I put on it in my preferences tab?
This is a purely manual sieve and you don't use BOINC. You post on the PST boards what range you're going to sieve, you download the command-line sieve program, and run the sieve program in a command window. When it's done, you upload the output file.
It's actually a lot easier than it sounds. :)
Quadro 600 should be fine for sieving. The requirements for the sieve are lower than for GeneferCUDA, so if you can run the GFN Work Units on BOINC, you can run the sieve.
As long as you are running Windows, anyway. There's a problem with the Linux client. That will get fixed, but right now the Windows client is the only one that works.
For more info, go to the Prime Search Team forums.
____________
My lucky number is 75898524288+1 | |
|
|
first try with manual sieving.
reserved a 100P range. running smooth with ~85P/day at my non oc gtx580.
could be a little tricky to keep the gpu busy all the time ...
How do you guys keep your gpus loaded?
if i am home and my current range is finished at maybe 2 o'clock in the morning my gpu is idle till the next morning.
Of course i could reserve a new range and start the new range before i go to bed and finish the last range when im home ...
interested to hear some other experiences.
____________
| |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
first try with manual sieving.
reserved a 100P range. running smooth with ~85P/day at my non oc gtx580.
could be a little tricky to keep the gpu busy all the time ...
How do you guys keep your gpus loaded?
if i am home and my current range is finished at maybe 2 o'clock in the morning my gpu is idle till the next morning.
Of course i could reserve a new range and start the new range before i go to bed and finish the last range when im home ...
Reserve a new range(s), create a batch file, and enter the command lines for each of the ranges one after the other. Then run the batchfile.
You can always kill the batchfile to add new ranges , and remove completed ranges from the batch file.
EDIT:- There is not penalty for killing the program and restarting it multiple times. The program checkpoints when you Ctrl-C a run, and restart is instantaneous since there is no sieve file to read/write. | |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,418,364,641 RAC: 2,723,356
                                      
|
How do you guys keep your gpus loaded?
if i am home and my current range is finished at maybe 2 o'clock in the morning my gpu is idle till the next morning.
You can make two folders and run two instances. One would be with higher block size, the other one with smaller value like 7.
As a result you get higher overall GPU usage. The one with higher block size runs faster (and can finish sooner) while the other one keeps your GPU somehow busy.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
first try with manual sieving.
reserved a 100P range. running smooth with ~85P/day at my non oc gtx580.
could be a little tricky to keep the gpu busy all the time ...
How do you guys keep your gpus loaded?
Another thing to keep in mind is that you get to choose how big the "work units" are. Once you get a decent idea about how fast you can crunch, there's no reason you can't reserve a larger chunk so you don't have to start crunch a new range as often.
Personally, using a batch file is best method, in my opinion. It keeps the GPU busy and gives you the most flexibility.
____________
My lucky number is 75898524288+1 | |
|
|
Thanks all! I actually got it to work!
I'm not sure what range I should reserve and in the guide there was the suggestion of using "22 4000 4100". What does the 22 stand for?
How do I change it so I get less screenlag when I'm using the computer?
And lastly, is it ok close the command window and it will resume where I was?
edit:
Also, would it be unwise to leave BOINC running aswell while doing the sieving (and then just keep it only crunching on CPUs)?
____________
| |
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1963 ID: 352 Credit: 6,418,364,641 RAC: 2,723,356
                                      
|
Please, read as suggested in reservation thread, it's all in there.
If using GFNSvCUDA (GPU), please visit the Generalized Fermat Prime Search (GPU Sieving) page for instructions.
Try to leave single CPU core free for GFN sieving, others can run BOINC.
____________
My stats | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
I'm not sure what range I should reserve and in the guide there was the suggestion of using "22 4000 4100". What does the 22 stand for?
The 22 is the 'n' in b^2^n+1. 2^22 is 4194304, the exponent for the World Record work units.
If you look at the bottom of the GFN Sieve Reservation Page, you will see all the current reservations. As of this moment, the last reservation in the N=4194304 (n=22) range is Honza's reservation from 8000P to 8100P. So the next reservation should start at 8100P. If you also wanted to reserve a 100P range to sieve (about 55 hours on a GTX 460), you would reserve 8100P-8200P. Your command line arguments would therefore be 22 8100 8200.
IMPORTANT: Prior to starting to sieve, you MUST post a message in that thread indicating the range that you're sieving. The format is as follows:
N=4194304, 8100P-8200P Michael_Goetz
IMPORTANT: Likewise, don't just look at the main (first) post to see what the latest sieve reservation is. You also need to check below that for users who have posted reservations but that haven't been moved to the main post yet.
Then go and fire up your GPU and have fun sieving!
When the sieve is done, upload the result file ("f22_####_####.txt") to the upload directory. The link for uploading is near the top of the reservation page. Then post another message in that thread like this:
N=4194304, 8100P-8200P Michael_Goetz COMPLETE
The user name is used to find your BOINC account so I can properly assign credit here for the sieving. Note: If this is the first time you've ever done any PSA work, it wouldn't be a terrible idea to drop me a PM with your BOINC ID number (Mine is 53948 -- it's the number that appears in the user information to the left of the message board posts). Usernames in BOINC can be duplicated, but the IDs are unique.
How do I change it so I get less screenlag when I'm using the computer?
That's controlled by the "b#" paramater. For example, "22 8100 8200 b11". "B7" is the default. "B11" is the fastest, but has more screen lag. "B5" has the least screen lag, but is slower.
And lastly, is it ok close the command window and it will resume where I was?
Yes, although it might be slightly better to hit CTRL-C to stop the sieve and then close the window. Either way, the program should be able to seamlessly resume where it left off. Use the exact same command line when restarting, EXCEPT that you are free to vary the "b#" parameter as you wish. The first three parameters must be the same.
Also, would it be unwise to leave BOINC running aswell while doing the sieving (and then just keep it only crunching on CPUs)?
That will work, but it's possible they might slow each other down, slightly. Many (most?) crunchers do work on the GPU and CPU simultaneously, so the general answer to your question is no, it would not be unwise. Feel free to run BOINC on the CPU while running the sieve on the GPU.
____________
My lucky number is 75898524288+1 | |
|
|
Awesome! Put in a slightly shorter range, 8100P-8150P to test it out.
Also dropped you a PM.
____________
| |
|
|
I have put both of my GPU to work here for some nice and easy PSA credit :) We'll see how much work they can complete at 60P/day before the challenge... (I will be crunching shorter intervals, so there may be some down time while I am asleep or at work)
____________
| |
|
|
Running on a GTX470 @65P/day (B10 setting), which keeps the GPU happy with 88% load.
Is it just me, or are there a lot of factor is the search range?
Something like 50 factors for every 0.5P ? | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Running on a GTX470 @65P/day (B10 setting), which keeps the GPU happy with 88% load.
Is it just me, or are there a lot of factor is the search range?
Something like 50 factors for every 0.5P ?
Yes, and no.
A lot of factors are being found, but remember that the program is sieving a huge range of 2 <= b <= 100,000,00. The actual range we'll be searching with GeneferCUDA is limited to about 0.5% of that, approximately 2 <= b <= 500,000.
Also, because this is a "dat-less" sieve, it's constantly finding factors for numbers that were already removed earlier in the sieving process.
If you take a look on the stats page (http://home.comcast.net/~jimb_primegrid/#R4194304) and look at my sieving of 6050P-6150P, you'll see that the sieve found 18,167 factors. Of those, 8,867 were removing candidates from the sieve, meaning that 9,300 were duplicates of numbers previously sieved out.
Of those 8,867 that were new, 45 were below 500,000 and actually represent numbers that we won't have to crunch with GeneferCUDA. The other 8,822 are above the range of numbers usable by GeneferCUDA. We may eventually search those using some other method.
The bottom line is that of the 18,167 factors found in that sieving run, only 45 actually removed candidates from the BOINC World Record work pool. That might not seem like a lot -- but those 45 numbers represent 1.75 years of crunching on my GPU, or really double that (3.5 years) if you include the double checking. 3.5 years of crunching saved by 2 days of sieving.
____________
My lucky number is 75898524288+1 | |
|
|
Good explanation on those factors found, thanks Michael! | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Good explanation on those factors found, thanks Michael!
You're welcome -- and thanks for asking an excellent question!
____________
My lucky number is 75898524288+1 | |
|
|
Damn, that's a _lot_ of savings there. That's nice :)
Let me preface this question by clearly stating I know next to nothing about code optimization and the likes: I know next to nothing about code optimization and the likes.
Now that you know...
Seeing how many factors are discovered, but a relatively low number (<1%) actually does anything, would it not save a lot of resources if a dat-file was used?
I say with the following assumptions/reasoning:
1) We're testing a range 2 <= b <= 100,000,000
2) The counted numbers in the "removed from sieve" column in http://home.comcast.net/~jimb_primegrid/#R4194304 are within that range and are proven not prime
3) The sum of said column is 27,462,822 numbers which are now being checked again and again by people
4) My card cranks out about 5300 numbers per second.
5) 27462822 numbers / 5300 numbers per second =~ 5181 seconds / 60 seconds =~ 80 minutes I need to spend less (per P?)
If anyone can point the fallacy in my logic, please do! :)
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
|
4) My card cranks out about 5300 numbers per second.
5) 27462822 numbers / 5300 numbers per second =~ 5181 seconds / 60 seconds =~ 80 minutes I need to spend less (per P?)
If anyone can point the fallacy in my logic, please do! :)
I know absolutely nothing about code optimization and the likes, but:
My card (a 550ti) can sieve about 40P per day. that's about 36 minutes per P. I assume that my card is slower than yours, which means that by your formula, a dat-file would save me more than 80 minutes per P, which is obviously impossible...
I think that, unlike the cpu app - that does use a dat-file, the gpu app does not to make sieving simpler and error free.
I do not now if there would be any optimization if a dat-file was used: less factors to check, yes, but also higher memory use, that could slow down crunching.
| |
|
|
Hehe, good point. Calculating stuff in negative time would be awesome though! :P Got some numbers switched around. Derp!
Yeah I figure memory costs would be involved, but then again modern cards have truck loads of the stuff. 1200MB on mine, of which a shocking 9.4% is now being used.
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Seeing how many factors are discovered, but a relatively low number (<1%) actually does anything, would it not save a lot of resources if a dat-file was used?
Nope. Read this: http://primesearchteam.com/showpost.php?p=5732&postcount=136
____________
My lucky number is 75898524288+1 | |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
The sieve algorithm is such that the computation is independent of both the range of candidates (b=2 to 100,000,000) as well as the actual list of candidates (27,462,822) that would be there in a dat file. This is true of both the CUDA version as well as CPU version (AthGfnXXX). The only thing a dat file achieves is to limit the factor output to only those candidates that you're interested. It doesn't affect the speed of the program (one way or the other).
The 5300 numbers that you're processing per second are the potential (prime) factors that the program is testing against these candidates (b=2 to 100,000,000). And no, the program doesn't test each of them one by one against all the candidates -- remember the part where 'computation is independent of candidates'?
Bottom line -- dat-file means less redundant factors, but no speed difference.
Those who are of a mathematical persuasion might peruse http://fatphil.org/maths/GFN/maths.html | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,982,426 RAC: 47,348
                     
|
Exactly. Honza and I resieved n=1048576 from 0P to 1130P using AthGfn64 after an early problem left the factor file missing 935,075 entries. During resieving we used a factor file containing only those entries. Speed was not affected at all. All numbers in the range were still tested, but only those appearing in the sieve were actually output.
The only change I made with the dat-less sieving was to add columns to the GFN Sieve Status page. One set lists raw factors found and their density, the other one lists factors newly removed from the sieve and that density. Dat-less sieving affected the first two of those, but not the last two. | |
|
|
Ah nice, thanks for clearing that up guys. :)
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
|
Just a practical question before I'm going to reserve some more P's for sieving. How badly do you want it to be done before the challenge? I.e. would you prefer me to reserve an amount that will most certainly be done before the challenge starts (but leave my GPU unused for some time) or is it also ok if it'll take a couple more hours of computing after the challenge?
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
| |
|
|
Just a practical question before I'm going to reserve some more P's for sieving. How badly do you want it to be done before the challenge? I.e. would you prefer me to reserve an amount that will most certainly be done before the challenge starts (but leave my GPU unused for some time) or is it also ok if it'll take a couple more hours of computing after the challenge?
I think Michael Goetz already answered that question here:
http://primesearchteam.com/showthread.php?t=40
____________
676754^262144+1 is prime | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Just a practical question before I'm going to reserve some more P's for sieving. How badly do you want it to be done before the challenge? I.e. would you prefer me to reserve an amount that will most certainly be done before the challenge starts (but leave my GPU unused for some time) or is it also ok if it'll take a couple more hours of computing after the challenge?
My personal preference would be for people to keep reserving sieve ranges like they've been doing so far, suspend them during the challenge, and resume them after the challenge.
(Due to impending scorching weather, I'll be suspending crunching for a few days after my sieve range finishes tomorrow, but otherwise I'd be reserving another 100P which would span the challenge days.)
I don't expect to complete the sieve before the challenge, although we'll be close. I do think there's a good chance that we'll get to the point where all the available sieve reservations at n=22 are exhausted; they just won't all be returned. We've got about 2000P to go with a bit more than a day to go, and the sieve reservation leading edge has been advancing at about 2000P per day. (The awesome thing is that the first 4000P took about 6 months; now we're advancing at 2000P per day. Wow.)
____________
My lucky number is 75898524288+1 | |
|
|
Since the CW sieve has been completed for now, would it be possible to bring GFN sieve to BOINC? Some time ago it was quite popular, but now with the new n=20 genefer tasks/challenge people have lost interest. Maybe with a new BOINC project/badge people will renew their interest. And let us not forget the Keplar GPUs are more efficient at sieving than at Genefer.
What I found in the GFN sieve discussion: http://primesearchteam.com/showthread.php?t=40&page=17
Optimal sieve point for n=20 is ~ 100E (100,000P), currently sieved to 26,000P
Optimal sieve point for n=21 is ~ 400E (400,000P), currently sieved to 20,200P
Optimal sieve point for n=22 is ~ 1000E (=1,000,000P), currently sieved to 41,300P
So about 74E + 380E + 959E = 1413E is still needed to reach optimal sieve depth for these N's , while only 87E has been completed so far. To put that in perspective, that is more than 5000 million BOINC credits worth left to sieve. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Since the CW sieve has been completed for now, would it be possible to bring GFN sieve to BOINC? Some time ago it was quite popular, but now with the new n=20 genefer tasks/challenge people have lost interest. Maybe with a new BOINC project/badge people will renew their interest. And let us not forget the Keplar GPUs are more efficient at sieving than at Genefer.
What I found in the GFN sieve discussion: http://primesearchteam.com/showthread.php?t=40&page=17
Optimal sieve point for n=20 is ~ 100E (100,000P), currently sieved to 26,000P
Optimal sieve point for n=21 is ~ 400E (400,000P), currently sieved to 20,200P
Optimal sieve point for n=22 is ~ 1000E (=1,000,000P), currently sieved to 41,300P
So about 74E + 380E + 959E = 1413E is still needed to reach optimal sieve depth for these N's , while only 87E has been completed so far. To put that in perspective, that is more than 5000 million BOINC credits worth left to sieve.
It would be possible, yes. It might not be desirable, however. If we start a new project, we'll create a new badge, of course. And everyone will want to get that badge, right?
GFN sieving is ONLY available as an Nvidia GPU application. There's no CPU program for sieving anymore. (We used to sieve on the CPU, but we're way beyond the range that could be handled by that program.) So this project will absolutely, possitively exclude everyone who doesn't have an Nvidia GPU.
It will also exclude everyone with a Mac.
I don't recall off the top of my head if there's a Linux version either, so it might be limited to only Windows and only Nvidia.
I don't think that's going to go over very well with the Mac, Linux, or AMD crowds.
____________
My lucky number is 75898524288+1 | |
|
RogerVolunteer developer Volunteer tester
 Send message
Joined: 27 Nov 11 Posts: 1138 ID: 120786 Credit: 268,668,824 RAC: 0
                    
|
April 2015 update:
n=20 sieved to 37,400P, ultimate goal 100,000P, 37.4% sieved to optimal level, sieve suspended
n=21 sieved to 44,600P, ultimate goal 400,000P, 11.1% sieved to optimal level, current goal 50,000P
n=22 sieved to 105,200P, ultimate goal 1,000,000P, 10.5% sieved to optimal level, current goal 120,000P
So about 62,600 + 355,400 + 894,800 = 1,312,800P still needed to reach optimal sieve depth, while 187,200P has been completed so far. | |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
April 2015 update:
n=20 sieved to 37,400P, ultimate goal 100,000P, 37.4% sieved to optimal level, sieve suspended
n=21 sieved to 44,600P, ultimate goal 400,000P, 11.1% sieved to optimal level, current goal 50,000P
n=22 sieved to 105,200P, ultimate goal 1,000,000P, 10.5% sieved to optimal level, current goal 120,000P
So about 62,600 + 355,400 + 894,800 = 1,312,800P still needed to reach optimal sieve depth, while 187,200P has been completed so far.
For n=20, the optimal depth was calculated when the CUDA range was untested. Now that the range is mostly tested, the optimal level has come down, and hence the sieve suspended. It might get picked up again if/when it gets moved to PRPNet and tested using genefer80 (as was done for n=18, recently). At that time, a new optimal point will have to be computed (which might be higher than 100,000P!) | |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 610,974,903 RAC: 732,751
                         
|
Okay so, now that OCL4 can do up to 400.000.000 B on any range.... shouldn't we change the sieving program to add those candidates as well? Or does that go beyond the sieve algorithm? | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1051 ID: 301928 Credit: 563,881,725 RAC: 768
                         
|
Okay so, now that OCL4 can do up to 400.000.000 B on any range.... shouldn't we change the sieving program to add those candidates as well? Or does that go beyond the sieve algorithm?
It's an interesting question because sieve algorithm itself does not using this upper limit (which is currently simply hardcoded to 100,000,000) at all. This limit value is only used to determine which "B" to print to output file, and nowhere else. "B" above this limit are silently discarded, although they all form correct factors. The algorithm itself seems to be always searching quite big range of B, at least 32-bit (i.e. up to 4,294,967,295), it just didn't reporting most of the results found. But my knowledge of this algorithm is not good enough, there can be some limitations in calculations which I'm not aware of.
I think that real question should be "Did is make sense for us to search any GFN above 100,000,000". And only if yes - sieve program must be updated ASAP.
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
Okay so, now that OCL4 can do up to 400.000.000 B on any range.... shouldn't we change the sieving program to add those candidates as well? Or does that go beyond the sieve algorithm?
This puts us in a somewhat similar (but not identical) situation with GFN as we've always been with LLR. All the sieves for LLR projects have an upper limit. That limit is based not upon the absolute upper limit of the capability of the LLR program, but upon which numbers we expect to be testing in the foreseeable future.
The new abilities of the OCL4 program put us in a similar situation with GFN. It used to be that sieving to 100M was absolutely sufficient because not only did we not expect to ever search that high, but there was no version of genefer that could even test numbers beyond b=100M.
Now we have a GPU version of Genefer that can go higher. But will we ever need it? It's unlikely that we'll see n=17 or above ever get above 100M, or anywhere close to it. 15, and to a lesser degree 16, could possibly get that high after quite a few years. In theory. Remember, however, that there's no CPU version of Genefer that can go that high -- the CPU version of Genefer's limits, with the x87 transform, are similar to the OCL2 limits. For part of the push to 100M, therefore, it would be just GPUs doing the work. And that's assuming we want to run a GPU-only project, which is something we prefer not to do.
The bottom line is that just because OCL4 can test numbers up to 400M doesn't imply we need to sieve that high, and there's certainly no urgency about it. I don't expect us to ever need a sieve file b>100M for anything other than n=15 and maybe n=16. N=15 isn't currently being sieved, so the only sieve currently running where we would even theoretically be able to use those factors is n=16. That need is a long, long way off.
____________
My lucky number is 75898524288+1 | |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 610,974,903 RAC: 732,751
                         
|
Okay so, now that OCL4 can do up to 400.000.000 B on any range.... shouldn't we change the sieving program to add those candidates as well? Or does that go beyond the sieve algorithm?
This puts us in a somewhat similar (but not identical) situation with GFN as we've always been with LLR. All the sieves for LLR projects have an upper limit. That limit is based not upon the absolute upper limit of the capability of the LLR program, but upon which numbers we expect to be testing in the foreseeable future.
The new abilities of the OCL4 program put us in a similar situation with GFN. It used to be that sieving to 100M was absolutely sufficient because not only did we not expect to ever search that high, but there was no version of genefer that could even test numbers beyond b=100M.
Now we have a GPU version of Genefer that can go higher. But will we ever need it? It's unlikely that we'll see n=17 or above ever get above 100M, or anywhere close to it. 15, and to a lesser degree 16, could possibly get that high after quite a few years. In theory. Remember, however, that there's no CPU version of Genefer that can go that high -- the CPU version of Genefer's limits, with the x87 transform, are similar to the OCL2 limits. For part of the push to 100M, therefore, it would be just GPUs doing the work. And that's assuming we want to run a GPU-only project, which is something we prefer not to do.
The bottom line is that just because OCL4 can test numbers up to 400M doesn't imply we need to sieve that high, and there's certainly no urgency about it. I don't expect us to ever need a sieve file b>100M for anything other than n=15 and maybe n=16. N=15 isn't currently being sieved, so the only sieve currently running where we would even theoretically be able to use those factors is n=16. That need is a long, long way off.
Ironically, I've only made this question because of you said earlier "we get these candidates for free, even if we never intend to test them".
Who knows, we might make a "GFN 22 - Mega" with those huge numbers. Or someone figures out a miraculous faster way to test candidates. Unlikely, I know. But if it really comes at no cost, might as well do it.... | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14043 ID: 53948 Credit: 481,332,491 RAC: 513,371
                               
|
But if it really comes at no cost, might as well do it....
Indeed, there's probably no compelling reason not to do it, because except for some additional disk I/O we'd get it for free. But it's not something that's very urgent.
____________
My lucky number is 75898524288+1 | |
|
GDBSend message
Joined: 15 Nov 11 Posts: 304 ID: 119185 Credit: 4,289,577,535 RAC: 1,745,492
                      
|
We're approaching the 120,000P sieving limit for n=22. Is it going to be increased? | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,982,426 RAC: 47,348
                     
|
We're approaching the 120,000P sieving limit for n=22. Is it going to be increased?
There's no such thing as an upper p limit in manual sieving. It's just me trying not to have too much blank space on the results graph and I make the results table agree with the graph.
But yes, I'll increase it the next time I'm in there. | |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 610,974,903 RAC: 732,751
                         
|
Just wondering a few things:
1- To do sieving on n=13 for external testing, I had to use AthGfn64. And it got me thinking: couldn't we use it to allow manual sieve on the CPU?
2- Why is the CUDA app limited to n>= 15? Why can't it do 14 and below?
3- Can we add an "optimal sieve depth" field somewhere? Whether it be on the stats page or reservation page, or anywhere else, it would be pretty nice to know. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,982,426 RAC: 47,348
                     
|
Just wondering a few things:
1- To do sieving on n=13 for external testing, I had to use AthGfn64. And it got me thinking: couldn't we use it to allow manual sieve on the CPU?
2- Why is the CUDA app limited to n>= 15? Why can't it do 14 and below?
We're not testing n<15. And the lower the n, the slower the GFN sieve runs. I'm not sure it's worth using a GPU there. Also, I don't believe the algorithm used has been tested at those n's. Plus they're so fast to primality-test that the optimal sieving level would be quite low.
3- Can we add an "optimal sieve depth" field somewhere? Whether it be on the stats page or reservation page, or anywhere else, it would be pretty nice to know.
Optimal depth varies depending on which genefer transform is being used and which of the three sieving algorithms is used (varies by how high p is). I had some rough approximations at one point, but they were all based on OCL3 and OCL2. I'd have to do a bunch of benchmarking to get new numbers. There are estimates of optimal sieving depth by others in this thread but they're also out of date. Suffice it to say that the optimal level is at least twice the current level for all open sieves. I don't know offhand about the suspended sieves (n=18, 19, 20).
Plus, some time in the week or so we're coming out with 64-bit GFN sieving program for both CUDA and OpenCL under both Windows and Linux. The 64-bit code has a pretty decent speed advantage over the 32-bit plus it has an option to use a lot less CPU for higher n's. That would render any current optimal point obsolete. I might possibly recalculate the optimal sieving depth after that.
| |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 610,974,903 RAC: 732,751
                         
|
We're not testing n<15.
Yes, PG isn't. I am, though. Trying to help fill these gaps.
And the lower the n, the slower the GFN sieve runs. I'm not sure it's worth using a GPU there. Also, I don't believe the algorithm used has been tested at those n's.
I once tried just running Cuda Sieve at n=13. Afterall, if it would have a remove rate of ~6 candidates per min or more, then it would be better run sieve rather than PRP. But the app immediately threw a "n must be 15 min" error message.
It's not even a matter of "being worth it" or not, it's straight up "the program doesn't work at that range".That's why I'm asking: why?
Plus, some time in the week or so we're coming out with 64-bit GFN sieving program for both CUDA and OpenCL under both Windows and Linux. The 64-bit code has a pretty decent speed advantage over the 32-bit plus it has an option to use a lot less CPU for higher n's. That would render any current optimal point obsolete. I might possibly recalculate the optimal sieving depth after that.
YAY!
.... but it's still GPU software, people would need graphics cards to run it. AthGFNsv, on the other hand, is a CPU program, which feels the gap. That was the actual point of the question: COULD we use that for, say, n=16? Even if it was only going to remove 1 candidate every few minutes, it would still be better to run CPU Sieve (with AthGFNsv) rather than genefer CPU.
Putting it other words, if reserved a range and ran it with the CPU program instead of the GPU app, would it work (much like I can run "amd genefer OCL" on an Intel GPU, as the output is the exact same)? | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1051 ID: 301928 Credit: 563,881,725 RAC: 768
                         
|
It's not even a matter of "being worth it" or not, it's straight up "the program doesn't work at that range".That's why I'm asking: why?
I think only author could answer you. All I know that there is a comment in the source code:
// For lower n's there is not enough iterations available.
From programmers' point of view, code immediately following this comment (which calculates GPU grid size) has no problems with overflows, but something may happens in sieving algorithm itself. Unfortunately, my math skills are not high enough to find this out.
Second, how fast CPU sieve for GFN-13/14 is working on your computer? Speed of GPU program halves each time as N decreases. This is a table of speed limits of GPU sieving program, in P/day. It means that CPU (4770K @4000MHz) cannot feed GPU faster then shown value:
-------------+--------------------------------------------------------------
Program | GFN-15 16 17 18 19 20 21 22
-------------+--------------------------------------------------------------
gfnsvsim_w32 | 13.9 27.8 51.7 102.4 204.2 401.0 774.0 1453.0
gfnsvsim_w64 | 20.9 41.8 76.8 152.7 301.6 591.0 1145.0 2157.0
-------------+--------------------------------------------------------------
So, maximum speed of GPU GFN-14 sieving will be about 10.4 P/days, and GFN-13 - 5.2 P/days. It seems to be very close to speed at which AthGFNsv could work on this CPU.
.... but it's still GPU software, people would need graphics cards to run it. AthGFNsv, on the other hand, is a CPU program, which feels the gap. That was the actual point of the question: COULD we use that for, say, n=16?
Although it's technically possible, I think it's not widespread because it's more difficult to support. The GUI program is harder to setup, you have to enter everything manually into dialog boxes, it's very easy to make a typo entering something huge like a 200000000000000000 as a starting range. Since it's probably applicable only to N=16, making this feature public just does not worth efforts.
Putting it other words, if reserved a range and ran it with the CPU program instead of the GPU app, would it work (much like I can run "amd genefer OCL" on an Intel GPU, as the output is the exact same)?
I suggest you to benchmark AthGFNsv at different N's first and compare results with the table above, then decide is the speed acceptable for you or not (considering that speed of cold and cheap GTX750TI is 50-55 P/day).
| |
|
|
Plus, some time in the week or so we're coming out with 64-bit GFN sieving program for both CUDA and OpenCL under both Windows and Linux. The 64-bit code has a pretty decent speed advantage over the 32-bit plus it has an option to use a lot less CPU for higher n's. That would render any current optimal point obsolete. I might possibly recalculate the optimal sieving depth after that.
On behalf of those of us who are similarly situated as myself and have most of their nvidia horsepower on macs, their windows-based amd horsepower is older tahiti stuff, and therefore not so great for sieving, I would like to put in a plug for at least cuda on osx.
Failing that-is the source going to be available? | |
|
GDBSend message
Joined: 15 Nov 11 Posts: 304 ID: 119185 Credit: 4,289,577,535 RAC: 1,745,492
                      
|
I returned a n=22 factor file for range 122667P-122867P. The factor file had 1808 factors, but the results table shows 1807 factors found. Why the difference? Usually they"re the same. The only difference that I noticed in the factors for this range was that one factor (2962) is unusually small (<10000). Maybe small factors aren't counted? | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,982,426 RAC: 47,348
                     
|
I returned a n=22 factor file for range 122667P-122867P. The factor file had 1808 factors, but the results table shows 1807 factors found. Why the difference? Usually they"re the same. The only difference that I noticed in the factors for this range was that one factor (2962) is unusually small (<10000). Maybe small factors aren't counted?
Because I remove duplicates as a matter of course, or rather my scripts do. Your file had two lines that looked like this:
122726188122258800641 | 87829630^4194304+1
122726188122258800641 | 87829630^4194304+1
Duplicate line got removed. I wouldn't normally care about duplicates, they don't hurt anything, but I take them out because of the results table. Sometimes there are hundreds of duplicate lines, though it's more common in PPR12M sieving. | |
|
GDBSend message
Joined: 15 Nov 11 Posts: 304 ID: 119185 Credit: 4,289,577,535 RAC: 1,745,492
                      
|
It might have come when I restarted after a reboot. | |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 610,974,903 RAC: 732,751
                         
|
Plus, some time in the week or so we're coming out with 64-bit GFN sieving program for both CUDA and OpenCL under both Windows and Linux. The 64-bit code has a pretty decent speed advantage over the 32-bit plus it has an option to use a lot less CPU for higher n's. That would render any current optimal point obsolete. I might possibly recalculate the optimal sieving depth after that.
So, I've just randomly remembered about this... obviously it has been more than a week. Any news / ETA? | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,982,426 RAC: 47,348
                     
|
Plus, some time in the week or so we're coming out with 64-bit GFN sieving program for both CUDA and OpenCL under both Windows and Linux. The 64-bit code has a pretty decent speed advantage over the 32-bit plus it has an option to use a lot less CPU for higher n's. That would render any current optimal point obsolete. I might possibly recalculate the optimal sieving depth after that.
So, I've just randomly remembered about this... obviously it has been more than a week. Any news / ETA?
I'd forgotten I ever mentioned that in public. Plus what I mentioned was incomplete. I should go back and reread my own posts once in a while. I'll have something to say by the end of the weekend. | |
|
Message boards :
Sieving :
We need your help with GFN Sieving! |