Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Sieving :
GC sieving (for CPU) is now available for five bases
Author |
Message |
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
We've opened five bases for Generalized Cullen sieving. They are b=29, b=41, b=47, b=49 and b=53. We're sieving for n=1M-2M, as no prime was found for n < 1M.
There are some differences between these and other current sieving. If you run separate instances each one will need its own directory as otherwise there will be conflicting filenames. Youl need to download a sieve file - while you only need one copy its probably easier to copy the file into each directory that you sieve in. The sieve file is different for each base. If you've done GC sieving in the past, you MUST download the new sieve file before starting. Older sieves were for different n ranges, even though the names may be similar.
At this point we don't know where the optimal sieving level will be, but suspect it's at least p=30T for all bases involved. These are larger candidates than for previous instances of these bases, therefore we'll sieve higher. A sieve is considered optimal when it takes the same or less time to test each candidate on PRPNet than to remove candidates by sieving. It's expected that the minimum reservation of 100G will take somewhere between 40 and 50 hours on a single core, depending on how fast the CPU is. Different bases take slightly different times, hence the varying credit per reservation.
For what I believe is the first time, we have a Mac version of the sieve program. While I haven't personally tested it (I don't own a Mac), I believe this program should work just fine.
GC sieving doesn't benefit from newer processor features. If running multiple cores worth, you'll get more throughput with HT on. It's ideal for older processors.
Please post any questions here, so that everyone can benefit from the answers. | |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 605,031,068 RAC: 563,489
                         
|
Uh... is this okay?
04/06/16 01:21:53 WARNING: --pmin=550000000000 from command line overrides pmin=500000000000 from `gc29_500G.txt' | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
Uh... is this okay?
04/06/16 01:21:53 WARNING: --pmin=550000000000 from command line overrides pmin=500000000000 from `gc29_500G.txt'
Yes, because it's not a single process doing all the sieving. You're starting that process at 550G and presumably another different one at 500G. You're not going to be producing a new sieve file, so it doesn't matter. I'll use the factor files to create new sieves here as necessary. | |
|
|
Is one instance needed per thread, or is there a command line switch to control how many threads to devote to an instance? I want to make sure I reserve an appropriately sized range based on this.
____________
| |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
Is one instance needed per thread, or is there a command line switch to control how many threads to devote to an instance? I want to make sure I reserve an appropriately sized range based on this.
Good question, I'd forgotten about that. The -t flag is available in linux and possibly in the Mac version. So in linux, to use four (virtual or real) cores on one job you'd add -t4 to the command line. The Windows compiler this code was built with did not support the fork() call, which the code needs.
README-threads wrote: MULTITHREADING
==============
As of version 1.2.0 the switch `-t --threads N' can be used to allow one
instance of the program to use N threads for the main sieving loop.
This feature is experimental, known problems are:
1. Running one multi-threaded instance of the program is LESS productive
than running multiple seperate non-threaded instances.
2. The fork() system call is required, but is not supported by the MinGW
compiler used to build the Windows executables.
3. CPU-time statistics are no longer supported, all reported times and
speeds are based on elapsed time, except the cpu_secs field in the
checkpoint file which is in CPU time but will be inaccurate when the -t
switch is used.
4. Intermediate progress reports and checkpoints are based on whichever
thread has made the least progress, instead of waiting for all threads to
synchronise.
5. If one child thread is terminated then the whole program will terminate,
instead of redistributing the work over the remaining threads.
THREAD AFFINITY
===============
To set the CPU affinity for individual threads, use the `-A --affinity N'
switch once for each thread. For example:
gcwsieve -t2 -A0 -A2 ...
will start two child threads, one with affinity to CPU 0 and the other with
affinity to CPU 2.
I'll look into adding that to the instructions page - it just needs to be clear that it absoluely, positively won't work under Windows.
Later edit: Those changes have now been made. | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
We've opened five bases for Generalized Cullen sieving. They are b=29, b=41, b=47, b=49 and b=53. We're sieving for n=1M-2M, as no prime was found for n < 1M.
Have y’all thought about GC bases 32, 75, 106, 115? They technically do not have GC primes: primes of the form n*b^n+1 with n+2 > b. Also, b=149 is still a primeless base for both GW and GC.
BTW, I was not able to access Daniel Hermle’s GC 101-200 info. Anyone able to access the link???
However, I did find it through archive.org.
____________
| |
|
|
Have y’all thought about GC bases 32, 75, 106, 115? They technically do not have GC primes: primes of the form n*b^n+1 with n+2 > b.
So, for each of those four b values you mention, there is one or more very small n which makes n*b^n+1 a prime. However, no n is known for these four b such that n+2>b.
Here is a dumb question: Why is the convention, n+2 > b? Why is this natural? Why not for example n > b or n-2 > b or something else?
/JeppeSN | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
Here is a dumb question: Why is the convention, n+2 > b? Why is this natural? Why not for example n > b or n-2 > b or something else?
From the Prime Pages: Generalized Cullen
The Generalized Cullen primes are the primes of the form n*b^n + 1 with n+2 > b. The reason for the restriction on the exponent n is simple, without some restriction every prime p would be a generalized Cullen because:
p = 1*(p-1)^1 + 1.
____________
| |
|
|
Here is a dumb question: Why is the convention, n+2 > b? Why is this natural? Why not for example n > b or n-2 > b or something else?
From the Prime Pages: Generalized Cullen
The Generalized Cullen primes are the primes of the form n*b^n + 1 with n+2 > b. The reason for the restriction on the exponent n is simple, without some restriction every prime p would be a generalized Cullen because:
p = 1*(p-1)^1 + 1.
I see the point that 1*b^1 ± 1 is trivial (any number has those forms). But I was more interested in why:
n+2 > b
was preferred over other possibilities like:
n > b
n-2 > b
n+1 > b
or similar. As an example (see A271718 (recent entry by myself)), we consider:
81*82^81 - 1
a generalized Woodall prime (n is smaller than b, but not too much smaller(?)), while the prime number:
110*112^110 - 1
is not considered a generalized Woodall because, according to the convention we have, n is too small compared to b.
So what motivates this "plus two" rule?
/JeppeSN | |
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
So what motivates this "plus two" rule?
I sought out others who might know the answer and so far, the only response I've received is that there's nothing special about using n+2. A restriction was needed a that's what was used.
The Prime Pages states "...a few authors have defined..." I'm still in search of "a few authors".
____________
| |
|
|
So what motivates this "plus two" rule?
I sought out others who might know the answer and so far, the only response I've received is that there's nothing special about using n+2. A restriction was needed a that's what was used.
The Prime Pages states "...a few authors have defined..." I'm still in search of "a few authors".
Maybe they chose n+2 > b because they wanted a criterion that the Cullen number:
1*2^1 + 1
could meet.
If they wanted a simple linear inequality, then:
n+2 > b
is the most restrictive choice that still allows the combination b=2 ("classical" Cullen primes) and n=1.
Just speculating.
/JeppeSN | |
|
|
Wow, Generalized Cullen b=29 is already closed. It was super fast. ;) I thought that 30T is the limit. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
Wow, Generalized Cullen b=29 is already closed. It was super fast. ;) I thought that 30T is the limit.
Yeah, so did I (though I didn't think I ever posted about it). After I do that big range (should take another 7.5 days unless I split it onto multiple machines) I'll look at the factor density and reopen if necessary. Density doesn't decline smoothly, there are stretches where it's artifically high or low.
If it's reopened, it might well be with a larger minimum reservation (like 200G), especially if we're at the point where there might not be a factor in any given 100G range (something I'm already seeing occasionally). When we were doing this sieving on primesearchteam.com the minimum reservation size was 1T. I decided we could go smaller here because then a) they take less time and b) the server prevents overlapping or duplicate reservations if two people reserve at the same time. | |
|
|
It would help ( in my case ) if the application were multithread aware. I have many threads, but they aren't high clocked.
100G takes 2.6 day per thread. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
I've reopened reservations on b=29. I've found 20 factors in the first 588G of the reservation. I've also added b=55 (the last time I looked on PRPNet there was one job still pending, but I ran it locally and it's composite). | |
|
|
Need help with linux. How the start.sh should look like?
Nevermind. It's done. :) | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
I just made some changes to the sieving code, I can now define a maximum sieving level (maximum p) per project. The only one defined at the moment is p=30T for GCW b=29. While I may reopen b=29 (much) later if we run out of sieving, at the moment sieving the other bases is more productive.
b=69 for n=500K-1M is almost complete on PRPNet, with 7 candidates remaining. Those tests look abandoned and I'll let them recycle automatically. Sieving for b=69 for n=1M-2M will be added next week after those tests finish. My initial sieving to p=500G on that base will be complete in another 3 days. | |
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 918 ID: 370496 Credit: 605,031,068 RAC: 563,489
                         
|
I just made some changes to the sieving code, I can now define a maximum sieving level (maximum p) per project. The only one defined at the moment is p=30T for GCW b=29. While I may reopen b=29 (much) later if we run out of sieving, at the moment sieving the other bases is more productive.
b=69 for n=500K-1M is almost complete on PRPNet, with 7 candidates remaining. Those tests look abandoned and I'll let them recycle automatically. Sieving for b=69 for n=1M-2M will be added next week after those tests finish. My initial sieving to p=500G on that base will be complete in another 3 days.
It seems to have broken the other sieves, though, as GC Sieve is the only one that's showing up when I click the Manual Sieve page.... | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
It seems to have broken the other sieves, though, as GC Sieve is the only one that's showing up when I click the Manual Sieve page....
Interestingly, I didn't touch that program. I'll either have it fixed or revert back to older code within the hour.
Later: I've now reverted back to earlier code. The strange part of this is that I'd actually seen that myself and was so tired that it didn't strike me as wrong.
Final: Fixed the problem and the new code is back in place. | |
|
|
b=69 for n=500K-1M is almost complete on PRPNet, with 7 candidates remaining.
Now there are 4! :) | |
|
|
b=29 is reserved up to 30G. The button for reserving ranges is still active. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
b=29 is reserved up to 30G. The button for reserving ranges is still active.
I hadn't really thought about having show_sieving.php test for available ranges. At the time, in my sleep-deprived state, I thought it was acceptable to have it work that way. I've now changed the code and any project which is technically still open for new reservations will now show "No new reservations" if in fact there are no ranges available. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
A 500G range in gc29 was cancelled by the user, which is why it's again showing the ability to create new reservations.
Later edit: And now that's been re-reserved. | |
|
|
I'm getting "permission denied" when I try to run the file on Mac...
EDIT: Ok, for anyone who sees this later: You need to "chmod u+x" the file so that it can actually be executed.
(More details here: http://stackoverflow.com/a/18960752) | |
|
|
A weird thing when running in the Mac terminal is that it doesn't put further progress on new lines; rather, it just replaces the existing line. However, it overwrites the existing text without actually clearing it, so if the new output is shorter than the old one, you get something like this:
p=3672393134099, 668698 p/sec, 9 factors, 72.4% done, ETA 04 Jun 23:29ctor
That's because, before this, it said x sec/factor, but the new output was shorter than what was already there, so it left "ctor" from "factor" behind. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
I've just hidden GC b=29, we'd finished all the reservations. You can still see work that you did on it, but the project won't show on the manual sieving project list any more. b=41 has been completely reserved, I'm just waiting on people to finish. New bases (73, 79, 101, 109, 116, 121) will appear either when their current tests on PRPNet finish or when we run out of other sieving, whichever comes first. Initial sieving for those bases has already been done, so they can start at any time.
Please feel free to help us finish up b=47 and then the rest of the bases. I'm doing a lot of behind-the-scenes sieving at the moment, but when I get some cores free I'll go back to making GC reservations there. | |
|
JimB Honorary cruncher Send message
Joined: 4 Aug 11 Posts: 920 ID: 107307 Credit: 989,369,137 RAC: 7,509
                     
|
Sieve Files Updated
I've just replaced all the current GC sieve files. The new ones are smaller, with all factors found having been applied to those files. This means that any sieving using those files will be substantially faster, yet you'll get the same amount of credit.
Those new sieve files are now displayed on the instructions page for sieving. So when you make a new reservation the new file is where the old file used to be. For anyone with sieving in progress, if you click on the Instructions page for your reservation you can both see the new file and see how the command line changed to use it. Basically the new name just replaces the old name. It's perfectly acceptable to stop a range in progress, get the new sieve, then resume again having changed the command line to change just the -i part of it that specifies the input file. I do it all the time.
The difference in using the new file is that your computer doesn't have to test candidates that previous sieving has removed. If you're looking at the stats chart, factors found should be closer or even equal to candidates removed. It's more efficient.
I'll continue to replace those files on occasion, but probably won't mention it here again. The effect is greatest at low p levels where factors are densest. One or two candidates removed is hardly noticeable timewise, but when there are hundreds it can really make a difference.
Example - old command line
-p75e11 -P76e11 -igc47_500G.txt -ffgc47_7500G-7600G.txt -q -zz
New command line
-p75e11 -P76e11 -igc47_5800Gi.txt -ffgc47_7500G-7600G.txt -q -zz
Before anyone asks, an "i" at the end of the name denotes an incomplete sieve file. The sieve file is named for the highest range completed but the 'i" means that there are unreported ranges below that. For example on the example above gc47 is currently missing 3800G-4800G. | |
|
Message boards :
Sieving :
GC sieving (for CPU) is now available for five bases |