I have to amend my original answer. It's been so long since I set this up (six years or so) that I forgot some of the details. For any GFN n value, I have two completely separate sieve files. While I think I've gone through the process before in the big manual sieving system thread, it probably bears repeating here so I can point people to a short thread.
When I'm about the validate manual sieving, I do the following:
1) Update the actual reservations themselves. In the very beginning I used to do this manually but it was error-prone and as more sieving happened it became a huge pain. So now there's a program that synchronizes the reservations between the PrimeGrid server and my local server (which generates the GFN stats and graphs). For those with a technical bent, I keep a tunnel open through each SSH client to the mysql port on each server. So my local workstation can run queries and update my local server quickly and efficiently.
2) Download the actual factor files. Again, this used to be done manually, but I wrote a small program that connects to the server and retrieves all the pending factor files. It just grabs every file in each reservation upload directory without regard to what kind of extension it has.
3) I run a program that I call "normalize". It does the following:
a) If the file is in .zip, .7z or .rar format, unpack it.
b) Test every factor to make sure it's valid. n value is tested to make sure it matches the filename.
c) Sorts the the entries in factor,n order. The sieving program output can be out of order.
d) Remove duplicate entries from the factor file. If you restart after a crash, there can be duplicates. That's mostly to keep the stats accurate as duplicate factors don't matter otherwise.
e) The output is always in "factor | candidate" format like 23803926529 | 3480^65536+1 as opposed to early versions of David Underbakke's GFN sieving program which produced output like 1*3480^65536 + 1 factor : 123803926529. The program can read both styles. That's how it originally got the name "normalize".
4) I run a program that resieves from the last factor appearing in each factor file to the end of the range (as given by the filename). It's quite common for GFN22 factor files to not have factors for up to 0.12P without it being an error. For any file apparently missing more than 0.1P of sieving the program throws up a message. Unless I interrupt it, it'll finish sieving on each range. This is where I find most of the problems with uploaded factor files - they end far earlier than they should. If I'm doing processing around the 0400 UTC deadline for the system to give credit (more on that below), then I interrupt processing and remove that factor file from consideration, writing a PM to the user involved. If doing it at a different time of day and the range is not huge, I may let my workstation finish the sieving.
5) Once #4 is finished, the factor files are automatically copied to their appropriate directories on both my workstation and my home server. Each n has its own directory (names are 32768, 65536, 131072 etc. so it's harder to accidentally be in the wrong directory than names like GFN15, GFN16 etc.).
6) Local workstation is done first. For each n in which there are new factors, I run a program that opens the old sieve, reads every factor file and applies it to the sieve. As part of that process, the following tests are done:
a) The header line is tested to make sure it's appropriate for the file. n and bmax values must be valid. Any file missing a header line is flagged.
b) First and last factors are checked to make sure they correspond with the filename
c) Every factor is again tested to make sure it really divides the candidate in question.
d) Gaps between successive candidates are looked at and flagged if they're too far apart.
e) Early sieving on Underbakke's program could have continuations. Special care was made to ensure there was no gap around a continuation. As there was no checkpoint file, the user involved had to re-enter all the parameters of the search and often made mistakes. The factor value could have gaps, the n value could completely change, a different range could be appended, etc.
f) Every newly-generated factor file is expected to have b values above 100M if the user is running the right program. Any factor file that doesn't is flagged here.
Those tests are all run on my local workstation where the only output is the new sieve and nothing else can get screwed up by bad files. Any file that doesn't pass testing here is removed from my local server.
7) A similar program is run on my local server. This program doesn't do all the checking that happened on my workstation, but talks to my local database. It makes certain that factor files exist to completely cover each reservation. Factor counts, sieve removals and the values of the removals themselves are all recorded in database files. At the end of each run (one per n) two lists are printed. One is the list of newly-removed candidates that are currently loaded on PrimeGrid and should be removed. The other is a list of missing ranges that's haven't yet been submitted (the gaps in the sieving). It's painful and time-consuming to remove bad data from the database, which is why this program is only run after the one on my workstation completes without errors.
8) I have a web page where I copy and paste the factors to remove sieved-out work already loaded on the server. While this could be automated, it's helpful for me to see what shows up. This web page either removes candidates entirely if not yet turned into a workunit or cancels the workunit, turns it to quorum 1 (any finished job validates immediately) and sets the residue field to "FACTOR FOUND". A factor is better than a genefer test result and that value will not be overwritten by the validator.
9) After doing all current n values on the local server, I run another command there that generates the stats and regenerates any graph where the data has changed. That program automatically copies those updated files to PrimeGrid's server when it finishes.
10) Somewhere in all of this, usually as each n is done testing on my workstation, I manually validate each pending manual sieving reservation that I downloaded factors for. It's not unusual for more uploads to happen during this processing and those are either left until the next time I "do" manual sieving or downloaded immediately and processed before step 9 above. Each factor file moves from its upload directory to the factor file directory for that n.
11) At 0400 UTC each day, credit is moved from the PSA badge pending (PRPNet and manual sieving) into actual PrimeGrid credit. The amount transferred is up to 80% of your current Recent Average Credit (RAC). Of course this has the effect of boosting your RAC so if you have too much credit to transfer all at once, the amount transferred the next day is much larger.
Back to the sieve files: The sieves on my local workstation are for the full b range for that n. For example, on GFN19 (524288) early sieving only went to b=100M so that's what my sieve goes to. On GFN15, GFN16, GFN17 and soon GFN18 sieving went to b=2G from the beginning and sieves go that high too. Those sieves on my local workstation also have candidates removed due to algebraic factors (some candidates can't possibly be prime as they have known divisors that won't be found by our sieving). Those workstation sieves are the ones used to produce new work to be loaded on PrimeGrid.
Sieve files on my local server are only for the stats and graphs. They all end at either b=100M or b=400M. But it's also useful as another copy of the factor files involved. There are at least four copies of every factor file kept by us. One is on the PrimeGrid web server box , one is on the PrimeGrid database server box which autosyncs with the web server, one is on my workstation and one is on my server. Additionally, every three months I make a backup of my entire sieving directory structure (565 gigabytes at the moment) onto a completely different local box. I have a year's worth of those. And when we finish sieving any project, I burn a copy to DVDR. We're serious about not losing data. Technically I don't need the factors after a new sieve has been generated, but if there are ever questions about whether sieving was done properly, those factors are invaluable.
Finally, bear in mind it takes a lot longer to talk (or read) about this processing than it takes to do it. Most of the tests don't ever find anything wrong, but we can't have improperly-eliminated candidates.