PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise
1) Message boards : Problems and Help : I call BS (Busted WU?) (Message 41864)
Posted 3624 days ago by roberto4u2*Project donor
Maybe someone can enlighten me as to how my host (214753) which has not had any invalid WU's or changes in configuration in a month (and is not used for anything other than PrimeGrid really) will suddenly pop an invalid result on just this one (heavily contested) Woodall WU from the midst of the last competition:

http://www.primegrid.com/workunit.php?wuid=213600922

How can so many generally stable hosts (look at their task history too) all come up with different answers until a random two magically agree on an answer?

If the WU is OK and the answer is mathematically correct, shouldn't every stable host come up with the same answer? If the answer is not mathematically correct, wouldn't you expect unstable hosts to be the culprit?

So, if I have a stable host, and so do the rest of the guys from the looks of it, what happened here? How do multiple otherwise stable machines get pegged for just this one WU simultaneously? Massive coincidence that all of them have a problem on just this one WU or is there something borked with the WU its self?

This isn't the first report of problems similar to this on the Woodall WUs either, and it's quite frustrating as ranking in the competition and 160,000+ seconds of crunching time are on the line with just this one WU alone... let alone any other ones experiencing this same problem.

And just because it is a common answer, telling me my host must be suffering from some sort of overclocking or cooling problem isn't going to fly. My whole machine maintains a 72 degree temp maximum (measured at the chips) and I would have seen at least one other invalid task in the thousands of tasks before or after that one WU if that was the case. And I'm sure that these other guys which were marked invalid on just this one WU would have seen at least one other invalid on their hosts at some point as well.

Here is a link to one of the forum posts covering this and other contested WU's:

http://www.primegrid.com/forum_thread.php?id=3659

I think this needs a deeper inspection as it does not appear to be symptomatic of just one machine randomly firing off an invalid WU.

Thanks,
Robert

P.S. Also, it appears that Conan and I validate differently than each other on some WU's (including the one listed here) if you need to machines to compare residues between.
2) Message boards : Number crunching : What's wrong with this WU ? (Message 41827)
Posted 3626 days ago by roberto4u2*Project donor
We will eventually get a couple of winners.


That is the part that concerns me a bit. How do so many previously "reliable" machines seem to be getting contested results on these workunits? Machines known to validate all of their results before and after these batches without issues suddenly fail validation on a handful of workunits with which few machines can agree on?

If there is a buried bug somewhere which gives repeatable but false results... wouldn't this look like the symptoms (until two machines end up "accidentally" with the same results)?

Hopefully this isn't any sort of wider problem. Or if it is a wider problem it is caught relatively soon. Those WU's take quite a while to crunch after all, and that is a lot for someone to crunch only to have the WU invalidated through no fault of their own (or their machine).

- Robert

3) Message boards : Problems and Help : Can I force BOINC to request GPU wu's (Message 41677)
Posted 3630 days ago by roberto4u2*Project donor
That is due to the scheduler mechanics.

It still has the allocated time for the high priority tasks reserved and will not try to get more work when it thinks it will run out of time on the tasks it already has.

Good to see you figured it out!

Thanks,
Robert
4) Message boards : Problems and Help : Can I force BOINC to request GPU wu's (Message 41651)
Posted 3631 days ago by roberto4u2*Project donor
No, a list won't be necessary. But try just subscribing to the two sieves by them selves and not to the GPU LLR. Between PPS and CW Sieves there will be plenty of work.

Also, if you suspend CPU/GPU crunching due to using the machine (set in your preferences) or by clicking on the tray icon and clicking "Snooze" it will suspend the GPU tasks as well. CPU must be allowed for GPU to work but not vice versa.

If that doesn't ring a bell, the important part is to see what the last message is in your client log once you stop getting more work. It will have an error message we can follow up on.

Thanks,
Robert
5) Message boards : Number crunching : Is this an Over Clocking problem with this Host? (Message 41647)
Posted 3631 days ago by roberto4u2*Project donor
And another:

http://www.primegrid.com/show_host_detail.php?hostid=220484

Pages and pages of all sorts of invalid WU's.
6) Message boards : Number crunching : What's wrong with this WU ? (Message 41642)
Posted 3631 days ago by roberto4u2*Project donor
Add this one to the mix:

http://www.primegrid.com/workunit.php?wuid=213600922

5 inconclusive, 1 abort, and 1 error and the WU is still pending.

Mine is host 214753 and has not failed validation for any of the others.

Thanks,
Robert
7) Message boards : Problems and Help : Can I force BOINC to request GPU wu's (Message 41641)
Posted 3631 days ago by roberto4u2*Project donor
Is it your Core i7 machine or the Pentium D machine?

Which sub-projects are you subscribed to?

How many GPU projects are you subscribed to and "allowing" tasks on?

Some more information is needed.

Thanks,
Robert
8) Message boards : General discussion : Process to -19 Linux (Message 41358)
Posted 3637 days ago by roberto4u2*Project donor
I've never tried, but I suppose you could use the command line the LLR application is being launched with to calculate the same ranges over and over again. (i.e. you could load the same "job" on all cores at once). But as far as how to go about this exactly, well, I have no clue. I've never actually looked at the source code or really what the client is doing under the hood...

I just noticed that it seems like all of the parameters for what to check is just options passed to the LLR application. I don't think there really is an "input" file with data in it per-se, but I might be mistaken. If someone with knowledge of this could pop in with some information...

In any case; once you get a static workload which does the same work every time... and you run the machine through comparison tests multiple times to get a good sample... then you should be able to use that to get a general idea if there is any real change or not.




9) Message boards : General discussion : Process to -19 Linux (Message 41334)
Posted 3637 days ago by roberto4u2*Project donor
This was THE ANSWER :) ( and that is not told in sarcastic way)
Thanks.
Meantime I found that setting app to -19 was bad idea, and found it slowest of all, so I abandoned that .
What I wont is to find is Linux faster then Windows in Primegrid. Since that is pure mathematics I was trying over 30 Linux distributions: from Puppy Linux to Ubuntu 11.04. and found also that difference is less then 5% ( even that high is under question) in favor of Linux. Also try many Linux kernel optimized to AMD or even try myself to compile kernel.
But in other hand that is was I learn about Linux :)
I start all Linux distros with init 3, and use pure boinc ( not boinc manager) under linux- since I have GUI manager on windows host :)
Find also that linux distro size ( or kernel size) doesn't affect on speed. Smallest distros / kernel was slowest because they are build on old kernels ( or made for old computers)
And finally it is nearly unbelievably that so small percent of Primegrid users have Windows/Linux hosts. And if I ask question: what is faster on same computer I always got answer: I never try that to compare :)


Glad I could help!

The reason you won't see much of a difference is that the applications don't use much of the OS resources except for a few relatively low level libraries. Everything else depends on the speed and capabilities of the hardware it's self.

Obviously the drivers and OS can limit features available via those libraries (i.e. cannot run 64-bit code on a 32-bit OS on a 64-bit CPU normally) or cause problems and bugs, but the performance gain you mostly see will be in how much or little background tasks are running on the machine. But low level OS libraries tend to be well tested and decently optimized to start with.

Linux can be tuned heavily for the specific type of workload being put on it. Windows can be tuned too, just not with as much freedom. With Windows you cannot easily shutdown the GUI, turn off the printer daemon, or boot into different init levels.

Unfortunately, advanced tuning requires experience, benchmarking, trial-and-error, and considerable time for both operating systems. So in light of that fact, getting 5% just from using one over the other is actually quite a large boost for not much fuss, especially from an application which depends MOST on your hardware speed, capabilities, and load and LEAST on the OS functionality.

Thanks,
Robert
10) Message boards : General discussion : Process to -19 Linux (Message 41276)
Posted 3639 days ago by roberto4u2*Project donor
As these applications are not latency sensitive you will see a reduced benefit from changing the priorities (and may mess up the kernel a bit if you put too much at system/real-time). Priorities help more for short term responsiveness (tasks taking < 1 second), not long term performance (tasks taking > 1 second). My suggestion is not to solve the problem as you state it but rather do something else for your desired performance boost.

Think of it this way:

You have 200,000 people in a line. You have 10,000 people who need to be put into the line. In what order would you put the 10,000 people into the line where it would ultimately impact how long it takes to get all of the (now) 210,000 people through the line?

Answer: It doesn't really matter. Unless, of course, someone in that group needs to get through the line FIRST. Then it matters what the priority is otherwise all 210,000 people need to get through the line eventually anyways.

So in computers, lets say you have 200,000 tasks to do. Then someone moves the mouse cursor on screen and clicks a button (lets just say it's an extra 10,000 tasks). Obviously you won't want to wait until all 200,000 other tasks are complete just to register a mouse click so you would give those 10,000 tasks priority over the 200,000.

What you need to do if you want to get every ounce of performance out of the machine without driving yourself mad is to reduce the "extra" tasks running on the machine (the amount of extra people going into the line), lock tasks to cores (making sure people aren't hopping from line to line as it takes time coordinate that), or change the performance of the computer by raising the clock speed (how fast the line moves).

Obviously, if you are new to computers or running on a tight budget (can't afford dead hardware) overclocking your PC is NOT recommended! Also, setting tasks to cores can be an exercise in frustration for the same reason as trying to set the priorities can be (they reset when a new task starts).

I won't really go into too much detail in this post about it, but if you want the most bang for the buck:

Kill extra tasks like SAMBA (file sharing), CUPS (printing daemon), Portmap (remote RPC for NFS and others), NFS (Linux network filesharing), KDE / GNOME (use the CLI), and especially the BOINC manager application when you aren't using it (it uses 1-3% cpu just to show the task list).

Then, if you have a "primary" Windows computer with Linux hosts reachable over the network you can install Xming on the windows box (free), and then use Putty (also free) to SSH into the hosts (start Xming first). With Xming running you can then run "boincmgr" from the CLI and the graphical window for it will pop up on your Windows workstation (this works using all default settings from a clean install). This gives you the ability to use the graphical interface without having to tinker with the BOINC settings for remote access on each and every machine or leave a GUI running on any of them either.

If all you have are Linux machines you can leave one of them running KDE / Gnome and use the same display forwarding as Xming (it's a built-in feature of Linux, so you don't need Xming installed) instead of leaving the GUI running on all of them. Just type 'echo $DISPLAY' in your ssh session. If you get a line of output then you should be ready to just run the manager. If not then you will need to enable this functionality (I won't cover that here but it's easy).

Another option you have if you really want to control priorities and CPU affinity (core binding) is to install vmware server / workstation / player (or another virtual machine server) and install a very minimal Linux install with no GUI or services except for SSH onto the virtual machine. Then bind the virtual machine (which always stays running) onto the cores you want with the priority you want (you only have to set this once per reboot of the virtual machine, or in the configuration files for the virtual machine depending on version of the server application). The only thing this doesn't do for you is allow you to run GPU tasks.

In any case, for these types of applications where latency doesn't matter, only total work load, you really don't want to muck with priorities too much. It will only leave you frustrated.

Thanks,
Robert


Next 10 posts
[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2021 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.98, 2.38, 2.63
Generated 19 Sep 2021 | 17:29:14 UTC