PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise
1) Message boards : Number crunching : Katherine Johnson Memorial Challenge (Message 142195)
Posted 52 days ago by xii5ku
Dave wrote:
Any way of seeing what the FFT distribution is at this time? Would it be heavily skewed towards the high end or more of a classic Poisson?

Here are two samples of 104 tasks each.

Sunday noon:
3 % 768K, 1 % 864K, 9 % 896K, 15 % 960K, 32 % 1M, 40 % zero padded 1120K

Thursday night:
2 % 768K, 0 % 864K, 3 % 896K, 13 % 960K, 32 % 1M, 12 % zero padded 1120K, 38 % 1120K

This is from hosts which run FMA3 FFT. Hosts which employ other implementations will use differently sized FFT lengths for the same workunits.

You can also look up the current k's and n's at stats_trp_llr.php and check their FFT lengths this way:

llr.exe -d -q"${k}*2^${n}-1"
2) Message boards : Problems and Help : Requesting new work -- "got 1 new tasks" (Message 139391)
Posted 170 days ago by xii5ku
Good point; this built-in "intelligence" of the client (or server?) needs to be kept in mind when assessing the work buffer depth settings.

Though it doesn't apply in this case:

  • Affected were and are clients with different <on_frac> and friends, anything between ~50% to 99.999%.
  • When the client has got n logical CPUs idle, it should request work for n CPUs, and should receive it in one go* even with poor <on_frac>.** And indeed these clients do receive work for n CPUs in a single request in such a situation at other projects, and did so at PrimeGrid too in the past.



--------
*) except if the project or application is new to the client, then the 1st request fetches one task, but the next request would get several tasks
**) because the case of several idle logical CPUs removes work buffer depth from the equation

3) Message boards : Number crunching : Sophie Germain's Birthday Challenge (Message 139390)
Posted 170 days ago by xii5ku
Bur wrote:
Just to make sure: the setting you suggest is the "Max # of threads for each task"?

Yes, I mean the setting which is called "Multi-threading: Max # of threads for each task" currently but should better be written as "Multi-threading: # of threads for each task", AFAICT.

Bur wrote:
So basically, set the number of cores to use to the percentage of physical cores vs logical. And "max # of threads" to 1/n (where n = #physicalCores/#logicalCores)? Spoken like the true mathematician I am not. ;)

As I don't have access to the client right now I probably cannot change the # of cores setting?

Note, I suggested this not as an optimization for your hardware, but for testing purpose (to learn more about where the throughput optimum of this Sandy Bridge i3 really is, and why). That is, I meant it more as something which could be tried if and when there is spare time, outside of a contest (unless you didn't run such a test already, which I now understand you haven't indeed).

Two more configs which could be tried:

  • 3 single-threaded tasks concurrently
  • 2 single-threaded tasks and 1 dual-threaded task concurrently (more difficult to set up, but possible)

That's because of the 1+ MB cache footprint per task, while this i3 has got 3 MB L3 cache.

Bur wrote:
Apart from the SGS challenge, are those settings generally recommend? I see the HT warning for a lot of categories. I will do some more testing on monday in that case when I continue working towards the PPS silver badge. :)

I meant it as a question, not as a recommendation, especially since I misunderstood your performance report at my first reading. :-)

I suspect these postings which recommend to avoid Intel Hyperthreading in LLR based PrimeGrid subprojects are coming mostly from people who never did any own measurements at all, or didn't measure in recent years, or applied poor testing methods. Proper advice on whether or not to use Hyperthreading needs to take the particular hardware and subproject into account, what the optimization goal is (throughput, run time, efficiency), and whether LLR runs exclusive or together with other workload.

Alas I have very little own test coverage for processors similar to yours, hence can't give a precise recommendation.
4) Message boards : Number crunching : Sophie Germain's Birthday Challenge (Message 139364)
Posted 170 days ago by xii5ku
Bur wrote:
Setting the number of simultaneous tasks to the number of physical cores slowed down things for me. 4 simultaneous WUs took 33-34 min each, 2 WUs were 19 min each.

So in total it resulted in less WU/t.

Your statement of "physical cores" to be 4 is incorrect: The processor has got only two physical cores, each with two hardware threads.

But your finding that your processor had lower throughput with 4 concurrent tasks than with 2 is entirely expected:

Each SGS-LLR task needs to hold more than 1.0 MB of data in the processor caches, otherwise its performance will tank due to a high rate of accesses to main memory. The i3-2120 has got 3 MB level-3 cache, hence cannot support more than two simultaneous SGS-LLR tasks without heavy memory accesses.

Now the question remains whether 2 single-threaded tasks or 2 dual-threaded tasks give higher throughput. I know the answer for some other processor architectures ("it's a wash"), but not for Sandy Bridge.


Oops, my reading comprehension could stand improvement.

Did you compare 4 single-threaded tasks with 2 single-threaded tasks, or with 2 dual-threaded tasks?
Bur wrote:
I didn't run the 50% test for long,

Sounds like 2 single-threaded tasks then. Hence maybe 2 dual-threaded tasks could improve on that.

Re: my prior statement on heavy memory access: On processors with high core count and very large cache, I actually have measured just a gradual decline in throughput as soon as the combined cache use of concurrent tasks began to exceed the cache size. Still, somewhat more than 4 MB hot data on a chip with 3 MB cache looks like too much. Though on second thought, it's merely 1 MB+ which needs to be pulled in and out all the time. That's tiny in comparison with most other LLR subprojects at PrimeGrid.
5) Message boards : Problems and Help : Requesting new work -- "got 1 new tasks" (Message 139363)
Posted 170 days ago by xii5ku
Thanks to all who looked at my question, and special thanks to those who responded so far.
Apologies that I make your life harder by not showing my hosts. (I am a shy one.)

Eudy Silva wrote:
Just a shot in the dark, but how much free space in your hard disk ?

Hmm, let's see. The 5 clients on 5 hosts which are running PG currently, and all show this behaviour, report this as "free, available to BOINC":
66.49 GB, 62.24 GB, 60.47 GB, 59.10 GB, 196.99 GB
So this should be plenty.

Some other random info:

  • The client versions are "well hung": 7.8.3 on all of the 5 hosts.
  • These clients didn't do this at PG in older times. As mentioned, I think the change in behavior more or less coincided with the "mt" plan class implementation at primegrid.com.
  • I don't recall to have seen this behavior at other projects. Granted, most of the time I set larger buffers when I run other projects, but sometimes I set such small buffers at other projects too.

6) Message boards : Problems and Help : Requesting new work -- "got 1 new tasks" (Message 139336)
Posted 171 days ago by xii5ku
Thanks, I actually forgot to check whether or not it's at 0 % (which I rarely, but indeed sometimes, used).
But no, resource share is 100 %, and has been already before I joined the SGS challenge.

(But even at 0 % resource share, the client used to pull 23 tasks at once if 23 CPUs were idly waiting for work. Not so now anymore.)
7) Message boards : Problems and Help : Requesting new work -- "got 1 new tasks" (Message 139334)
Posted 171 days ago by xii5ku
Then what else is limiting?

My hosts don't have one core.

Ah wait. I had set the buffers to 0.02 + 0.02 days and similar. Which is 1/2 hour + 1/2 hour.

Now picture a 128 CPUs host set to use 64 CPUs, which completes SGS tasks in 13 minutes. I certainly expect to receive a metric boatload of tasks with such a setting, per request. But no, it gives 1 task per request.

And now I tried something out of this world: A buffer of 0.1 days + .01 days. Now I get... 5 (five!) new tasks per request.

Yay!

:-(

I used to run PG with 0.00 + 0.00 day buffers. Why doesn't this work reasonably anymore?
8) Message boards : Problems and Help : Requesting new work -- "got 1 new tasks" (Message 139326)
Posted 171 days ago by xii5ku
Hi,

circa since when the "mt" plan class was implemented here at PrimeGrid, the client's scheduler requests for new work are responded to with no more than 1 new task each.

Now, with the SGS-LLR challenge going on, my best host can finish tasks faster than it can request tasks normally. And for sure there are hosts around here with even higher throughput than mine.

I am aware of two workarounds:

  • If the client's automatic request retry rate is just a tad too low to keep the host saturated, it is possible and sufficient to force the client to slightly more frequent scheduler requests than it does on its own (with its built-in retry latency combined with PG's current <request_delay> of 7.0 seconds).
  • If the host's computational throughput significantly exceeds the client's possible request rate, then an obvious workaround is to operate more than one client on the host.


Of course, both workarounds don't fix the actual problem that a wastefully high rate of scheduler requests has to be maintained per host.

So, is there any way to get more than 1 new task per scheduler request?

PS, my job control settings are:
Max # of simultaneous PrimeGrid tasks --- No limit
Multi-threading: Max # of threads for each task --- 1

PPS, at subprojects which benefit from multithreaded tasks, I set the threadcount locally per app_config.

9) Message boards : Problems and Help : Looks like my TeAm got hijacked! (Message 139269)
Posted 172 days ago by xii5ku
Thank you, ILW8.
10) Message boards : Number crunching : Better multi-threading (Message 134065)
Posted 337 days ago by xii5ku
xii5ku wrote:
What does the setting "Max # of simultaneous PrimeGrid tasks" mean?
    a) per-host limit of tasks in progress
    b) same as app_config::project_max_concurrent
    c) something else


Michael Goetz wrote:
Both A & B.

No, it's only a).

I fetched two tasks, started both, then suspended one.
Then I reduced max_jobs to 1 in the web preferences, triggered a project update, and checked in sched_reply_www.primegrid.com.xml that the client received this new setting.
Then I resumed the second task, and it was immediately put into running state again by the client.

Naturally, the client doesn't do this with app_config::project_max_concurrent = 1.

PS,
the client with which I tested had an app_config.xml for PrimeGrid when it was started, without a project_max_concurrent line in it though.


Next 10 posts
[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2020 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 2.27, 2.64, 2.41
Generated 21 Sep 2020 | 9:42:23 UTC