PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Number crunching : A test for hyper threading

Author Message
mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88397 - Posted: 23 Sep 2015 | 17:20:23 UTC

This is just a random thought I had.

Case 1: If we are doing LLR then the general rule is that HT should not be used, either by disabling it completely or only running the number of simultaneous tasks up to the number of physical cores. The software is already making near enough 100% potential use of the CPU so we can't get any more out of it from HT.

Case 2: If we are doing sieve, we do gain from HT and should run an instance on as many threads as the CPU can simultaneously handle. For whatever technical reasons there may be in the background, sieve can do more per core running two threads with HT than one.

Case 3: Can we do both? Imagine a scenario where for each physical core, it is running one LLR task and one sieve task through HT. I believe they use different parts of the processor, so is there potential to get more work out this way? As far as is my understanding, LLR uses AVX (floating point) instructions, and sieve uses integer instructions. So there is potential for them to not get in each other's way. At least for the execution units, but other shared resources may become limiting. I'm no better at the fine details of microprocessor architecture than I am at writing software, so consider this an informed guess.

If the above could be of benefit could be tested by simply doing it. Which on further thought, is not so simple. I don't believe there is any way to tell a single boinc install to download 4 units from each subproject, nor is there any way to make sure they end up paired up appropriately on each physical core.

I think that leaves two more options:
A: run two boinc installs on one system. I believe this is possible with some modification to the install. But I don't think this offers an easy way around the affinity problem.
B: run a 2nd instance of boinc in a virtual machine, which has fixed affinity to separate cores. Then hope the process scheduler on the host is smart enough to put its load on the remaining free cores. A possible hindrance to this approach is that in some initial research, it sounds like for vbox the guest OS will only run if there are enough simultaneous cores free, which could be a major choke point. If this is the case, then the only workaround I can think of is to run single core guest OSes but this may become a management nightmare.

And at that point I wonder if all this is worth it to gain some extra work from a processor...

Note: I am assuming in the above we're talking about modern Intel CPUs. AMD CPUs don't have hyper threading. Ancient P4s had HT too but they would not be a serious consideration here.

Profile Crun-chiProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Nov 09
Posts: 2671
ID: 50683
Credit: 51,681,108
RAC: 32
Eliminated 1 conjecture "k"Found 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Silver: Earned 100,000 credits (229,492)Cullen LLR Silver: Earned 100,000 credits (110,733)PPS LLR Ruby: Earned 2,000,000 credits (2,982,482)PSP LLR Silver: Earned 100,000 credits (104,385)SoB LLR Silver: Earned 100,000 credits (106,117)SR5 LLR Silver: Earned 100,000 credits (139,802)SGS LLR Amethyst: Earned 1,000,000 credits (1,073,792)TRP LLR Silver: Earned 100,000 credits (122,712)Woodall LLR Silver: Earned 100,000 credits (122,944)321 Sieve Silver: Earned 100,000 credits (104,900)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,599)Generalized Cullen/Woodall Sieve Gold: Earned 500,000 credits (515,556)PPS Sieve Jade: Earned 10,000,000 credits (10,561,511)TRP Sieve (suspended) Silver: Earned 100,000 credits (255,612)AP 26/27 Ruby: Earned 2,000,000 credits (2,575,874)GFN Sapphire: Earned 20,000,000 credits (23,152,487)PSA Turquoise: Earned 5,000,000 credits (7,522,050)
Message 88402 - Posted: 23 Sep 2015 | 18:35:55 UTC - in response to Message 88397.

I will not agree with you. You cannot squeeze CPU and got 110% of juice : you will always got only 100%.
So if your physical core run LLR task, and your HT core run sieving task you will not succeed in manner you wont. Your time if not doubled will be near that. And whit that CPU you will not make nothing else: even mouse move will be jumpy.
And also it is impossible to sort sieve task on HT core and LLR on physical core on Windows, affinity dont work well as on Linux..
____________
93*10^1029523-1 REPDIGIT MEGA PRIME :) :) :)
57*2^3339932-1 MEGA PRIME :)
10994460^131072+1 GENERALIZED FERMAT :)
31*332^367560+1 CRUS PRIME :)
Proud member of team Aggie The Pew. Go Aggie!

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88408 - Posted: 23 Sep 2015 | 19:25:16 UTC - in response to Message 88402.

Ok, to be precise I should have said LLR uses near enough 100% of the parts it needs. It does not mean it is using 100% of everything. This is where a possible gain might come from. We know HT gains for sieve, but not for LLR. But what about other combinations?

Picking made up numbers for illustrations only, let's say you can do 10 LLR units a day on a core, OR 10 sieve units a day. But with HT, you get 14 sieve units a day (I find you do get in practice around 40% throughput advantage from HT on CPU sieve). If you could run my proposed mix, what might happen? Maybe LLR slows a little, but you gain on sieve: LLR <= 10 and LLR + sieve > 10, ideally >= 14. That is the hope. Worst case, you don't gain anything: LLR + sieve = 10. But we don't know that.

Also the claim about the mouse becoming jumpy is pure rubbish. Tasks are run at low priority and will give way to routine user tasks. In the same way, you don't see the mouse go jumpy just from running the CPU at 100% on tasks. The only time I have seen reduced mouse pointer performance is when running heavy tasks on the GPU, but that is not applicable in this case.

Profile Rafael
Volunteer tester
Avatar
Send message
Joined: 22 Oct 14
Posts: 845
ID: 370496
Credit: 270,663,461
RAC: 215,398
321 LLR Turquoise: Earned 5,000,000 credits (5,020,551)Cullen LLR Turquoise: Earned 5,000,000 credits (5,027,789)ESP LLR Ruby: Earned 2,000,000 credits (4,591,087)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,036,673)PPS LLR Turquoise: Earned 5,000,000 credits (5,870,467)PSP LLR Ruby: Earned 2,000,000 credits (4,831,030)SoB LLR Turquoise: Earned 5,000,000 credits (5,521,831)SR5 LLR Ruby: Earned 2,000,000 credits (4,874,141)SGS LLR Turquoise: Earned 5,000,000 credits (5,024,715)TRP LLR Turquoise: Earned 5,000,000 credits (5,007,966)Woodall LLR Ruby: Earned 2,000,000 credits (4,913,103)321 Sieve Silver: Earned 100,000 credits (189,064)Generalized Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (10,037,204)PPS Sieve Jade: Earned 10,000,000 credits (10,305,147)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,053)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,030,160)AP 26/27 Turquoise: Earned 5,000,000 credits (9,262,513)GFN Sapphire: Earned 20,000,000 credits (24,696,146)PSA Double Bronze: Earned 100,000,000 credits (156,423,820)
Message 88409 - Posted: 23 Sep 2015 | 19:29:45 UTC - in response to Message 88408.

But... wasn't it you that showed LLR tasks are currently RAM (and maybe cache?) bottlenecked? If that's the case, I can only imagine running a sieve + LLR to be even worse.

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88411 - Posted: 23 Sep 2015 | 19:38:39 UTC - in response to Message 88409.

For big tasks, yes, but not for small ones like PPSE and SGS. 4 of those can fit easily on the processor cache of i5/i7 CPUs, although I don't know what impact sieve might have on that.

A quick check shows a random SGS unit is 128k FFT, so that would take about 1MB of cache. Don't have a PPSE on hand to check that at the moment.

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12497
ID: 53948
Credit: 181,109,121
RAC: 111,235
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (2,001,789)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,135,461)PPS LLR Ruby: Earned 2,000,000 credits (2,768,012)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (33,848,148)SR5 LLR Turquoise: Earned 5,000,000 credits (8,153,193)SGS LLR Ruby: Earned 2,000,000 credits (2,011,264)TRP LLR Ruby: Earned 2,000,000 credits (2,708,742)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Ruby: Earned 2,000,000 credits (4,531,369)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,403,395)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 88412 - Posted: 23 Sep 2015 | 19:50:14 UTC - in response to Message 88411.
Last modified: 23 Sep 2015 | 19:52:19 UTC

For big tasks, yes, but not for small ones like PPSE and SGS. 4 of those can fit easily on the processor cache of i5/i7 CPUs, although I don't know what impact sieve might have on that.

A quick check shows a random SGS unit is 128k FFT, so that would take about 1MB of cache. Don't have a PPSE on hand to check that at the moment.


PPSE currently used FFT sizes of 96K and 128K.

EDIT: Unless it doesn't. The FFT size can vary somewhat depending on the CPU type, so it's possible that a specific task on a specific CPU might use an FFT size other than 96K or 128K.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88416 - Posted: 23 Sep 2015 | 20:34:04 UTC - in response to Message 88412.

Ran some units on a Sandy Bridge, and I'm seeing 96k and 120k units on that.


Giving the matter some more thought, I have a rubbish idea to test this out.

Step 1: get a baseline measurement by separately running a single long duration LLR unit (say, some hours), sieve without HT, sieve with HT, and for each case work out an average results per day. The LLR unit just has to be long enough for several sieve units to run in the same time to get a good average.

Step 2: start a single long duration LLR unit as before, but this time run sieve with HT on all remaining cores. This way, the long duration LLR unit should not be held back by ram bandwidth, plus will have to be paired up with sieve. At the end of the LLR unit again work out the average results per day. Compare against the values in step 1. This obviously has to be done over a short amount of time so that WU lengths don't change significantly. It might be safer to manually run the same LLR test instead of getting live work, but sieve seems more stable so that can be real work.

Thinking even more, if I can disable cores just to leave a single with HT, I can easily pair up the work manually. Taking it further, I should even run the sieve manually to maximise repeatability.

I might try this on the i3 box I'm hoping to build this weekend. This would at least give an indication if I'm onto something here, or just wasting my time.

Profile Rafael
Volunteer tester
Avatar
Send message
Joined: 22 Oct 14
Posts: 845
ID: 370496
Credit: 270,663,461
RAC: 215,398
321 LLR Turquoise: Earned 5,000,000 credits (5,020,551)Cullen LLR Turquoise: Earned 5,000,000 credits (5,027,789)ESP LLR Ruby: Earned 2,000,000 credits (4,591,087)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,036,673)PPS LLR Turquoise: Earned 5,000,000 credits (5,870,467)PSP LLR Ruby: Earned 2,000,000 credits (4,831,030)SoB LLR Turquoise: Earned 5,000,000 credits (5,521,831)SR5 LLR Ruby: Earned 2,000,000 credits (4,874,141)SGS LLR Turquoise: Earned 5,000,000 credits (5,024,715)TRP LLR Turquoise: Earned 5,000,000 credits (5,007,966)Woodall LLR Ruby: Earned 2,000,000 credits (4,913,103)321 Sieve Silver: Earned 100,000 credits (189,064)Generalized Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (10,037,204)PPS Sieve Jade: Earned 10,000,000 credits (10,305,147)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,053)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,030,160)AP 26/27 Turquoise: Earned 5,000,000 credits (9,262,513)GFN Sapphire: Earned 20,000,000 credits (24,696,146)PSA Double Bronze: Earned 100,000,000 credits (156,423,820)
Message 88417 - Posted: 23 Sep 2015 | 20:41:52 UTC - in response to Message 88416.

Shouldn't you just get, say, 3days of sieve work, then change preferences to up to 10 days of work, but with big tasks? That way you can download all WUs at once, so length will be aproximately the same. And it can also help you cherry pick your WUs, just in case you get 5month old WUs that are yet to be validated (and thus shorter WUs than current work).

It can also be very helpfull in running only as many tasks as you want, as you can manually suspend / resume stuff whenever you please.

stream
Volunteer developer
Volunteer tester
Send message
Joined: 1 Mar 14
Posts: 527
ID: 301928
Credit: 443,459,850
RAC: 20,482
Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (9,919,609)Cullen LLR Turquoise: Earned 5,000,000 credits (9,934,320)ESP LLR Turquoise: Earned 5,000,000 credits (9,909,084)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,921,052)PPS LLR Turquoise: Earned 5,000,000 credits (6,962,917)PSP LLR Turquoise: Earned 5,000,000 credits (5,089,560)SoB LLR Turquoise: Earned 5,000,000 credits (5,824,522)SR5 LLR Turquoise: Earned 5,000,000 credits (5,399,087)SGS LLR Turquoise: Earned 5,000,000 credits (5,382,419)TRP LLR Turquoise: Earned 5,000,000 credits (9,911,706)Woodall LLR Turquoise: Earned 5,000,000 credits (5,011,851)321 Sieve Jade: Earned 10,000,000 credits (12,139,151)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,047,667)PPS Sieve Sapphire: Earned 20,000,000 credits (20,866,490)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,043,271)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,015,177)AP 26/27 Sapphire: Earned 20,000,000 credits (20,045,194)GFN Emerald: Earned 50,000,000 credits (50,737,162)PSA Double Silver: Earned 200,000,000 credits (200,301,443)
Message 88434 - Posted: 24 Sep 2015 | 17:19:49 UTC

Well, you've raised an interesting question. I've thought about this too, but decided not to mess with HT because using it on anything except sieving significantly raises temperatures on my overclocked CPU.

On the first look, it sounds like a great idea. E.g. on Haswell, all AVX operations can go only to execution port number 5, while all other CPU parts remains idle. Running something _integer_ in parallel looks promising. But...

1. Really, sr2sieve is not using standard integer registers only. Big part of it is SSE2. So - everybody is waiting for port 5 again.
2. TRP Sieve task requires 150 MEGABYTES of memory. I'm not sure which access pattern it's using, but there is a high chance that cache will be poisoned and even small-FFT LLR task will dramatically decrease performance.
3. Sieve Linux wrapper is not fixed yet, you'll end up with "Text file busy" error.

Results may vary significantly depending on CPU architecture. For example, if AVX/SSE instructions could go to two execution ports, it may change a lot. So it's up to you to test.

Note that you must find relationship between logical and physical core numbers. In Linux 3.13-3.19 logical cores 0-3 are "main", and 4-8 are corresponding hyperthreaded "shadows". Other kernel versions may differ. I don't know about Windows. To find correct order, you can pin two tasks to tested logical pair and check temperature of physical cores.

Benchmarking is really not so hard. Assuming 8-HT-cores system. Get only 4 big tasks (e.g. SoB) and let them run. Pin affinity of each process to single main core (in Linux - use "taskset" program, Windows - via task manager). Let it run for some time (e.g. 24 hours to get more exact results). Check their progress after this period - so you can calculate "percents per hour" for clean LLR. (Note it may be better to wait for 24 hours of _CPU_ time reported by Boinc, not elapsed time). Now, switch project config to sieve and let 4 more tasks be downloaded and run. No more action required, in most cases, CPU scheduler is smart enough to use "free" cores for these tasks. After another 24 hours, calculate (progress48-progress24)/24 and you'll have "percents per hour" for LLR with sieve. Everything else could be calculated and estimated from these two values.

Running two Boincs on single machine is possible, I've described this on the forum early (although you have to start it via command-line, and whole concept seems to be a bit fragile - e.g. never use automatic "Merge computers" button in your PG account)

Setting an affinity is trivial under Linux - using standard "taskset" utility. Something like "taskset 0x0F boinc ...." for first copy and "taskset 0xF0 boinc ..." for second. All tasks run by boinc will automatically inherit this setting and run on allowed cores only. I think similar tool shall exist under Windows too.

And a bit of good info in the end. Some things which I've already tried and they really work:

- Running 3 LLR + 1 Sieve on single host. (I think you know that on some hardware combinations 4 big LLR tasks are working no better and sometimes even worse then 3, so one core stays idle). Performance of LLR tasks drops only by few percents and almost unnoticeable.

- Running 4 LLR + 4 Sieve on AMD FX CPU. This CPU is not truly hyperthreaded but shares single FPU/SSE unit per two cores, so running all 8 LLRs only halves performance, but running many sieves have almost no speed penalty. I've tried 4 Cullen + 4 TRP sieve on this PC and overall performance looked good, but LLR itself is slow on this CPU, also I have to fight with "Text file busy" errors after each reboot (I've using single copy of Boinc), so after one batch I've reverted this system to sieve only.

Profile Crun-chiProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Nov 09
Posts: 2671
ID: 50683
Credit: 51,681,108
RAC: 32
Eliminated 1 conjecture "k"Found 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Silver: Earned 100,000 credits (229,492)Cullen LLR Silver: Earned 100,000 credits (110,733)PPS LLR Ruby: Earned 2,000,000 credits (2,982,482)PSP LLR Silver: Earned 100,000 credits (104,385)SoB LLR Silver: Earned 100,000 credits (106,117)SR5 LLR Silver: Earned 100,000 credits (139,802)SGS LLR Amethyst: Earned 1,000,000 credits (1,073,792)TRP LLR Silver: Earned 100,000 credits (122,712)Woodall LLR Silver: Earned 100,000 credits (122,944)321 Sieve Silver: Earned 100,000 credits (104,900)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,599)Generalized Cullen/Woodall Sieve Gold: Earned 500,000 credits (515,556)PPS Sieve Jade: Earned 10,000,000 credits (10,561,511)TRP Sieve (suspended) Silver: Earned 100,000 credits (255,612)AP 26/27 Ruby: Earned 2,000,000 credits (2,575,874)GFN Sapphire: Earned 20,000,000 credits (23,152,487)PSA Turquoise: Earned 5,000,000 credits (7,522,050)
Message 88435 - Posted: 24 Sep 2015 | 18:06:53 UTC - in response to Message 88434.

When single LLR task pass 288K on CPU with 8MB L3 cache ( so it is only I7 series) performance is same as for 3 or 4 tasks in parallel. So ;I thing it is only place ( combination) where you can put combination of sieve/llr tasks in same time.
And yes, since AMD CPU is different from Intel CPU it is well know there is no performance downgrade of using more sieve tasks or more LLR tasks at once.
So you can try to take 3 MEGA LLR task, and few sieve task in parallel. It may become sweat point. But on that box you cannot do anything else. It is LLR box and it does do it only, not anything else.
____________
93*10^1029523-1 REPDIGIT MEGA PRIME :) :) :)
57*2^3339932-1 MEGA PRIME :)
10994460^131072+1 GENERALIZED FERMAT :)
31*332^367560+1 CRUS PRIME :)
Proud member of team Aggie The Pew. Go Aggie!

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88437 - Posted: 24 Sep 2015 | 18:23:18 UTC

I got all the parts on hand now for another build so will put that together, and if there is time remaining tonight I might try some testing. Somehow. Not sure how yet!

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88444 - Posted: 24 Sep 2015 | 23:21:31 UTC - in response to Message 88434.

Stream, I found your reply thought provoking. I had tried to look up the execution units myself previously, but since I don't know what is used where, I didn't get far. On the charts I saw though, FMA goes to ports 0 and 1 depending on the instruction. So the question there might be what proportion of the AVX instructions are FMA?

I think I have found the option in the bios which lets me reduce the number of cores to 1, so for the test I will then know both logical processors are from the same physical core.

I didn't have time after all to run any testing. The basic build is done, and I've thrown on a non-activated Win7 to play with as I'm more familiar with monitoring tools there than any other OS. Once set up the plan is to switch to Linux.

Anyway, while downloading CPU-Z and HWMonitor, I noticed they also had a 3rd program I never tried called Perfmonitor2. All three have some overlap but with different functions. I like CPU-Z just to know the running state e.g. clocks and ram configuration, as well as check on mobo bios versions for example.

HW Monitor is only something I looked at recently, and there's a particular gem in there, where it breaks out the power usage to Package, IA Cores, Uncore, DRAM. The DRAM value is the one I find interesting, as it gives an insight to the degree of ram activity going on.

Perfmonitor2 has some interesting values reported also. I only ran this for the first time today so I'm not at all familiar with it yet, but two functions I found interesting are cache hit rates and IPC. There's separate cache hit indicators for L2 and L3. IPC is shown per logical processor as well as what appears to be the average. It will require some experimentation to find out exactly what these do. By observation, I find L2 hit rate goes near max when running either PSP sieve units or SGS LLR, but L3 is usually lower. The system is still updating itself from Windows Update so this isn't purely down to work here, but I hope to narrow it down through more observation soon.

I couldn't find my spare heatsinks so I had to use one that was bundled with an i5-4570S, which is about half the height of the ones from i7-2600k for example. Under LLR this was running 65C ball park so I'll have to get a better cooler on order before I leave it running LLR continuously. For now I'll leave it stewing at lower temps on sieve.

stream
Volunteer developer
Volunteer tester
Send message
Joined: 1 Mar 14
Posts: 527
ID: 301928
Credit: 443,459,850
RAC: 20,482
Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (9,919,609)Cullen LLR Turquoise: Earned 5,000,000 credits (9,934,320)ESP LLR Turquoise: Earned 5,000,000 credits (9,909,084)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,921,052)PPS LLR Turquoise: Earned 5,000,000 credits (6,962,917)PSP LLR Turquoise: Earned 5,000,000 credits (5,089,560)SoB LLR Turquoise: Earned 5,000,000 credits (5,824,522)SR5 LLR Turquoise: Earned 5,000,000 credits (5,399,087)SGS LLR Turquoise: Earned 5,000,000 credits (5,382,419)TRP LLR Turquoise: Earned 5,000,000 credits (9,911,706)Woodall LLR Turquoise: Earned 5,000,000 credits (5,011,851)321 Sieve Jade: Earned 10,000,000 credits (12,139,151)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,047,667)PPS Sieve Sapphire: Earned 20,000,000 credits (20,866,490)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,043,271)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,015,177)AP 26/27 Sapphire: Earned 20,000,000 credits (20,045,194)GFN Emerald: Earned 50,000,000 credits (50,737,162)PSA Double Silver: Earned 200,000,000 credits (200,301,443)
Message 88448 - Posted: 25 Sep 2015 | 6:31:10 UTC - in response to Message 88444.

Stream, I found your reply thought provoking. I had tried to look up the execution units myself previously, but since I don't know what is used where, I didn't get far. On the charts I saw though, FMA goes to ports 0 and 1 depending on the instruction. So the question there might be what proportion of the AVX instructions are FMA?

There are no FMA instructions. FMA (in LLR sense) is a ONE instruction which calculates A=B*C+D in one step. It has lot of variants with different signs of operands, but generally they're all the same. FMA instruction indeed goes to ports 0/1 (which one is free) so why it gives so nice performance boost - you can execute two complex calculations per clock and don't wait for busy port 5. But you have to do something else before and after FMA. My reply was a bit simplified, there are other XMM/YMM instructions which are using other ports, but bottleneck in port 5 is quite common. I wrote or tried to improve few math things myself. When I've passed my code thru Intel code analyzer, high port 5 pressure always was main problem.

Speaking theoretically, hyperthreading can make things worse only when it can lead to dramatically increased number of slow memory accesses (cache misses). When everything works from cache, you'll always have some overall performance gain - from 0% (no free CPU resources) to 100% (two tasks are using different execution ports at each moment of time).
I've occasionally proved first case. When I run 8 small (PPSE or SGS) tasks, execution time of each task was exactly two times more then usual. Did double work in double time, 0% gain. Also it boosted CPU temperature to the sky - somewhere close to 95C. (I've never seen more then 80C without HT).

What will happens with sieve, with it's high memory footprint - only testing could say. I didn't tried this scenario.

Profile SteveRCProject donor
Avatar
Send message
Joined: 22 Mar 10
Posts: 146
ID: 57364
Credit: 433,330,409
RAC: 81,482
Discovered 1 mega primeEliminated 2 conjecture "k"s321 LLR Turquoise: Earned 5,000,000 credits (5,229,962)Cullen LLR Turquoise: Earned 5,000,000 credits (5,756,791)ESP LLR Turquoise: Earned 5,000,000 credits (5,917,101)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (4,688,089)PPS LLR Jade: Earned 10,000,000 credits (16,359,448)PSP LLR Turquoise: Earned 5,000,000 credits (6,289,793)SoB LLR Jade: Earned 10,000,000 credits (12,275,142)SR5 LLR Jade: Earned 10,000,000 credits (10,451,390)SGS LLR Jade: Earned 10,000,000 credits (11,277,440)TRP LLR Turquoise: Earned 5,000,000 credits (9,514,131)Woodall LLR Turquoise: Earned 5,000,000 credits (6,452,373)321 Sieve Silver: Earned 100,000 credits (245,924)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (6,474,263)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (25,971,807)PPS Sieve Double Bronze: Earned 100,000,000 credits (119,451,276)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,557,924)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,896,883)AP 26/27 Jade: Earned 10,000,000 credits (13,596,609)GFN Double Bronze: Earned 100,000,000 credits (145,556,181)PSA Turquoise: Earned 5,000,000 credits (6,367,890)
Message 88465 - Posted: 26 Sep 2015 | 10:53:20 UTC

I have done some fiddling with pairing LLR and Sieve tasks in cores manually in Windows.. - this is pretty easy to do without any additional apps required.. Here's how..
.. Assuming a hyperthreaded Core-i7 (8 tasks)..
1. Download 4 (nice long!) LLR tasks.
2. Rt-Click taskbar and select Task Manager
3. Select [Processes] (Win 7-) or [Details] (Win 8+)
4. Rt-Click on each LLR task (the ones showing the sizeable CPU%!) - and set [Affinity] to a single Core (1,3,5,7) for each task respectively (I usually also set [Priority] to [Above Normal] unless i need the PC for "real" work!!!)
5. Now you can simply set Boinc to download (short!) sieve tasks and leave it to 'fill' the other available cores in the usual way..!

I worked out proper core 'pairing' with the aid of Tthrottle - which clearly shows the paired cores are.. :-
0-1 2-3 4-5 6-7 in Windows..

I would love for someone to do some definitive tests with this pairing strategy to determine whether there's any real overall gain.. - compared to simply running the (un-threaded) LLR and (hyperthreaded) Sieve tasks separately...?? - I'm afraid I'm simply too lazy/forgetful to collect the required data!

____________

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88468 - Posted: 26 Sep 2015 | 14:28:29 UTC
Last modified: 26 Sep 2015 | 14:29:55 UTC

I have results! This might not be easy to follow...

Firstly I got baseline data for normal runtimes of units on an i3-4150T system. This is 3 GHz, 3MB cache, 2 cores 4 threads with hyper threading, and has ram running at 1600. The system has an add in GPU so there should be no impact from integrated graphics.

I ran SGS on 2 cores, which gave an average unit time 833 seconds.
I ran ESP sieve on 2 cores, which gave an average unit time of 1453 seconds.
I ran ESP sieve on 4 cores, which gave an average unit time of 2178 seconds.

The experiment was done the lazy way. I selected only SGS LLR and ESP sieve in the project preferences, and let it get on with it on all 4 threads. The difference between the average time when the first 4 units were sent out, to when the last units were returned, was 47085 seconds or a bit over 13 hours. During this time I have no influence over what work the computer was sent, or what task was run on which thread so matching will not be optimal. The CPU was still running all 4 threads active when the last counted units were completed, so there wont be any boost from less active cores as it winds down.

At the end of the experiment, 84 LLR tasks were done with an average unit time of 1421 seconds. Also 31 sieve tasks were done with an average time of 2286 seconds. That works out about 63% spent on LLR.

Now how do we work out if that is better or worse than expected?

If we ran nothing but LLR by themselves, that would take 84*833/2 = 35006 seconds.

That leaves 47085-35006 = 12079 seconds left for the sieve work. If we ran them normally by themselves, how many sieve units could we get in that amount of time? 4 * 12079 / 2178 = 22.2 units. But we actually did 31, therefore we have the bonus work done equivalent to 8.8 sieve units! If you prefer that in seconds, 8.8 / 4 * 2178 = 4798 seconds of bonus sieve work, which works out as 10% bonus.

Note: above values are shown rounded to nearest second so please be aware of rounding differences if you use the numbers above to check my calculation.

The potential benefit may vary depending on the tasks being run, and other system differences. For example, the i3 has the same dual channel ram interface as the i5 and mainstream i7, but only half the physical cores. You may run into different limits on different systems, especially if you run the bigger LLR units.

The experiment average LLR time of 1421 seconds is in between that of running 2 instances (833s), and the predicted doubling of running 4 instances (1667s). The individual unit minimum and maximum were 1229 and 1720s respectively. That maximum is actually worse than double a normal unit average, but I probably should caution the test system was a normal Win10 install, so there may also be background tasks slowing things down randomly.

Sieve was more consistent. The 4 unit average runtime was 2178s. During the experiment, the average was 2285 so hardly any difference. Also there was little variability with a minimum and maximum of 2203 and 2404 respectively.

I think this is enough to say there is a benefit and not just down to measurement errors, but at 10% is isn't going to change the world. Maybe if there was an easy way to manage the projects and cores to make sure they match up, the benefit would be higher. The way I crunch is to focus on particular tasks at a given time, so mixing and matching like this isn't something I'd choose to do anyway. It might be interesting for those optimising for credits gained in these projects.

Don't ask if this also applies to bigger LLR tasks... that could take a long time and I'm not in the mood to repeat this all over again...

Profile SteveRCProject donor
Avatar
Send message
Joined: 22 Mar 10
Posts: 146
ID: 57364
Credit: 433,330,409
RAC: 81,482
Discovered 1 mega primeEliminated 2 conjecture "k"s321 LLR Turquoise: Earned 5,000,000 credits (5,229,962)Cullen LLR Turquoise: Earned 5,000,000 credits (5,756,791)ESP LLR Turquoise: Earned 5,000,000 credits (5,917,101)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (4,688,089)PPS LLR Jade: Earned 10,000,000 credits (16,359,448)PSP LLR Turquoise: Earned 5,000,000 credits (6,289,793)SoB LLR Jade: Earned 10,000,000 credits (12,275,142)SR5 LLR Jade: Earned 10,000,000 credits (10,451,390)SGS LLR Jade: Earned 10,000,000 credits (11,277,440)TRP LLR Turquoise: Earned 5,000,000 credits (9,514,131)Woodall LLR Turquoise: Earned 5,000,000 credits (6,452,373)321 Sieve Silver: Earned 100,000 credits (245,924)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (6,474,263)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (25,971,807)PPS Sieve Double Bronze: Earned 100,000,000 credits (119,451,276)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,557,924)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,896,883)AP 26/27 Jade: Earned 10,000,000 credits (13,596,609)GFN Double Bronze: Earned 100,000,000 credits (145,556,181)PSA Turquoise: Earned 5,000,000 credits (6,367,890)
Message 88469 - Posted: 26 Sep 2015 | 22:23:28 UTC - in response to Message 88468.

I think those results can be viewed as extremely encouraging - considering the Boinc scheduler is exceptionally bad at 'pairing' tasks - preferring to run them in 'batches' of all one type until they run out!
As can be seen in the experiment, the Boinc scheduler preferred to spend almost 2/3 of the core-time on the LLR tasks, which would imply using 3 cores together for LLR on average, incurring that big hyperthreading-penalty we know to be true for LLR tasks.. So even a 10% 'bonus' under these circumstances is remarkable, and I would expect a MUCH better result from a properly-paired experiment! - but clearly not so easy to manage with short tasks manually!!

____________

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2158
ID: 29980
Credit: 324,161,170
RAC: 32,917
Discovered 2 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,124,499)Cullen LLR Turquoise: Earned 5,000,000 credits (5,149,818)ESP LLR Turquoise: Earned 5,000,000 credits (5,448,000)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Sapphire: Earned 20,000,000 credits (43,378,544)PSP LLR Jade: Earned 10,000,000 credits (10,128,604)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Jade: Earned 10,000,000 credits (18,479,395)SGS LLR Turquoise: Earned 5,000,000 credits (6,702,766)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (15,149,797)Woodall LLR Turquoise: Earned 5,000,000 credits (5,715,464)321 Sieve Turquoise: Earned 5,000,000 credits (8,326,385)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (74,140,875)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Jade: Earned 10,000,000 credits (17,576,712)GFN Sapphire: Earned 20,000,000 credits (48,107,585)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 88665 - Posted: 3 Oct 2015 | 11:14:55 UTC

In light of the HT and LLR testing I've been doing in another thread, I should add a comment here. In short, 1 LLR task per physical core on Windows with HT enabled is showing increased runtimes relative to either the case of HT off or affinity set to ensure the tasks are actually on separate cores. The speed difference measured on this system for SGS is over 10%, the apparent benefit I thought I was seeing from running mixed units.

I can't say for sure without extra testing, but based on the new information, I probably wasn't getting increased performance from HT after all. Instead, we were recovering from that loss by 100% thread loading the CPU.

Message boards : Number crunching : A test for hyper threading

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2019 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 0.93, 0.96, 1.06
Generated 19 Jun 2019 | 3:42:44 UTC