PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Number crunching : Setting core affinities: Linux, GFN multi-threading

Author Message
River~~
Send message
Joined: 17 Mar 07
Posts: 342
ID: 6533
Credit: 15,792,075
RAC: 0
321 LLR Silver: Earned 100,000 credits (124,889)Cullen LLR Silver: Earned 100,000 credits (200,779)ESP LLR Silver: Earned 100,000 credits (112,791)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (106,156)PPS LLR Amethyst: Earned 1,000,000 credits (1,358,025)PSP LLR Silver: Earned 100,000 credits (150,832)SoB LLR Gold: Earned 500,000 credits (573,744)SR5 LLR Gold: Earned 500,000 credits (500,731)SGS LLR Silver: Earned 100,000 credits (479,282)TRP LLR Silver: Earned 100,000 credits (328,373)Woodall LLR Silver: Earned 100,000 credits (119,260)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (7,061,082)PPS Sieve Silver: Earned 100,000 credits (326,987)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Silver: Earned 100,000 credits (174,708)TRP Sieve (suspended) Gold: Earned 500,000 credits (505,558)AP 26/27 Gold: Earned 500,000 credits (598,364)GFN Ruby: Earned 2,000,000 credits (3,066,295)
Message 123949 - Posted: 23 Dec 2018 | 19:15:54 UTC
Last modified: 23 Dec 2018 | 19:19:10 UTC

k4m1k4z3 asked in the recent challenge thread

.... Is there an easy way to limit each task to one physical processor? I am not very skilled at linux so the word easy is important.


I thought my answer is probably too detailed to be regarded as "on-topic" where the question was asked.

Do you have the taskset command installed? (enter the command and see if it gives you brief usage instructions, or if you get an error) (NB tasksel is different -- don't accept that as an alernative if it is offered!)

If it is not installed, Ubuntu and some Ubuntu-derivatives will tell you what package to install -- but on Debian / Ubuntu / Linuxmint try this before you start looking the hard way

apt-get install util-linux schedtool


Having installed the relevant commands, the next hardest thing is to figure out which apparent cores are doubled up into a physical core. On a four-core machine with HT on then logical core 0 and 4 are the same, 5 and 1 are the same, cores 6 and 2, etc. I do not know how that works with two sockets. Perhaps someone else can enlighten us? With a multi-socket motherboard does that depend on the motherboard as well as on the processors?

This Page might help you find out how the numbering works on your system. It refers to a command called lstopo, and you might need to figure out which package includes that, it it is not already installed. Good Luck with that!


In the following I will be assuming that adding (subtracting) 20 to (from) the number of a virtual core brings you back to the same physical core. I am imagining cores 0-9 are on socket A, 10-19 on B, then 20-29 back on socket A as the and 30-39 on socket B again. But there are other possibilities.

Assuming you have figured out which core numbers share a physical core, the rest is straightforward.

My suggestion is to force each Genefer thread into its own core, and make it the higher number of the pair. So I would want to force the genefer tasks onto logical cores 20 upwards (if numbering was as I am guessing).

When genefer is already running, from the command line, run
top -H


This shows the prcsess numbers of each thread and its cpu usage, among other things. If app_config said 10 threads, you will likely see 11. This is because the -nt count does not include an "overhead" thread that is always there in practice but doesn't use any cpu. You should see 10 of them each using 99% or more cpu - those are the process numbers you would want to change. Press q to exit, and the info you wanted from top usually stays on screen.

The easiest syntax for the taskset comment is

taskset -pc 20 <pid1> taskset -pc 21 <pid2> ...


where 20, 21 etc are the core numbers you want to run each thread and <pidn> the PID shown by top.

For the remainder of that linux thread, it will only run on the specified core. The Linux kernal will notice that that core is busy and will not run anything else alongside unless it has already filled all the cores.

Each thread will retain its local context in the local cache of its own core. This affinity setting will last over the task being suspended in memory, but will be lost it the task is suspended to disk, or if it restarts from checkpoint.

BAD NEWS

you need to do this for each new CPU task, after it starts. Without altering the genefer threading code there seems no *easy* way to automate the affinity setting. There is no urgency about this: the taskset command gives you a small extra edge but not an enormous one. Do the taskset stuff when checking the machine, but (I suggest) don't get up in the night specially to do it.

Bonus info

The above means that each genefer thread stays in its "own" hypercore for the rest of its life, eliminating migrations that destroy the cache.

We can do even better: we can ensure that nothing else runs in that hypercore (it might run in the same physical core of course). To do this we confine the Linux kernel to using only one apparent core out of each physical one. If we never issue a taskset command, this will be just like disabling HT in BIOS, except that we retain the ability to move task threads into the excluded zone.

You need to edit a system file as root. First make a backup copy of it as it is,

sudo cp /etc/default/grub {,-bkp}


then open an editor, either

sudo gedit /etc/default/grub


or try "nano" or "xed" instead of gedit. CLI gurus will want to use vim or emacs.

Find a line that starts GRUB_CMDLINE_LINUX_DEFAULT, and it should already have a string in quotes. Insert a space then the following just before the closing quote

isolcpus=20-39


This "isolates" logical cores 20-39 from the Linux scheduler, but allows us to use them via taskset.

To include this in your boot code, from the command line run

sudo update-grub



I hope that helps: how easy it is depends on your previous experience, but I have made it as easy as I know how.

Expect at best a 1% improvement -- as has been said, even without this fine-tuning the Linux scheduler gets it right most of the time, and xii5ku says it is not worth his while to do this. In my case I enjoy tweaking so I did it.

It did not affect my challenge standing: I would have been able to submit exactly the same number of WU without, that 1% was not enough to squeeze an extra qualifying task in.

With one excpetion, the above should not make things worse.

If the performance drops drastically, most likley thing is that you have misunderstood which apparent cores are paired up -- and I can't really give you any help with that.


Hope that is useful to someone
____________
My computers found:

9831*21441403+1 is a quadhectokilo prime prime, ie >400,000 digits ;)

252031090528237591 + 65521*149*23*19*17*13*11*7*5*3*2*n is prime for every n in { 0..20 } (an arithemtic progression of 21 primes)

Eric Nietering
Send message
Joined: 30 Mar 09
Posts: 10
ID: 37742
Credit: 709,532,567
RAC: 96,373
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,872,240)Cullen LLR Ruby: Earned 2,000,000 credits (2,484,357)ESP LLR Ruby: Earned 2,000,000 credits (2,011,736)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,076,444)PPS LLR Jade: Earned 10,000,000 credits (11,262,202)PSP LLR Turquoise: Earned 5,000,000 credits (5,083,095)SoB LLR Jade: Earned 10,000,000 credits (19,030,526)SR5 LLR Ruby: Earned 2,000,000 credits (2,601,252)SGS LLR Turquoise: Earned 5,000,000 credits (5,468,048)TRP LLR Jade: Earned 10,000,000 credits (10,016,456)Woodall LLR Ruby: Earned 2,000,000 credits (2,334,354)321 Sieve Ruby: Earned 2,000,000 credits (2,813,968)Cullen/Woodall Sieve (suspended) Gold: Earned 500,000 credits (880,079)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,169,744)PPS Sieve Double Gold: Earned 500,000,000 credits (515,467,958)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,030,909)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,206,962)AP 26/27 Ruby: Earned 2,000,000 credits (3,776,650)GFN Double Bronze: Earned 100,000,000 credits (102,788,646)PSA Ruby: Earned 2,000,000 credits (4,164,350)
Message 123963 - Posted: 24 Dec 2018 | 15:49:37 UTC - in response to Message 123949.

I've had a dual CPU machine for about 2 years, and have had a bit of experience with this sort of thing. I use the taskset and top commands that River~~ mentioned, as well as mpstat, which is tells you which logical CPUs are busy. The lstopo program mentioned is also instrumental in determining which CPUs are paired together. In Debian and Ubuntu based systems, it's available in the hwloc package.

My computer is this one. It has two hex-core Intel Xeons, with hyperthreading, for a total of 24 virtual CPUs presented to the operating system.

Using lstopo, I found that the physical CPU cores are formed by the logical core pairs (0,12), (1,13), (2, 14), ... (11, 23). The first six of these pairs are found on physical CPU 0, and the remaining six pairs on physical CPU 1. So if we want to ignore the extra virtual cores, we can limit things to logical CPUs 0-11. As River~~ mentioned, this may be CPU and/or motherboard dependent (and perhaps linux version as well), so be sure to check.

On my AMD systems, the cores tend to be in pairs like (0, 1), (2, 3), (4, 5)..., so to fill every physical core, we need to assign threads to only the even or odd numbered logical CPUs, but not both.

The mpstat command gives details about the usage of each virtual CPU. It can do long term averaging, and give you details of only particular CPUs, and all sorts of things. An easy way to start, however, is:

mpstat -P ALL 1 1

This will print out (twice) the list of all CPUs (including a total at the top), and their usage broken down into the various bins that linux keeps track of. Very useful to verify that your configuration is working properly. On my system, with everything working right, I see CPUs 0-11 loaded at 100%, and 12-23 generally pretty lightly loaded, but often not quite zero, depending on what the OS is doing or what other programs are running.

I have written a handful of perl scripts to attempt to manage this sort of thing for me. They differ only in the number of tasks I want running at once. For instance, for large LLR tasks I want to be running two at 6 threads each. For medium LLR tasks, four at 3 threads apiece is good, and for the smaller ones, 6 at 2 threads each. I also have a setup for twelve at 1 thread each, though that isn't used much anymore. I also have one for 24 tasks at 1 thread each, for sieving challenges.

They all operate in the same basic format. They keep track of which of the slots are occupied, wake up every 30 seconds to see if any new tasks have started, and assign them to the newly opened slots. The timing can be adjusted, of course - you need to compromise running frequently (to handle new tasks sooner) with only running occasionally (to not waste a much CPU time).

There's a difference I noticed due to this challenge, between the CPU GFN21 program and the LLR program that is used elsewhere. As River~~ mentioned, the GFN21 application has an "overhead" thread. This thread is the second thread created, I've found. When running, say, six computation threads, there will be seven threads under the parent process. The first, and third through seventh threads are the computation threads, and the second thread is the overhead thread. This is different to how LLR works, where there's a separate process (the executable has "wrapper" in the name) that handles the overhead, and only the compute threads are handled by the second (non-wrapper) process.

As River~~ mentioned, if the linux scheduler is working properly, this doesn't add up to much benefit. However, if by default the scheduler is doing some not so smart stuff, the gains can be much greater.

Profile composite
Volunteer tester
Send message
Joined: 16 Feb 10
Posts: 700
ID: 55391
Credit: 540,253,014
RAC: 295,098
Discovered 2 mega primesFound 1 prime in the 2018 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (5,090,406)Cullen LLR Gold: Earned 500,000 credits (776,297)ESP LLR Amethyst: Earned 1,000,000 credits (1,542,446)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,056,207)PPS LLR Jade: Earned 10,000,000 credits (10,291,051)PSP LLR Ruby: Earned 2,000,000 credits (3,777,949)SoB LLR Sapphire: Earned 20,000,000 credits (20,170,797)SR5 LLR Turquoise: Earned 5,000,000 credits (5,978,312)SGS LLR Ruby: Earned 2,000,000 credits (2,932,085)TRP LLR Turquoise: Earned 5,000,000 credits (7,021,680)Woodall LLR Amethyst: Earned 1,000,000 credits (1,693,614)321 Sieve Turquoise: Earned 5,000,000 credits (8,433,186)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,571,178)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (50,009,610)PPS Sieve Double Silver: Earned 200,000,000 credits (231,197,285)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,165,888)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,071,454)AP 26/27 Ruby: Earned 2,000,000 credits (2,544,827)GFN Sapphire: Earned 20,000,000 credits (48,177,630)PSA Double Bronze: Earned 100,000,000 credits (102,762,384)
Message 123970 - Posted: 25 Dec 2018 | 2:51:49 UTC - in response to Message 123963.

They all operate in the same basic format. They keep track of which of the slots are occupied, wake up every 30 seconds to see if any new tasks have started, and assign them to the newly opened slots. The timing can be adjusted, of course - you need to compromise running frequently (to handle new tasks sooner) with only running occasionally (to not waste a much CPU time).

You will want to check out the inotify filesystem event interface.
man inotify

There are lots of ways to access this, for example using the inotify-tools package in Mint/Ubuntu/Debian.

In your case, you might prefer to use a Perl module like Filesys::Notify::Simple, one of a few options you will uncover with the search terms notify perl in Synaptic.

Eric Nietering
Send message
Joined: 30 Mar 09
Posts: 10
ID: 37742
Credit: 709,532,567
RAC: 96,373
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,872,240)Cullen LLR Ruby: Earned 2,000,000 credits (2,484,357)ESP LLR Ruby: Earned 2,000,000 credits (2,011,736)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,076,444)PPS LLR Jade: Earned 10,000,000 credits (11,262,202)PSP LLR Turquoise: Earned 5,000,000 credits (5,083,095)SoB LLR Jade: Earned 10,000,000 credits (19,030,526)SR5 LLR Ruby: Earned 2,000,000 credits (2,601,252)SGS LLR Turquoise: Earned 5,000,000 credits (5,468,048)TRP LLR Jade: Earned 10,000,000 credits (10,016,456)Woodall LLR Ruby: Earned 2,000,000 credits (2,334,354)321 Sieve Ruby: Earned 2,000,000 credits (2,813,968)Cullen/Woodall Sieve (suspended) Gold: Earned 500,000 credits (880,079)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,169,744)PPS Sieve Double Gold: Earned 500,000,000 credits (515,467,958)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,030,909)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,206,962)AP 26/27 Ruby: Earned 2,000,000 credits (3,776,650)GFN Double Bronze: Earned 100,000,000 credits (102,788,646)PSA Ruby: Earned 2,000,000 credits (4,164,350)
Message 124062 - Posted: 31 Dec 2018 | 19:00:40 UTC - in response to Message 123970.

They all operate in the same basic format. They keep track of which of the slots are occupied, wake up every 30 seconds to see if any new tasks have started, and assign them to the newly opened slots. The timing can be adjusted, of course - you need to compromise running frequently (to handle new tasks sooner) with only running occasionally (to not waste a much CPU time).

You will want to check out the inotify filesystem event interface.
man inotify

There are lots of ways to access this, for example using the inotify-tools package in Mint/Ubuntu/Debian.

In your case, you might prefer to use a Perl module like Filesys::Notify::Simple, one of a few options you will uncover with the search terms notify perl in Synaptic.


Thanks! I'll have to check that out. I've used something that appears to be similar at work, but in the .NET land from Microsoft.

If/when I've gotten something working, I'll post back with some details.

Post to thread

Message boards : Number crunching : Setting core affinities: Linux, GFN multi-threading

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2019 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 2.70, 2.98, 2.43
Generated 21 Jul 2019 | 7:44:15 UTC