API - Pinning
Pinning
ThreadPinning.pinthreads
— Functionpinthreads(cpuids;
nthreads = nothing,
force = true,
warn = is_first_pin_attempt(),
threadpool = :default
)
Pin Julia threads to an explicit or implicit list of CPU IDs. The latter can be specified in three ways:
- by passing one of several predefined symbols (e.g.
pinthreads(:cores)
orpinthreads(:sockets)
), - by providing a logical specification via helper functions (e.g.
pinthreads(numa(2, 1:4))
), - explicitly (e.g.
0:3
or[0,12,4]
).
See ??pinthreads
for more information on these variants and keyword arguments.
Keyword arguments
If set, the keyword argument nthreads
serves as a cutoff, that is, the first min(length(cpuids), nthreads)
Julia threads will get pinned.
The keyword argument threadpool
can be used to indicate the pool of Julia threads that should be considered. Supported values are :default
(default), :interactive
, or :all
. On Julia >= 1.11, there is also experimental support for :gc
.
If force=false
, threads will only get pinned if this is the very first pin attempt (otherwise the call is a no-op). This may be particularly useful for packages that merely want to specify an "default pinning" that can be overwritten by the user.
The option warn
toggles general warnings, such as unwanted interference with BLAS thread settings.
Extended help
1) Predefined Symbols
:cputhreads
or:compact
: successively pin to all available CPU-threads.:cores
: spread threads across all available cores, only use hyperthreads if necessary.:sockets
: spread threads across sockets (round-robin), only use hyperthreads if necessary. Setcompact=true
to get compact pinning within each socket.:numa
: spread threads across NUMA/memory domains (round-robin), only use hyperthreads if necessary. Setcompact=true
to get compact pinning within each NUMA/memory domain.:random
: pin threads randomly to CPU-threads:current
: pin threads to the CPU-threads they are currently running on:firstn
: pin threads to CPU-threads in order according to there OS index.:affinitymask
: pin threads to different CPU-threads in accordance with the process affinity. By default,hyperthreads_last=true
.
2) Logical Specification
The functions node
, socket
, numa
, and core
can be used to to specify CPU IDs of/within a certain domain. Moreover, the functions sockets
and numas
can be used to express a round-robin scatter policy between sockets or NUMA domains, respectively.
Examples (domains):
pinthreads(socket(1, 1:3))
# pin to the first 3 cores in the first socketpinthreads(socket(1, 1:3; compact=true))
# pin to the first 3 CPU-threads in the first socketpinthreads(numa(2, [2,4,6]))
# pin to the second, the fourth, and the sixth cores in the second NUMA/memory domainpinthreads(node(ncores():-1:1))
# pin threads to cores in reversing order (starting at the end of the node)pinthreads(sockets())
# scatter threads between sockets, cores before hyperthreads
Different domains can be concatenated by providing them in a vector or as separate arguments to pinthreads
.
Examples (concatenation):
pinthreads([socket(1, 1:3), numa(2, 4:6)])
pinthreads(socket(1, 1:3), numa(2, 4:6))
3) Explicit
Simply provide an AbstractVector{<:Integer}
of CPU IDs. The latter are expected to be "physical" OS indices (e.g. from hwloc or lscpu) that start at zero!
ThreadPinning.pinthread
— Functionpinthread(cpuid::Integer; threadid = Threads.threadid())
Pin the a Julia thread to the given CPU-thread.
ThreadPinning.with_pinthreads
— Functionwith_pinthreads(f::F, args...;
soft = false,
kwargs...
)
Runs the function f
with the specified pinning and restores the previous thread affinities afterwards. Typically to be used in combination with do-syntax.
By default (soft=false
), before the thread affinities are restored, the Julia threads will be pinned to the CPU-threads they were running on previously.
Example
julia> getcpuids()
4-element Vector{Int64}:
7
75
63
4
julia> with_pinthreads(:cores) do
getcpuids()
end
4-element Vector{Int64}:
0
1
2
3
julia> getcpuids()
4-element Vector{Int64}:
7
75
63
4
ThreadPinning.unpinthreads
— Functionunpinthreads(; threadpool::Symbol = :default)
Unpins all Julia threads by setting the affinity mask of all threads to all unity. Afterwards, the OS is free to move any Julia thread from one CPU thread to another.
ThreadPinning.unpinthread
— Functionunpinthread(; threadid::Integer = Threads.threadid())
Unpins the given Julia thread by setting the affinity mask to all unity. Afterwards, the OS is free to move the Julia thread from one CPU thread to another.
ThreadPinning.setaffinity
— Functionsetaffinity(mask; threadid = Threads.threadid())
Set the affinity of a Julia thread based on the given mask (a vector of ones and zeros).
ThreadPinning.setaffinity_cpuids
— FunctionSet the affinity of a Julia thread to the given CPU-threads.
Examples:
setaffinity(socket(1))
# set the affinity to the first socketsetaffinity(numa(2))
# set the affinity to the second NUMA domainsetaffinity(socket(1, 1:3))
# set the affinity to the first three cores in the first NUMA domainsetaffinity([1,3,5])
# set the affinity to the CPU-threads with the IDs 1, 3, and 5.
Pinning - OpenBLAS
ThreadPinning.openblas_pinthreads
— Functionopenblas_pinthreads(cpuids; nthreads = BLAS.get_num_threads())
Pin the OpenBLAS threads to the given CPU IDs. The optional keyword argument nthreads
serves as a cutoff.
ThreadPinning.openblas_pinthread
— Functionopenblas_pinthread(cpuid; threadid)
Pin the OpenBLAS thread with the given threadid
to the given CPU-thread (cpuid
).
ThreadPinning.openblas_unpinthreads
— Functionopenblas_unpinthreads(; threadpool = :default)
Unpins all OpenBLAS threads by setting their affinity masks all unity. Afterwards, the OS is free to move any OpenBLAS thread from one CPU thread to another.
ThreadPinning.openblas_unpinthread
— Functionopenblas_unpinthread(; threadid)
Unpins the OpenBLAS thread with the given threadid
by setting its affinity mask to all unity. Afterwards, the OS is free to move the OpenBLAS thread from one CPU thread to another.
ThreadPinning.openblas_setaffinity
— Functionopenblas_setaffinity(mask; threadid)
Set the affinity of the OpenBLAS thread with the given threadid
to the given mask
.
The input mask
should be one of the following:
- a
BitArray
to indicate the mask directly - a vector of cpuids (in which case the mask will be constructed automatically)
ThreadPinning.openblas_setaffinity_cpuids
— FunctionSet the affinity of the OpenBLAS thread to the given CPU-threads.
Examples:
openblas_setaffinity_cpuids(socket(1))
# set the affinity to the first socketopenblas_setaffinity_cpuids(numa(2))
# set the affinity to the second NUMA domainopenblas_setaffinity_cpuids(socket(1, 1:3))
# set the affinity to the first three cores in the first NUMA domainopenblas_setaffinity_cpuids([1,3,5])
# set the affinity to the CPU-threads with the IDs 1, 3, and 5.
Pinning - MPI
ThreadPinning.mpi_pinthreads
— Functionmpi_pinthreads(symbol; compact, kwargs...)
Pin the Julia threads of MPI ranks in a round-robin fashion to specific domains (e.g. sockets). Supported domains (symbol
) are :sockets
, :numa
, and :cores
.
When calling this function on all MPI ranks, the Julia threads of the latter will be distributed in a round-robin fashion among the specified domains and will be pinned to non-overlapping ranges of CPU-threads within the domains.
A multi-node setup, where MPI ranks are hosted on different nodes, is supported.
If compact=false
(default), physical cores are occupied before hyperthreads. Otherwise, CPU-cores - with potentially multiple CPU-threads - are filled up one after another (compact pinning).
Example:
using ThreadPinning
using MPI
MPI.Init()
mpi_pinthreads(:sockets)
Pinning - Distributed.jl
ThreadPinning.distributed_pinthreads
— Functiondistributed_pinthreads(symbol;
include_master = false,
compact = false,
nthreads_per_proc = Threads.nthreads(),
kwargs...)
Pin the Julia threads of Julia workers in a round-robin fashion to specific domains (e.g. sockets). Supported domains (symbol
) are :sockets
, :numa
, and :cores
.
When calling this function, the Julia threads of all Julia workers will be distributed in a round-robin fashion among the specified domains and will be pinned to non-overlapping ranges of CPU-threads within the domains.
A multi-node setup, where Julia workers are hosted on different nodes, is supported.
If include_master=true
, the master process (Distributed.myid() == 1
) will be pinned as well.
If compact=false
(default), physical cores are occupied before hyperthreads. Otherwise, CPU-cores - with potentially multiple CPU-threads - are filled up one after another (compact pinning).
Example:
using Distributed
addprocs(3)
@everywhere using ThreadPinning
distributed_pinthreads(:sockets)
ThreadPinning.distributed_unpinthreads
— FunctionUnpin all threads on all Julia workers.
If include_master=true
, the master process (Distributed.myid() == 1
) will be unpinned as well.
Pinning - LIKWID
Besides pinthreads
, we offer pinthreads_likwidpin
which, ideally, should handle all inputs that are supported by the -c
option of likwid-pin
(e.g. S0:1-3@S1:2,4,5
or E:N:4:2:4
). If you encounter an input that doesn't work as expected, please file an issue.
ThreadPinning.pinthreads_likwidpin
— Functionpinthreads_likwidpin(str::AbstractString; onebased = false)
Pins Julia threads to CPU-threads based on the given likwid-pin
compatible string. Checkout the LIKWID documentation for more information.
If the keyword argument onebased
is set to true
, logical indices as well as domain indices start at one instead of zero (likwid-pin default). Note, though, that this doesn't affect the explicit pinning mode where "physical" CPU IDs always start at zero.
Examples
pinthreads_likwidpin("S0:0-3")
pinthreads_likwidpin("M1:0,2,4")
pinthreads_likwidpin("S:scatter")
pinthreads_likwidpin("E:N:4:1:2")
ThreadPinning.likwidpin_to_cpuids
— Functionlikwidpin_to_cpuids(lpstr::AbstractString; onebased = false)
Convert the given likwid-pin compatible string into a CPU ID list. See pinthreads_likwidpin
for more information.
ThreadPinning.likwidpin_domains
— Functionlikwidpin_domains(; onebased = false)
The likwid-pin compatible domains that are available for the system.