API - Pinning

Pinning

ThreadPinning.pinthreadsFunction
pinthreads(cpuids;
    nthreads   = nothing,
    force      = true,
    warn       = is_first_pin_attempt(),
    threadpool = :default
)

Pin Julia threads to an explicit or implicit list of CPU IDs. The latter can be specified in three ways:

  1. by passing one of several predefined symbols (e.g. pinthreads(:cores) or pinthreads(:sockets)),
  2. by providing a logical specification via helper functions (e.g. pinthreads(numa(2, 1:4))),
  3. explicitly (e.g. 0:3 or [0,12,4]).

See ??pinthreads for more information on these variants and keyword arguments.

Keyword arguments

If set, the keyword argument nthreads serves as a cutoff, that is, the first min(length(cpuids), nthreads) Julia threads will get pinned.

The keyword argument threadpool can be used to indicate the pool of Julia threads that should be considered. Supported values are :default (default), :interactive, or :all. On Julia >= 1.11, there is also experimental support for :gc.

If force=false, threads will only get pinned if this is the very first pin attempt (otherwise the call is a no-op). This may be particularly useful for packages that merely want to specify an "default pinning" that can be overwritten by the user.

The option warn toggles general warnings, such as unwanted interference with BLAS thread settings.

Extended help

1) Predefined Symbols

  • :cputhreads or :compact: successively pin to all available CPU-threads.
  • :cores: spread threads across all available cores, only use hyperthreads if necessary.
  • :sockets: spread threads across sockets (round-robin), only use hyperthreads if necessary. Set compact=true to get compact pinning within each socket.
  • :numa: spread threads across NUMA/memory domains (round-robin), only use hyperthreads if necessary. Set compact=true to get compact pinning within each NUMA/memory domain.
  • :random: pin threads randomly to CPU-threads
  • :current: pin threads to the CPU-threads they are currently running on
  • :firstn: pin threads to CPU-threads in order according to there OS index.
  • :affinitymask: pin threads to different CPU-threads in accordance with the process affinity. By default, hyperthreads_last=true.

2) Logical Specification

The functions node, socket, numa, and core can be used to to specify CPU IDs of/within a certain domain. Moreover, the functions sockets and numas can be used to express a round-robin scatter policy between sockets or NUMA domains, respectively.

Examples (domains):

  • pinthreads(socket(1, 1:3)) # pin to the first 3 cores in the first socket
  • pinthreads(socket(1, 1:3; compact=true)) # pin to the first 3 CPU-threads in the first socket
  • pinthreads(numa(2, [2,4,6])) # pin to the second, the fourth, and the sixth cores in the second NUMA/memory domain
  • pinthreads(node(ncores():-1:1)) # pin threads to cores in reversing order (starting at the end of the node)
  • pinthreads(sockets()) # scatter threads between sockets, cores before hyperthreads

Different domains can be concatenated by providing them in a vector or as separate arguments to pinthreads.

Examples (concatenation):

  • pinthreads([socket(1, 1:3), numa(2, 4:6)])
  • pinthreads(socket(1, 1:3), numa(2, 4:6))

3) Explicit

Simply provide an AbstractVector{<:Integer} of CPU IDs. The latter are expected to be "physical" OS indices (e.g. from hwloc or lscpu) that start at zero!

source
ThreadPinning.pinthreadFunction
pinthread(cpuid::Integer; threadid = Threads.threadid())

Pin the a Julia thread to the given CPU-thread.

source
ThreadPinning.with_pinthreadsFunction
with_pinthreads(f::F, args...;
    soft = false,
    kwargs...
)

Runs the function f with the specified pinning and restores the previous thread affinities afterwards. Typically to be used in combination with do-syntax.

By default (soft=false), before the thread affinities are restored, the Julia threads will be pinned to the CPU-threads they were running on previously.

Example

julia> getcpuids()
4-element Vector{Int64}:
  7
 75
 63
  4

julia> with_pinthreads(:cores) do
           getcpuids()
       end
4-element Vector{Int64}:
 0
 1
 2
 3

julia> getcpuids()
4-element Vector{Int64}:
  7
 75
 63
  4
source
ThreadPinning.unpinthreadsFunction
unpinthreads(; threadpool::Symbol = :default)

Unpins all Julia threads by setting the affinity mask of all threads to all unity. Afterwards, the OS is free to move any Julia thread from one CPU thread to another.

source
ThreadPinning.unpinthreadFunction
unpinthread(; threadid::Integer = Threads.threadid())

Unpins the given Julia thread by setting the affinity mask to all unity. Afterwards, the OS is free to move the Julia thread from one CPU thread to another.

source
ThreadPinning.setaffinityFunction
setaffinity(mask; threadid = Threads.threadid())

Set the affinity of a Julia thread based on the given mask (a vector of ones and zeros).

source
ThreadPinning.setaffinity_cpuidsFunction

Set the affinity of a Julia thread to the given CPU-threads.

Examples:

  • setaffinity(socket(1)) # set the affinity to the first socket
  • setaffinity(numa(2)) # set the affinity to the second NUMA domain
  • setaffinity(socket(1, 1:3)) # set the affinity to the first three cores in the first NUMA domain
  • setaffinity([1,3,5]) # set the affinity to the CPU-threads with the IDs 1, 3, and 5.
source

Pinning - OpenBLAS

ThreadPinning.openblas_pinthreadsFunction
openblas_pinthreads(cpuids; nthreads = BLAS.get_num_threads())

Pin the OpenBLAS threads to the given CPU IDs. The optional keyword argument nthreads serves as a cutoff.

source
ThreadPinning.openblas_unpinthreadsFunction
openblas_unpinthreads(; threadpool = :default)

Unpins all OpenBLAS threads by setting their affinity masks all unity. Afterwards, the OS is free to move any OpenBLAS thread from one CPU thread to another.

source
ThreadPinning.openblas_unpinthreadFunction
openblas_unpinthread(; threadid)

Unpins the OpenBLAS thread with the given threadid by setting its affinity mask to all unity. Afterwards, the OS is free to move the OpenBLAS thread from one CPU thread to another.

source
ThreadPinning.openblas_setaffinityFunction
openblas_setaffinity(mask; threadid)

Set the affinity of the OpenBLAS thread with the given threadid to the given mask.

The input mask should be one of the following:

  • a BitArray to indicate the mask directly
  • a vector of cpuids (in which case the mask will be constructed automatically)
source
ThreadPinning.openblas_setaffinity_cpuidsFunction

Set the affinity of the OpenBLAS thread to the given CPU-threads.

Examples:

  • openblas_setaffinity_cpuids(socket(1)) # set the affinity to the first socket
  • openblas_setaffinity_cpuids(numa(2)) # set the affinity to the second NUMA domain
  • openblas_setaffinity_cpuids(socket(1, 1:3)) # set the affinity to the first three cores in the first NUMA domain
  • openblas_setaffinity_cpuids([1,3,5]) # set the affinity to the CPU-threads with the IDs 1, 3, and 5.
source

Pinning - MPI

ThreadPinning.mpi_pinthreadsFunction
mpi_pinthreads(symbol; compact, kwargs...)

Pin the Julia threads of MPI ranks in a round-robin fashion to specific domains (e.g. sockets). Supported domains (symbol) are :sockets, :numa, and :cores.

When calling this function on all MPI ranks, the Julia threads of the latter will be distributed in a round-robin fashion among the specified domains and will be pinned to non-overlapping ranges of CPU-threads within the domains.

A multi-node setup, where MPI ranks are hosted on different nodes, is supported.

If compact=false (default), physical cores are occupied before hyperthreads. Otherwise, CPU-cores - with potentially multiple CPU-threads - are filled up one after another (compact pinning).

Example:

using ThreadPinning
using MPI
MPI.Init()
mpi_pinthreads(:sockets)
source

Pinning - Distributed.jl

ThreadPinning.distributed_pinthreadsFunction
distributed_pinthreads(symbol;
    include_master = false,
    compact = false,
    nthreads_per_proc = Threads.nthreads(),
    kwargs...)

Pin the Julia threads of Julia workers in a round-robin fashion to specific domains (e.g. sockets). Supported domains (symbol) are :sockets, :numa, and :cores.

When calling this function, the Julia threads of all Julia workers will be distributed in a round-robin fashion among the specified domains and will be pinned to non-overlapping ranges of CPU-threads within the domains.

A multi-node setup, where Julia workers are hosted on different nodes, is supported.

If include_master=true, the master process (Distributed.myid() == 1) will be pinned as well.

If compact=false (default), physical cores are occupied before hyperthreads. Otherwise, CPU-cores - with potentially multiple CPU-threads - are filled up one after another (compact pinning).

Example:

using Distributed
addprocs(3)
@everywhere using ThreadPinning
distributed_pinthreads(:sockets)
source

Pinning - LIKWID

Besides pinthreads, we offer pinthreads_likwidpin which, ideally, should handle all inputs that are supported by the -c option of likwid-pin (e.g. S0:1-3@S1:2,4,5 or E:N:4:2:4). If you encounter an input that doesn't work as expected, please file an issue.

ThreadPinning.pinthreads_likwidpinFunction
pinthreads_likwidpin(str::AbstractString; onebased = false)

Pins Julia threads to CPU-threads based on the given likwid-pin compatible string. Checkout the LIKWID documentation for more information.

If the keyword argument onebased is set to true, logical indices as well as domain indices start at one instead of zero (likwid-pin default). Note, though, that this doesn't affect the explicit pinning mode where "physical" CPU IDs always start at zero.

Examples

  • pinthreads_likwidpin("S0:0-3")
  • pinthreads_likwidpin("M1:0,2,4")
  • pinthreads_likwidpin("S:scatter")
  • pinthreads_likwidpin("E:N:4:1:2")
source