The current/main R session counts as one, meaning the minimum number of cores available is always at least one.
availableCores(
constraints = NULL,
methods = getOption2("parallelly.availableCores.methods", c("system", "cgroups.cpuset",
"cgroups.cpuquota", "cgroups2.cpu.max", "nproc", "mc.cores", "BiocParallel",
"_R_CHECK_LIMIT_CORES_", "Bioconductor", "LSF", "PJM", "PBS", "SGE", "Slurm",
"fallback", "custom")),
na.rm = TRUE,
logical = getOption2("parallelly.availableCores.logical", TRUE),
default = c(current = 1L),
which = c("min", "max", "all"),
omit = getOption2("parallelly.availableCores.omit", 0L)
)
An optional character specifying under what
constraints ("purposes") we are requesting the values.
For instance, on systems where multicore processing is not supported
(i.e. Windows), using constraints = "multicore"
will force a
single core to be reported.
Using constraints = "connections"
, will append "connections"
to
the methods
argument.
It is possible to specify multiple constraints, e.g.
constraints = c("connections", "multicore")
.
A character vector specifying how to infer the number of available cores.
If TRUE, only non-missing settings are considered/returned.
Passed to
detectCores(logical = logical)
, which,
if supported, returns the number of logical CPUs (TRUE) or physical
CPUs/cores (FALSE).
At least as of R 4.2.2, detectCores()
this argument on Linux.
This argument is only if argument methods
includes "system"
.
The default number of cores to return if no non-missing settings are available.
A character specifying which settings to return.
If "min"
(default), the minimum value is returned.
If "max"
, the maximum value is returned (be careful!)
If "all"
, all values are returned.
(integer; non-negative) Number of cores to not include.
Return a positive (>= 1) integer.
If which = "all"
, then more than one value may be returned.
Together with na.rm = FALSE
missing values may also be returned.
The following settings ("methods") for inferring the number of cores are supported:
"system"
-
Query detectCores(logical = logical)
.
"cgroups.cpuset"
-
On Unix, query control group (cgroup v1) value cpuset.set
.
"cgroups.cpuquota"
-
On Unix, query control group (cgroup v1) value
cpu.cfs_quota_us
/ cpu.cfs_period_us
.
"cgroups2.cpu.max"
-
On Unix, query control group (cgroup v2) values cpu.max
.
"nproc"
-
On Unix, query system command nproc
.
"mc.cores"
-
If available, returns the value of option
mc.cores
.
Note that mc.cores
is defined as the number of
additional R processes that can be used in addition to the
main R process. This means that with mc.cores = 0
all
calculations should be done in the main R process, i.e. we have
exactly one core available for our calculations.
The mc.cores
option defaults to environment variable
MC_CORES
(and is set accordingly when the parallel
package is loaded). The mc.cores
option is used by for
instance mclapply()
of the parallel
package.
"connections"
-
Query the current number of available R connections per
freeConnections()
. This is the maximum number of socket-based
parallel cluster nodes that are possible launch, because each
one needs its own R connection.
The exception is when freeConnections()
is zero, then 1L
is
still returned, because availableCores()
should always return a
positive integer.
"BiocParallel"
-
Query environment variable BIOCPARALLEL_WORKER_NUMBER
(integer),
which is defined and used by BiocParallel (>= 1.27.2).
If the former is set, this is the number of cores considered.
"_R_CHECK_LIMIT_CORES_"
-
Query environment variable _R_CHECK_LIMIT_CORES_
(logical or
"warn"
) used by R CMD check
and set to true by
R CMD check --as-cran
. If set to a non-false value, then a maximum
of 2 cores is considered.
"Bioconductor"
-
Query environment variable IS_BIOC_BUILD_MACHINE
(logical)
used by the Bioconductor (>= 3.16) build and check system. If set to
true, then a maximum of 4 cores is considered.
"LSF"
-
Query Platform Load Sharing Facility (LSF) environment variable
LSB_DJOB_NUMPROC
.
Jobs with multiple (CPU) slots can be submitted on LSF using
bsub -n 2 -R "span[hosts=1]" < hello.sh
.
"PJM"
-
Query Fujitsu Technical Computing Suite (that we choose to shorten
as "PJM") environment variables PJM_VNODE_CORE
and
PJM_PROC_BY_NODE
.
The first is set when submitted with pjsub -L vnode-core=8 hello.sh
.
"PBS"
-
Query TORQUE/PBS environment variables PBS_NUM_PPN
and NCPUS
.
Depending on PBS system configuration, these resource
parameters may or may not default to one.
An example of a job submission that results in this is
qsub -l nodes=1:ppn=2
, which requests one node with two cores.
"SGE"
-
Query Sun Grid Engine/Oracle Grid Engine/Son of Grid Engine (SGE)
and Univa Grid Engine (UGE) environment variable NSLOTS
.
An example of a job submission that results in this is
qsub -pe smp 2
(or qsub -pe by_node 2
), which
requests two cores on a single machine.
"Slurm"
-
Query Simple Linux Utility for Resource Management (Slurm)
environment variable SLURM_CPUS_PER_TASK
.
This may or may not be set. It can be set when submitting a job,
e.g. sbatch --cpus-per-task=2 hello.sh
or by adding
#SBATCH --cpus-per-task=2
to the hello.sh
script.
If SLURM_CPUS_PER_TASK
is not set, then it will fall back to
use SLURM_CPUS_ON_NODE
if the job is a single-node job
(SLURM_JOB_NUM_NODES
is 1), e.g. sbatch --ntasks=2 hello.sh
.
To make sure all tasks are assign to a single node, specify
--nodes=1
, e.g. sbatch --nodes=1 --ntasks=16 hello.sh
.
"custom"
-
If option
parallelly.availableCores.custom
is set and a function,
then this function will be called (without arguments) and it's value
will be coerced to an integer, which will be interpreted as a number
of available cores. If the value is NA, then it will be ignored.
It is safe for this custom function to call availableCores()
; if
done, the custom function will not be recursively called.
For any other value of a methods
element, the R option with the
same name is queried. If that is not set, the system environment
variable is queried. If neither is set, a missing value is returned.
Note that some machines might have a limited number of cores, or the R process runs in a container or a cgroup that only provides a small number of cores. In such cases:
ncores <- availableCores() - 1
may return zero, which is often not intended and is likely to give an error downstream. Instead, use:
ncores <- availableCores(omit = 1)
to put aside one of the cores from being used. Regardless how many cores you put aside, this function is guaranteed to return at least one core.
It is possible to override the maximum number of cores on the machine
as reported by availableCores(methods = "system")
. This can be
done by first specifying
options(parallelly.availableCores.methods = "mc.cores")
and
then the number of cores to use, e.g. options(mc.cores = 8)
.
To get the set of available workers regardless of machine,
see availableWorkers()
.
message(paste("Number of cores available:", availableCores()))
#> Number of cores available: 8
if (FALSE) { # \dontrun{
options(mc.cores = 2L)
message(paste("Number of cores available:", availableCores()))
} # }
if (FALSE) { # \dontrun{
## IMPORTANT: availableCores() may return 1L
options(mc.cores = 1L)
ncores <- availableCores() - 1 ## ncores = 0
ncores <- availableCores(omit = 1) ## ncores = 1
message(paste("Number of cores to use:", ncores))
} # }
if (FALSE) { # \dontrun{
## Use 75% of the cores on the system but never more than four
options(parallelly.availableCores.custom = function() {
ncores <- max(parallel::detectCores(), 1L, na.rm = TRUE)
ncores <- min(as.integer(0.75 * ncores), 4L)
max(1L, ncores)
})
message(paste("Number of cores available:", availableCores()))
## Use 50% of the cores according to availableCores(), e.g.
## allocated by a job scheduler or cgroups.
## Note that it is safe to call availableCores() here.
options(parallelly.availableCores.custom = function() {
0.50 * parallelly::availableCores()
})
message(paste("Number of cores available:", availableCores()))
} # }