The futurize package allows you to easily turn sequential code
into parallel code by piping the sequential code to the futurize()
function. Easy!
library(plyr)
library(futurize)
plan(multisession)
slow_fcn <- function(x) {
Sys.sleep(0.1) # emulate work
x^2
}
xs <- 1:1000
ys <- llply(xs, slow_fcn) |> futurize()
This vignette demonstrates how to use this approach to parallelize plyr
functions such as llply(), maply(), and ddply().
The plyr llply() function is commonly used to apply a function to
the elements of a list and return a list. For example,
library(plyr)
xs <- 1:1000
ys <- llply(xs, slow_fcn)
Here llply() evaluates sequentially, but we can easily make it
evaluate in parallel, by using:
library(futurize)
library(plyr)
xs <- 1:1000
ys <- xs |> llply(slow_fcn) |> futurize()
This will distribute the calculations across the available parallel workers, given that we have set parallel workers, e.g.
plan(multisession)
The built-in multisession backend parallelizes on your local
computer and it works on all operating systems. There are [other
parallel backends] to choose from, including alternatives to
parallelize locally as well as distributed across remote machines,
e.g.
plan(future.mirai::mirai_multisession)
and
plan(future.batchtools::batchtools_slurm)
Another example is:
library(plyr)
library(futurize)
plan(future.mirai::mirai_multisession)
ys <- llply(baseball, summary) |> futurize()
The futurize() function supports parallelization of the following plyr functions:
a_ply(), aaply(), adply(), alply()d_ply(), daply(), ddply(), dlply()l_ply(), laply(), ldply(), llply()m_ply(), maply(), mdply(), mlply()