Embarrassingly Parallel – FASRC DOCS https://docs.rc.fas.harvard.edu Mon, 27 Oct 2025 18:51:51 +0000 en-US hourly 1 https://wordpress.org/?v=6.9 https://docs.rc.fas.harvard.edu/wp-content/uploads/2018/08/fasrc_64x64.png Embarrassingly Parallel – FASRC DOCS https://docs.rc.fas.harvard.edu 32 32 172380571 R Parallel https://docs.rc.fas.harvard.edu/kb/r-parallel/ Tue, 19 Aug 2025 20:59:30 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=28987 Description

Here, we briefly explain different ways to use R in parallel on the FASRC Cannon cluster.  The best place for information on R Parallel is our training session:

Parallel computing may be necessary to speed up a code or to deal with large datasets. It can divide the workload into chunks and each worker (i.e. core) will take one chunk. The goal of using parallel computing is to reduce the total computational time by having each worker process its workload in parallel with other workers.

Usage

Request an interactive node

salloc -p test --time=0:30:00 --mem=4000

Load required software modules.

# Compiler, MPI, and R libraries
# Use `module spider NAME` to find the correct version 
module load gcc/x.x.x openmpi/x.x.x R/x.x.x

Examples

User Codes has a summary of R parallel packages that can be used on Cannon. You can find a complete list of available packages at CRAN.

Processing large datasets

Single-node, multi-core (shared memory)

Multi-node, distributed memory

Hybrid: Multi-node + shared-memory

Using nested futures and package future.batchtools, we can perform a multi-node and multi-core job.

Resources

]]>
28987