Containers – FASRC DOCS https://docs.rc.fas.harvard.edu Wed, 30 Apr 2025 14:33:12 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.1 https://docs.rc.fas.harvard.edu/wp-content/uploads/2018/08/fasrc_64x64.png Containers – FASRC DOCS https://docs.rc.fas.harvard.edu 32 32 172380571 Podman https://docs.rc.fas.harvard.edu/kb/podman/ Thu, 23 Jan 2025 15:49:49 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=28194 Introduction

Podman is an Open Containers Initiative (OCI) container toolchain developed by RedHat. Unlike its popular OCI cousin Docker, it is daemonless making it easier to use with resource schedulers like Slurm. Podman maintains command line interface (CLI) that is very similar to Docker. On the FASRC cluster the docker command runs podman under the hood and many docker commands just work with podman though with some exceptions. Note that this document uses the term container to mean OCI container. Besides Podman containers FASRC also supports Singularity.

Normally podman requires privileged access. However on the FASRC clusters we have enabled rootless podman, alleviating the requirement. We recommend reading our document on rootless containers before proceeding further so you understand how it works and its limitations.

Podman Documentation

The official Podman Documentation provides the latest information on how to use Podman. On this page we will merely highlight specific useful commands and features/quirks specific to the FASRC cluster. You can get command line help pages by running man podman or  podman --help.

Working with Podman

To start working with podman, first get an interactive session either via salloc or via Open OnDemand. Once you have that session then you can start working with your container image. The basic commands we will cover here are:

  • pull: Download a container image from a container registry
  • images: List downloaded images
  • run: Run a command in a new container
  • build: Create a container image from a Dockerfile/Containerfile
  • push: push a container image to a container registry

For these examples we will use the lolcow and ubuntu images from DockerHub.

pull

podman pull fetches the specified container image and extracts it into node-local storage (/tmp/container-user-<uid> by default on the FASRC cluster). This step is optional, as podman will automatically download an image specified in a podman run, podman build, or podman shell command.

[jharvard@holy8a26601 ~]$ podman pull docker://godlovedc/lolcow
Trying to pull docker.io/godlovedc/lolcow:latest...
Getting image source signatures
Copying blob 8e860504ff1e done | 
Copying blob 9fb6c798fa41 done | 
Copying blob 3b61febd4aef done | 
Copying blob 9d99b9777eb0 done | 
Copying blob d010c8cf75d7 done | 
Copying blob 7fac07fb303e done | 
Copying config 577c1fe8e6 done | 
Writing manifest to image destination
577c1fe8e6d84360932b51767b65567550141af0801ff6d24ad10963e40472c5

images

podman images lists the images that are already available on the node (in /tmp/container-user-<uid>)

[jharvard@holy8a26601 ~]$
REPOSITORY                  TAG         IMAGE ID      CREATED      SIZE
docker.io/godlovedc/lolcow  latest      577c1fe8e6d8  7 years ago  248 MB

run

Podman containers may contain an entrypoint script that will execute when the container is run. To run the container:

[jharvard@holy8a26601 ~]$ podman run -it docker://godlovedc/lolcow
_______________________________________
/ Your society will be sought by people \
\ of taste and refinement. /
---------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

To view the entrypoint script for a podman container:

[jharvard@holy8a26601 ~]$ podman inspect -f 'Entrypoint: {{.Config.Entrypoint}}\nCommand: {{.Config.Cmd}}' lolcow
Entrypoint: [/bin/sh -c fortune | cowsay | lolcat]
Command: []

shell

To start a shell inside a new container, specify the podman run -it --entrypoint bash options. -it effectively provides an interactive session, while --entrypoint bash invokes the bash shell (bash can be substituted with another shell program that exists in the container image).

[jharvard@holy8a26601 ~]$ podman run -it --entrypoint bash docker://godlovedc/lolcow
root@holy8a26601:/#

GPU Example

First, start an interactive job on a gpu partition. Then invoke podman run with the  --device nvidia.com/gpu=all option:

[jharvard@holygpu7c26306 ~]$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Wed Jan 22 15:41:58 2025 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:CA:00.0 Off | On |
| N/A 27C P0 66W / 400W | N/A | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 2 0 0 | 37MiB / 19968MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
WARN[0001] Failed to add pause process to systemd sandbox cgroup: dbus: couldn't determine address of session bus

Batch Jobs

Podman containers can also be executed as part of a normal batch job as you would any other command. Simply include the command as part of the sbatch script. As an example here is a sample podman.sbatch:

#!/bin/bash
#SBATCH -J podman_test
#SBATCH -o podman_test.out
#SBATCH -e podman_test.err
#SBATCH -p test
#SBATCH -t 0-00:10
#SBATCH -c 1
#SBATCH --mem=4G

# Podman command line options
podman run docker://godlovedc/lolcow

When submitted to the cluster as a batch job:

[jharvard@holylogin08 ~]$ sbatch podman.sbatch

Generates the podman_test.out which contains:

[jharvard@holylogin08 ~]$ cat podman_test.out
____________________________________
< Don't read everything you believe. >
------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

Accessing Files

Each podman container operates within its own isolated filesystem tree in /tmp/container-user-<uid>/storage. However, if needed, host file systems can be explicitly shared with containers by using the --volume option when starting a container (this is unlike Singularity which is set up to automatically bind several default filesystems). This option allows you to bind-mount a directory or file from the host into the container, granting the container access to that path. To access files on the host from inside the container, bind host file(s)/directory(ies) into the container using the --volume option. For instance, to access netscratch from the container:

[jharvard@holy8a26602 ~]$ podman run -it --entrypoint bash --volume /n/netscratch:/n/netscratch docker://ubuntu
root@holy8a26602:/# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 397G 6.5G 391G 2% /
tmpfs 64M 0 64M 0% /dev
netscratch-ib01.rc.fas.harvard.edu:/netscratch/C 3.6P 1.8P 1.9P 49% /n/netscratch
/dev/mapper/vg_root-lv_scratch 397G 6.5G 391G 2% /run/secrets
shm 63M 0 63M 0% /dev/shm
devtmpfs 504G 0 504G 0% /dev/tty

Ownership of files as seen from the host that are created by a process in the container depend on the user ID (UID) of the creating process in the container, either:

  • The host (cluster) user, if the container user is:
    • root (UID 0) – this is often the default
    • podman run --userns=keep-id is specified, so the host user and primary group ID are used for the container user (similar to SingularityCE in the default native mode)
    • podman run --userns=keep-id:uid=<container-uid>,gid=<container-gid> is specified, using the specified UID/GID for the container user and mapping it to the host/cluster user’s UID/GID.E.g., in the following example, the “node” user in the container (UID=1000, GID=1000) creates a file that is (as seen from the host) owned by the host user:
      $ podman run -it --rm --user=node --entrypoint=id docker.io/library/node:22
      uid=1000(node) gid=1000(node) groups=1000(node)
      $ podman run -it --rm --volume /n/netscratch:/n/netscratch --userns=keep-id:uid=1000,gid=1000 --entrypoint=bash docker.io/library/node:22
      node@host:/$ touch /n/netscratch/jharvard_lab/Lab/jharvard/myfile
      node@host:/$ ls -l /n/netscratch/jharvard_lab/Lab/jharvard/myfile
      -rw-r--r--. 1 node node 0 Apr 7 16:05 /n/netscratch/jharvard_lab/Lab/jharvard/myfile
      node@host:/$ exit
      $ ls -ld /n/netscratch/jharvard_lab/Lab/jharvard/myfile
      -rw-r--r--. 1 jharvard jharvard_lab 0 Apr 7 12:05 /n/netscratch/jharvard_lab/Lab/jharvard/myfile
  • Otherwise the subuid/subgid is associated with the container-uid/container-gid (see rootless containers). Only filesystems that can resolve your subuid’s can be written to from a podman container (e.g. NFS file systems like /n/netscratch and home directories, or node-local filesystems like /scratch or /tmp; but not Lustre filesystems like holylabs) and only locations with “other” read/write/execute permissions can be utilized (e.g. the Everyone directory).

Environment Variables

A Podman container does not inherit environment variables from the host environment. Any environment variables that are not defined by the container image must be explicitly set with the –env option:

[jharvard@holy8a26602 ~]$ podman run -it --rm --env MY_VAR=test python:3.13-alpine python3 -c 'import os; print(os.environ["MY_VAR"])'
test

Building Your Own Podman Container

You can build or import a Podman container in several different ways. Common methods include:

  1. Download an existing OCI container image located in Docker Hub or another OCI container registry (e.g., quay.io, NVIDIA NGC Catalog, GitHub Container Registry).
  2. Build a Podman image from a Containerfile/Dockerfile.

Images are stored by default at /tmp/containers-user-<uid>/storage. You can find out more about the specific paths by running the podman info command.

Since the default path is in /tmp that means that containers will only exist for the duration of the job and then the system will clean up the space. If you want to maintain images for longer you will need to override the default configuration. You can do this by putting configuration settings in $HOME/.config/containers/storage.conf. Note that due to subuid you will need to select a storage location that your subuids can access. It should be also noted that the version of NFS the cluster runs does not currently support xattrs meaning that NFS storage mounts will not work. This plus the subuid restrictions mean that the vast majority of network storage will not work for this purpose. Documentation for storage.conf can be found here.

Downloading OCI Container Image From Registry

To download a OCI container image from a registry simply use the pull command:

[jharvard@holy8a26602 ~]$ podman pull docker://godlovedc/lolcow
Trying to pull docker.io/godlovedc/lolcow:latest...
Getting image source signatures
Copying blob 8e860504ff1e done | 
Copying blob 9fb6c798fa41 done | 
Copying blob 3b61febd4aef done | 
Copying blob 9d99b9777eb0 done | 
Copying blob d010c8cf75d7 done | 
Copying blob 7fac07fb303e done | 
Copying config 577c1fe8e6 done | 
Writing manifest to image destination
577c1fe8e6d84360932b51767b65567550141af0801ff6d24ad10963e40472c5
WARN[0006] Failed to add pause process to systemd sandbox cgroup: dbus: couldn't determine address of session bus 
[jharvard@holy8a26602 ~]$ podman image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/godlovedc/lolcow latest 577c1fe8e6d8 7 years ago 248 MB

Build Podman Image From Dockerfile/Containerfile

Podman can build container images from a Dockerfile/Containerfile (podman prefers the generic term Containerfile, and podman build without -f <file> will first check for the existence of Containerfile in the current working directory, falling back to Dockerfile if one doesn’t exist). To build first write your Containerfile:

FROM ubuntu:22.04

RUN apt-get -y update \
  && apt-get -y install cowsay lolcat\
  && rm -rf /var/lib/apt/lists/*
ENV LC_ALL=C PATH=/usr/games:$PATH 

ENTRYPOINT ["/bin/sh", "-c", "date | cowsay | lolcat"]

Then run the build command (assuming Dockerfile or Containerfile in the current working directory):

[jharvard@holy8a26602 ~]$ podman build -t localhost/lolcow
STEP 1/4: FROM ubuntu:22.04
Resolved "ubuntu" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/ubuntu:22.04...
Getting image source signatures
Copying blob 6414378b6477 done | 
Copying config 97271d29cb done | 
Writing manifest to image destination
STEP 2/4: RUN apt-get -y update && apt-get -y install cowsay lolcat

... omitted output ...

Running hooks in /etc/ca-certificates/update.d...
done.
--> a41765f5337a
STEP 3/4: ENV LC_ALL=C PATH=/usr/games:$PATH
--> e9eead916e20
STEP 4/4: ENTRYPOINT ["/bin/sh", "-c", "date | cowsay | lolcat"]
COMMIT
--> 51e919dd571f
51e919dd571f1c8a760ef54c746dcb190659bdd353cbdaa1d261ba8d50694d24

Saving/loading a container image on another compute node

Podman container images are stored on a node-local filesystem (/tmp/container-user-<uid>). Any container images built / pulled on one node that are needed on another node must be saved to a data storage location that is accessible to all compute nodes in the FAS RC cluster. The podman save command can be used to accomplish this.

[jharvard@holy8a26602 ~]$ podman save --format oci-archive -o lolcow.tar localhost/lolcow
[jharvard@holy8a26602 ~]$ ls -lh lolcow.tar 
-rw-r--r--. 1 jharvard jharvard_lab 57M Jan 27 11:37 lolcow.tar

Note: omitting --format oci-archive saves the file in the docker-archive format, which is uncompressed, and thus faster to save/load though larger in size.

From another compute node, podman load extracts the docker- or oci-archive to the node-local /tmp/container-user-<uid>, where it can be used by podman:

[jharvard@holy8a26603 ~]$ podman images
REPOSITORY  TAG         IMAGE ID    CREATED     SIZE
[jharvard@holy8a26603 ~]$ podman load -i lolcow.tar
Getting image source signatures
Copying blob 163070f105c3 done   | 
Copying blob f88085971e43 done   | 
Copying config e9749e43bc done   | 
Writing manifest to image destination
Loaded image: localhost/lolcow:latest
[jharvard@holy8a26603 ~]$ podman images
REPOSITORY        TAG         IMAGE ID      CREATED        SIZE
localhost/lolcow  latest      e9749e43bc74  6 minutes ago  172 MB

Pushing a container image to a container registry

To make a container image built on the FASRC cluster available outside the FASRC cluster, the container image can be pushed to a container registry. Popular container registries with a free tier include Docker Hub and the GitHub Container Registry.

This example illustrates the use of the GitHub Container Registry, and assumes a GitHub account.

Note: The GitHub Container Registry is a part of the GitHub Packages ecosystem

  1. Create a Personal access token (classic) with write:packages scope (this implicitly adds read:packages for pulling private container images):
    https://github.com/settings/tokens/new?scopes=write:packages
  2. Authenticate to ghcr.io, using the authentication token generated in step 1 as the “password” (replace “<GITHUB_USERNAME>” with your GitHub username):
    [jharvard@holy8a26603 ~]$ podman login -u <GITHUB_USERNAME> ghcr.io
    Password: <paste authentication token>
    Login succeeded!
  3. Ensure the image has been named ghcr.io/<OWNER>/<image>:<tag> (where “<OWNER>” is either your GitHub username, or an organization that you are a member of and have permission to
    using the podman tag command to add a name to an existing local image if it needed (Note: the GitHub owner must be all lower-case (e.g., jharvard instead of JHarvard)):

    [jharvard@holy8a26603 ~]$ podman tag localhost/lolcow:latest ghcr.io/<OWNER>/lolcow:latest
  4. Push the image to the container registry:
[jharvard@holy8a26603 ~]$ podman push ghcr.io/<OWNER>/lolcow:latest
​​Getting image source signatures
Copying blob 2573e0d81582 done   |
… 
Writing manifest to image destination

By default, the container image will be private. To change the visibility to “public”, access the package from the list at https://github.com/GITHUB_OWNER?tab=packages and configure the package settings (see Configuring a package’s access control and visibility).

Online Trainings Materials

References

]]>
28194
Containers https://docs.rc.fas.harvard.edu/kb/containers/ Fri, 10 Jan 2025 20:28:52 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=28111 Introduction

Containers have become the industry standard method for managing complex software environments, especially ones with bespoke configuration options. In brief, a container is a self-contained environment and software stack that runs on the host operating system (OS). They can allow users to use a variety of base operating systems (e.g., Ubuntu) and their software packages aside from the host OS the cluster nodes run (Rocky Linux). One can even impersonate root inside the container to allow for highly customized builds and a high level of control of the environment.

While containers allow for the management of sophisticated software stacks, they are not a panacea. As light as containers are, they still create a performance penalty for use, as the more layers you put between the code and the hardware, the more inefficiencies pile up. In addition, host filesystem access and various other permission issues can be tricky. Other incompatibilities can arise between the OS of the container and the OS of the compute node.

Still, with these provisos in mind, containers are an excellent tool for software management. Containers exist for many software packages, making software installation faster and more trouble-free. Containers also make it easy to record and share the exact stack of software required for a workflow, allowing other researchers to more-easily reproduce research results in order to validate and extend them.

Types of Containers

There are two main types of containers. The first is the industry standard OCI (Open Container Initiative) container, popularized by Docker. Docker uses a client-server architecture, with one (usually) privileged background server process (or “daemon” process, called “dockerd”) per host. If run on a multi-tenant system (e.g., HPC cluster such as Cannon), this results in a security issue in that users who interact with the privileged daemon process could access files owned by other users. Additionally, on an HPC cluster, the docker daemon process does not integrate with Slurm resource allocation facilities.

Podman, a daemonless OCI container toolchain developed by RedHat to address these issues, is installed on the FASRC cluster. The Podman CLI (command-line interface) was designed to be largely compatible with the Docker CLI, and on the FASRC cluster, the docker command runs podman under the hood. Many docker commands will just work with podman, though there are some differences.

The second is Singularity. Singularity grew out of the need for additional security in shared user contexts (like you find on a cluster). Since Docker normally requires the user to run as root, Singularity was created to alleviate this requirement and bring the advantages of containerization to a broader context. There are a couple of implementations of Singularity, and on the cluster, we use SingularityCE (the other implementation is Apptainer). Singularity has the ability to convert OCI (docker) images into Singularity Image Format (SIF) files. Singularity images have the advantage of being distributable as a single read-only file; on an HPC cluster, this can be located on a shared filesystem, which can be easily launched by processes on different nodes. Additionally, Singularity containers can run as the user who launched them without elevated privileges.

Rootless

Normally, building a container requires root permissions, and in the case of Podman/Docker, the containers themselves would ordinarily be launched by the root user. While this may be fine in a cloud context, it is not in a shared resource context like a HPC cluster. Rootless is the solution to this problem.

Rootless essentially allows the user to spoof being root inside the container. It does this via a Linux feature called subuid (short for Subordinate User ID) and subgid (Subordinate Group ID). This feature allows a range of uid’s (a unique integer assigned to each user name used for permissions identification) and gid’s (unique integer for groups) to be subordinated to another uid. An example is illustrative. Let’s say you are userA with a uid of 20000. You are assigned the subuid range of 1020001-1021000. When you run your container, the following mapping happens:

In the Container [username(uid)] Outside the Container [username(uid)]
root(0) userA(20000)
apache(48) 1020048
ubuntu(1000) 1021000

Thus, you can see that while you are inside the container, you pretend to be another user and have all the privileges of that user in the container. Outside the container, though, you are acting as your user, and the uid’s subordinated to your user.

A few notes are important here:

  1. The subuid/subgid range assigned to each user does not overlap the uid/gid or subuid/subgid range assigned to any other user or group.
  2. While you may be spoofing a specific user inside of the container, the process outside the container sees you as your normal uid or subuid. Thus, if you use normal Linux tools like top or ps outside the container, you will notice that the id’s that show up are your uid and subuid.
  3. Filesystems, since they are external, also see you as your normal uid/gid and subuid/subgid. So files created as root in the container will show up on the storage as owned by your uid/gid. Files created by other users in the container will show up as their mapped subuid/subgid.

Rootless is very powerful and allows you to both build containers on the cluster, as well as running Podman/Docker containers right out of the box. If you want to see what your subuid mapping is, you can find the mappings at /etc/subuid and /etc/subgid. You can find your uid by running the id command, which you can then use to look up your map (e.g., with the command: grep "^$(id -u):" /etc/subuid).

Rootless and Filesystems

Two more crucial notes about filesystems. The first is that since subuids are not part of our normal authentication system, it means that filesystems that cannot resolve subids will not permit them access. In particular Lustre (e.g., /n/holylabs) does not recognize subuids and since it cannot resolve them, it will not permit them. NFS filesystems (e.g., /n/netscratch) do not have this problem.

The second is that even if you can get into the filesystem, you may not be able to traverse into locations that do not have world access (o+rx) enabled. This is because the filesystem cannot resolve your user group or user name, does not see you as a valid member of the group, and thus will reject you. As such, it is imperative to test and validate filesystem access for filesystems you intend to map into the container and ensure that access is achievable. A simple way to ensure this is to utilize the Everyone directory which exists for most filesystems on the cluster. Note that your home directory is not world accessible for security reasons and thus cannot be used.

Getting Started

The first step in utilizing a container on the cluster is to submit a job. Login nodes are not appropriate places for development. If you are just beginning, the easiest method is to either get a command line interactive session via salloc, or launch an OOD session.

Once you have a session, you can then launch your container:

Singularity

[jharvard@holy8a26602 ~]$ singularity run docker://godlovedc/lolcow
INFO: Downloading library image to tmp cache: /scratch/sbuild-tmp-cache-701047440
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
INFO: Fetching OCI image...
45.3MiB / 45.3MiB [============================================================================================================================] 100 % 21.5 MiB/s 0s
53.7MiB / 53.7MiB [============================================================================================================================] 100 % 21.5 MiB/s 0s
INFO: Extracting OCI image...
2025/01/09 10:49:52 warn rootless{dev/agpgart} creating empty file in place of device 10:175
2025/01/09 10:49:52 warn rootless{dev/audio} creating empty file in place of device 14:4
2025/01/09 10:49:52 warn rootless{dev/audio1} creating empty file in place of device 14:20
INFO: Inserting Singularity configuration...
INFO: Creating SIF file...
_________________________________________
/ Q: What do you call a principal female \
| opera singer whose high C |
| |
| is lower than those of other principal |
\ female opera singers? A: A deep C diva. /
-----------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

Podman

[jharvard@holy8a24601 ~]$ podman run docker://godlovedc/lolcow
Trying to pull docker.io/godlovedc/lolcow:latest...
Getting image source signatures
Copying blob 8e860504ff1e done |
Copying blob 9fb6c798fa41 done |
Copying blob 3b61febd4aef done |
Copying blob 9d99b9777eb0 done |
Copying blob d010c8cf75d7 done |
Copying blob 7fac07fb303e done |
Copying config 577c1fe8e6 done |
Writing manifest to image destination
_____________________________
< Give him an evasive answer. >
-----------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

Shell

If you want to get a shell prompt in a container do:

Singularity

[jharvard@holy8a26602 ~]$ singularity shell docker://godlovedc/lolcow
Singularity>

Podman

[jharvard@holy8a26601 ~]$ podman run --rm -it --entrypoint bash docker://godlovedc/lolcow
root@holy8a26601:/#

GPU

If you want to use a GPU in a container first start a job reserving a GPU on a gpu node. Then do the following:

Singularity

You will want to add the --nv flag for singularity:

[jharvard@holygpu7c26306 ~]$ singularity exec --nv docker://godlovedc/lolcow /bin/bash
Singularity> nvidia-smi
Fri Jan 10 15:50:20 2025 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:4B:00.0 Off | On |
| N/A 24C P0 43W / 400W | 74MiB / 40960MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 1 0 0 | 37MiB / 19968MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Singularity>

Podman

For podman you need to add --device nvidia.com/gpu=all:

[jharvard@holygpu7c26305 ~]$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Fri Jan 10 20:26:57 2025 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:31:00.0 Off | On |
| N/A 25C P0 47W / 400W | N/A | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 2 0 0 | 37MiB / 19968MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
WARN[0005] Failed to add pause process to systemd sandbox cgroup: dbus: couldn't determine address of session bus

Docker Rate Limiting

Docker Hub limits the number of pulls anonymous accounts can make. If you hit either an error of:

ERROR: toomanyrequests: Too Many Requests.

or

You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limits.

you will need to create a Docker account to increase your limit. See the Docker documentation for more details.

Once you have a Docker account, you can authenticate with Docker Hub with your Docker Hub account (not FASRC account) and then run a Docker container.

Singularity

singularity remote login --username <dockerhub_username> docker://docker.io

Podman

podman login docker.io

Advanced Usage

For advanced usage tips such as how to build your own containers, see our specific container software pages:

]]>
28111
Singularity https://docs.rc.fas.harvard.edu/kb/singularity-on-the-cluster/ Thu, 17 May 2018 15:55:54 +0000 https://www.rc.fas.harvard.edu/?page_id=18043 Introduction

Singularity enables users to have full control of their operating system environment (OS). This allows a non-privileged user (e.g. non- root, sudo, administrator, etc.) to “swap out” the Linux operating system and environment on the host machine (i.e., the cluster’s OS) for another Linux OS and computing environment that they can control (i.e., the container’s OS). For instance, the host system runs Rocky Linux but your application requires CentOS or Ubuntu Linux with a specific software stack. You can create a CentOS or Ubuntu image containing your software and its dependencies, and run your application on that host in its native CentOS or Ubuntu environment. Singularity leverages the resources of the host system, such as high-speed interconnect (e.g., InfiniBand), high-performance parallel file systems (e.g., Lustre /n/netscratch and /n/holylfs filesystems), GPUs, and other resources (e.g., licensed Intel compilers).

Note for Windows and MacOS: Singularity only supports Linux containers. You cannot create images that use Windows or MacOS (this is a restriction of the containerization model rather than Singularity).

Docker/Podman vs. Singularity

Podman (a Docker-compatible software tool for managing containers) is also supported on FASRC clusters. There are some important differences between Docker/Podman and Singularity:

  • Singularity allows running containers as a regular cluster user.
  • Docker/Podman and Singularity have their own container formats.
  • Docker/Podman (Open Container Initiative) containers may be imported to run via Singularity.

Singularity, SingularityCE, Apptainer

SingularityCE (Singularity Community Edition) and Apptainer are branches/children of the deprecated Singularity. SingularityCE is maintained by Sylabs while Apptainer is maintained by the Linux Foundation. By and large the two are interoperable with slightly different feature sets. The cluster uses SingularityCE, which we will refer to in this document as Singularity.

Singularity Glossary

  • SingularityCE or Apptainer or Podman or Docker: the containerization software
    • as in “SingularityCE 3.11” or “Apptainer 1.0”
  • Image: a compressed, usually read-only file that contains an OS and specific software stack
  • Container
    • The technology, e.g. “containers vs. virtual machines”
    • An instance of an image, e.g. “I will train a model using a Singularity container of PyTorch.”
  • Host: computer/supercomputer where the image is run

Singularity on the cluster

To use Singularity on the cluster one must first start an interactive, Open OnDemand, or batch job. Then you simply run singularity:

[jharvard@holy2c04309 ~]$ singularity --version
singularity-ce version 4.2.2-1.el8

SingularityCE Documentation

The SingularityCE User Guide has the latest documentation. You can also see the most up-to-date help on SingularityCE from the command line:

[jharvard@holy2c04309 ~]$ singularity --help

Linux container platform optimized for High Performance Computing (HPC) and
Enterprise Performance Computing (EPC)

Usage:
  singularity [global options...]

Description:
  Singularity containers provide an application virtualization layer enabling
  mobility of compute via both application and environment portability. With
  Singularity one is capable of building a root file system that runs on any
  other Linux system where Singularity is installed.

Options:
  -c, --config string   specify a configuration file (for root or
                        unprivileged installation only) (default
                        "/etc/singularity/singularity.conf")
  -d, --debug           print debugging information (highest verbosity)
  -h, --help            help for singularity
      --nocolor         print without color output (default False)
  -q, --quiet           suppress normal output
  -s, --silent          only print errors
  -v, --verbose         print additional information
      --version         version for singularity

Available Commands:
  build       Build a Singularity image
  cache       Manage the local cache
  capability  Manage Linux capabilities for users and groups
  completion  Generate the autocompletion script for the specified shell
  config      Manage various singularity configuration (root user only)
  delete      Deletes requested image from the library
  exec        Run a command within a container
  help        Help about any command
  inspect     Show metadata for an image
  instance    Manage containers running as services
  key         Manage OpenPGP keys
  oci         Manage OCI containers
  overlay     Manage an EXT3 writable overlay image
  plugin      Manage Singularity plugins
  pull        Pull an image from a URI
  push        Upload image to the provided URI
  remote      Manage singularity remote endpoints, keyservers and OCI/Docker registry credentials
  run         Run the user-defined default command within a container
  run-help    Show the user-defined help for an image
  search      Search a Container Library for images
  shell       Run a shell within a container
  sif         Manipulate Singularity Image Format (SIF) images
  sign        Add digital signature(s) to an image
  test        Run the user-defined tests within a container
  verify      Verify digital signature(s) within an image
  version     Show the version for Singularity

Examples:
  $ singularity help <command> [<subcommand>]
  $ singularity help build
  $ singularity help instance start


For additional help or support, please visit https://www.sylabs.io/docs/

Working with Singularity Images

Singularity uses a portable, single-file container image format known as the Singularity Image Format (SIF). You can scp or rsync these to the cluster as you would do with any other file. See Copying Data to & from the cluster using SCP or SFTP for more information. You can also download them from various container registries or build your own.

When working with images you can:

For more examples and details, see SingularityCE quick start guide.

Working with Singularity Images Interactively

Singularity syntax

singularity <command> [options] <container_image.sif>
Commands
  • shell: run an interactive shell inside the container
  • exec: execute a command
  • run: launch the runscript

For this example, we will use the laughing cow Singularity image from Sylabs library.

First, request interactive job (for more details about interactive jobs on Cannon, see and on FASSE see) and download the laughing cow lolcow_latest.sif Singularity image:

# request interative job
[jharvard@holylogin01 ~]$ salloc -p test -c 1 -t 00-01:00 --mem=4G

# pull image from Sylabs library
[jharvard@holy2c02302 sylabs_lib]$ singularity pull library://lolcow
INFO:    Downloading library image
90.4MiB / 90.4MiB [=====================================] 100 % 7.6 MiB/s 0s

shell

With the shell command, you can start a new shell within the container image and interact with it as if it were a small virtual machine.

Note that the shell command does not source ~/.bashrc and ~/bash_profile. Therefore, the shell command is useful if customizations in your ~/.bashrc and ~/bash_profile are not supposed to be sourced within the Singularity container.

# launch container with shell command
[jharvard@holy2c02302 sylabs_lib]$ singularity shell lolcow_latest.sif

# test some linux commands within container
Singularity> pwd
/n/holylabs/LABS/jharvard_lab/Users/jharvard/sylabs_lib
total 95268
-rwxr-xr-x 1 jharvard jharvard_lab  2719744 Mar  9 14:27 hello-world_latest.sif
drwxr-sr-x 2 jharvard jharvard_lab     4096 Mar  1 15:21 lolcow
-rwxr-xr-x 1 jharvard jharvard_lab 94824197 Mar  9 14:56 lolcow_latest.sif
drwxr-sr-x 2 jharvard jharvard_lab     4096 Mar  1 15:23 ubuntu22.04
Singularity> id
uid=21442(jharvard) gid=10483(jharvard_lab) groups=10483(jharvard_lab)
Singularity> cowsay moo
 _____
< moo >
 -----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

# exit the container
Singularity> exit
[jharvard@holy2c02302 sylabs_lib]$

exec

The exec command allows you to execute a custom command within a container by specifying the image file. For instance, to execute the cowsay program within the lolcow_latest.sif container:

[jharvard@holy2c02302 sylabs_lib]$ singularity exec lolcow_latest.sif cowsay moo
 _____
< moo >
 -----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
[jharvard@holy2c02302 sylabs_lib]$ singularity exec lolcow_latest.sif cowsay "hello FASRC"
 _____________
< hello FASRC >
 -------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

run

Singularity containers may contain runscripts. These are user defined scripts that define the actions a container should perform when someone runs it. The runscript can be triggered with the run command, or simply by calling the container as though it were an executable.

Using the run command:

[jharvard@holy2c02302 sylabs_lib]$ singularity run lolcow_latest.sif
 _____________________________
< Thu Mar 9 15:15:56 UTC 2023 >
 -----------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Running as the container were an executable file:

[jharvard@holy2c02302 sylabs_lib]$ ./lolcow_latest.sif
 _____________________________
< Thu Mar 9 15:17:06 UTC 2023 >
 -----------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

To view the runscript of a Singularity image:

[jharvard@holy2c02302 sylabs_lib]$ $ singularity inspect -r lolcow_latest.sif 

#!/bin/sh

    date | cowsay | lolcat

GPU Example

First, start an interactive job in the gpu or gpu_test partition and then download the Singularity image.

# request interactive job on gpu_test partition
[jharvard@holylogin01 gpu_example]$ salloc -p gpu_test --gres=gpu:1 --mem 8G -c 4 -t 60

# build singularity image by pulling container from Docker Hub
[jharvard@holygpu7c1309 gpu_example]$ singularity pull docker://tensorflow/tensorflow:latest-gpu
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Copying blob 521d4798507a done
Copying blob 2798fbbc3b3b done
Copying blob 4d8ee731d34e done
Copying blob 92d2e1452f72 done
Copying blob 6aafbce389f9 done
Copying blob eaead16dc43b done
Copying blob 69cc8495d782 done
Copying blob 61b9b57b3915 done
Copying blob eac8c9150c0e done
Copying blob af53c5214ca1 done
Copying blob fac718221aaf done
Copying blob 2047d1a62832 done
Copying blob 9a9a3600909b done
Copying blob 79931d319b40 done
Copying config bdb8061f4b done
Writing manifest to image destination
Storing signatures
2023/03/09 13:52:18  info unpack layer: sha256:eaead16dc43bb8811d4ff450935d607f9ba4baffda4fc110cc402fa43f601d83
2023/03/09 13:52:19  info unpack layer: sha256:2798fbbc3b3bc018c0c246c05ee9f91a1ebe81877940610a5e25b77ec5d4fe24
2023/03/09 13:52:19  info unpack layer: sha256:6aafbce389f98e508428ecdf171fd6e248a9ad0a5e215ec3784e47ffa6c0dd3e
2023/03/09 13:52:19  info unpack layer: sha256:4d8ee731d34ea0ab8f004c609993c2e93210785ea8fc64ebc5185bfe2abdf632
2023/03/09 13:52:19  info unpack layer: sha256:92d2e1452f727e063220a45c1711b635ff3f861096865688b85ad09efa04bd52
2023/03/09 13:52:19  info unpack layer: sha256:521d4798507a1333de510b1f5474f85d3d9a00baa9508374703516d12e1e7aaf
2023/03/09 13:52:21  warn rootless{usr/lib/x86_64-linux-gnu/gstreamer1.0/gstreamer-1.0/gst-ptp-helper} ignoring (usually) harmless EPERM on setxattr "security.capability"
2023/03/09 13:52:54  info unpack layer: sha256:69cc8495d7822d2fb25c542ab3a66b404ca675b376359675b6055935260f082a
2023/03/09 13:52:58  info unpack layer: sha256:61b9b57b3915ef30727fb8807d7b7d6c49d7c8bdfc16ebbc4fa5a001556c8628
2023/03/09 13:52:58  info unpack layer: sha256:eac8c9150c0e4967c4e816b5b91859d5aebd71f796ddee238b4286a6c58e6623
2023/03/09 13:52:59  info unpack layer: sha256:af53c5214ca16dbf9fd15c269f3fb28cefc11121a7dd7c709d4158a3c42a40da
2023/03/09 13:52:59  info unpack layer: sha256:fac718221aaf69d29abab309563304b3758dd4f34f4dad0afa77c26912aed6d6
2023/03/09 13:53:00  info unpack layer: sha256:2047d1a62832237c26569306950ed2b8abbdffeab973357d8cf93a7d9c018698
2023/03/09 13:53:15  info unpack layer: sha256:9a9a3600909b9eba3d198dc907ab65594eb6694d1d86deed6b389cefe07ac345
2023/03/09 13:53:15  info unpack layer: sha256:79931d319b40fbdb13f9269d76f06d6638f09a00a07d43646a4ca62bf57e9683
INFO:    Creating SIF file...

Run the container with GPU support, see available GPUs, and check if tensorflow can detect them:

# run the container
[jharvard@holygpu7c1309 gpu_example]$ singularity shell --nv tensorflow_latest-gpu.sif
Singularity> nvidia-smi
Thu Mar  9 18:57:53 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:06:00.0 Off |                    0 |
| N/A   35C    P0    25W / 250W |      0MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  On   | 00000000:2F:00.0 Off |                    0 |
| N/A   36C    P0    23W / 250W |      0MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-PCIE...  On   | 00000000:86:00.0 Off |                    0 |
| N/A   35C    P0    25W / 250W |      0MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-PCIE...  On   | 00000000:D8:00.0 Off |                    0 |
| N/A   33C    P0    23W / 250W |      0MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

# check if `tensorflow` can see GPUs
Singularity> python
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
2023-03-09 19:00:15.107804: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
>>> print(device_lib.list_local_devices())
2023-03-09 19:00:20.010087: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-09 19:00:24.024427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /device:GPU:0 with 30960 MB memory:  -> device: 0, name: Tesla V100-PCIE-32GB, pci bus id: 0000:06:00.0, compute capability: 7.0
2023-03-09 19:00:24.026521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /device:GPU:1 with 30960 MB memory:  -> device: 1, name: Tesla V100-PCIE-32GB, pci bus id: 0000:2f:00.0, compute capability: 7.0
2023-03-09 19:00:24.027583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /device:GPU:2 with 30960 MB memory:  -> device: 2, name: Tesla V100-PCIE-32GB, pci bus id: 0000:86:00.0, compute capability: 7.0
2023-03-09 19:00:24.028227: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /device:GPU:3 with 30960 MB memory:  -> device: 3, name: Tesla V100-PCIE-32GB, pci bus id: 0000:d8:00.0, compute capability: 7.0

... omitted output ...

incarnation: 3590943835431918555
physical_device_desc: "device: 3, name: Tesla V100-PCIE-32GB, pci bus id: 0000:d8:00.0, compute capability: 7.0"
xla_global_id: 878896533
]

Running Singularity Images in Batch Jobs

You can also use Singularity images within a non-interactive batch script as you would any other command. If your image contains a run-script then you can use singularity run to execute the run-script in the job. You can also use singularity exec to execute arbitrary commands (or scripts) within the image.

Below is an example batch-job submission script using the laughing cow lolcow_latest.sif to print out information about the native OS of the image.

File singularity.sbatch:

#!/bin/bash
#SBATCH -J singularity_test
#SBATCH -o singularity_test.out
#SBATCH -e singularity_test.err
#SBATCH -p test
#SBATCH -t 0-00:10
#SBATCH -c 1
#SBATCH --mem=4G

# Singularity command line options
singularity exec lolcow_latest.sif cowsay "hello from slurm batch job"

Submit a slurm batch job:

[jharvard@holy2c02302 jharvard]$ sbatch singularity.sbatch

Upon the job completion, the standard output is located in the file singularity_test.out:

 [jharvard@holy2c02302 jharvard]$ cat singularity_test.out
  ____________________________
< hello from slurm batch job >
 ----------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

GPU Example Batch Job

File singularity_gpu.sbatch (ensure to include the --nv flag after singularity exec):

#!/bin/bash
#SBATCH -J singularity_gpu_test
#SBATCH -o singularity_gpu_test.out
#SBATCH -e singularity_gpu_test.err
#SBATCH -p gpu
#SBATCH --gres=gpu:1
#SBATCH -t 0-00:10
#SBATCH -c 1
#SBATCH --mem=8G

# Singularity command line options
singularity exec --nv lolcow_latest.sif nvidia-smi
Submit a slurm batch job:
[jharvard@holy2c02302 jharvard]$ sbatch singularity_gpu.sbatch
Upon the job completion, the standard output is located in the file singularity_gpu_test.out:
$ cat singularity_gpu_test.out
Thu Mar  9 20:40:24 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:06:00.0 Off |                    0 |
| N/A   35C    P0    25W / 250W |      0MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Accessing Files

Files and directories on the cluster are accessible from within the container. By default, directories under /n, $HOME, $PWD, and /tmp are available at runtime inside the container.

See these variables on the host operating system:

[jharvard@holy2c02302 jharvard]$ echo $PWD
/n/holylabs/LABS/jharvard_lab/Lab/jharvard
[jharvard@holy2c02302 jharvard]$ echo $HOME
/n/home01/jharvard
[jharvard@holy2c02302 jharvard]$ echo $SCRATCH
/n/netscratch
The same variables within the container:
[jharvard@holy2c02302 jharvard]$ singularity shell lolcow_latest.sif
Singularity> echo $PWD
/n/holylabs/LABS/jharvard_lab/Lab/jharvard
Singularity> echo $HOME
/n/home01/jharvard
Singularity> echo $SCRATCH
/n/netscratch
You can specify additional directories from the host system such that they can be accessible from the container. This process is called bind mount into your container and is done with the --bind option.

For instance, if you first create a file hello.dat in the /scratch directory on the host system. Then, you can execute from within the container by bind mounting /scratch to the /mnt directory inside the container:

[jharvard@holy2c02302 jharvard]$ echo 'Hello from file in mounted directory!' > /scratch/hello.dat
[jharvard@holy2c02302 jharvard]$ singularity shell --bind /scratch:/mnt lolcow_latest.sif
Singularity> cd /mnt/
Singularity> ls
cache  hello.dat
Singularity> cat hello.dat
Hello from file in mounted directory

If you don’t use the --bind option, the file will not be available in the directory /mnt inside the container:

[jharvard@holygpu7c1309 sylabs_lib]$ singularity shell lolcow_latest.sif
Singularity> cd /mnt/
Singularity> ls
Singularity>

Submitting Jobs Within a Singularity Container

Note: Submitting jobs from within a container may or may not work out of the box. This is due to possible environmental variable mismatch, as well as operating system and image library issues. It is important to validate that submitted jobs are properly constructed and operating as expected. If possible it is best to submit jobs outside the container in the host environment.

If you would like to submit slurm jobs from inside the container, you can bind the directories where the slurm executables are. The environmental variable SINGULARITY_BIND stores the directories of the host system that are accessible from inside the container. Thus, slurm commands can be accessible by adding the following code to you slurm batch script before the singularity execution:

export SINGULARITY_BIND=$(tr '\n' ',' <<END
/etc/nsswitch.conf
/etc/slurm
/etc/sssd/
/lib64/libnss_sss.so.2:/lib/libnss_sss.so.2
/slurm
/usr/bin/sacct
/usr/bin/salloc
/usr/bin/sbatch
/usr/bin/scancel
/usr/bin/scontrol
/usr/bin/scrontab
/usr/bin/seff
/usr/bin/sinfo
/usr/bin/squeue
/usr/bin/srun
/usr/bin/sshare
/usr/bin/sstat
/usr/bin/strace
/usr/lib64/libmunge.so.2
/usr/lib64/slurm
/var/lib/sss
/var/run/munge:/run/munge
END
)

Build Your Own Singularity Container

You can build or import a Singularity container in different ways. Common methods include:

  1. Download an existing container from the SingularityCE Container Library or another image repository. This will download an existing Singularity image to the FASRC cluster.
  2. Build a SIF from an OCI container image located in Docker Hub or another OCI container registry (e.g., quay.io, NVIDIA NGC Catalog, GitHub Container Registry). This will download the OCI container image and convert it into a Singularity container image on the FASRC cluster.
  3. Build a SIF file from a Singularity definition file directly on the FASRC cluster.
  4. Build an OCI-SIF from a local Dockerfile using option --oci. The resulting image can be pushed to an OCI container registry (e.g., Docker Hub) for distribution/use by other container runtimes such as Docker.

NOTE: for all options above, you need to be in a compute node. Singularity on the clusters shows how to request an interactive job on Cannon and FASSE.

Download Existing Singularity Container from Library or Registry

Download the laughing cow (lolcow) image from Singularity library with singularity pull:

[jharvard@holy2c02302 ~]$ singularity pull lolcow.sif library://lolcow
INFO:    Starting build...
INFO:    Using cached image
INFO:    Verifying bootstrap image /n/home05/jharvard/.singularity/cache/library/sha256.cef378b9a9274c20e03989909930e87b411d0c08cf4d40ae3b674070b899cb5b
INFO:    Creating SIF file...
INFO:    Build complete: lolcow.sif

Download a custom JupyterLab and Seaborn image from the Seqera Containers registry (which builds/hosts OCI and Singularity container images comprising user-selected conda and Python packages):

[jharvard@holy2c02302 ~]$ singularity pull oras://community.wave.seqera.io/library/jupyterlab_seaborn:a7115e98a9fc4dbe
INFO:    Downloading oras
287.0MiB / 287.0MiB [=======================================] 100 % 7.0 MiB/s 0s

Download Existing Container from Docker Hub

Build the laughing cow (lolcow) image from Docker Hub:

[jharvard@holy2c02302 ~]$ singularity pull lolcow.sif docker://sylabsio/lolcow
INFO:    Starting build...
Getting image source signatures
Copying blob 5ca731fc36c2 done
Copying blob 16ec32c2132b done
Copying config fd0daa4d89 done
Writing manifest to image destination
Storing signatures
2023/03/01 10:29:37  info unpack layer: sha256:16ec32c2132b43494832a05f2b02f7a822479f8250c173d0ab27b3de78b2f058
2023/03/01 10:29:38  info unpack layer: sha256:5ca731fc36c28789c5ddc3216563e8bfca2ab3ea10347e07554ebba1c953242e
INFO:    Creating SIF file...
INFO:    Build complete: lolcow.sif

Build the latest Ubuntu image from Docker Hub:

[jharvard@holy2c02302 ~]$ singularity pull ubuntu.sif docker://ubuntu
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
INFO:    Fetching OCI image...
INFO:    Extracting OCI image...
INFO:    Inserting Singularity configuration...
INFO:    Creating SIF file...
[jharvard@holy2c02302 ~]$ singularity exec ubuntu.sif head -n 1 /etc/os-release
PRETTY_NAME="Ubuntu 24.04.1 LTS"

Note that to build images that are downloaded from Docker Hub or another OCI registry, you can either use the commands build or pull.

Build a Singularity Container from Singularity Definition File

Singularity supports building definition files using --fakeroot. This feature leverages rootless containers.

Step 1: Write/obtain a definition file. You will need a definition file specifying environment variables, packages, etc. Your SingularityCE image will be based on this file. See SingularityCE definition file docs for more details.

This is an example of the laughing cow definition file:

Bootstrap: docker
From: ubuntu:22.04

%post
    apt-get -y update
    apt-get -y install cowsay lolcat

%environment
    export LC_ALL=C
    export PATH=/usr/games:$PATH

%runscript
    date | cowsay | lolcat

Step 2: Build SingularityCE image

Build laughing cow image.

[jharvard@holy8a26602 jharvard]$ singularity build --fakeroot lolcow.sif lolcow.def
INFO: Starting build...
INFO: Fetching OCI image...
28.2MiB / 28.2MiB [=================================================================================================================================================] 100 % 27.9 MiB/s 0s
INFO: Extracting OCI image...
INFO: Inserting Singularity configuration...
INFO: Running post scriptlet
... omitted output ...

Running hooks in /etc/ca-certificates/update.d...
done.
INFO:    Adding environment to container
INFO:    Adding runscript
INFO:    Creating SIF file...
INFO:    Build complete: lolcow.sif

Building a Singularity container from a Dockerfile: OCI mode

SingularityCE supports building containers from Dockerfiles in OCI mode, using a bundled version of the BuildKit container image builder used in recent versions of Docker, resulting in an OCI-SIF image file (as opposed to native mode, which supports building a SIF image from a Singularity definition file). OCI mode enables the Docker-like –compat flag, enforcing a greater degree of isolation between the container and the host environments for Docker/Podman/OCI compatibility.

An example OCI Dockerfile:

FROM ubuntu:22.04 

RUN apt-get -y update \ 
 && apt-get -y install cowsay lolcat

ENV LC_ALL=C PATH=/usr/games:$PATH 

ENTRYPOINT ["/bin/sh", "-c", "date | cowsay | lolcat"]

Build the OCI-SIF (Note that on the FASRC cluster the XDG_RUNTIME_DIR environment variable currently needs to be explicitly set to a node-local user-writable directory, such as shown below):

[jharvard@holy2c02302 ~]$ XDG_RUNTIME_DIR=$(mktemp -d) singularity build --oci lolcow.oci.sif Dockerfile
INFO:    singularity-buildkitd: running server on /tmp/tmp.fCjzW2QnfV/singularity-buildkitd/singularity-buildkitd-3709445.sock
... omitted ouptput ...
INFO:    Terminating singularity-buildkitd (PID 3709477)
WARNING: removing singularity-buildkitd temporary directory /tmp/singularity-buildkitd-2716062861                                                                           
INFO:    Build complete: lolcow.oci.sif

To run the ENTRYPOINT command (equivalent to the Singularity definition file runscript):

[jharvard@holy2c02302 ~]$ singularity run --oci lolcow.oci.sif

OCI mode limitations

  • (As of SingularityCE 4.2) If the Dockerfile contains “USER root” as the last USER instruction, the singularity exec/run –fakeroot or –no-home options must be specified to use the OCI-SIF, or a tmpfs error will result.
  • Portability note: Apptainer does not support OCI mode, and OCI-SIF files cannot be used with Apptainer.

BioContainers

Cluster nodes automount a CernVM-File System at /cvmfs/singularity.galaxyproject.org/. This provides a universal file system namespace to Singularity images for the BioContainers project, which comprises container images automatically generated from Bioconda software packages. The Singularity images are organized into a directory hierarchy following the convention:

/cvmfs/singularity.galaxyproject.org/FIRST_LETTER/SECOND_LETTER/PACKAGE_NAME:VERSION--CONDA_BUILD

For example:

singularity exec /cvmfs/singularity.galaxyproject.org/s/a/samtools:1.13--h8c37831_0 samtools --help

The Bioconda package index lists all software available in /cvmfs/singularity.galaxyproject.org/, while the BioContainers registry provides a searchable interface.

NOTE: There will be a 10-30 second delay when first accessing /cvmfs/singularity.galaxyproject.org/ on a compute node on which it is not currently mounted; in addition, there will be a delay when accessing a Singularity image on a compute node where it has not already been accessed and cached to node-local storage.

BioContainer Images in Docker Hub

A small number of Biocontainers images are available only in DockerHub under the biocontainers organization, and are not available on Cannon under /cvmfs/singularity.galaxyproject.org/.

See BioContainers GitHub for a complete list of BioContainers images available in DockerHub (note that many of the applications listed in that GitHub repository have since been ported to Bioconda, and are thus available in /cvmfs/singularity.galaxyproject.org, but a subset are still only available in DockerHub).

These images can be fetched and built on Cannon using the singularity pull command:

singularity docker://biocontainers/<image>:<tag>
For example, for the container cellpose with tag 2.1.1_vc1 (cellpose Docker Hub page):
[jharvard@holy2c02302 bio]$ singularity pull --disable-cache docker://biocontainers/cellpose:2.1.1_cv1
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
2023/03/13 15:58:16  info unpack layer: sha256:a603fa5e3b4127f210503aaa6189abf6286ee5a73deeaab460f8f33ebc6b64e2
INFO:    Creating SIF file...
The sif image file cellpose_2.1.1_cv1.sif will be created:
[jharvard@holy2c02302 bio]$ ls -lh
total 2.5G
-rwxr-xr-x 1 jharvard jharvard_lab 2.4G Mar 13 15:59 cellpose_2.1.1_cv1.sif
-rwxr-xr-x 1 jharvard jharvard_lab  72M Mar 13 12:06 lolcow_latest.sif

BioContainer and Package Tips

  • The registry https://biocontainers.pro may be slow
  • We recommend to first check the Bioconda package index, as it quickly provides a complete list of Bioconda packages, all of which have a corresponding biocontainers image in /cvmfs/singularity.galaxyproject.org/
  • If an image doesn’t exist there, then there is a small chance there might be one generated from a Dockerfile in BioContainer GitHub
  • If your package is listed in the BioContainer GitHub, search for the package in Docker Hub, under the biocontainers organization(e.g. search for biocontainers/<package>)

Parallel computing with Singularity

Singularity is capable of both OpenMP and MPI parallelization. OpenMP is mostly trivial, you simply need a OpenMP enabled code and compiler and then properly set the normal variables. We have an example code on our User Codes repo. MPI on the other hand is much more involved.

MPI Applications

The goal of these follow instructions is to help you run Message Passing Interface (MPI) programs using Singularity containers on the FAS RC cluster. The MPI standard is used to implement distributed parallel applications across compute nodes of a single HPC cluster, such as Cannon, or across multiple compute systems. The two major open-source implementations of MPI are Mpich (and its derivatives, such as Mvapich), and OpenMPI. The most widely used MPI implementation on Cannon is OpenMPI.

There are several ways of developing and running MPI applications using Singularity containers, where the most popular method relies on the MPI implementation available on the host machine. This approach is named Host MPI or the Hybrid model since it uses both the MPI implementation on the host and the one in the container.

The key idea behind the Hybrid method is that when you execute a Singularity container with a MPI application, you call mpiexec, mpirun, or srun, e.g., when using the Slurm scheduler, on the singularity command itself. Then the MPI process outside of the container will work together with MPI inside the container to initialize the parallel job. Therefore, it is very important that the MPI flavors and versions inside the container and on the host match.

Code examples below can be found on our User Codes repo.

Example MPI Code

To illustrate how Singularity can be used with MPI applications, we will use a simple MPI code implemented in Fortran 90, mpitest.f90:

!=====================================================
! Fortran 90 MPI example: mpitest.f90
!=====================================================
program mpitest
  implicit none
  include 'mpif.h'
  integer(4) :: ierr
  integer(4) :: iproc
  integer(4) :: nproc
  integer(4) :: i
  call MPI_INIT(ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,iproc,ierr)
  do i = 0, nproc-1
     call MPI_BARRIER(MPI_COMM_WORLD,ierr)
     if ( iproc == i ) then
        write (6,*) 'Rank',iproc,'out of',nproc
     end if
  end do
  call MPI_FINALIZE(ierr)
  if ( iproc == 0 ) write(6,*)'End of program.'
  stop
end program mpitest

Singularity Definition File

To build Singularity images you need to write a Definition File, where the the exact implementation will depend on the available MPI flavor on the host machine.

OpenMPI

If you intend to use OpenMPI, the definition file could look like, e.g., the one below:

Bootstrap: yum
OSVersion: 7
MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/
Include: yum

%files
  mpitest.f90 /home/

%environment
  export OMPI_DIR=/opt/ompi
  export SINGULARITY_OMPI_DIR=$OMPI_DIR
  export SINGULARITYENV_APPEND_PATH=$OMPI_DIR/bin
  export SINGULAIRTYENV_APPEND_LD_LIBRARY_PATH=$OMPI_DIR/lib

%post
  yum -y install vim-minimal
  yum -y install gcc
  yum -y install gcc-gfortran
  yum -y install gcc-c++
  yum -y install which tar wget gzip bzip2
  yum -y install make
  yum -y install perl

  echo "Installing Open MPI ..."
  export OMPI_DIR=/opt/ompi
  export OMPI_VERSION=4.1.1
  export OMPI_URL="https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-$OMPI_VERSION.tar.bz2"
  mkdir -p /tmp/ompi
  mkdir -p /opt
  # --- Download ---
  cd /tmp/ompi
  wget -O openmpi-$OMPI_VERSION.tar.bz2 $OMPI_URL && tar -xjf openmpi-$OMPI_VERSION.tar.bz2
  # --- Compile and install ---
  cd /tmp/ompi/openmpi-$OMPI_VERSION
  ./configure --prefix=$OMPI_DIR && make -j4 && make install
  # --- Set environmental variables so we can compile our application ---
  export PATH=$OMPI_DIR/bin:$PATH
  export LD_LIBRARY_PATH=$OMPI_DIR/lib:$LD_LIBRARY_PATH
  export MANPATH=$OMPI_DIR/share/man:$MANPATH
  # --- Compile our application ---
  cd /home
  mpif90 -o mpitest.x mpitest.f90 -O2
MPICH

If you intend to use MPICH, the definition file could look like, e.g., the one below:

Bootstrap: yum
OSVersion: 7
MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/
Include: yum

%files
  /n/home06/pkrastev/holyscratch01/Singularity/MPI/mpitest.f90 /home/

%environment
  export SINGULARITY_MPICH_DIR=/usr

%post
  yum -y install vim-minimal
  yum -y install gcc
  yum -y install gcc-gfortran
  yum -y install gcc-c++
  yum -y install which tar wget gzip
  yum -y install make
  cd /root/
  wget http://www.mpich.org/static/downloads/3.1.4/mpich-3.1.4.tar.gz
  tar xvfz mpich-3.1.4.tar.gz
  cd mpich-3.1.4/
  ./configure --prefix=/usr && make -j2 && make install
  cd /home
  mpif90 -o mpitest.x mpitest.f90 -O2
  cp mpitest.x /usr/bin/

Building Singularity Image

You can use the below commands to build your Singularity images, e.g.:

# --- Building the OpenMPI based image ---
$ singularity build openmpi_test.simg openmpi_test_centos7.def
# --- Building the based Mpich image ---
$ singularity build mpich_test.simg mpich_test.def

These will generate the Singularity image files openmpi_test.simg and mpich_test.simg respectively.

Executing MPI Applications with Singularity

On the FASRC cluster the standard way to execute MPI applications is through a batch-job submission script. Below are two examples, one using OpenMPI, and another MPICH.

OpenMPI
#!/bin/bash
#SBATCH -p test
#SBATCH -n 8
#SBATCH -J mpi_test
#SBATCH -o mpi_test.out
#SBATCH -e mpi_test.err
#SBATCH -t 30
#SBATCH --mem-per-cpu=1000

# --- Set up environment ---
export UCX_TLS=ib
export PMIX_MCA_gds=hash
export OMPI_MCA_btl_tcp_if_include=ib0
module load gcc/10.2.0-fasrc01 
module load openmpi/4.1.1-fasrc01

# --- Run the MPI application in the container ---
srun -n 8 --mpi=pmix singularity exec openmpi_test.simg /home/mpitest.x

Note: Please notice that the version of the OpenMPI implementation used on the host need to match the one in the Singularity container. In this case this is version 4.1.1.

If the above script is named run.sbatch.ompi, the MPI Singularity job is submitted as usual with:

sbatch run.sbatch.ompi
MPICH
#!/bin/bash
#SBATCH -p test
#SBATCH -n 8
#SBATCH -J mpi_test
#SBATCH -o mpi_test.out
#SBATCH -e mpi_test.err
#SBATCH -t 30
#SBATCH --mem-per-cpu=1000

# --- Set up environment ---
module load python/3.8.5-fasrc01
source activate python3_env1

# --- Run the MPI application in the container ---
srun -n 8 --mpi=pmi2 singularity exec mpich_test.simg /usr/bin/mpitest.x

If the above script is named run.sbatch.mpich, the MPI Singularity job is submitted as usual with:

$ sbatch run.sbatch.mpich

Note: Please notice that we don’t have Mpich installed as a software module on the FASRC cluster and therefore this example assumes that Mpich is installed in your user, or lab, environment. The easiest way to do this is through a conda environment. You can find more information on how to set up conda environments in our computing environment here.

Provided you have set up and activated a conda environment named, e.g., python3_env1, Mpich version 3.1.4 can be installed with:

$ conda install mpich==3.1.4

Example Output

$ cat mpi_test.out
 Rank           0 out of           8
 Rank           1 out of           8
 Rank           2 out of           8
 Rank           3 out of           8
 Rank           4 out of           8
 Rank           5 out of           8
 Rank           6 out of           8
 Rank           7 out of           8
 End of program.

Compiling Code with OpenMPI inside Singularity Container

To compile inside the Singularity container, we need to request a compute node to run Singularity:

$ salloc -p test --time=0:30:00 --mem=1000 -n 1

Using the file compile_openmpi.sh, you can compile mpitest.f90 by executing bash compile_openmpi.sh inside the container openmpi_test.simg

$ cat compile_openmpi.sh
#!/bin/bash

export PATH=$OMPI_DIR/bin:$PATH
export LD_LIBRARY_PATH=$OMPI_DIR/lib:$LD_LIBRARY_PATH

# compile fortran program
mpif90 -o mpitest.x mpitest.f90 -O2

# compile c program
mpicc -o mpitest.exe mpitest.c

$ singularity exec openmpi_test.simg bash compile_openmpi.sh

In compile_openmpi.sh, we also included the compilation command for a c program.

Online Trainings Materials

References

]]>
18043