Podman – FASRC DOCS

Podman

RC Admin — Thu, 23 Jan 2025 15:49:49 +0000

Introduction

Podman is an Open Containers Initiative (OCI) container toolchain developed by RedHat. Unlike its popular OCI cousin Docker, it is daemonless making it easier to use with resource schedulers like Slurm. Podman maintains command line interface (CLI) that is very similar to Docker. On the FASRC cluster the docker command runs podman under the hood and many docker commands just work with podman though with some exceptions. Note that this document uses the term container to mean OCI container. Besides Podman containers FASRC also supports Singularity.

Normally podman requires privileged access. However on the FASRC clusters we have enabled rootless podman, alleviating the requirement. We recommend reading our document on rootless containers before proceeding further so you understand how it works and its limitations.

Podman Documentation

The official Podman Documentation provides the latest information on how to use Podman. On this page we will merely highlight specific useful commands and features/quirks specific to the FASRC cluster. You can get command line help pages by running man podman or podman --help.

Working with Podman

To start working with podman, first get an interactive session either via salloc or via Open OnDemand. Once you have that session then you can start working with your container image. The basic commands we will cover here are:

pull: Download a container image from a container registry
images: List downloaded images
run: Run a command in a new container
build: Create a container image from a Dockerfile/Containerfile
push: push a container image to a container registry

For these examples we will use the lolcow and ubuntu images from DockerHub.

WARNING: We do not recommend overriding the default configurations using: $HOME/.config/containers/storage.conf If you do so, it is at your own risk.

pull

podman pull fetches the specified container image and extracts it into node-local storage (/tmp/container-user- by default on the FASRC cluster). This step is optional, as podman will automatically download an image specified in a podman run, podman build, or podman shell command.

[jharvard@holy8a26601 ~]$ podman pull docker://godlovedc/lolcow
Trying to pull docker.io/godlovedc/lolcow:latest...
Getting image source signatures
Copying blob 8e860504ff1e done | 
Copying blob 9fb6c798fa41 done | 
Copying blob 3b61febd4aef done | 
Copying blob 9d99b9777eb0 done | 
Copying blob d010c8cf75d7 done | 
Copying blob 7fac07fb303e done | 
Copying config 577c1fe8e6 done | 
Writing manifest to image destination
577c1fe8e6d84360932b51767b65567550141af0801ff6d24ad10963e40472c5

images

podman images lists the images that are already available on the node (in /tmp/container-user-)

[jharvard@holy8a26601 ~]$ podman images
REPOSITORY                  TAG         IMAGE ID      CREATED      SIZE
docker.io/godlovedc/lolcow  latest      577c1fe8e6d8  7 years ago  248 MB

run

Podman containers may contain an entrypoint script that will execute when the container is run. To run the container:

[jharvard@holy8a26601 ~]$ podman run -it docker://godlovedc/lolcow
_______________________________________
/ Your society will be sought by people \
\ of taste and refinement. /
---------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

To view the entrypoint script for a podman container:

[jharvard@holy8a26601 ~]$ podman inspect -f 'Entrypoint: {{.Config.Entrypoint}}\nCommand: {{.Config.Cmd}}' lolcow
Entrypoint: [/bin/sh -c fortune | cowsay | lolcat]
Command: []

shell

To start a shell inside a new container, specify the podman run -it --entrypoint bash options. -it effectively provides an interactive session, while --entrypoint bash invokes the bash shell (bash can be substituted with another shell program that exists in the container image).

[jharvard@holy8a26601 ~]$ podman run -it --entrypoint bash docker://godlovedc/lolcow
root@holy8a26601:/#

GPU Example

First, start an interactive job on a gpu partition. Then invoke podman run with the --device nvidia.com/gpu=all option:

[jharvard@holygpu7c26306 ~]$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Wed Jan 22 15:41:58 2025 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:CA:00.0 Off | On |
| N/A 27C P0 66W / 400W | N/A | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 2 0 0 | 37MiB / 19968MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
WARN[0001] Failed to add pause process to systemd sandbox cgroup: dbus: couldn't determine address of session bus

Batch Jobs

Podman containers can also be executed as part of a normal batch job as you would any other command. Simply include the command as part of the sbatch script. As an example here is a sample podman.sbatch:

#!/bin/bash
#SBATCH -J podman_test
#SBATCH -o podman_test.out
#SBATCH -e podman_test.err
#SBATCH -p test
#SBATCH -t 0-00:10
#SBATCH -c 1
#SBATCH --mem=4G

# Podman command line options
podman run docker://godlovedc/lolcow

When submitted to the cluster as a batch job:

[jharvard@holylogin08 ~]$ sbatch podman.sbatch

Generates the podman_test.out which contains:

[jharvard@holylogin08 ~]$ cat podman_test.out
____________________________________
< Don't read everything you believe. >
------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

Accessing Files

Each podman container operates within its own isolated filesystem tree in /tmp/container-user-/storage. However, if needed, host file systems can be explicitly shared with containers by using the --volume option when starting a container (this is unlike Singularity which is set up to automatically bind several default filesystems). This option allows you to bind-mount a directory or file from the host into the container, granting the container access to that path. To access files on the host from inside the container, bind host file(s)/directory(ies) into the container using the --volume option. For instance, to access netscratch from the container:

[jharvard@holy8a26602 ~]$ podman run -it --entrypoint bash --volume /n/netscratch:/n/netscratch docker://ubuntu
root@holy8a26602:/# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 397G 6.5G 391G 2% /
tmpfs 64M 0 64M 0% /dev
netscratch-ib01.rc.fas.harvard.edu:/netscratch/C 3.6P 1.8P 1.9P 49% /n/netscratch
/dev/mapper/vg_root-lv_scratch 397G 6.5G 391G 2% /run/secrets
shm 63M 0 63M 0% /dev/shm
devtmpfs 504G 0 504G 0% /dev/tty

Ownership of files as seen from the host that are created by a process in the container depend on the user ID (UID) of the creating process in the container, either:

The host (cluster) user, if the container user is:

root (UID 0) – this is often the default
podman run --userns=keep-id is specified, so the host user and primary group ID are used for the container user (similar to SingularityCE in the default native mode)

podman run --userns=keep-id:uid=,gid= is specified, using the specified UID/GID for the container user and mapping it to the host/cluster user’s UID/GID.E.g., in the following example, the “node” user in the container (UID=1000, GID=1000) creates a file that is (as seen from the host) owned by the host user:

$ podman run -it --rm --user=node --entrypoint=id docker.io/library/node:22
uid=1000(node) gid=1000(node) groups=1000(node)
$ podman run -it --rm --volume /n/netscratch:/n/netscratch --userns=keep-id:uid=1000,gid=1000 --entrypoint=bash docker.io/library/node:22
node@host:/$ touch /n/netscratch/jharvard_lab/Lab/jharvard/myfile
node@host:/$ ls -l /n/netscratch/jharvard_lab/Lab/jharvard/myfile
-rw-r--r--. 1 node node 0 Apr 7 16:05 /n/netscratch/jharvard_lab/Lab/jharvard/myfile
node@host:/$ exit
$ ls -ld /n/netscratch/jharvard_lab/Lab/jharvard/myfile
-rw-r--r--. 1 jharvard jharvard_lab 0 Apr 7 12:05 /n/netscratch/jharvard_lab/Lab/jharvard/myfile

Otherwise the subuid/subgid is associated with the container-uid/container-gid (see rootless containers). Only filesystems that can resolve your subuid’s can be written to from a podman container (e.g. NFS file systems like /n/netscratch and home directories, or node-local filesystems like /scratch or /tmp; but not Lustre filesystems like holylabs) and only locations with “other” read/write/execute permissions can be utilized (e.g. the Everyone directory).

Environment Variables

A Podman container does not inherit environment variables from the host environment. Any environment variables that are not defined by the container image must be explicitly set with the –env option:

[jharvard@holy8a26602 ~]$ podman run -it --rm --env MY_VAR=test python:3.13-alpine python3 -c 'import os; print(os.environ["MY_VAR"])'
test

Building Your Own Podman Container

You can build or import a Podman container in several different ways. Common methods include:

Download an existing OCI container image located in Docker Hub or another OCI container registry (e.g., quay.io, NVIDIA NGC Catalog, GitHub Container Registry).
Build a Podman image from a Containerfile/Dockerfile.

Images are stored by default at /tmp/containers-user-/storage. You can find out more about the specific paths by running the podman info command. Since the default path is in /tmp that means that containers will only exist for the duration of the job and then the system will clean up the space.

Downloading OCI Container Image From Registry

To download a OCI container image from a registry simply use the pull command:

[jharvard@holy8a26602 ~]$ podman pull docker://godlovedc/lolcow
Trying to pull docker.io/godlovedc/lolcow:latest...
Getting image source signatures
Copying blob 8e860504ff1e done | 
Copying blob 9fb6c798fa41 done | 
Copying blob 3b61febd4aef done | 
Copying blob 9d99b9777eb0 done | 
Copying blob d010c8cf75d7 done | 
Copying blob 7fac07fb303e done | 
Copying config 577c1fe8e6 done | 
Writing manifest to image destination
577c1fe8e6d84360932b51767b65567550141af0801ff6d24ad10963e40472c5
WARN[0006] Failed to add pause process to systemd sandbox cgroup: dbus: couldn't determine address of session bus 
[jharvard@holy8a26602 ~]$ podman image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/godlovedc/lolcow latest 577c1fe8e6d8 7 years ago 248 MB

Build Podman Image From Dockerfile/Containerfile

Podman can build container images from a Dockerfile/Containerfile (podman prefers the generic term Containerfile, and podman build without -f will first check for the existence of Containerfile in the current working directory, falling back to Dockerfile if one doesn’t exist). To build first write your Containerfile:

FROM ubuntu:22.04

RUN apt-get -y update \
  && apt-get -y install cowsay lolcat\
  && rm -rf /var/lib/apt/lists/*
ENV LC_ALL=C PATH=/usr/games:$PATH 

ENTRYPOINT ["/bin/sh", "-c", "date | cowsay | lolcat"]

Then run the build command (assuming Dockerfile or Containerfile in the current working directory):

[jharvard@holy8a26602 ~]$ podman build -t localhost/lolcow
STEP 1/4: FROM ubuntu:22.04
Resolved "ubuntu" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/ubuntu:22.04...
Getting image source signatures
Copying blob 6414378b6477 done | 
Copying config 97271d29cb done | 
Writing manifest to image destination
STEP 2/4: RUN apt-get -y update && apt-get -y install cowsay lolcat

... omitted output ...

Running hooks in /etc/ca-certificates/update.d...
done.
--> a41765f5337a
STEP 3/4: ENV LC_ALL=C PATH=/usr/games:$PATH
--> e9eead916e20
STEP 4/4: ENTRYPOINT ["/bin/sh", "-c", "date | cowsay | lolcat"]
COMMIT
--> 51e919dd571f
51e919dd571f1c8a760ef54c746dcb190659bdd353cbdaa1d261ba8d50694d24

Saving/loading a container image on another compute node

Podman container images are stored on a node-local filesystem (/tmp/container-user-). Any container images built / pulled on one node that are needed on another node must be saved to a data storage location that is accessible to all compute nodes in the FAS RC cluster. The podman save command can be used to accomplish this.

[jharvard@holy8a26602 ~]$ podman save --format oci-archive -o lolcow.tar localhost/lolcow
[jharvard@holy8a26602 ~]$ ls -lh lolcow.tar 
-rw-r--r--. 1 jharvard jharvard_lab 57M Jan 27 11:37 lolcow.tar

Note: omitting --format oci-archive saves the file in the docker-archive format, which is uncompressed, and thus faster to save/load though larger in size.

From another compute node, podman load extracts the docker- or oci-archive to the node-local /tmp/container-user-, where it can be used by podman:

[jharvard@holy8a26603 ~]$ podman images
REPOSITORY  TAG         IMAGE ID    CREATED     SIZE
[jharvard@holy8a26603 ~]$ podman load -i lolcow.tar
Getting image source signatures
Copying blob 163070f105c3 done   | 
Copying blob f88085971e43 done   | 
Copying config e9749e43bc done   | 
Writing manifest to image destination
Loaded image: localhost/lolcow:latest
[jharvard@holy8a26603 ~]$ podman images
REPOSITORY        TAG         IMAGE ID      CREATED        SIZE
localhost/lolcow  latest      e9749e43bc74  6 minutes ago  172 MB

Pushing a container image to a container registry

To make a container image built on the FASRC cluster available outside the FASRC cluster, the container image can be pushed to a container registry. Popular container registries with a free tier include Docker Hub and the GitHub Container Registry.

This example illustrates the use of the GitHub Container Registry, and assumes a GitHub account.

Note: The GitHub Container Registry is a part of the GitHub Packages ecosystem

Create a Personal access token (classic) with write:packages scope (this implicitly adds read:packages for pulling private container images):
https://github.com/settings/tokens/new?scopes=write:packages
Authenticate to ghcr.io, using the authentication token generated in step 1 as the “password” (replace “
” with your GitHub username):
```
[jharvard@holy8a26603 ~]$ podman login -u  ghcr.io
Password: 
Login succeeded!
```
Ensure the image has been named ghcr.io//: (where “
” is either your GitHub username, or an organization that you are a member of and have permission to
using the podman tag command to add a name to an existing local image if it needed (Note: the GitHub owner must be all lower-case (e.g., jharvard instead of JHarvard)):
```
[jharvard@holy8a26603 ~]$ podman tag localhost/lolcow:latest ghcr.io//lolcow:latest
```
Push the image to the container registry:

[jharvard@holy8a26603 ~]$ podman push ghcr.io//lolcow:latest
Getting image source signatures
Copying blob 2573e0d81582 done   |
… 
Writing manifest to image destination

By default, the container image will be private. To change the visibility to “public”, access the package from the list at https://github.com/GITHUB_OWNER?tab=packages and configure the package settings (see Configuring a package’s access control and visibility).

Podman Compose: Run multi-container apps

podman compose can be used to run multi-container applications (e.g. a web service + database) on a single compute node. The containers to be managed and their relationships are defined in a Compose file (typically `compose.yaml` or `docker-compose.yaml`).

Using Docker Compose with Podman

podman compose is a wrapper that uses a compose provider. Two compose providers supported are:

podman-compose – The podman-compose compose provider is still under active development, and lacks some functionality found in the original Docker Compose. podman-compose is preinstalled on the FASRC cluster (/usr/bin/podman-compose).
docker-compose For greater compatibility with some Compose files, podman compose can use Docker Compose as its compose provider. The setup involves installing the docker-compose CLI plugin in your home directory:mkdir -p ~/.docker/cli-plugins curl -o ~/.docker/cli-plugins/docker-compose -L https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 chmod +x ~/.docker/cli-plugins/docker-compose
Before running podman compose with the docker-compose compose provider on a compute node, a podman-system-service must be started on that compute node to provide a Docker-daemon-compatible API for the docker-compose compose provider:[jharvard@holy8a24101 ~]$ podman system service -t 0 &

Example: HPC-hosted Private LLM Service

The following example deploys a private LLM service using Open WebUI front-end web interface connected to a llama.cpp HTTP server backend hosting the Meta Llama-3.2-1B-Instruct model. Only CPU resources are used for convenient deployment.

Launch an Open OnDemand Remote Desktop with 4 CPUs and 8 GB memory.
(Alternatively, SSH tunneling or VS Code with port forwarding can be used to access web applications running on FAS RC compute nodes)

Save the following Compose configuration to a file called compose.yaml on the FASRC cluster:

services:
  llama-server:
    image: ghcr.io/ggml-org/llama.cpp:server
    tty: true
    command: >
      --hf-repo QuantFactory/Llama-3.2-1B-Instruct-GGUF
      --hf-file Llama-3.2-1B-Instruct.Q8_0.gguf
      --threads ${SLURM_CPUS_PER_TASK:-1}
      --ctx-size 8192
    environment:
      LLAMA_CACHE: /cache
    volumes:
      - cache:/cache

open-webui:
  image: ghcr.io/open-webui/open-webui:main
  depends_on:
    - llama-server
  environment:
    OPENAI_API_BASE_URL: http://llama-server:8080/v1
    WEBUI_SECRET_KEY:
  volumes:
    - openwebui:/app/backend/data
# set the OPEN_WEBUI_PORT environment variable to use a different port
  ports:
    - "127.0.0.1:${OPEN_WEBUI_PORT:-8080}:8080"

volumes:
  cache: {}
  openwebui: {}

Open a terminal in the Remote Desktop session, navigate to the directory containing the compose.yaml file, and issue the following command to start the containers:
podman compose up
Note: works with the default podman-compose compose provider; see above for additional setup to use docker-compose instead
Once the logs indicate the open-webui service is ready (i.e., the text [open-webui] | INFO: Started server process [1] appears in the logs), open a web browser on the remote desktop (Applications > Internet > FireFox) and access http://localhost:8080/. Follow the prompt to create an admin account before accessing the Open WebUI chat interface to interact with the LLM.

Known Limitations

The following are known limitations of rootless Podman on the FASRC cluster.

–cpuset-cpus option is unsupported

Workarounds include:

Slurm srun -c, –cpus-per-task option within a Slurm job to launch podman with CPU binding to a specified number of allocated CPUs; e.g.
srun -c 2 podman run ...
The taskset command to set the CPU affinity:
user@holy8a24101 ~]$ taskset -cp $$
pid 368917’s current affinity list: 102-105
user@holy8a24101 test]$ taskset -c 102-103 podman run …

`podman rm` doesn’t work

Symptom: a container is stuck in the “stopping” state and cannot be removed.

$ podman ps
$ podman ps -a
CONTAINER ID ... STATUS ...
2eaa0ca21480 ... Stopping ...
$ podman rm 2eaa0ca21480
...
Error: cannot remove container 2eaa0ca21480... as it is stopping - running or paused containers cannot be removed without force: container state improper

Solution: Use podman kill before podman rm

$ podman kill 2eaa0ca21480
2eaa0ca21480
$ podman rm 2eaa0ca21480
2eaa0ca21480
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

Online Trainings Materials

FASRC Training Materials

References

Containers

RC Admin — Fri, 10 Jan 2025 20:28:52 +0000

Introduction

Containers have become the industry standard method for managing complex software environments, especially ones with bespoke configuration options. In brief, a container is a self-contained environment and software stack that runs on the host operating system (OS). They can allow users to use a variety of base operating systems (e.g., Ubuntu) and their software packages aside from the host OS the cluster nodes run (Rocky Linux). One can even impersonate root inside the container to allow for highly customized builds and a high level of control of the environment.

While containers allow for the management of sophisticated software stacks, they are not a panacea. As light as containers are, they still create a performance penalty for use, as the more layers you put between the code and the hardware, the more inefficiencies pile up. In addition, host filesystem access and various other permission issues can be tricky. Other incompatibilities can arise between the OS of the container and the OS of the compute node.

Still, with these provisos in mind, containers are an excellent tool for software management. Containers exist for many software packages, making software installation faster and more trouble-free. Containers also make it easy to record and share the exact stack of software required for a workflow, allowing other researchers to more-easily reproduce research results in order to validate and extend them.

Types of Containers

There are two main types of containers. The first is the industry standard OCI (Open Container Initiative) container, popularized by Docker. Docker uses a client-server architecture, with one (usually) privileged background server process (or “daemon” process, called “dockerd”) per host. If run on a multi-tenant system (e.g., HPC cluster such as Cannon), this results in a security issue in that users who interact with the privileged daemon process could access files owned by other users. Additionally, on an HPC cluster, the docker daemon process does not integrate with Slurm resource allocation facilities.

Podman, a daemonless OCI container toolchain developed by RedHat to address these issues, is installed on the FASRC cluster. The Podman CLI (command-line interface) was designed to be largely compatible with the Docker CLI, and on the FASRC cluster, the docker command runs podman under the hood. Many docker commands will just work with podman, though there are some differences.

The second is Singularity. Singularity grew out of the need for additional security in shared user contexts (like you find on a cluster). Since Docker normally requires the user to run as root, Singularity was created to alleviate this requirement and bring the advantages of containerization to a broader context. There are a couple of implementations of Singularity, and on the cluster, we use SingularityCE (the other implementation is Apptainer). Singularity has the ability to convert OCI (docker) images into Singularity Image Format (SIF) files. Singularity images have the advantage of being distributable as a single read-only file; on an HPC cluster, this can be located on a shared filesystem, which can be easily launched by processes on different nodes. Additionally, Singularity containers can run as the user who launched them without elevated privileges.

Rootless

Normally, building a container requires root permissions, and in the case of Podman/Docker, the containers themselves would ordinarily be launched by the root user. While this may be fine in a cloud context, it is not in a shared resource context like a HPC cluster. Rootless is the solution to this problem.

Rootless essentially allows the user to spoof being root inside the container. It does this via a Linux feature called subuid (short for Subordinate User ID) and subgid (Subordinate Group ID). This feature allows a range of uid’s (a unique integer assigned to each user name used for permissions identification) and gid’s (unique integer for groups) to be subordinated to another uid. An example is illustrative. Let’s say you are userA with a uid of 20000. You are assigned the subuid range of 1020001-1021000. When you run your container, the following mapping happens:

In the Container [username(uid)]	Outside the Container [username(uid)]
root(0)	userA(20000)
apache(48)	1020048
ubuntu(1000)	1021000

Thus, you can see that while you are inside the container, you pretend to be another user and have all the privileges of that user in the container. Outside the container, though, you are acting as your user, and the uid’s subordinated to your user.

A few notes are important here:

The subuid/subgid range assigned to each user does not overlap the uid/gid or subuid/subgid range assigned to any other user or group.
While you may be spoofing a specific user inside of the container, the process outside the container sees you as your normal uid or subuid. Thus, if you use normal Linux tools like top or ps outside the container, you will notice that the id’s that show up are your uid and subuid.
Filesystems, since they are external, also see you as your normal uid/gid and subuid/subgid. So files created as root in the container will show up on the storage as owned by your uid/gid. Files created by other users in the container will show up as their mapped subuid/subgid.

Rootless is very powerful and allows you to both build containers on the cluster, as well as running Podman/Docker containers right out of the box. If you want to see what your subuid mapping is, you can find the mappings at /etc/subuid and /etc/subgid. You can find your uid by running the id command, which you can then use to look up your map (e.g., with the command: grep "^$(id -u):" /etc/subuid).

Rootless and Filesystems

Two more crucial notes about filesystems. The first is that since subuids are not part of our normal authentication system, it means that filesystems that cannot resolve subids will not permit them access. In particular Lustre (e.g., /n/holylabs) does not recognize subuids and since it cannot resolve them, it will not permit them. NFS filesystems (e.g., /n/netscratch) do not have this problem.

The second is that even if you can get into the filesystem, you may not be able to traverse into locations that do not have world access (o+rx) enabled. This is because the filesystem cannot resolve your user group or user name, does not see you as a valid member of the group, and thus will reject you. As such, it is imperative to test and validate filesystem access for filesystems you intend to map into the container and ensure that access is achievable. A simple way to ensure this is to utilize the Everyone directory which exists for most filesystems on the cluster. Note that your home directory is not world accessible for security reasons and thus cannot be used.

Getting Started

The first step in utilizing a container on the cluster is to submit a job. Login nodes are not appropriate places for development. If you are just beginning, the easiest method is to either get a command line interactive session via salloc, or launch an OOD session.

Once you have a session, you can then launch your container:

Singularity

[jharvard@holy8a26602 ~]$ singularity run docker://godlovedc/lolcow
INFO: Downloading library image to tmp cache: /scratch/sbuild-tmp-cache-701047440
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
INFO: Fetching OCI image...
45.3MiB / 45.3MiB [============================================================================================================================] 100 % 21.5 MiB/s 0s
53.7MiB / 53.7MiB [============================================================================================================================] 100 % 21.5 MiB/s 0s
INFO: Extracting OCI image...
2025/01/09 10:49:52 warn rootless{dev/agpgart} creating empty file in place of device 10:175
2025/01/09 10:49:52 warn rootless{dev/audio} creating empty file in place of device 14:4
2025/01/09 10:49:52 warn rootless{dev/audio1} creating empty file in place of device 14:20
INFO: Inserting Singularity configuration...
INFO: Creating SIF file...
_________________________________________
/ Q: What do you call a principal female \
| opera singer whose high C |
| |
| is lower than those of other principal |
\ female opera singers? A: A deep C diva. /
-----------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

Podman

[jharvard@holy8a24601 ~]$ podman run docker://godlovedc/lolcow
Trying to pull docker.io/godlovedc/lolcow:latest...
Getting image source signatures
Copying blob 8e860504ff1e done |
Copying blob 9fb6c798fa41 done |
Copying blob 3b61febd4aef done |
Copying blob 9d99b9777eb0 done |
Copying blob d010c8cf75d7 done |
Copying blob 7fac07fb303e done |
Copying config 577c1fe8e6 done |
Writing manifest to image destination
_____________________________
< Give him an evasive answer. >
-----------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

Shell

If you want to get a shell prompt in a container do:

Singularity

[jharvard@holy8a26602 ~]$ singularity shell docker://godlovedc/lolcow
Singularity>

Podman

[jharvard@holy8a26601 ~]$ podman run --rm -it --entrypoint bash docker://godlovedc/lolcow
root@holy8a26601:/#

GPU

If you want to use a GPU in a container first start a job reserving a GPU on a gpu node. Then do the following:

Singularity

You will want to add the --nv flag for singularity:

[jharvard@holygpu7c26306 ~]$ singularity exec --nv docker://godlovedc/lolcow /bin/bash
Singularity> nvidia-smi
Fri Jan 10 15:50:20 2025 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:4B:00.0 Off | On |
| N/A 24C P0 43W / 400W | 74MiB / 40960MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 1 0 0 | 37MiB / 19968MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Singularity>

Podman

For podman you need to add --device nvidia.com/gpu=all:

[jharvard@holygpu7c26305 ~]$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Fri Jan 10 20:26:57 2025 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:31:00.0 Off | On |
| N/A 25C P0 47W / 400W | N/A | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 2 0 0 | 37MiB / 19968MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
WARN[0005] Failed to add pause process to systemd sandbox cgroup: dbus: couldn't determine address of session bus

Docker Rate Limiting

Docker Hub limits the number of pulls anonymous accounts can make. If you hit either an error of:

ERROR: toomanyrequests: Too Many Requests.

You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limits.

you will need to create a Docker account to increase your limit. See the Docker documentation for more details.

Once you have a Docker account, you can authenticate with Docker Hub with your Docker Hub account (not FASRC account) and then run a Docker container.

Singularity

singularity remote login --username <dockerhub_username> docker://docker.io

Podman

podman login docker.io

Advanced Usage

For advanced usage tips such as how to build your own containers, see our specific container software pages:

Podman – FASRC DOCS

Podman

Introduction

Podman Documentation

Working with Podman

pull

images

run

shell

GPU Example

Batch Jobs

Accessing Files

Environment Variables

Building Your Own Podman Container

Downloading OCI Container Image From Registry

Build Podman Image From Dockerfile/Containerfile

Saving/loading a container image on another compute node

Pushing a container image to a container registry

Podman Compose: Run multi-container apps

Using Docker Compose with Podman

Example: HPC-hosted Private LLM Service

Known Limitations

–cpuset-cpus option is unsupported

podman rm doesn’t work

Online Trainings Materials

References

Containers

Introduction

Types of Containers

Rootless

Rootless and Filesystems

Getting Started

Shell

GPU

Docker Rate Limiting

Advanced Usage

`podman rm` doesn’t work