Software – FASRC DOCS https://docs.rc.fas.harvard.edu Wed, 08 Jan 2025 03:56:22 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://docs.rc.fas.harvard.edu/wp-content/uploads/2018/08/fasrc_64x64.png Software – FASRC DOCS https://docs.rc.fas.harvard.edu 32 32 172380571 Python Package Installation https://docs.rc.fas.harvard.edu/kb/python-package-installation/ Wed, 04 Sep 2024 17:02:13 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=27591 Description

Python packages on the cluster are primarily managed with Mamba.  Direct use of pip, outside of a virtual environment, is discouraged on the FASRC clusters.

Mamba is a package manager that is a drop-in replacement for Conda, and is generally faster and better at resolving dependencies:

  • Speed: Mamba is written in C++, which makes it faster than Conda. Mamba uses parallel processing and efficient code to install packages faster.
  • Compatibility: Mamba is fully compatible with Conda, so it can use the same commands, packages, and environment configurations.
  • Cross-platform support: Mamba works on Mac, Linux and Windows.
  • Dependency resolution: Mamba is better at resolving dependencies than Conda.
  • Environment creation: Mamba is faster at creating environments, especially large ones.
  • Package repository: Mamba uses Mambaforge ( aka conda-forge ), the most up to date packages available.

Important:
Anaconda is currently reviewing its Terms of Service for Academia and Research and is expected to conclude the update by the end of 2024. There is a possibility that  Conda may no longer be free for non-profit academic research use at institutions with more than 200 employees. And downloading packages through Anaconda’s Main channel may incur costs.  Hence, we recommend our users switch to using open-source conda-forge channel for package distribution when possible. Our python module is built with Miniforge3 distribution that has conda-forge set as its default channel. 

Mamba is a drop-in replacement for Conda and uses the same commands and configuration options as conda. You can swap almost all commands between conda & mamba.  By default, mamba uses conda-forge, the free Mambaforge package repository.  ( In this doc, we will generally only refer to mamba.)

Usage

mamba is available on the FASRC cluster as a software module either as Mambaforge or as python/3* which is aliased to mamba. Once can access this by loading either of the following modules:

$ module load python/{PYTHON_VERS}-fasrc01
$ python -V Python {PYTHON_VERS}

Environments

You can create a virtual environments with mamba in the same way as with conda. However, it is important to start an interactive session prior to creating an environment and installing desired packages in the following manner:

$ salloc --partition test --nodes=1 --cpus-per-task=2 --mem=4GB --time=0-02:00:00

$ module load python/{PYTHON_VERS}-fasrc01

Create an environment using mamba: $ mamba create -n <ENV_NAME>

You can also install packages with the create command that could speed up your setup time significantly. For example,

$ mamba create -n <ENV_NAME> <PACKAGES> 
$ mamba create -n python_env1 python={PYTHON_VERS} pip wheel

You must activate an environment in order to use it or install any packages in it. To activate and use an environment: $ mamba activate python_env1

To deactivate an active environment: $ mamba deactivate

You can list the packages currently installed in the mamba or  conda environment with: $ mamba list

To install new packages in the environment with mamba using the default channel:

 $ mamba install -y <PACKAGES>

For example: $ mamba install -y numpy 

To install a package from a specific channel, instead:

$ mamba install --channel <CHANNEL-NAME> -y <PACKAGE>

For example: $ mamba install --channel conda-forge boltons

To uninstall packages: $ mamba uninstall PACKAGE

For additional features, please refer to the Mamba documentation.

Pip Installs

Avoid using pip outside of a mamba environment on any FASRC cluster. If you run pip install outside of a mamba environment, the installed packages will be placed in your $HOME/.local directory, which can lead to package conflicts and may cause some packages to fail to install or load correctly via mamba.

For example, if your environment name is python_env1:

$ module load python
$ mamba activate python_env1
$ pip install <package_name>

Best Practices

Use mamba environment in Jupyter Notebooks

If you would like to use a mamba environment as a kernel in a Jupyter Notebook on Open OnDemand (Cannon OOD or FASSE OOD), you have to install packages, ipykernel and nb_conda_kernels. These packages will allow Jupyter to detect mamba environments that you created from the command line.

For example, if your environment name is python_env1:

$ module load python
$ mamba activate python_env1
$ mamba install ipykernel nb_conda_kernels
After these packages are installed, launch a new Jupyter Notebook job (existing Jupyter Notebook jobs will fail to “see” this environment). Then:
  1. Open a Jupyter Notebook (a .ipynb file)
  2. On the top menu, click Kernel -> Change kernel -> select the conda environment

Mamba environments in holylabs space

With mamba, use the -p or --prefix option to specify writing environment files to a holylabs share location.  Don’t use your home directory as it has very low performance due to filesystem latency.  Using a lab share location, you can also share your conda environment with other people on the cluster.  Keep in mind, you will need to make the destination directory, and specify the python version to use.  For example:

$ mamba create --prefix /n/holylabs/LABS/{YOUR_LAB}/Lab/envs python={PYTHON_VERS}

$ mamba activate /n/holylabs/LABS/{YOUR_LAB}/Lab/envs

Troubleshooting

Interactive vs. batch jobs

If your code works in an interactive job, but fails in a slurm batch job,

  1. You are submitting your jobs from within a mamba environment.
    Solution 1: Deactivate your environment with the command mamba deactivate and submit the job or
    Solution 2: Open another terminal and submit the job from outside the environment.

  2. Check if your ~/.bashrc or ~/.bash_profile files have a section of conda initialize or a source activate command. The conda initialize section is known to create issues on the FASRC clusters.
    Solution: Delete the section between the two conda initialize statements. If you have source activate in those files, delete it or comment it out.
    For more information on ~/.bashrc files, see https://docs.rc.fas.harvard.edu/kb/editing-your-bashrc/

Jupyter Notebook or JupyterLab on Open OnDemand/VDI problems

See Jupyter Notebook or JupyterLab on Open OnDemand/VDI troubleshooting section.

]]>
27591
R and RStudio on the FASRC clusters https://docs.rc.fas.harvard.edu/kb/r-and-rstudio/ Fri, 07 Jun 2024 20:46:42 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=27082

What is R?

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

There are several options to use R on the FASRC clusters:

We recommend using RStudio Server on Open OnDemand because it is the simplest way to install R packages (see RStudio Server). We only recommend R module and RStudio Desktop if:

  • plan to run mpi/multi-node jobs
  • need to choose specific compilers for R package installation
  • you are an experienced user and know how to install software

RStudio Server

RStudio Server is our go-to RStudio app because it contains a wide range of precompiled R packages from bioconductor and rocker/tidyverse. This means that installing R packages in RStudio Server is pretty straightforward. Most times, it will be sufficient to simply:

> install.packages("package_name")

This simplicity was possible because RStudio Server runs inside a Singularity container, meaning that it does not use the host operating system (OS). For more information on Singularity, refer to our Singularity on the cluster docs.

Important notes:

  • User-installed R libraries will be installed in ~/R/ifxrstudio/\<IMAGE_TAG\>
  • This app contains many pre-compiled packages from bioconductor and rocker/tidyverse.
  • FAS RC environment modules (e.g. module load) and Slurm (e.g. sbatch) are not accessible from this app.
  • For the RStudio with environment module and Slurm support, see RStudio Desktop

This app is useful for most applications, including multi-core jobs. However, it is not suitable for multi-node jobs. For multi-node jobs, the recommended app is RStudio Desktop.


FASSE cluster additional settings

If you are using FASSE Open OnDemand and need to install R packages in RStudio Server, you will likely need to set the proxies as explained in our Proxy Settings documentation. Before installing packages, execute these two commands in RStudio Server:

> Sys.setenv(http_proxy="http://rcproxy.rc.fas.harvard.edu:3128")
> Sys.setenv(https_proxy="http://rcproxy.rc.fas.harvard.edu:3128")

Package Seurat

In RStudio Server Release 3.18, the default version for umap-learn is 0.5.5. However, this version contains a bug. To resolve this issue, downgrade to umap-learn version 0.5.4:

> install.packages("Seurat")
> reticulate::py_install(packages = c("umap-learn==0.5.4","numpy<2"))

And test with

> library(Seurat)
> data("pbmc_small")
> pbmc_small <- RunUMAP(object = pbmc_small, dims = 1:5, metric='correlation', umap.method='umap-learn')
UMAP(angular_rp_forest=True, local_connectivity=1, metric='correlation', min_dist=0.3, n_neighbors=30, random_state=RandomState(MT19937) at 0x14F205B9E240, verbose=True)
Wed Jul 3 17:22:55 2024 Construct fuzzy simplicial set
Wed Jul 3 17:22:56 2024 Finding Nearest Neighbors
Wed Jul 3 17:22:58 2024 Finished Nearest Neighbor Search
Wed Jul 3 17:23:00 2024 Construct embedding
Epochs completed: 100%| ██████████ 500/500 [00:00]
Wed Jul 3 17:23:01 2024 Finished embedding

R, CRAN, and RStudio Server pinned versions

To ensure R packages compatibility, R, CRAN, and RStudio Server versions are pinned to a specific date. For more details see Rocker project which is the base image for FASRC’s RStudio Server.

Use R packages from RStudio Server in a batch job

The RStudio Server OOD app hosted on Cannon at rcood.rc.fas.harvard.edu and FASSE at fasseood.rc.fas.harvard.edu runs RStudio Server in a Singularity container (see Singularity on the cluster). The path to the Singularity image on both Cannon and FASSE clusters is the same:

/n/singularity_images/informatics/ifxrstudio/ifxrstudio:RELEASE_<VERSION>.sif

Where <VERSION> corresponds to the Bioconductor version listed in the “R version” dropdown menu. For example:

R 4.2.3 (Bioconductor 3.16, RStudio 2023.03.0)

uses the Singularity image:

/n/singularity_images/informatics/ifxrstudio/ifxrstudio:RELEASE_3_16.sif

As mentioned above, when using the RStudio Server OOD app, user-installed R packages by default go in:

~/R/ifxrstudio/RELEASE_<VERSION>

This is an example of a batch script named runscript.sh that executes R script myscript.R inside the Singularity container RELEASE_3_16:

#!/bin/bash
#SBATCH -c 1 # Number of cores (-c)
#SBATCH -t 0-01:00 # Runtime in D-HH:MM
#SBATCH -p test # Partition to submit to
#SBATCH --mem=1G # Memory pool for all cores (see also --mem-per-cpu)
#SBATCH -o myoutput_%j.out # File to which STDOUT will be written, %j inserts jobid
#SBATCH -e myerrors_%j.err # File to which STDERR will be written, %j inserts jobid

# set R packages and rstudio server singularity image locations
my_packages=${HOME}/R/ifxrstudio/RELEASE_3_16
rstudio_singularity_image="/n/singularity_images/informatics/ifxrstudio/ifxrstudio:RELEASE_3_16.sif"

# run myscript.R using RStudio Server signularity image
singularity exec --cleanenv --env R_LIBS_USER=${my_packages} ${rstudio_singularity_image} Rscript myscript.R

To submit the job, execute the command:

sbatch runscript.sh

Advanced Users

These options are for users familiar with software installation from source, where you choose compilers and set your environmental variables. If you are not familiar with these concepts, we highly recommend using RStudio Server instead.

R module

To use R module, ou should first have taken our Introduction to the FASRC training and be familiar with running jobs on the cluster. R modules come with some basic R packages. If you use a module, you will likely have to install most of the R packages that you need.

To use R on the FASRC clusters, load R via our module system. For example, this command will load the latest R version:

module load R

If you need a specific version of R, you can search with the command

module spider R

To load a specific version

module load R/4.2.2-fasrc01

For more information on modules, see the Lmod Modules page.

To use R from the command line, you can use an R shell for interactive work. For batch jobs, you can use R CMD BATCH and RScript commands. Note that these commands have different behaviors:

  • R CMD BATCH
    • output will be directed to a .Rout file unless you specify otherwise
    • prints out input statements
    • cannot output to STDOUT
  • RScript
    • output and errors are directed to to STDOUT and STDERR, respectively, as many other programs
    • does not print input statements

For slurm batch examples, refer to FASRC User_Codes Github repository:

Examples and details of how to run R from the command line can be found at:

R Module + RStudio Desktop

RStudio Desktop depends on an R module. Although it has some precompiled R packages that comes with the R module, it is a much more limited list than the RStudio Server app.

RStudio Desktop runs on the host operating system (OS), the same environment as when you ssh to Cannon or FASSE.

This app is particularly useful to run multi-node/mpi applications because the you can specify the exact modules, compilers, and packages that you need to load.

See how to launch RStudio Desktop documentaiton.

R in Jupyter

To use R in Jupyter, you will need to create a conda/mamba virtual environment and install packages jupyter and rpy2 , which will allow you to use R in Jupyter.

Step 1:  Request an interactive job

salloc --partition test --time 02:00:00 --ntasks=1 --mem 10000

Step 2: Load python module, set environmental variables, and create an environment with the necessary packages:

module load python/3.10.13-fasrc01
export PYTHONNOUSERSITE=yes
mamba create -n rpy2_env jupyter numpy matplotlib pandas scikit-learn scipy rpy2 r-ggplot2 -c conda-forge -y

See Python instructions for more details on Python and mamba/conda environments.

After creating the mamba/conda environment, you will need to load that environment by selecting the corresponding kernel on the Jupyter Notebook to start using R in the notebook.

Step 3: Launch the Jupyter app on the OpenOnDemand VDI portal using these instructions.

You may need to load certain modules for package installations. For example, R package lme requires cmake. You can load cmake by adding the module name in the field “Extra Modules”:

Step 4: Open your Jupyter notebook. On the top right corner, click on “Python 3” (typically, it has “Python 3”, but it may be different on your Notebook). Select the created conda environment “Python [conda env:conda-rpy2_env]”:

Alternatively, you can use the top menu: Kernel -> Change Kernel -> Python [conda env:conda-rpy2_env]

Step 5: Install R packages using a Jupyter Notebooks

Refer to the example Jupyter Notebook on FASRC User_Codes Github.

R with Spack

Step 1: Install Spack by following our Spack Install and Setup instructions.

Step 2: Install the R packages with Spack from the command line. For all R package installations with Spack, ensure you are in a compute node by requesting an interactive job (if you are already in a interactive job, there is no need to request another interactive job):

[jharvard@holylogin spack]$ salloc --partition test --time 4:00:00 --mem 16G -c 8
Installing R packages with spack is fairly simple. The main steps are:
[jharvard@holy2c02302 spack]$ spack install package_name  # install software
[jharvard@holy2c02302 spack]$ spack load package_name     # load software to your environment
[jharvard@holy2c02302 spack]$ R                           # launch R
> library(package_name)                                   # load package within R
For specific examples, refer to FASRC User_Codes Github repository:

R and RStudio on Windows

See our R and RStudio on Windows page.

Troubleshooting

Files that may configure R package installations

  • ~/.Rprofile
  • ~/.Renviron
  • ~/.bashrc
  • ~/.bash_profile
  • ~/.profile
  • ~/.config/rstudio/rstudio-prefs.json
  • ~/.R/Makevars

References

]]>
27082
Macaulay2 https://docs.rc.fas.harvard.edu/kb/macaulay2/ Wed, 01 May 2024 20:37:38 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=27049 Description

Macaulay2 is a  algebraic geometry and commutative algebra software. Its creation and development has been funded by the National Science Foundation since 1992.

Macaulay2 on the cluster

Macaulay2 is available on the cluster via Singularity containers. We recommend working on a compute node. You can get to a compute node by requesting an interactive job. For example

salloc --partition test --time 01:00:00 --cpus-per-task 4 --mem-per-cpu 2G

You can pull (i.e. download) a container with the command

singularity pull docker://unlhcc/macaulay2:latest

Start a shell inside the Singularity container

singularity shell macaulay2_latest.sif

The prompt will change to Singularity>. Then, type M2 to start Macualay2. You should see a prompt with i1:

Singularity> M2
Macaulay2, version 1.15
--storing configuration for package FourTiTwo in /n/home01/jharvard/.Macaulay2/init-FourTiTwo.m2
--storing configuration for package Topcom in /n/home01/jharvard/.Macaulay2/init-Topcom.m2
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases, PrimaryDecomposition, ReesAlgebra, TangentCone,
Truncations

i1 :

For examples, we recommend visiting Macaulay2 documentation.

Resources

]]>
27049
Mathematica https://docs.rc.fas.harvard.edu/kb/mathematica/ Tue, 30 Apr 2024 14:33:09 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=27043 Description

Mathematica is a powerful computational software system that provides a comprehensive environment for technical computing. Developed by Wolfram Research, it offers a wide range of capabilities spanning symbolic and numerical computation, visualization, and programming. Mathematica’s symbolic engine allows for the manipulation of mathematical expressions, equations, and functions, making it particularly useful for tasks such as calculus, algebra, and symbolic integration. Its vast library of built-in functions covers various areas of mathematics, science, and engineering, enabling users to tackle diverse problems efficiently. Moreover, Mathematica’s interactive interface and high-level programming language facilitate the creation of custom algorithms and applications, making it an indispensable tool for researchers, educators, and professionals in countless fields.

Mathematica is available on the FASRC Cannon cluster as software modules. Currently, the following modules/versions are available:

mathematica/12.1.1-fasrc01 and mathematica/13.3.0-fasrc01

Examples

To start using Mathematica on the FASRC cluster, please look at the examples on our Users Code repository.

Resources

]]>
27043
Gaussian https://docs.rc.fas.harvard.edu/kb/gaussian/ Tue, 27 Feb 2024 18:54:06 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=26828 Access

Please contact us if you require Gaussian access. It is controlled on a case-by-case basis and requires membership in a security group.

When you are not a member of this security group, you can still load the module, but you will not only be able to run Gaussian.

FASRC provides the module and basic instructions on how to launch Gaussian, but we do not provide support specifics on how to run Gaussian. For how to run Gaussian, refer to Gaussian documentation or your department.

Running Gaussian

Example batch file runscript.sh:

#!/bin/bash
#SBATCH -J my_gaussian # job name
#SBATCH -c 1 # number of cores
#SBATCH -t 01:00:00 # time in HH:MM:SS
#SBATCH -p serial_requeue # partition
#SBATCH --mem-per-cpu=800 # memory per core
#SBATCH -o rchelp.out # standard output file
#SBATCH -e rchelp.err # standard error file

module load gaussian/16-fasrc04

g16 CH4_s.gjf

To submit the job:

sbatch runscript.sh

Versions

You can search for gaussian modules with the command module spider gaussian:

[jharvard@boslogin02 ~]$ module spider gaussian

-----------------------------------------------------------------------------------------------------------------------------------------
gaussian:
-----------------------------------------------------------------------------------------------------------------------------------------
Description:
Gaussian, a computational chemistry software program

Versions:
gaussian/16-fasrc01
gaussian/16-fasrc02
gaussian/16-fasrc03
gaussian/16-fasrc04

And to see the details about a particular module, use commands module spider or module display:

[jharvard@boslogin02 ~]$ module spider gaussian/16-fasrc04

-----------------------------------------------------------------------------------------------------------------------------------------
gaussian: gaussian/16-fasrc04
-----------------------------------------------------------------------------------------------------------------------------------------
Description:
Gaussian, a computational chemistry software program

This module can be loaded directly: module load gaussian/16-fasrc04

Help:
gaussian-16-fasrc04
Gaussian, a computational chemistry software program

[jharvard@boslogin02 ~]$ module display gaussian/16-fasrc04
-----------------------------------------------------------------------------------------------------------------------------------------
/n/sw/helmod-rocky8/modulefiles/Core/gaussian/16-fasrc04.lua:
-----------------------------------------------------------------------------------------------------------------------------------------
help([[gaussian-16-fasrc04
Gaussian, a computational chemistry software program
]], [[
]])
whatis("Name: gaussian")
whatis("Version: 16-fasrc04")
whatis("Description: Gaussian, a computational chemistry software program")
setenv("groot","/n/sw/g16_sandybridge")
setenv("GAUSS_ARCHDIR","/n/sw/g16_sandybridge/g16/arch")
setenv("G09BASIS","/n/sw/g16_sandybridge/g16/basis")
setenv("GAUSS_SCRDIR","/scratch")
setenv("GAUSS_EXEDIR","/n/sw/g16_sandybridge/g16/bsd:/n/sw/g16_sandybridge/g16/local:/n/sw/g16_sandybridge/g16/extras:/n/sw/g16_sandybridge/g16")
setenv("GAUSS_LEXEDIR","/n/sw/g16_sandybridge/g16/linda-exe")
prepend_path("PATH","/n/sw/g16_sandybridge/g16/bsd:/n/sw/g16_sandybridge/g16/local:/n/sw/g16_sandybridge/g16/extras:/n/sw/g16_sandybridge/g16")
prepend_path("PATH","/n/sw/g16_sandybridge/nbo6_x64_64/nbo6/bin")

GaussView

RC users can download these clients from our Downloads page. You must be connected to the FASRC VPN to access this page. Your FASRC username and password are required to log in.

On MacOS: Move the downloaded file to the ‘Applications’ folder, unarchive it, and double click on the gview icon located in gaussview16_A03_macOS_64bit.

On Windows: Unarchive the file in the Downloads folder itself.

A pop up will appear saying “Gaussian is not installed”.

Click on OK. This would now open the gview interface.

In the case Gaussview doesn’t open on MacOS, do the following:

Go to the Applications folder > gaussview16 folder > Right click on gview and choose “Show Package Contents”

Go to the Contents folder of gview > MacOS folder > Right click on the gview executable and choose “Open”

A pop up will appear saying “Gaussian is not installed”. Click on OK. This would now open the gview interface.

 

Note: We do not have a license for Gaussview on the cluster. It needs to be run locally.  

]]>
26828
Running Singularity image with CentOS 7 on Rocky 8 https://docs.rc.fas.harvard.edu/kb/centos7-singularity/ Fri, 05 May 2023 14:30:52 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=26245 If you absolutely need to still use CentOS 7 after the OS upgrade to Rocky 8, you can use it with SingularityCE. For more information on SingularityCE, see our Singularity documentation and GitHub User Codes.

We have a Singularity image with CentOS7 and the same environment of compute nodes (as of March 29th, 2023). In addition, you can access CentOS 7 modules from within the Singularity container. The image is stored at:

/n/singularity_images/FAS/centos7/compute-el7-noslurm-2023-03-29.sif

You can execute this image and/or copy it, but you cannot modify it in its original location. See below how you can modify this image.

Run Singularity with CentOS 7

To get a bash shell on CentOS 7 environment, you can run:

[jharvard@holy7c12102 ~]$ singularity run /n/singularity_images/FAS/centos7/compute-el7-noslurm-2023-03-29.sif
Singularity>

or

[jharvard@holy7c12102 ~]$ singularity exec /n/singularity_images/FAS/centos7/compute-el7-noslurm-2023-03-29.sif /bin/bash
Singularity>

NOTE: The command singularity shell is also an option. However it does not allow you to access modules as explained in Load modules

Load modules

You can still load modules that were available on CentOS 7 from inside the Singularity container:

[jharvard@holy7c12102 ~]$ singularity exec /n/singularity_images/FAS/centos7/compute-el7-noslurm-2023-03-29.sif /bin/bash
Singularity> module load gcc
Singularity> module load matlab
Singularity> module list

Currently Loaded Modules:
  1) gmp/6.2.1-fasrc01   2) mpfr/4.1.0-fasrc01   3) mpc/1.2.1-fasrc01   4) gcc/12.1.0-fasrc01   5) matlab/R2022b-fasrc01

Note that the modules are from the CentOS 7 environment:

Singularity> module display matlab/R2022b-fasrc01
-----------------------------------------------------------------------------------------------------------------------------------------------------------
   /n/helmod/modulefiles/centos7/Core/matlab/R2022b-fasrc01.lua:
-----------------------------------------------------------------------------------------------------------------------------------------------------------
help([[matlab-R2022b-fasrc01
a high-level language and interactive environment for numerical computation, visualization, and programming

]], [[
]])
whatis("Name: matlab")
whatis("Version: R2022b-fasrc01")
whatis("Description: a high-level language and interactive environment for numerical computation, visualization, and programming")
setenv("MATLAB_HOME","/n/helmod/apps/centos7/Core/matlab/R2022b-fasrc01")
prepend_path("PATH","/n/helmod/apps/centos7/Core/matlab/R2022b-fasrc01/bin")
setenv("MLM_LICENSE_FILE","27000@rclic1")
setenv("ZIENA_LICENSE_NETWORK_ADDR","10.242.113.134:8349")

Submit slurm jobs

If you need to submit a job rather than getting to a shell, you have to do the following steps in the appropriate order:

  1. launch the singularity image
  2. load modules
  3. (optional) compile code
  4. execute code

If you try to load modules before launching the image, it will try to load modules from the Rocky 8 host system.

To ensure that steps 2-4 are run within the singularity container, they are place between END (see slurm batch script below).

NOTE: You cannot submit slurm jobs from inside the container, but you can submit a slurm job that will execute the container.

Example with a simple hello_world.f90 fortran code:

program hello
  print *, 'Hello, World!'
end program hello

Slurm batch script run_singularity_centos7.sh:

#!/bin/bash
#SBATCH -J sing_hello           # Job name
#SBATCH -p rocky                # Partition(s) (separate with commas if using multiple)
#SBATCH -c 1                    # Number of cores
#SBATCH -t 0-00:10:00           # Time (D-HH:MM:SS)
#SBATCH --mem=500M              # Memory
#SBATCH -o sing_hello_%j.out    # Name of standard output file
#SBATCH -e sing_hello_%j.err    # Name of standard error file

# start a bash shell inside singularity image
singularity run /n/singularity_images/FAS/centos7/compute-el7-noslurm-2023-03-29.sif <<END

# load modules
module load gcc/12.1.0-fasrc01
module list

# compile code
gfortran hello_world.f90 -o hello.exe

# execute code
./hello.exe
END

To ensure that the commands are run within the singularity container, they are place between END.

To submit the slurm batch script:

sbatch run_singularity_centos7.sh

Another option have a bash script with steps 2-4 and then use singularity run to execute the script. For example, script_inside_container.sh:

#!/bin/bash

# load modules
module load gcc/12.1.0-fasrc01
module list

# compile code
gfortran hello_world.f90 -o hello.exe

# execute code
./hello.exe

And the slurm batch script run_singularity_centos7_script.sh becomes:

#!/bin/bash
#SBATCH -J sing_hello           # Job name
#SBATCH -p rocky                # Partition(s) (separate with commas if using multiple)
#SBATCH -c 1                    # Number of cores
#SBATCH -t 0-00:10:00           # Time (D-HH:MM:SS)
#SBATCH --mem=500M              # Memory
#SBATCH -o sing_hello_%j.out    # Name of standard output file
#SBATCH -e sing_hello_%j.err	# Name of standard error file

# start a bash shell inside singularity image
singularity run /n/singularity_images/FAS/centos7/compute-el7-noslurm-2023-03-29.sif script_inside_container.sh

You can submit a batch job with:

sbatch run_singularity_centos7_script.sh

Modify SingularityCE image with CentOS 7

If you need to run your codes in the former operating system (pre June 2023), you can build a custom SingularityCE image with proot. The base image is the the FASRC CentOS 7 compute node image, and you can add your own software/library/packages under the %post header in the Singularity definition file.

Step 1: Follow steps 1 and 2 in setup proot

Step 2: Copy the CentOS 7 compute image to your holylabs (or home directory). The base image file needs to be copied to a directory that you have read/write access, otherwise it will fail to build your custom image

[jharvard@holy2c02302 ~]$ cd /n/holylabs/LABS/jharvard_lab/Users/jharvard
[jharvard@holy2c02302 jharvard]$ cp /n/holystore01/SINGULARITY/FAS/centos7/compute-el7-noslurm-2023-02-15.sif .

Step 3: In definition file centos7_custom.def, set Bootstrap: localimage and put the path of the existing image that you copied for the From: field. Then, add your packages/software/libraries that you need. Here, we add cowsay:

Bootstrap: localimage
From: compute-el7-noslurm-2023-02-15.sif

%help
    This is CentOS 7 Singularity container based on the Cannon compute node with my added programs.

%post
    yum -y update
    yum -y install cowsay

Step 3: Build the SingularityCE image

[jharvard@holy2c02302 jharvard]$ singularity build centos7_custom.sif centos7_custom.def
INFO:    Using proot to build unprivileged. Not all builds are supported. If build fails, use --remote or --fakeroot.
INFO:    Starting build...
INFO:    Verifying bootstrap image compute-el7-noslurm-2023-02-15.sif
WARNING: integrity: signature not found for object group 1
WARNING: Bootstrap image could not be verified, but build will continue.
INFO:    Running post scriptlet
+ yum -y update

... omitted output ...

Running transaction
  Installing : cowsay-3.04-4.el7.noarch                   1/1
  Verifying  : cowsay-3.04-4.el7.noarch                   1/1

Installed:
  cowsay.noarch 0:3.04-4.el7

Complete!
INFO:    Adding help info
INFO:    Creating SIF file...
INFO:    Build complete: centos7_custom.sif
]]>
26245
Spack software package management https://docs.rc.fas.harvard.edu/kb/spack/ Tue, 18 Apr 2023 14:21:49 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=26139 Overview

Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments. It was designed for large supercomputer centers, where many users and application teams share common installations of software on clusters with exotic architectures, using non-standard libraries. Spack is non-destructive: installing a new version does not break existing installations. In this way several configurations can coexist on the same system.

Most importantly, Spack is simple. It offers a simple spec syntax so that users can specify versions and configuration options concisely. Spack is also simple for package authors: package files are written in pure Python, and specs allow package authors to maintain a single file for many different builds of the same package.

This page provides a brief guide to the usage of Spack. For more detailed instructions please checkout FASRC User Codes and the official Spack documentation.

Installation

To install Spack do:

git clone -c feature.manyFiles=true https://github.com/spack/spack.git

We recommend cloning into your lab directory on holylabs, or other lab storage if holylabs is not available, in a common directory (e.g. /n/holylabs/LABS/jharvard_lab/Lab/software). This way you can have a collective Spack installation for your lab to use and you get the superior performance of holylabs over your home directory. Other locations work as well, but you should pick a location that has decent performance and is mounted to the entire cluster.

Setup

To setup Spack, go to your Spack directory and run . /share/spack/setup-env.sh. This will modify your environment and put spack in your path.  If you want to have spack always on, you will need to update your .bashrc with source <PATHTOSPACK>/share/spack/setup-env.sh

Package Management

Below is a list of convenient commands for Spack. Note: Depending on the package installation can take hours so plan accordingly by using screen/tmux or a Remote Desktop on Open Ondemand.

Command Purpose
spack list List all available packages
spack list <package> List specific package
spack versions <package> List available versions of package
spack install <package> Install package
spack uninstall <package> Remove package
spack find List installed packages
spack load <package> Load package for use
spack unload <package> Unload package from environment

Further Reading

]]>
26139
FASRC Github User Codes Repository https://docs.rc.fas.harvard.edu/kb/github-user-codes/ Wed, 15 Mar 2023 18:57:22 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=26095 Overview

FASRC maintains a Github repository with codes and examples for many applications and processes.  We endeavor to add links to these codes on the relevant document here in our documentation, but in the interest of catching edge cases, here is a copy of the current list of items and subjects in that repository. Note that this listing is not necessarily complete. We invite you to checkout the github repo and use the search functions there. Community contributions, comments, pull requests, and corrects are welcome and appreciated.

Building Software

Singularity Containers

Languages

Numerical and I/O Libraries

Artificial Intelligence/Machine Learning (AI/ML) Frameworks

Parallel Computing

Applications

Preformance

Example Recipes

 

 

]]>
26095
LANDIS https://docs.rc.fas.harvard.edu/kb/landis/ Wed, 15 Mar 2023 16:50:50 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=26090 You can find documentation on LANDIS in our github repo: https://github.com/fasrc/User_Codes/tree/master/Applications/LANDIS

]]>
26090
Rocky 8 Transition Guide https://docs.rc.fas.harvard.edu/kb/rocky-8-transition-guide/ Mon, 13 Mar 2023 19:13:28 +0000 https://docs.rc.fas.harvard.edu/?post_type=epkb_post_type_1&p=26061 Overview

Please Note: This document is being preserved for historical purposes. This transition took place in 2023.

As part of our June 5-8, 2023 MGHPCC Downtime, FASRC will be upgrading the cluster operating system from CentOS 7 to Rocky 8. Details as to why the transition is taking place are provided on the downtime page. This page is intended to go over the test environment for users prior to the downtime as well as a guide for different aspects of the upgrade.

  • Town hall presentation slides
  • Town Hall presentation video

 

OOD/VDI (Open OnDemand) changes: The Open OnDemand instances for Cannon (and later for FASSE) will involve a change to each user’s settings and the location of the folder containing those settings and other OOD-specific files. The new location in your home directory will be ~/.fasrcood  This also means that values you had filled out in the OOD start form will not be retained. Form values will revert to the defaults.

Any “development mode” AKA “sandbox” apps that you have installed to ~/fasrc/dev (approximately 78 users have done so) will no longer be visible through your OOD Dashboard, and will need to be moved to ~/.fasrcood/dev before they can be used on the new OOD. The old folder at ~/fasrc will no longer be used after June 8th and, assuming you have nothing in /dev to move,  can be removed from your home directory (this directory may have grown large, so it is not advised to keep it around unecessarily).

 

Software

Warning

The change in operating system means that most users software built on CentOS 7 will not work and will need to be rebuilt. Even if the code does work with out being rebuilt the change in underlying libraries may impact code execution and results.

Users should test and verify that their codes are producing the expected results on the new operating system. This is a significant change with a host of bug fixes, performance upgrades, and numerical methods changes. These can change results, so users need to be sure to test and validate their codes.

Overview

Below you will find discussion and links to various software documentation.  In general if you are using a package manager you should work with in that manager. If you use Spack you should stick to installing packages via Spack, even if you are using python, Julia, or R. If you are using R, then stick to R and do not use Spack. If you are using conda or pip then do not use Spack.  If you are using Julia, do not use Spack. Mixing package managers can cause problems and weird dependencies and should be avoided unless absolutely necessary. Modules can mix with everything, so there should be no concern with using those.

Software Overview

Building

While software can be built on the login nodes, we recommend users start an interactive session via salloc. This is especially true if you want to build code optimized for specific chipsets. We run a variety of hardware and the login nodes are of a different architecture than the compute nodes.

Modules

FASRC has traditionally provided modules of different software packages for users. For Rocky 8 we are significantly reducing the number of modules we will support. Only modules considered core operational codes (like compilers, MPI, or software environments) or licensed software (e.g. matlab, mathematica) will be built. You can find the list of modules we provide by doing: module avail. Note that module avail only shows the modules that can currently be loaded, it does not show those modules built against specific compilers and versions of MPI. To see those modules you must load the compiler and MPI versions you want to use. To search the modules do: module spider <name>  CentOS 7 modules will not be available in Rocky 8. For modules formerly provided in CentOS 7 we are recommending users use Spack instead.

Available modules list

Spack

FASRC Spack Guide

Singularity

FASRC Singularity Guide

CentOS 7 Singularity Image

For those who cannot upgrade to Rocky 8 we are providing a CentOS 7 Singularity image which contains our full CentOS 7 environment as well as access to our old CentOS 7 modules. A guide for using that environment can be found here: Running Singularity image with CentOS 7 on Rocky 8

Julia

FASRC Julia Guide

Python

FASRC Python Guide

Python 2

Python 2 has been deprecated since 2020. Users are encouraged to migrate to Python 3. For those who need Python 2 for historic codes we recommend using either a Singularity container or the python/2.7.16-fasrc01 module (which uses Anaconda2).

PyTorch

Note that the rocky_gpu partition on Rocky 8 test cluster is setup with Multi-instance GPU (MIG) feature of Nvidia A100s. Due to MIG, PyTorch may not work. If you would like to test PyTorch on rocky_gpu, please send us a ticket.

R

FASRC R Guide

Other

FASRC User Codes

FAQ

Partitions

Based on a thorough analysis of usage patterns, many partitions’ time limits have changed from 7 days to 3.  Further explanation can be found here:

Legacy CentOS 7 Support

FASRC provides a container of our full CentOS 7 environment for those who require it for their code. Beyond that support for CentOS 7 will not be maintained for the compute environment. Slurm support for CentOS 7 will be dropped with the next major Slurm upgrade, if you have a host that connects to Slurm that is CentOS 7 contact FASRC to discuss migration. Virtual Machine’s and other systems running CentOS 7 and older will need to migrate to other hosting options or coordinate upgrades with FASRC.

I don’t see the module/software I use in modules, Spack, Singularity, or User Codes

Users are always welcome to build their own codes on the cluster or download compatible binaries. We provide guides for how to accomplish this in our documentation. If you have trouble doing this or you think a module is missing contact rchelp@rc.fas.harvard.edu

SSH key or ‘DNS spoofing’ errors when connecting to login or other nodes

WARNING: POSSIBLE DNS SPOOFING DETECTED! and/or The RSA host key for login.rc.fas.harvard.edu has changed error messages.  After an update of nodes the SSH key fingerprint of a node may change. This will, in turn, cause an error when you next try to log into that node as your locally stored key will no longer match. SSH uses this as a way to spot man-in-the-middle attacks or DNS spoofing. However, when a key changes for a valid reason, this does mean you need to clear out the one on your computer in order to be able to re-connect.

See this FAQ page for further instructions: SSH key error, DNS spoofing message

Modules in your .bashrc no longer work or give errors on login

If you have edited your .bashrc file to include module loads at login, you may find that some CentOS 7 modules will not be found or may not work on Rocky 8. You will need to edit your .bashrc and comment out or remove any such lines going forward. If you can no longer log in because of something in your .bashrc, contact us and we can rename your .bashrc and copy in a default version for you.

If you’d like to start from scratch, a default .bashrc contains the following:

# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi

# User specific aliases and functions below

My alternate shell (csh, tcsh, etc.) doesn’t work right

Having a non-standard default shell will cause problems and does not allow us to set global environmental defaults for everyone. We will no longer change the default shell on any account or support the use of alternate shells as default login shell. Users who do not have bash as their default login shell will need to change back to bash. Users can, of course, still launch an alternate shell once logged in.

My module won’t load in a batch job

If you are getting a error similar to this when loading a module in a batch job

"/usr/bin/lua: /n/helmod/apps/lmod/7.7.32/libexec/lmod:61: module 'posix' not found:
no field package.preload['posix']
no file '/usr/share/lua/5.1/posix.lua'
no file '/usr/share/lua/5.1/posix/init.lua'
no file '/usr/lib64/lua/5.1/posix.lua'
no file '/usr/lib64/lua/5.1/posix/init.lua'
no file '/usr/lib64/lua/5.1/posix.so'
no file '/usr/lib64/lua/5.1/loadall.so'
stack traceback:
[C]: in function 'require'
/n/helmod/apps/lmod/7.7.32/libexec/lmod:61: in main chunk
[C]: in ?
/var/slurmd/spool/slurmd/job53240333/slurm_script: line 27: python: command not found
"

Then you are submitting your job from a Centos 7 host such as boslogin, holylogin, or a Centos 7 compute node. You must submit the job from a Rocky 8 host (e.g. rockylogin)

Processes run on a login node are restricted

We have moved away from the old pkill process on login nodes which killed processes using too much CPU or memory (RAM) usage to try and maintain a fair balance for all (example: running Mathematica on a login node and not a compute node). The Rocky 8 login nodes use a system-level mechanism called cgroups which limits each logged-in account to 1 core and 4GB of memory. This should eliminate memory or CPU hogging on login nodes. Please use an interactive session if you need to run more involved processes that require more memory or CPU. Login nodes are meant as gateways to launching and monitoring jobs and to run light processes to prepare your jobs or environment.

Should you run a process on a login node that runs afoul of cgroup policies, the process will be killed. Please be aware that there is no warning mechanism; such processes will be killed without warning, so please err on the side of caution when choosing to run something on a login node versus an interactive session.

Passwordless ssh not working

The permissions on your ~/.ssh directory maybe incorrect. It is now required that your ~/.ssh directory be only accessible by you. You can make sure of this by running chmod -R g-rwx,o-rwx ~/.ssh  It is also worth double checking the permissions on your home directory. In general it should only be accessible to you, or if it is accessible to others they should not have write access.

 

]]>
26061