Search Results for: security policy

FAS RC Research Data Retention and Deletion Policy

Purpose:

This policy defines FAS RC standards and procedures for the retention and deletion of research data, outputs, temporary files, and associated digital resources managed by the FAS RC in support of research activities.

Scope:

This policy applies to all research data stored, processed, or managed on servers, workstations, cloud resources, storage systems, or backup media provisioned by the FAS Research Computing Service Group.

Data Retention:

Following the departure of faculty from the University, the associated primary department will assume responsibility for the maintenance, storage, and cost of housing the remaining research data.

Home Directories:

Aligning with the University Research Data Security Policy and the Retention and Maintenance of Research Records and Data Frequently Asked Questions (“FAQs”), home directories will be retained for no more than 7 years following a researcher’s departure from the University or the deactivation of their FASRC account. The researcher’s last login to their FASRC account will be used to track compliance.

Project Data:

Principal Investigators (PIs) should notify FAS RC 60 days prior to their departure from the University including the duration of any appointments (courtesy or associate), with instructions and next steps for remaining datasets.

For research data associated with completed or inactive research projects and/or departed faculty where no notice has been given to FAS RC as to where the research data should be stored:

The PIs Harvard affiliated primary department becomes responsible for the storage and cost of the research data. Closure of the PIs group and project in FAS RC will be used to track compliance.
The research data will be retained in the source storage directory for 2 years following project completion or inactivity. Completion of a project occurs after:
1. final reporting to the research sponsor
2. final financial close-out of a sponsored research award segment
3. final publication of research results
4. cessation of academic or scientific activity on a specific activity on a specific research project, regardless of whether its results are published, whichever is later.
Following 2 years of inactivity, data will be migrated to FASRC Long-Term Storage. The data will be retained for an additional 5 years to meet the University Data Retention guidelines. Following the completion of 5 years, the data can be deleted. Departments will be notified via email prior to the deletion.

Temporary and Scratch Storage:

Data stored in scratch or temporary directories may be deleted after 90 days without notice to maximize available resources.

Deletion Procedures:

Faculty and/or departments will be notified in advance of research data being deleted, per the timelines above. If PIs or Faculty are no longer associated with the University, the relevant department leadership will be notified via email.
Data will be deleted using secure erasure methods in accordance with institutional IT security standards.
Requests for retention extension can be made in writing and are subject to approval by FASRC and the department; individuals requesting the extension will be responsible for all associated storage costs.

Ownership and Roles:

University: Harvard University owns all research data generated through projects conducted under its authority or using its resources. While PIs and researchers manage and safeguard the data, the University is ultimately responsible for compliance with legal and sponsor requirements, ensuring confidentiality and security.
Principal Investigators: Principal Investigators (PIs) are stewards of research data. If PIs choose to delegate responsibility within their research groups, the PI remains accountable to the University for stewardship of the data. Principal Investigators are responsible for ensuring proper data management, storage, and accessibility, meeting all University, legal, and sponsor requirements. This involves setting up procedures for data retention, confidentiality, and sharing while respecting data use agreements.
Departments: In the case that a PI has left the University without delegating responsibility for data, the associated primary department of the departed PI takes on the role of steward.
Researchers: Harvard community members who assist with management of data created, analyzed, and stored on FAS RC systems.
FAS RC: Responsible for executing deletions as outlined, maintaining logs of deletion actions, and responding to extension or exception requests.

Policy Review:

This policy will be reviewed and updated annually or as required by regulatory or operational changes.

Last modification date: 2025-12-02

Related Policies and Information

FASRC Cluster Storage Policy

Cluster storage offered and maintained by FASRC should only be used for research taking place on FASRC clusters.

Examples of data that can be stored on FASRC storage are:

Datasets
Code
Scientific software
Research results

Examples of data that should not be stored on FASRC storage include:

Clerical or lab administrative data
Data related to personnel, grant proposals, business operations, or general lab management
Data with personally identifiable or financial information

FASRC storage filesystems are only approved for Data Security Level 1 (DSL1) and DSL2 research data on the Cannon cluster. DSL3 data must be stored in the approved FASSE cluster project. Research data containing information classified as DSL 4 must be stored on an appropriate storage solution that is approved for DSL4 sensitive data.*

*A limited number of DSL4 projects exist in their own isolated environments

If it comes to the attention of the FASRC Staff that non research related data is being stored on the FASRC systems, we will alert the lab’s PI.

To view alternative storage options for administrative data, please refer to the FASRC website. Additional information is also provided on the Harvard Security website regarding Data Security levels.

Data Security Levels

What is a Data Security Level (DSL)?

Harvard groups data into 5 data security levels depending on the sensitivity of the data. The DSL for data determines how that data must be managed.

DISCLAIMER: The information on this page relates only to the FASRC clusters and our current understanding of Harvard policy. Please refer to the Harvard Security Data Security Levels page for up-to-date university policies and information.

Cluster Data Security Level Ratings

Public

Public information (Level 1/DSL1): The FASRC Cannon cluster is rated only for DSL 1 and DSL 2 data.

Low Risk

Low Risk (Level 2/DSL2): The FASRC Cannon cluster is rated only for DSL 1 and DSL 2 data.

Medium Risk

Medium Risk (Level 3/DSL3): Only the FASRC FASSE (FAS Secure Environment) cluster is rated for DSL3 data.

High Risk

High Risk (Level 4/DSL4): For DSL4 projects, please contact University RC (URC) for options.

Extreme Risk

Extreme Risk (Level 5/DSL5): FASRC has no systems rated for DSL5 data.

Web Scraping Policy

Web scraping is a contentious issue within research. While it is true that fair use provides for many uses of data gleaned from the Internet, in general this is applied to human information gathering, not programmatic machine scraping. That distinction makes the act of brute-force scraping an issue separate from fair use.

You, as a representative of Harvard, are not just using the source’s data, but also their servers, bandwidth, etc. in a way the source may not approve. This can lead to IP blacklisting and even legal action. So please tread carefully as your actions could negatively affect others.

If in doubt or in need of more authoritative guidance, please contact the Harvard Office of the General Counsel or Office of the Vice Provost for Research

If you are scraping for the purpose of train a GAI model, contact the Harvard Office of the General Counsel or Office of the Vice Provost for Research

Please be aware that merely being involved in academic pursuits does not exempt you from the usage policies of social media and other Internet platforms like Facebook, Twitter, etc.

Sensitive Data

If the data you are acquiring is considered sensitive, confidential, or contains human data, you will need to have this data reviewed for compliance before placing it on the FASRC cluster. If in doubt, you should always err on the side of caution and contact the Office of the Vice Provost for Research

Scraping data for use on the FASRC Cluster

If your research requires you to scrape content from the web, please review the following guidelines and suggestions.

We highly discourage using the cluster itself to scrape data. Due to its size and ease of parallelization of processes, the cluster is easily weaponized and your actions could have consequences for other researchers. Please seek another avenue for data acquisition first.

You should contact FASRC before commencing any scraping activity using the FASRC cluster.

It is highly preferable that you do the scraping elsewhere and then bring the data to the FASRC cluster for processing. If the data is sensitive, confidential, contains human data, or it is unclear, then this is a requirement. See ‘Sensitive Data’ above.

Also, if you are scraping for the purpose of training a GAI/LLM model, you should respect that site’s policies on this practice (this may be posted on the site, contained in a robots.txt file, or explicitly stated in their ToS). Even if you are doing the scraping manually, you should consider yourself the same as a bot and, if a site excludes GAI/AI bots, this also applies to you. Merely being an academic does not exempt you from following the wishes of a site and/or its members; your exfiltrated data could end up in other models thereby nullifying the source’s right to exclusivity/ownership. Please contact the Harvard Office of the General Counsel or Office of the Vice Provost for Research for further guidance.

Source Permission

If you are in doubt or have questions, please contact the Harvard Office of the Vice Provost for Research

Data on the Internet should not be programmatically (or ‘brute-force’) scraped using FASRC computing resources, even for academic research purposes, unless FASRC has given permission to proceed using the cluster or some system tied to the cluster, and:

A) The source provides an API for this purpose and any requirements they impose have been met.

B) The source allows/does not prohibit scraping in their terms of service or other public notice.

C) The source is the United States government and the data in question was generated with public funds and is publicly available without encumbrance. Further, that the site not be scraped using brute-force means if an API is provided.

D) The source has given you explicit permission in writing or via a secondary document spelling out that permission.

E) The source does not exclude/forbid your use-case, such as GAI or LLM training.

Data cannot be programmatically scraped using FASRC computing resources if the source has explicitly forbidden scraping in their terms of service and written permission to do so cannot be obtained. In such a case, you should investigate other options for acquiring this or similar data.

Throttling and Blacklisting

Scraping content from websites using highly parallelized processes, even with unfettered permission from the source, should be avoided. Doing so runs the risk of having the cluster, or even the university’s, IP range blacklisted. This could have an undesirable effect on other network and cluster users. Please ensure your processes pull data at a reasonable rate unless you explicitly have written approval from the data source to download more aggressively and assurance that this will not lead to blacklisting from them or their upstream provider.

Harvard Office of the Vice Provost for Research

US Data.gov Data Harvesting Information

Archive.org Scraping

Disabled Accounts

FASRC does not delete any accounts once they are granted, we simply will disable an account to make it inactive.

Users can only have a single account, if needed we will move the sponsorship or upgrade the account to a more privileged role, we never issue new accounts once you are in our database. your account can be “rehydrated” again later.

Disabled Accounts

If you can’t log in it might be because your account has been disabled. Accounts could go into the Disabled” state for a number of reasons. Most commonly:

your account is idle for some time because you have not logged in to one of the FASRC services, (ssh into the cluster, log into SPINAL, use OOD, etc)
your PI retired or your Sponsoring PI asked us to remove you from their lab. Without a valid, active sponsor an account will be disabled
your account had an expiration date on it and that date has passed
your account has been compromised or we were asked to disable it for some other reason

In order to have your account re-enabled and rehydrated, we will need approval from your sponsor. Ideally, have your sponsor contact us and indicate that they wish your account to be re-enabled. You may also contact us, but bear in mind that we will still need to contact your sponsor for approval, so this will take slightly longer than if they contact us directly.

Please allow time for us to process your request. FASRC Support Hours

Adding Groups, Cluster Access. or Changing Labs

See Adding additional lab groups or cluster access for details and instructions. A new account is not required to add groups, acces, or change labs.

Again, signing up for an additional account if you already have or have ever had a FASRC account is never the correct answer. See: Add or Change Lab Groups

Account Sharing

Sharing accounts or account credentials is against university security policy. See: Sharing Accounts

Onboarding Policies and Procedures

This document outlines FAS Research Computing’s policies and procedures related to the onboarding of researchers and PIs. The document is structured as a checklist, to be utilized by researchers and PIs as they enter the university or join a new lab. The document also notates differences between the onboarding of researchers and faculty (PIs).

Onboarding Checklist: Faculty

Familiarize yourself with FAS Research Computing Services
Review Harvard policies and procedures
- Research Data Ownership Policy
- Harvard Research Data Security Policy (HRDSP)
- Harvard University General Records Schedule
- Harvard Data Security Levels
- Retention and Maintenance of Research Records and Data Frequently Asked Questions (“FAQs”)
- Harvard Data Safety Website
- PI Responsibilities on the Cluster
  - Identify a data manager for the lab
- FASRC Data Ownership and Access Policy
Get a FASRC account using the account request tool
- Set your FASRC password
- Set up OpenAuth for two-factor authentication (2FA)
- Set up FASRC VPN
- Link FASRC account to HarvardKey
  - If you have a HarvardKey, but are denied access to approve new accounts, visit and complete the FAS Onboard tool for approvers.
- Faculty can sponsor FASRC accounts for any researcher working in their lab, including external collaborators. If a collaborator does not have a Harvardkey account, they may apply for an external FASRC account. External accounts need to be reenabled every 90 days. PIs will need to request an extension every 90 days to prevent the account from being suspended.
Learn how to utilize the High Performance Compute cluster
- User Quick Start Guide
- Review Running Jobs
- Review Cluster Customs and Responsibilities
- Command line access with Terminal (login nodes)
- Understand Fairshare on the cluster
Review FASRC storage offerings
- Request a new storage allocation in Coldfront
  - Coldfront is a resource allocation management system FASRC adapted to manage allocations on the FASRC cluster. The platform enables the viewing and management of lab groups (Projects), and storage and cluster allocations (Allocations).
  - FAS Secure Environment
Connect to your new storage folder from a desktop computer
Transfer data to your new storage folder
Create a well-defined storage workflow
- Develop streamlined directory structures
- Establish consistent file naming conventions
View information about storage folders associated with your group/lab
- Utilize the Starfish Zones tool to view key information about your group’s storage folders. The Starfish Zone User Interface is a self-service visual tool that enables users to view group storage amounts and locations. Users can navigate folder structures to access detailed information about files and storage. Labs and groups are strongly recommended to utilize this tool to assist with their data organization and cleanup efforts.
Installing and using software on FASRC cluster
Attend FASRC introductory trainings
- Training calendar
Contact
- Email rchelp@fas.harvard.edu if you have any questions
- Virtual Office Hours: Wednesdays 12-3PM

Onboarding Checklist: Researchers

Familiarize yourself with FAS Research Computing Services
Review Harvard policies and procedures
- Research Data Ownership Policy
- Harvard Research Data Security Policy (HRDSP)
- Harvard University General Records Schedule
- Harvard Data Security Levels
- Retention and Maintenance of Research Records and Data Frequently Asked Questions (“FAQs”)
- Harvard Data Safety Website
Get a FASRC account using the account request tool
- Set your FASRC password
- Set up OpenAuth for two-factor authentication (2FA)
- Set up FASRC VPN
Learn how to utilize the High Performance Compute cluster
Review FASRC storage offerings
- FAS Secure Environment
Connect to your new storage folder from a desktop computer
Transfer data to your new storage folder
View information about storage folders associated with your group/lab
- Utilize the Starfish Zones tool to view key information about your groups storage folders. The Starfish Zone User Interface is a self-service visual tool that enables users to view group storage amounts and locations. Users can navigate folder structures to access detailed information about files and storage. Labs and groups are strongly recommended to utilize this tool to assist with their data organization and cleanup efforts.
Attend FASRC introductory trainings
- Training calendar
Contact
- Email rchelp@fas.harvard.edu if you have any questions
- Virtual Office Hours: Wednesdays 12-3PM

Virtual Machines & Virtual Hosting

As of December 2024, FASRC does not provide a general virtual machine service as part of its core services. It has in the past attempted to fill this gap when no other options were available, but 1) there was no funding for hardware or support for this service and its infrastructure is old and being retired 2) other options, within and without Harvard, now exist.

If you require a VM for web hosting or other needs or for hosting or sharing data sets, please see the following options.

Harvard-based options:

Harvard Dataverse: hosting for data sets https://dataverse.harvard.edu
Harvard Web Publishing group: Web publishing and consulting https://hwp.harvard.edu
Research Support options https://researchsupport.harvard.edu/dissemination-preservation
More data sharing and publishing options https://researchsupport.harvard.edu/data-sharing-and-publishing
Digital Scholarship Support Group: Web consulting services https://dssg.fas.harvard.edu/initiatives/research
HUIT’s Cloud services: Help with using AWS and other cloud resources https://cloud.huit.harvard.edu/services

Self-service, pay as you go, managed by you:

Amazon Web Services (AWS): https://aws.amazon.com
Google Cloud Platform (GCP): https://cloud.google.com
Digital Ocean: https://www.digitalocean.com
NERC: Self-service VM provisioning and other services https://nerc.mghpcc.org (this service is no longer run by Harvard)

Please note that PIs and other data owners are responsible for following Harvard Information Security Policy and all other applicable Harvard policies and requirements. This includes knowing your data and following the requirements for Data Security Level for servers and Research Data Management Security and Ownership Policies

FASSE / Protected Data Transfers

To preface this: You are responsible for knowing, and complying with applicable Harvard Information Security Policy (controls that apply to DSL3 and lower), Harvard Research Data Security Policy, and any applicable contracts / data use agreements.

FASSE data transfers generally work the same as transfers for other environments. For example:

When connected to the FASSE VPN realm, you can copy files to and from the FASSE cluster, assuming this meets policy/DUA compliance requirements.
While on FASSE nodes (compute, login, etc.) and the FASSE VPN, you have full access to the Internet through a proxy.
- Generally, this means that you can push to or pull from any HTTPS, SFTP, or other service that supports a proxy.
- For example, this means you should be able to pull data from data providers that provide an HTTPS, SFTP, or other service. You may need to adjust certain configurations and workflows to use the proxy – Some details on this here

With that said, given that FASSE is rated for data security level (DSL) 3 data:

Do not store DSL 3 / FASSE data in your home directory.
If you have a DUA that requires encryption at rest, you must not use scratch for any data that the DUA applies to. Neither local scratch, nor our global scratch, support encryption at rest.
FASSE VPN, login, compute, and VDI environments use a proxy. Some transfer solutions do not work through a proxy. If you run into this:
- Please ensure you have tried to use a proxy, and if you still run into trouble,
- Open a ticket with rchelp@rc.fas.harvard.edu indicating
  - What you have tried
  - What you expected to happen
  - What actually happened
  - Include specific commands, where these ran, and output messages including all errors.
Data security level 3 / FASSE storage is intentionally not included in Globus by default. If you would like your FASSE project to be exposed through Globus, consider the following:
- If any data in this project is governed by a contract / data use agreement (DUA), please review the DUA to ensure Globus is compliant. You might consult your School Security Officer for this.
  - An example scenario where Globus would not be compliant: DUAs indicating that a VPN or private network must be used for all access to the data. Globus makes data available over the Internet without a VPN or private network
- Please submit a ticket to rchelp@rc.fas.harvard.edu as follows:
  - This must include the path to the project to add to Globus (e.g. “/n/piname_project_l3”)
  - This must indicate that the PI attests to Globus being compliant with any contracts/DUAs governing the data in this project storage
  - This must be from, or receive a reply directly from the PI for this project confirming this information
For Storage, FASSE storage is intentionally not provided SMB shares by default. If you need your FASSE project exposed through an SMB share, consider the following:
- Please submit a ticket to rchelp@rc.fas.harvard.edu as follows:
  - This must include the path to the project (e.g. “/n/piname_project_l3”)
  - This must indicate that the PI attests to understanding and accepting the risks of enabling SMB access to this data, given that any system or network that can talk to this tiered storage, could access this data if the credentials from an account in the project were used. Some example scenarios:
    - Someone with access to your storage accesses it / copies data down to an unmanaged lab computer without data security level controls
    - Someone with access to your storage accidentally clicks the wrong link on a computer with access to this storage. Their computer is compromised, malware identifies SMB access to your data, and compromises the confidentiality, integrity, and/or availability of your data – maybe ransomware, stealing the data, etc.
  - This must include a brief explanation of why SMB access is needed, and from where you will use this SMB access
  - This must be from, or receive a reply directly from the PI for this project confirming this information

If you have any questions or concerns, please do not hesitate to consult us at at security@rc.fas.harvard.edu, although in some cases we may end up pulling in or pointing you to your school privsec officer.

PI Responsibilities at FAS RC

Overview

PIs have a variety of responsibilities at Harvard University. This document will cover the responsibilities specific to FAS Research Computing, especially around information security and risk.

PIs are individuals given continuous or limited PI rights by the university and whom control their own funding in a school that FAS RC supports. Co-Investigators are not considered PIs.

Responsibilities

PIs are responsible for following all applicable Harvard University policies, including but not limited to Harvard Research Data Security Policy and Harvard Information Security Policy, as well as any requirements in data use agreements (DUAs) or contracts that impact them.
- PIs are responsible for ensuring all accounts they sponsor follow all applicable Harvard University policies, including but not limited to Harvard Research Data Security Policy and Harvard Information Security Policy, as well as any requirements in data use agreements or contracts that impact them.
PIs are responsible for creating and maintaining accurate data documentation in the Harvard Compliance System, as required by University policies, and complying with approved data security and management plans. Guidance on which applications are needed for your data.
PIs are responsible for submitting FASSE project requests for any data security level (DSL) 3 data they plan to use at FAS RC and keeping associated data in the specific FASSE storage provided for these projects.
PIs are responsible for informing FAS RC of any changes to Research Administration applications (e.g. DAT12-1234, DUA12-1234, IRB12-1234) governing data they plan to use for their FASSE projects, before moving new data to FAS RC storage for these projects. This includes informing FASRC before adding data from a new application (e.g. DUA12-1234) to an existing FASSE project.
PIs are responsible for ensuring that any access they approve complies with all applicable Harvard University policies and DUA or compliance regimes. For example, among many other scenarios:
- If a DUA requires informing or obtaining approval from the data provider before providing access to the data, the PI must ensure this is done before they approve the associated FAS RC access
- If a DUA states that only Harvard staff may have access to the data, the PI is responsible for ensuring they never approve access to non-Harvard members to that data (e.g. external collaborators)
PIs are responsible for informing FAS RC when an account they have sponsored should be disabled (i.e. if they sponsor the account and the person has left or should otherwise be disabled)
PIs are responsible for informing FAS RC when any accounts should be removed from groups they manage
PIs are responsible for informing FAS RC if and when data needs secure disposal/sanitization, either as required by Harvard University policy or a DUA

Upcoming Responsibilities

Coming soon: PIs are responsible for reviewing accounts they sponsor on an annual basis [1]
Coming soon: PIs are responsible for reviewing access to groups they manage on an annual basis [1]

[1] If you would like to review spreadsheets of accounts you sponsor and group memberships for groups you approve, please contact rchelp@rc.fas.harvard.edu ask for account and access review spreadsheets.

Open OnDemand (OOD/VDI) Remote Desktop: How to open software

Introduction

In this document, you can see how to launch different software in the Open OnDemand (OOD) Remote Desktop app (available at rcood.rc.fas.harvard.edu)

Step 1: Connect to the FASRC VPN (see VPN setup documentation)

Step 2: Launch the Remote Desktop app

Cannon cluster
FASSE cluster

Step 3: When the Remote Desktop app opens, click the terminal icon to launch a terminal (or click Applications -> Terminal Emulator).

Step 4: Below, you can follow the instructions to launch various software.

Keep in mind that, for the most part, the terminal window must remain open. If the terminal window is closed, the software launched via the terminal will also be closed.

Training Session: FASRC Open On Demand Users Training

Remote Desktop login

To comply with Harvard’s security policy, if the Remote Desktop session becomes idle, the Remote Desktop session will lock. You need to enter your FASRC password to log back in.

Abaqus

In the terminal, type the commands to load the modules and launch Abaqus

[jharvard@holy7c24102 ~]$ module load abaqus
[jharvard@holy7c24102 ~]$ export LANG=en_US
[jharvard@holy7c24102 ~]$ abaqus cae -mesa cpus=$SLURM_CPUS_PER_TASK &

You can see all versions of Abaqus with module spider abaqus. For more details, see the modules page.

The Abaqus license is restricted to SEAS. For more information, see our Abaqus docs.

Comsol

In the terminal, type the commands to load the modules and launch Comsol

[jharvard@holy7c24102 ~]$ module load comsol
[jharvard@holy7c24102 ~]$ export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
[jharvard@holy7c24102 ~]$ comsol -3drend sw -np $SLURM_CPUS_PER_TASK &

You can see all versions of Comsol with module spider comsol. For more details, see the modules page.

The Comsol license is restricted to SEAS. For more information, see our Comsol docs.

For how to set the Comsol temporary directory, see our Comsol Troubleshooting doc.

Jupyter Notebook

(optional) Creating and loading a mamba/conda environment

Note: this is a one-time setup to ensure that your conda environment can be loaded in Jupyter Notebook.

See our Python documentation on how to create a conda environment.

Then, in order to see your conda environment in Jupyter Notebook, ensure that you have installed the packages ipykernel and nb_conda_kernels. To do so, launch a terminal in the Remote Desktop and type the commands:

[jharvard@holy7c24102 ~]$ module load python
[jharvard@holy7c24102 ~]$ source activate my_conda_environment
[jharvard@holy7c24102 ~]$ mamba install ipykernel
[jharvard@holy7c24102 ~]$ mamba install nb_conda_kernels

For more information on creating conda environments for TensorFlow and PyTorch, see our GitHub documentation:

You can see all versions of Python with module spider python. For more details, see the modules page.

Launching Jupyter Notebook

In the Remote Desktop terminal, type the commands to load the modules and launch Jupyter Notebook:

[jharvard@holy7c24102 ~]$ module load python
# (optional) load conda environment
[jharvard@holy7c24102 ~]$ source activate my_conda_environment
# launch jupyter notebook
[jharvard@holy7c24102 ~]$ jupyter notebook

After the jupyter notebook command, it may hang for a few seconds. Be patient, a Firefox window will open soon after.

To select my_conda_environment as the kernel, go to Kernel -> Change kernel, and select the kernel (i.e. conda environment) of your choice.

Note: If you prefer to launch Jupyter Lab, note that conda environments cannot be loaded when using Jupyter Lab. Only the base environment is available.

Cleanly close Jupyter Notebook

These are instructions to kill your Jupyter server and so you can exit the job cleanly.

First, close each Jupyter Notebook you have open: click on File -> Close and Halt.

Then, from the Jupyter Notebook Home Page (where you can browse files and folders), on the top right corner, click on “Quit”. Close the Firefox window.

KNIME

In the terminal, type the following commands to load the module and launch Knime.

[jharvard@holy7c24102 ~]$ module load knime
[jharvard@holy7c24102 ~]$ knime &

You can see all versions of KNIME with module spider knime. For more details, see the modules page.

LibreOffice

LibreOffice is a free and open source suite that is compatible with a wide range of formats, including those from Microsoft Word (.doc, .docx), Excel (.xls, .xlsx), PowerPoint (.ppt, .pptx) and Publisher.

LibreOffice is available in the FASRC cluster (both Cannon and FASSE) through a Singularity image. Therefore, LibreOffice is only available through the Remote Desktop app. LibreOffice does not work in the Containerized Remote Desktop app.

In the terminal type the commands to pull and create a singularity image with LibreOffice installed within the container. This command is only needed once.

[jharvard@holy7c24102 ~]$ singularity pull docker://linuxserver/libreoffice

To launch LibreOffice, in the terminal, run the command

[jharvard@holy7c24102 ~]$ singularity exec --cleanenv --env DISPLAY=$DISPLAY libreoffice_latest.sif soffice

Lumerical

In the terminal, type the commands to load the modules and launch Lumerical

[jharvard@holy7c24102 ~]$ module load lumerical-seas
[jharvard@holy7c24102 ~]$ launcher

The Lumerical license is restricted to SEAS. For more information, see our Lumerical docs.

You can see all versions of Lumerical with module spider lumerical. For more details, see the modules page.

Mathematica

In the terminal, type the commands to load the modules and launch Mathematica

[jharvard@holy7c24102 ~]$ module load mathematica
[jharvard@holy7c24102 ~]$ mathematica

You can see all versions of Mathematica with module spider mathematica. For more details, see the modules page.

Matlab

In the terminal, type the commands to load the modules and launch Matlab

[jharvard@holy7c24102 ~]$ module load matlab
[jharvard@holy7c24102 ~]$ matlab -desktop -softwareopengl

You can see all versions of Matlab with module spider matlab . For more details, see the modules page.

MOE

In the terminal, type the commands to load the modules and launch MOE

[jharvard@holy7c24102 ~]$ module load moe
[jharvard@holy7c24102 ~]$ moe

You can see all versions of MOE with module spider moe . For more details, see the modules page.

MOE databases

FASRC has MOE databases available in two locations:

Most of the MOE Auxiliary Databases are available to everyone with cluster access in /n/holylabs/rc_admin/Everyone/moe_databases:
Databases are also available in the $MOE/project folder. You can open them in File -> Open -> Type in the address bar $MOE/project.

RStudio Desktop

In the terminal, type the commands to load modules

[jharvard@holy7c24102 ~]$ module load R
[jharvard@holy7c24102 ~]$ module load rstudio

Set environmental variables

[jharvard@holy7c24102 ~]$ unset R_LIBS_SITE
[jharvard@holy7c24102 ~]$ mkdir -p $HOME/apps/R_version
[jharvard@holy7c24102 ~]$ export R_LIBS_USER=$HOME/apps/R_version:$R_LIBS_USER

Launch RStudio Desktop

[jharvard@holy7c24102 ~]$ rstudio

# vanilla option (combines --no-save, --no-restore, --no-site-file, --no-init-file and --no-environ)
[jharvard@holy7c24102 ~]$ rstudio --vanila

You can see all versions of R and RStudio with module spider R and module spider rstudio, respectively. For more details, see the modules page.

Remoteviz Partition

If you have used the “FAS-RC Remote Visualization” Open OnDemand (or VDI) app, we have decommissioned it.

SageMath

You can use sage wither in a interactive shell using command line interface or by launching a Jupyter Notebook with the SageMath kernel. To launch a Jupyter Notebook, in the terminal, type the commands to load the modules and launch Jupyter

[jharvard@holy7c24102 ~]$ module load sage
[jharvard@holy7c24102 ~]$ sage -n jupyter

Ensure that you have “SageMath” kernel selected. If not, go to Kernel -> Change kernel, and select SageMath.

For example, see Sage documentation:

interactive shell examples
Jupyter Notebook examples

You can see all versions of SageMath with module spider sage. For more details, see the modules page.

SAS

In the terminal, type the commands to load the modules and launch SAS

[jharvard@holy7c24102 ~]$ module load sas
[jharvard@holy7c24102 ~]$ sas &

Stata

In the terminal, type the commands to load the module and launch Stata

[jharvard@holy7c24102 ~]$ module load stata/17.0-fasrc01

# if you are using single-core jobs
[jharvard@holy7c24102 ~]$ xstata-se

# if you are using multi-core jobs
[jharvard@holy7c24102 ~]$ xstata-mp "set processors $SLURM_CPUS_PER_TASK"

TensorBoard

For TensorBoard, you will first need to create a conda environment (Step 1). You only need to create a conda environment once. If you have created one, you can skip to Step 2. Or, if you have your own environment, make sure you install the TensorBoard package, and then you can skip to Step 2.

Step 1: Create conda environment

In a terminal, load Mambaforge or Python module, create a mamba environment, activate it, and install TensorBoard inside the mamba environment

[jharvard@holy7c24102 ~]$ module load python
[jharvard@holy7c24102 ~]$ module load cuda/11.7.1-fasrc01
[jharvard@holy7c24102 ~]$ module load cudnn/8.5.0.96_cuda11-fasrc01
[jharvard@holy7c24102 ~]$ conda create -n tb_tf2.10_cuda11 python=3.10 pip numpy six wheel scipy pandas matplotlib seaborn h5py jupyterlab
[jharvard@holy7c24102 ~]$ source activate tb_tf2.10_cuda11
[jharvard@holy7c24102 ~]$ conda install -c conda-forge tensorboard
[jharvard@holy7c24102 ~]$ conda install -c conda-forge tensorflow

You can see different versions of Mambaforge or Python in our modules page.

Step 2: Activate conda environment and launch TensorBoard

In a terminal, setup variables for TensorBoard. Make sure that the data you need to visualize in Tensorboard is located in the log directory MY_TB_LOGDIR. You can either use the suggested path below or use somewhere else that better suits your workflow.

# Find available port to run server on (does not output anything to screen)
[jharvard@holy7c24102 ~]$ for myport in {6818..11845}; do ! nc -z localhost ${myport} && break; done

# setup tensorboard environmental variables
[jharvard@holy7c24102 ~]$ export MY_TB_PORT=${myport}
[jharvard@holy7c24102 ~]$ export MY_TB_BASEURL=/node/${host}/${myport}/
[jharvard@holy7c24102 ~]$ export MY_TB_LOGDIR=$HOME/.tensorboard/log/$SLURM_JOBID
[jharvard@holy7c24102 ~]$ mkdir -p $MY_TB_LOGDIR

# load module, activate conda environment, and launch tensorboard
[jharvard@holy7c24102 ~]$ module load python
[jharvard@holy7c24102 ~]$ module load cuda/11.7.1-fasrc01
[jharvard@holy7c24102 ~]$ module load cudnn/8.5.0.96_cuda11-fasrc01
[jharvard@holy7c24102 ~]$ source activate tb_tf2.10_cuda11 
(tb_tf2.10_cuda11) tensorboard --host localhost --port ${MY_TB_PORT} --logdir ${MY_TB_LOGDIR} --path_prefix ${MY_TB_BASEURL}

You can see different versions of Mambaforge or Python in our modules page.

Right-click on the link that starts with “http://localhost” and click on “Open Link”. This will open a Firefox browser, where you can view your results.

Example

Using the environment created in Step 1, run the small program tb_test.py in a directory of your choice and visualize its results.

Source code of tb_test.py:

import os
import tensorflow as tf
import datetime

def create_model():
    return tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = create_model()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

logdir = os.getenv('MY_TB_LOGDIR')
print(logdir)

tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir, histogram_freq=1)
model.fit(x=x_train, 
          y=y_train, 
          epochs=5, 
          validation_data=(x_test, y_test), 
          callbacks=[tensorboard_callback])

Setup variables and run tb_test.py

# Find available port to run server on (does not output anything to screen)
[jharvard@holy7c24102 tb_example]$ for myport in {6818..11845}; do ! nc -z localhost ${myport} && break; done

# go to the directory that you have your tb_test.py file
[jharvard@holy7c24102 ~]$ cd tb_example

# setup tensorboard environmental variables
[jharvard@holy7c24102 tb_example]$ export MY_TB_PORT=${myport}
[jharvard@holy7c24102 tb_example]$ export MY_TB_BASEURL=/node/${host}/${myport}/

# this command will set MY_TB_LOGDIR to your current working directory
[jharvard@holy7c24102 tb_example]$ export MY_TB_LOGDIR=$PWD

# load modules and activate conda environment
[jharvard@holy7c24102 tb_example]$ module load python
[jharvard@holy7c24102 tb_example]$ module load cuda/11.7.1-fasrc01
[jharvard@holy7c24102 tb_example]$ module load cudnn/8.5.0.96_cuda11-fasrc01
[jharvard@holy7c24102 tb_example]$ source activate tb_tf2.10_cuda11

# run python code
(tb_tf2.10_cuda11) python tb_test.py

# launch tensorboard
(tb_tf2.10_cuda11) tensorboard --host localhost --port ${MY_TB_PORT} --logdir ${MY_TB_LOGDIR} --path_prefix ${MY_TB_BASEURL}

Right click on the link that starts with “http://localhost” and click on “Open Link”. This will open a Firefox browser where you will be able to see your results.

TotalView

TotalView is a debugging tool particularly suitable for parallel applications. The modules you need to load depend on the compilers used in the code you are trying to debug. Due to this compiler dependency, we refer you to a more elaborate TotalView documentation.

Visual Studio Code

In the terminal, type the commands to load the modules and launch Visual Studio Code

[jharvard@holy7c24102 ~]$ module load vscode
[jharvard@holy7c24102 ~]$ code --user-data-dir $HOME/.vscode/data/ &

You can see all versions of Visual Studio Code with module spider vscode. For more details, see the modules page.

Purpose:

Scope:

Data Retention:

Home Directories:

Project Data:

Temporary and Scratch Storage:

Deletion Procedures:

Ownership and Roles:

Policy Review:

Related Policies and Information

What is a Data Security Level (DSL)?

Cluster Data Security Level Ratings

LINKS

Sensitive Data

Scraping data for use on the FASRC Cluster

Source Permission

Throttling and Blacklisting

Related:

Disabled Accounts

Adding Groups, Cluster Access. or Changing Labs

Account Sharing

Onboarding Checklist: Faculty

Onboarding Checklist: Researchers

FASSE / Protected Data Transfers

Overview

Responsibilities

Upcoming Responsibilities

Introduction

Training Session: FASRC Open On Demand Users Training

Remote Desktop login

Abaqus

Comsol

Jupyter Notebook

(optional) Creating and loading a mamba/conda environment

Launching Jupyter Notebook

Cleanly close Jupyter Notebook

KNIME

LibreOffice

Lumerical

Mathematica

Matlab

MOE

MOE databases

RStudio Desktop

Remoteviz Partition

SageMath

SAS

Stata

TensorBoard

Step 1: Create conda environment

Step 2: Activate conda environment and launch TensorBoard

Example

TotalView

Visual Studio Code