Software – FASRC DOCS https://docs.rc.fas.harvard.edu Tue, 29 Jul 2025 19:41:25 +0000 en-US hourly 1 https://wordpress.org/?v=6.9 https://docs.rc.fas.harvard.edu/wp-content/uploads/2018/08/fasrc_64x64.png Software – FASRC DOCS https://docs.rc.fas.harvard.edu 32 32 172380571 Distributed MultiThreaded CheckPointing https://docs.rc.fas.harvard.edu/distributed-multithreaded-checkpointing/ Tue, 29 Jul 2025 19:41:25 +0000 https://docs.rc.fas.harvard.edu/?p=28909 Overview

Distributed MultiThreaded CheckPointing (DMTCP) is a library that can be used to add checkpointing to your code without having to do a code rewrite. DMTCP is designed to work codes that are serial or threaded, allowing users to create restarts on the fly.  DMTCP will not work with non-GPU, non-MPI codes. You will want to make sure to have sufficient storage space for any checkpointing dumps created by DMTCP.

Usage

DMTCP is provided as a module and can be loaded using module load dmtcp. It is recommended that users select a specific version of DMTCP and note which version they are using as different versions of DMTCP may not be compatible with each other. For more see the DMTCP documentation.

]]>
28909