Home
catena
a Python Utility for Submitting Work to a SLURM Cluster
catena
is a Python library for interacting with SLURM through the REST API. In particular, the library is focused on the submission of jobs to a SLURM cluster, either locally or remotely, but can be extended to other schedulers with a suitable API.
📝 Note: Currently only works when running locally on a SLURM HPC
📋 Key Features
-
Provides 'job' classes that allow for work to be orchestrated through SLURM, programatically in Python
-
Defines schemas with sensible defaults and validators for SLURM /job/submit request
-
Affords end-users the ability to orchestrate multiple jobs in various programming languages using Job Manifests
-
Allows building and running pipelines of inter-dependent jobs (DAGs) of various programing languages to be run on a SLURM HPC using Job Pipelines
-
Provides ability to share and cache results of jobs between almost any programming language
📝 Note: This project is still under development.
Quickstart
Clone the repository and try out some examples
📍 Move into the repo and load anaconda3/2021.05
(or your favourite version, as long as python >= 3.6)
$ cd catena/examples
$ module load anaconda3/2021.05
📍 Install all requirements for catena
$ pip3 install --user -r requirements.txt
---> 100%
TODO
- Local job submission single script any language programatically in Python
- Local submission of many job sripts of various languages using Job Manifests
- Unit test all components involved in above
- Local submission of many interdependent job scripts represented as a DAG using Job Pipelines
- Build out custom loggers using loguru (e.g. job monitor, verbose vs. silent/to file)
- Unit test all components involved in above
- Setup GitHub actions for CI/CD
- Extend all of the above to remote job execution: will require ability to mount remote files possibly using a custom variation on squashfs mixed with gRPC.