Skip to content

Home


catena a Python Utility for Submitting Work to a SLURM Cluster

catena is a Python library for interacting with SLURM through the REST API. In particular, the library is focused on the submission of jobs to a SLURM cluster, either locally or remotely, but can be extended to other schedulers with a suitable API.

📝 Note: Currently only works when running locally on a SLURM HPC


📋 Key Features

  • Provides 'job' classes that allow for work to be orchestrated through SLURM, programatically in Python

  • Defines schemas with sensible defaults and validators for SLURM /job/submit request

  • Affords end-users the ability to orchestrate multiple jobs in various programming languages using Job Manifests

  • Allows building and running pipelines of inter-dependent jobs (DAGs) of various programing languages to be run on a SLURM HPC using Job Pipelines

  • Provides ability to share and cache results of jobs between almost any programming language

📝 Note: This project is still under development.


Quickstart

Clone the repository and try out some examples

📍 Move into the repo and load anaconda3/2021.05 (or your favourite version, as long as python >= 3.6)

$ cd catena/examples
$ module load anaconda3/2021.05

📍 Install all requirements for catena

$ pip3 install --user -r requirements.txt
---> 100%

TODO

  • Local job submission single script any language programatically in Python
  • Local submission of many job sripts of various languages using Job Manifests
  • Unit test all components involved in above
  • Local submission of many interdependent job scripts represented as a DAG using Job Pipelines
  • Build out custom loggers using loguru (e.g. job monitor, verbose vs. silent/to file)
  • Unit test all components involved in above
  • Setup GitHub actions for CI/CD
  • Extend all of the above to remote job execution: will require ability to mount remote files possibly using a custom variation on squashfs mixed with gRPC.