Running snakebids apps on Compute Canada and CBS Server

switt4 · April 13, 2023, 5:30pm

Below are instructions for running snakebids (https://github.com/akhanf/snakebids) apps on Compute Canada and the CBS server. For this tutorial, denoise-fmri is used as an example, however, the instructions can be easily extrapolate to other snakebids apps and ultimately other snakemake pipelines.

This tutorial does not cover more nuanced details like editing a snakemake or snakebids config file or running anything other than the full pipeline.

Step 1: Clone snakebids app repository

Most likely whatever snakebids app you want to run will be contained in a GitHub repository. You will need to clone the repository to either your home directory or your personal folder within your PI’s allocation for Compute Canada or your PI’s datashare for the CBS Server.

For the example case of denoise-fmri, this step would be:
git clone https://github.com/akhanf/denoise-fmri

Step 2: Setting up the virtual environment.

You will want to set up a venv virtual environment (Complete instructions here: Creating general purpose python virtual environment on CBS server) and install snakebids.

Set up a virtual environment: python3.9 -m venv ~/venv
Activate it: source ~/venv/bin/activate
Update pip: pip install --upgrade pip
Install snakebids: pip install snakebids

Depending on what snakebids app you are using, you may also need to install app-specific dependencies. Check to see if the repository has either a requirements.txt or setup.py file. The denoise-fmri has a setup.py file containing all of the necessary dependencies.

Navigate to wherever you cloned the snakebids app repository and make sure that you are inside the correct directory level to see the setup.py file.

Install the dependencies: pip install .

When finished with the virtual environment:

Close out the virtual environment: deactivate

Step 3: Dry-run the snakebids app

With snakemake pipelines and snakebids apps, it is always good practice to perform a dry-run before running the job with data.

For the denoise-fmri example a dry run command would look like:

/full/path/to/cloned_repository/denoise_fmri/run.py /full/path/to/input/data /full/path/to/output/directory participant -np

It is generally best to specify the full paths rather than relative paths, especially when running on Compute Canada. For snakemake and snakebids, the option -np tells snakemake to perform a dry-run and dump the individual rules and commands to the terminal. As long as everything looks correct and there are no error messages, you can safely run on real data.

Step 4a: Running on Compute Canada

Since Compute Canada relies on a SLURM scheduler, we need to create a shell script to submit to the scheduler.

The contents of the shell script for denoise-fmri should look something like:

#!/bin/bash

source /<path_to_snakebids_venv>/bin/activate
/<path_to_local_repository_clone>/denoise-fmri/denoise_fmri/run.py /<path_to_fmriprep_output>/fmriprep_output /<path_to_output>/denoise_output participant -c1

The shell script first activates the virtual environment containing the snakebids and denoise-fmri dependencies. Then it runs denoise-fmri on an existing fMRIPrep output directory and writes out the results to denoise_output. Since denoise-fmri is not a resource intense pipeline, the number of cores specified is 1, as indicated by -c1 at the end of the command. For a default regularSubmit job, you can increase the number of cores up to 8 (-c8).

Before running the script through Compute Canada’s SLURM system, you will need to grant all users execute permission to the script. So, from whatever directory the script is stored in, run the following from the terminal prompt:

chmod a+x my_script.sh

Once you have created and saved your shell script and set the execute permissions, you can use regularSubmit contained in neuroglia-helpers (https://github.com/khanlab/neuroglia-helpers) to submit your script as a SLURM job to the queue.

regularSubmit /full/path/to/my_script.sh

regularSubmit will submit the snakebids app for a standard 8core/32G/24hr job.

Step 4b: Running on the CBS Server

Since the CBS server does not have a scheduler, you can run the commands directly from the terminal.

/full/path/to/cloned_repository/denoise_fmri/run.py /full/path/to/input /full/path/to/output participant -c1

Again we set the number of cores to 1 (-c1), however you can increase this to 4 (-c4) for CBS Basic and 12 for CBS Heavy (-c12).