Using SLURM for Large-Scale Analysis

This guide covers deploying HiTMicTools on SLURM clusters for high-throughput processing of large microscopy datasets.

Overview

SLURM (Simple Linux Utility for Resource Management) enables:

  • Parallel Processing: Process hundreds of images simultaneously across multiple nodes

  • Job Arrays: Split large experiments into manageable batches

  • Resource Management: Dedicated GPU/CPU allocation for each task

  • Queue System: Automatic job scheduling and execution

  • Scalability: Handle experiments too large for local workstations

Workflow Summary

  1. Split your dataset into batches using split-files

  2. Generate a SLURM submission script using generate-slurm

  3. Customize the script for your cluster configuration

  4. Submit the job array to SLURM

  5. Monitor progress and collect results

1. Prerequisites

Cluster Access

Ensure you have:

  • SSH access to your SLURM cluster

  • Conda environment set up on the cluster

  • HiTMicTools installed in the cluster environment

  • Model collection files accessible on the cluster (shared filesystem or local copy)

Environment Setup on Cluster

# SSH to cluster
ssh username@your-cluster.domain

# Create conda environment (one-time setup)
conda create -n hitmictools python=3.9
conda activate hitmictools

# Install HiTMicTools
pip install git+https://github.com/phisanti/HiTMicTools

# Optional: Install btrack for tracking support
git clone https://github.com/quantumjot/btrack.git
cd btrack && bash build.sh && pip install . && cd ..

File Organization

Set up your project on the cluster:

/your/cluster/path/project/
├── data/                          # Input images
│   ├── experiment_001.nd2
│   ├── experiment_002.nd2
│   └── ...
├── results/                       # Output directory (created automatically)
├── config/
│   └── analysis_config.yml       # Your configuration file
├── models/
│   └── model_collection_tracking_20250529.zip
├── temp/                          # File blocks (created by split-files)
└── SLURM_jobs_report/            # Job logs (created automatically)
    └── job_name/
        ├── jobid_HiTMicTools.out
        └── jobid_HiTMicTools.err

2. Splitting Files into Batches

The split-files command divides your dataset into manageable chunks for parallel processing.

Basic Usage

# Split 100 images into 10 batches (10 images each)
hitmictools split-files \
    --target-folder ./data \
    --n-blocks 10 \
    --output-dir ./temp

This creates files:

temp/
├── file_block_0.txt    # Files 1-10
├── file_block_1.txt    # Files 11-20
├── ...
└── file_block_9.txt    # Files 91-100

Advanced Options

# Split with filtering and full paths
hitmictools split-files \
    --target-folder /path/to/images \
    --n-blocks 20 \
    --output-dir ./temp \
    --file-pattern "experiment_A.*" \
    --file-extension ".nd2" \
    --return-full-path

Parameters:

  • --target-folder: Directory containing files to split (required)

  • --n-blocks: Number of batches to create (required)

  • --output-dir: Where to save block files (default: ./temp)

  • --file-pattern: Regex pattern to filter files (optional)

  • --file-extension: File extension filter (e.g., .nd2, .tiff)

  • --return-full-path: Write full paths vs. filenames only (default: True)

  • --no-return-full-path: Write only filenames (requires working dir change in SLURM script)

Determining Block Count

Choose n-blocks based on:

  • Total files: More files = more blocks for better parallelization

  • Cluster limits: Check maximum array size (scontrol show config | grep MaxArraySize)

  • Processing time: Aim for 1-6 hours per block (optimal for queue priority)

  • Memory: More blocks = more concurrent jobs = more total memory needed

Example calculations:

  • 100 files, 30 min each → 10 blocks (5 hours/block)

  • 500 files, 10 min each → 25 blocks (3.3 hours/block)

  • 50 files, 2 hours each → 5 blocks (20 hours/block)

3. Generating SLURM Scripts

The generate-slurm command creates a submission script tailored to your cluster.

Basic Command

hitmictools generate-slurm \
    --job-name 'my_analysis' \
    --file-blocks \
    --n-blocks 10 \
    --conda-env 'hitmictools' \
    --config-file './config/analysis_config.yml'

This creates a script that:

  • Processes 10 file blocks as an array job

  • Uses the hitmictools conda environment

  • Runs the analysis defined in analysis_config.yml

All Available Options

hitmictools generate-slurm \
    --job-name 'experiment_001' \
    --config-file './config/analysis_config.yml' \
    --file-blocks \
    --n-blocks 10 \
    --conda-env 'hitmictools' \
    --email 'your.email@domain.com' \
    --partition 'rtx4090' \
    --qos 'gpu6hours' \
    --time '06:00:00' \
    --memory '25G' \
    --gpu-count 1 \
    --cpu-count 4 \
    --work-dir '/path/to/project'

Parameters:

Parameter

Description

Default

--job-name

SLURM job name

Required

--config-file

Path to config YAML

Required

--file-blocks

Enable array job mode

False

--n-blocks

Number of array tasks

10

--conda-env

Conda environment name

img_analysis

--email

Email for notifications

your.email@unibas.ch

--partition

SLURM partition

rtx4090

--qos

Quality of service

gpu6hours

--time

Max walltime

06:00:00

--memory

RAM per CPU

25G

--gpu-count

GPUs per task

1

--cpu-count

CPUs per task

4

--work-dir

Project directory

Current directory

Understanding the Generated Script

A typical generated script looks like:

#!/bin/bash

#SBATCH --job-name=my_analysis
#SBATCH --mail-user=your.email@unibas.ch
#SBATCH --mail-type=END,FAIL
#SBATCH --time=06:00:00
#SBATCH --qos=gpu6hours
#SBATCH --mem-per-cpu=25G
#SBATCH --partition=rtx4090
#SBATCH --gres=gpu:1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --array=0-9                  # 10 tasks (0-9)
#SBATCH --output=./SLURM_jobs_report/my_analysis/%A_%a_HiTMicTools.out
#SBATCH --error=./SLURM_jobs_report/my_analysis/%A_%a_HiTMicTools.err

# Load modules
module load Python
module load CUDA
module load jobstats

# Create log directory
mkdir -p ./SLURM_jobs_report/my_analysis

# Check GPU availability
if command -v nvidia-smi &> /dev/null; then
    echo "GPU information:"
    nvidia-smi
else
    echo "No GPU available"
fi

# Check CPU
echo "CPU information:"
lscpu | egrep 'Model name|Socket|Thread|NUMA|CPU\(s\)'

# Activate conda environment
source ~/.bashrc
conda init
conda activate hitmictools

# Change to project directory
cd '/path/to/project'

# Set variables
CONFIG_FILE="./config/analysis_config.yml"
BLOCK_NUM=${SLURM_ARRAY_TASK_ID}
FILELIST="./temp/file_block_${BLOCK_NUM}.txt"

# Display configuration
echo "Config file contents:"
cat "$CONFIG_FILE"

# Run analysis
echo "Executing command:"
echo "hitmictools run --config $CONFIG_FILE --worklist $FILELIST"
hitmictools run --config $CONFIG_FILE --worklist $FILELIST

# Display resource usage
sstat --format=JobID,AveCPU,AveRSS,MaxRSS -j $SLURM_JOBID.batch
sacct -o JobID,CPUTime -j $SLURM_JOBID

4. Customizing SLURM Scripts

Adjusting Resource Requests

Time Limits

Match --time to your queue’s QoS:

# For short jobs (< 6 hours)
--qos gpu6hours --time 05:00:00

# For long jobs (< 24 hours)
--qos gpu24hours --time 20:00:00

# For very long jobs
--qos gpu1week --time 7-00:00:00  # 7 days

Memory Requirements

Estimate memory needs:

  • Basic analysis: 20-25G per CPU

  • With tracking: 30-35G per CPU

  • Large images (>4K x 4K): 40-50G per CPU

# Low memory (small images, no tracking)
--memory 20G --cpus-per-task 4  # Total: 80GB

# High memory (large images, tracking)
--memory 40G --cpus-per-task 4  # Total: 160GB

GPU Selection

Specify GPU type and count:

# Single RTX 4090 (most common)
--partition rtx4090 --gres gpu:1

# Single A100 (for very large models)
--partition a100 --gres gpu:1

# Multiple GPUs (advanced, requires code modification)
--gres gpu:2

Understanding Partitions and QoS (Quality of Service)

For wet lab biologists new to computing clusters:

Think of the SLURM cluster as a shared resource pool, similar to booking time on a shared microscope facility. Partitions are like different types of equipment (e.g., confocal microscope vs. widefield), each with specific capabilities. QoS (Quality of Service) is like booking time slots - you can reserve shorter time slots (30 minutes to 6 hours) which are processed faster, or longer slots (1 day to 2 weeks) for extensive analyses.

The key difference: shorter time slots get higher priority in the queue, similar to how “quick scans” might be prioritized over “overnight acquisitions” on a microscope booking system. Choose the shortest time that fits your analysis to get results faster.

Available Partitions (Computing Resources)

Each partition provides different hardware optimized for specific tasks:

Partition

Hardware

GPUs

Memory

Best For

rtx4090

Latest NVIDIA RTX 4090 GPUs

8 per node

1 TB

HiTMicTools standard - Fast, modern GPUs ideal for image analysis

a100

NVIDIA A100 GPUs (40 GB)

4 per node

1 TB

Very large models or high-memory tasks

a100-80g

NVIDIA A100 GPUs (80 GB)

4 per node

1 TB

Extremely large models (rarely needed for HiTMicTools)

titan

Older NVIDIA Titan GPUs

7 per node

512 GB

Legacy partition (avoid if rtx4090 available)

scicore

CPU-only

None

512 GB - 1 TB

CPU-only processing (slower)

bigmem

CPU-only, high memory

None

1-2 TB

Very large memory needs without GPU

Recommendation for HiTMicTools: Use rtx4090 for almost all analyses - it provides the best performance-to-availability ratio.

Available QoS (Time Limits)

QoS determines how long your job can run. Choose based on your expected processing time:

QoS

Maximum Runtime

When to Use

Example Use Case

gpu30min

30 minutes

Quick tests, small datasets

Testing config on 1-2 images

gpu6hours

6 hours

Most common

Standard batch (10-50 movies)

gpu1day

24 hours

Large batches

Processing 100-200 movies

gpu1week

7 days

Very large experiments

Processing 500+ movies or tracking

30min

30 minutes

CPU-only quick tasks

Testing without GPU

6hours

6 hours

CPU-only standard

Standard analysis without GPU

1day

24 hours

CPU-only long tasks

Large CPU-only analysis

1week

7 days

CPU-only very long

Rarely needed

2weeks

14 days

CPU-only extended

Very rarely needed

Resource Limits by QoS

The cluster enforces limits to ensure fair sharing among all users. These limits control how many resources (CPUs, GPUs, memory) you can use simultaneously across all your jobs.

GPU QoS Limits:

QoS

Max Runtime

Total Cluster Limit

Per Account Limit

Per User Limit

gpu30min

30 minutes

3,300 CPUs, 170 GPUs, 26 TB

2,400 CPUs, 136 GPUs, 22 TB

2,600 CPUs, 136 GPUs, 22 TB

gpu6hours

6 hours

3,000 CPUs, 150 GPUs, 24 TB

2,000 CPUs, 100 GPUs, 16 TB

2,000 CPUs, 100 GPUs, 16 TB

gpu1day

24 hours

2,500 CPUs, 120 GPUs, 20 TB

1,250 CPUs, 60 GPUs, 10 TB

1,250 CPUs, 60 GPUs, 10 TB

gpu1week

7 days

1,500 CPUs, 48 GPUs, 12 TB

750 CPUs, 24 GPUs, 6 TB

750 CPUs, 24 GPUs, 6 TB

CPU-only QoS Limits:

QoS

Max Runtime

Total Cluster Limit

Per Account Limit

Per User Limit

30min

30 minutes

12,000 CPUs, 68 TB

10,000 CPUs, 50 TB

10,000 CPUs, 50 TB

6hours

6 hours

11,500 CPUs, 64 TB

7,500 CPUs, 40 TB

7,500 CPUs, 40 TB

1day

24 hours

9,000 CPUs, 60 TB

4,500 CPUs, 30 TB

4,500 CPUs, 30 TB

1week

7 days

3,800 CPUs, 30 TB

2,000 CPUs, 15 TB

2,000 CPUs, 15 TB

2weeks

14 days

1,300 CPUs, 10 TB

128 CPUs, 2 TB

128 CPUs, 2 TB

What this means in practice:

  • If you submit 10 jobs with gpu6hours QoS, each requesting 1 GPU and 4 CPUs, you’ll use 10 GPUs and 40 CPUs total

  • This is well within the per-user limit of 100 GPUs and 2,000 CPUs for gpu6hours

  • Shorter QoS options allow more concurrent jobs but less time per job

  • Longer QoS options allow fewer concurrent jobs but more time per job

Modifying Array Size

Change the array range in the script header:

# For 20 blocks (0-19)
#SBATCH --array=0-19

# For 50 blocks (0-49)
#SBATCH --array=0-49

# Process subset of blocks (e.g., only blocks 10-20)
#SBATCH --array=10-20

Email Notifications

Control when you receive emails:

# All events
#SBATCH --mail-type=ALL

# Only failures
#SBATCH --mail-type=FAIL

# Start and end
#SBATCH --mail-type=BEGIN,END

# Disable emails
# Remove or comment out --mail-user and --mail-type lines

5. Submitting and Managing Jobs

Submitting Jobs

# Submit the job array
sbatch run_analysis.sh

# Submit with dependency (wait for job 12345 to complete)
sbatch --dependency=afterok:12345 run_analysis.sh

# Submit with limited array size (max 10 concurrent)
sbatch --array=0-99%10 run_analysis.sh

Expected output:

Submitted batch job 67890

Monitoring Jobs

# Check your jobs in the queue
squeue -u $USER

# Detailed view of a specific job
scontrol show job 67890

# Check array job status
squeue -u $USER -t RUNNING,PENDING

# View job progress
tail -f SLURM_jobs_report/my_analysis/67890_0_HiTMicTools.out

Queue Status Output

JOBID    PARTITION  NAME          USER   ST  TIME  NODES  NODELIST
67890_0  rtx4090    my_analysis   user   R   1:23  1      node042
67890_1  rtx4090    my_analysis   user   R   1:22  1      node043
67890_2  rtx4090    my_analysis   user   PD  0:00  1      (Resources)

Status codes:

  • R - Running

  • PD - Pending (waiting for resources)

  • CG - Completing (job finishing)

  • CD - Completed

  • F - Failed

Canceling Jobs

# Cancel a specific array task
scancel 67890_5

# Cancel entire job array
scancel 67890

# Cancel all your jobs
scancel -u $USER

# Cancel pending jobs only
scancel -u $USER -t PENDING

Checking Resource Usage

# After job completes, view statistics
sacct -j 67890 --format=JobID,JobName,Partition,Elapsed,State,MaxRSS,MaxVMSize

# For array jobs, see all tasks
sacct -j 67890 --format=JobID,State,MaxRSS,Elapsed

# Detailed efficiency report
seff 67890_0

6. Configuration for SLURM

Adapting Your Config File

Your config file should use absolute paths on the cluster:

input_data:
  input_folder: "/cluster/path/to/data"
  output_folder: "/cluster/path/to/results"
  file_type: ".nd2"
  export_labelled_masks: false
  export_aligned_image: false

pipeline_setup:
  name: "ASCT_focusrestore"
  parallel_processing: false        # Use SLURM arrays instead
  num_workers: 1                    # Single worker per SLURM task
  reference_channel: 0
  pi_channel: 1
  focus_correction: true
  align_frames: true
  method: "basicpy_fl"
  tracking: true

models:
  model_collection: "/cluster/path/to/models/model_collection_tracking_20250529.zip"

tracking:
  parameters_override: null

Important notes:

  • Set parallel_processing: false (SLURM handles parallelism)

  • Set num_workers: 1 (each SLURM task processes one block)

  • Use absolute paths for portability

  • Keep export_labelled_masks: false to save disk space

Testing Configuration

Before submitting large jobs, test with a single file:

# Create a test worklist with one file
echo "test_image.nd2" > test_worklist.txt

# Run locally on a cluster node (interactive session)
srun --partition=rtx4090 --gres=gpu:1 --mem=25G --cpus-per-task=4 --pty bash
conda activate hitmictools
hitmictools run --config config/analysis_config.yml --worklist test_worklist.txt
exit

7. Best Practices

Resource Optimization

  1. Right-size your requests:

    • Don’t request more memory/time than needed (wastes resources)

    • Request slightly more than expected (avoid job failures)

  2. Optimize block size:

    • Aim for 2-6 hour jobs (good queue priority)

    • Avoid jobs < 30 min (overhead) or > 24 hours (risky)

  3. Use appropriate QoS:

    • Short jobs → gpu6hours

    • Standard jobs → gpu24hours

    • Long jobs → gpu1week (but minimize use)

Data Management

  1. Store data efficiently:

    • Keep raw data in shared filesystem

    • Write results to high-speed scratch space if available

    • Move final results to permanent storage after completion

  2. Cleanup strategy:

    # Remove temporary file blocks after completion
    rm -rf temp/
    
    # Archive logs
    tar -czf logs_${SLURM_JOB_ID}.tar.gz SLURM_jobs_report/my_analysis/
    
  3. Monitor disk usage:

    # Check quota
    quota -s
    
    # Find large files
    du -sh results/*/ | sort -h
    

Error Handling

  1. Check logs regularly:

    # Find failed jobs
    grep -l "Error" SLURM_jobs_report/my_analysis/*.err
    
    # Count successful completions
    ls results/*.csv | wc -l
    
  2. Rerun failed tasks:

    # If task 5 failed, rerun just that block
    sbatch --array=5 run_analysis.sh
    
    # Rerun multiple failed tasks
    sbatch --array=3,5,7,12 run_analysis.sh
    
  3. Common failure causes:

    • Out of memory → Increase --memory

    • Timeout → Increase --time or reduce block size

    • Missing files → Check paths in config

    • CUDA errors → Check --gres=gpu:1 is set

8. Example Workflows

Workflow 1: Standard Analysis (100 files)

# 1. Split files
hitmictools split-files --target-folder ./data --n-blocks 10 --output-dir ./temp

# 2. Generate script
hitmictools generate-slurm \
    --job-name 'exp001' \
    --file-blocks \
    --n-blocks 10 \
    --conda-env 'hitmictools' \
    --config-file './config/analysis.yml'

# 3. Submit
sbatch slurm_script.sh

# 4. Monitor
squeue -u $USER
watch -n 30 'ls results/*.csv | wc -l'  # Count completed files

# 5. Check completion
ls results/*.csv | wc -l  # Should be 100

Workflow 2: Large-Scale Tracking (500 files)

# 1. Split into 25 blocks
hitmictools split-files --target-folder ./data --n-blocks 25 --output-dir ./temp

# 2. Generate script with more resources
hitmictools generate-slurm \
    --job-name 'tracking_exp' \
    --file-blocks \
    --n-blocks 25 \
    --conda-env 'hitmictools' \
    --config-file './config/tracking_config.yml' \
    --memory '35G' \
    --time '08:00:00' \
    --qos 'gpu24hours'

# 3. Submit with max 5 concurrent jobs
sbatch --array=0-24%5 slurm_script.sh

# 4. Monitor progress
watch -n 60 'squeue -u $USER; echo ""; ls results/*.csv | wc -l'

Workflow 3: Reprocessing Subset

# Reprocess only specific files
cat > reprocess_list.txt << EOF
data/image_042.nd2
data/image_103.nd2
data/image_205.nd2
EOF

# Run without array job
hitmictools run --config config/analysis.yml --worklist reprocess_list.txt

9. Troubleshooting

Job Pending Forever

# Check why job is pending
scontrol show job 67890 | grep Reason

# Common reasons:
# - Resources: No available nodes with requested resources
# - Priority: Other jobs have higher priority
# - QOSMaxCpuPerUserLimit: You've exceeded CPU limit

Solutions:

  • Reduce resource requests

  • Choose different partition

  • Wait for resources to free up

  • Cancel competing jobs if appropriate

Out of Memory Errors

Check logs:

grep -i "memory" SLURM_jobs_report/my_analysis/*.err
grep -i "killed" SLURM_jobs_report/my_analysis/*.err

Solutions:

  • Increase --memory in SLURM script

  • Reduce num_workers in config

  • Set export_labelled_masks: false

  • Process smaller file blocks

GPU Not Available

# Check SLURM script has GPU request
grep "gres=gpu" slurm_script.sh

# Verify GPU in job
srun --jobid=67890_0 nvidia-smi

Solutions:

  • Add #SBATCH --gres=gpu:1 to script

  • Verify partition supports GPUs

  • Check QoS allows GPU access

Files Not Found

# Check file paths in error logs
grep "FileNotFoundError" SLURM_jobs_report/my_analysis/*.err

# Verify paths are accessible from compute nodes
srun --partition=rtx4090 --pty bash
ls /cluster/path/to/data
exit

Solutions:

  • Use absolute paths in config

  • Ensure shared filesystem is mounted

  • Check file permissions

10. Advanced Topics

Dependency Chains

Run multiple analyses in sequence:

# Job 1: Preprocess
JOB1=$(sbatch --parsable preprocess.sh)

# Job 2: Analysis (waits for Job 1)
JOB2=$(sbatch --parsable --dependency=afterok:$JOB1 analysis.sh)

# Job 3: Postprocess (waits for Job 2)
sbatch --dependency=afterok:$JOB2 postprocess.sh

Checkpoint and Resume

For very long analyses:

# In your config, process files one at a time
input_data:
  file_list:
    - "long_timeseries_001.nd2"

Then use multiple shorter SLURM jobs instead of one long job.

Custom SLURM Directives

Add additional directives to the generated script:

# After generation, edit the script to add:
#SBATCH --exclusive          # Exclusive node access
#SBATCH --constraint=gpu40   # Specific GPU type
#SBATCH --account=proj_name  # Billing account

Summary

SLURM workflow with HiTMicTools:

  1. Use split-files to create file blocks

  2. Use generate-slurm to create submission script

  3. Customize script for cluster resources

  4. Submit with sbatch

  5. Monitor with squeue and log files

  6. Handle failures by rerunning specific array indices

For help with your specific cluster, consult:

  • Your cluster’s documentation

  • man sbatch and man squeue

  • Cluster support team