# Launching the Analysis Pipeline

This guide walks you through running HiTMicTools end-to-end on a local workstation or standard lab GPU node for basic image analysis without tracking.

## Overview

The basic HiTMicTools workflow involves:
1. Setting up your environment and folder structure
2. Configuring the analysis pipeline via YAML file
3. Running the analysis with the CLI
4. Examining the outputs

## 1. Prerequisites

### Environment Setup
Activate the recommended Conda environment:
```bash
conda activate hitmictools  # or img_analysis for development
```

### Installation
For production use:
```bash
pip install git+https://github.com/phisanti/HiTMicTools
```

For updating HiTMicTools code inside an already working environment, avoid changing the dependency stack:
```bash
pip install --force-reinstall --no-deps git+https://github.com/phisanti/HiTMicTools
```

For development (editable install):
```bash
git clone https://github.com/phisanti/HiTMicTools
cd HiTMicTools
pip install -e . --no-deps
```

For rebuilding an environment reproducibly, use a matching constraint file from `constraints/`:
```bash
pip install --extra-index-url https://download.pytorch.org/whl/cu121 -c constraints/scicore-py39-cu121.txt .
```

### Dependency Note
HiTMicTools bounds the scientific Python stack and pins known fragile integration points, including `hyperactive==4.8.0`, `gradient-free-optimizers==1.7.2`, `jax==0.4.23`, `jaxlib==0.4.23`, and the Basicpy/jetraw-tools Git commits. This prevents code-only updates from accidentally upgrading core packages into untested combinations.

### Required Assets
- **Model Collection**: Download or locate the appropriate model bundle (e.g., `model_collection_tracking_20250529.zip`)
  - These files contain all necessary neural network weights for segmentation, classification, and focus restoration
  - Store in your project directory or a centralized models folder
  - **Note**: Never commit model bundles to git due to their size

## 2. Project Structure

Set up your analysis project with the following structure:

```
your_project/
├── data/                           # Input microscopy files
│   └── experiment_001/
│       ├── image001.nd2
│       ├── image002.nd2
│       └── ...
├── results/                        # Output folder (created automatically)
├── config/
│   └── analysis_config.yml        # Your configuration file
└── models/
    └── model_collection_tracking_20250529.zip
```

## 3. Configuration File

HiTMicTools uses YAML configuration files to define all analysis parameters. The modern approach uses **model collections** for simplified setup.

### Basic Configuration Example

Create a file `analysis_config.yml`:

```yaml
input_data:
  input_folder: "./data/experiment_001"     # Path to input images
  output_folder: "./results/experiment_001" # Output directory
  file_list: null                           # null = process all files
  file_type: ".nd2"                         # File extension (.nd2, .tiff, .p.tiff)
  file_pattern: ""                          # Optional: filter files by pattern
  export_labelled_masks: false              # Export labeled segmentation masks
  export_aligned_image: false               # Export aligned/corrected images

pipeline_setup:
  name: "ASCT_focusrestore"                 # Pipeline type
  parallel_processing: true                 # Enable parallel processing
  num_workers: 3                            # Number of parallel workers
  reference_channel: 0                      # Brightfield channel (usually 0)
  pi_channel: 1                             # Fluorescence channel (usually 1)
  focus_correction: true                    # Apply focus restoration
  align_frames: true                        # Align frames across time
  method: "basicpy_fl"                      # Background correction method
  tracking: false                           # Disable tracking (see tracking guide)

models:
  model_collection: "./models/model_collection_tracking_20250529.zip"
```

### Configuration Parameters Explained

#### Input Data Section
- **input_folder**: Directory containing your microscopy files
- **output_folder**: Where results will be saved (created if doesn't exist)
- **file_list**: Optional list of specific files to process. If `null`, all files matching `file_type` are processed
- **file_type**: File extension to process
  - `.nd2` - Nikon ND2 files
  - `.tiff` / `.tif` - TIFF files
  - `.p.tiff` - Jetraw-compressed TIFF (requires Jetraw license)
- **file_pattern**: Optional regex pattern to filter files (e.g., `"experiment_A.*"`)
- **export_labelled_masks**: Set to `true` to save segmentation masks as images (useful for troubleshooting)
- **export_aligned_image**: Set to `true` to save processed/aligned images (8-bit compressed)

#### Pipeline Setup Section
- **name**: Pipeline type (see Available Pipelines below)
- **parallel_processing**: Process multiple images simultaneously
- **num_workers**: Number of parallel processes (recommend 2-4 based on available RAM/VRAM)
- **reference_channel**: Channel index for brightfield segmentation (typically 0)
- **pi_channel**: Channel index for fluorescence measurements (typically 1)
- **focus_correction**: Apply deep learning focus restoration (recommended)
- **align_frames**: Register frames across time series (required for tracking)
- **method**: Background correction strategy:
  - `"basicpy_fl"` - Recommended: BaSiC correction for fluorescence, standard for brightfield
  - `"basicpy"` - BaSiC correction for all channels
  - `"standard"` - Difference of Gaussians method

#### Models Section
- **model_collection**: Path to the bundled model ZIP file (recommended approach)
  - This single file contains all required models
  - Simplifies deployment and ensures version consistency

### Available Pipelines

- **ASCT_focusrestore**: Focus restoration + segmentation + classification (most common)
- **ASCT_scsegm**: Single-cell instance segmentation with RT-DETR
- **ASCT_zaslavier**: Specialized pipeline for Zaslavier lab workflow

### Processing Single Files

To process specific files instead of the entire folder:

```yaml
input_data:
  file_list:
    - "image001.nd2"
    - "image003.nd2"
    - "image005.nd2"
```

## 4. Running the Analysis

### Basic Command

From your project directory:

```bash
hitmictools run --config config/analysis_config.yml
```

### Using a Worklist

For better control over which files to process, use a worklist file:

```bash
# Create a worklist (text file with one filename per line)
echo "image001.nd2" > worklist.txt
echo "image002.nd2" >> worklist.txt

# Run with worklist
hitmictools run --config config/analysis_config.yml --worklist worklist.txt
```

### CLI Help

View all available options:

```bash
hitmictools --help
hitmictools run --help
```

## 5. Understanding the Output

### Output Files

The analysis creates the following in your `output_folder`:

```
results/experiment_001/
├── image001_analysis_results.csv          # Measurement data
├── image002_analysis_results.csv
├── image001_labeled_mask.tiff            # (optional) Segmentation masks
├── image001_aligned.tiff                 # (optional) Processed images
└── analysis_logs/
    └── processing_log.txt
```

### CSV Output Columns

Typical columns in the results CSV:
- **frame**: Time point index
- **label**: Cell/object ID within the frame
- **area**: Object area in pixels
- **centroid_0**, **centroid_1**: Object center coordinates
- **mean_intensity**: Mean pixel intensity
- **cell_class**: Classification result (e.g., "single-cell", "clump", "noise")
- **pi_positive**: PI staining classification (if applicable)

## 6. Monitoring and Performance

### Parallel Processing

Enable parallel processing for faster analysis:
- Set `parallel_processing: true` in config
- Set `num_workers` to 2-4 (more workers = more memory usage)
- Monitor RAM/VRAM to avoid out-of-memory errors

### Resource Usage Tips

- **Local workstation**: Start with `num_workers: 2`
- **GPU node**: Can use 3-4 workers if VRAM > 16GB
- **Large images**: Reduce workers to avoid memory exhaustion
- Set `export_labelled_masks: false` and `export_aligned_image: false` for production runs

### Progress Monitoring

The CLI outputs progress information:
```
Processing file 1/10: image001.nd2
  - Focus restoration: complete
  - Segmentation: complete
  - Classification: complete
  - Results saved
```

## 7. Troubleshooting

### Common Issues

**"Model file not found"**
- Check that `model_collection` path is correct
- Ensure the ZIP file exists and is not corrupted
- Use absolute paths if relative paths fail

**"Out of memory" errors**
- Reduce `num_workers` in config
- Close other applications using GPU/RAM
- Process files in smaller batches

**"btrack not installed" (when tracking: true)**
- Either disable tracking (`tracking: false`) or install btrack
- See the [tracking guide](launch_analysis_with_tracking.md) for btrack installation

**Wrong file type detected**
- Verify `file_type` matches your files exactly
- Check `file_pattern` isn't filtering out files unintentionally

**Background correction fails**
- Try different `method` values
- For very noisy images, use `"standard"` instead of `"basicpy_fl"`

### Getting Help

- Check log files in `output_folder/analysis_logs/`
- Verify configuration with a single test image first
- Review sample configs in the `config/` directory of the repository

## 8. Advanced Configuration

### Using Individual Models (Alternative to Model Collections)

If you need fine-grained control, specify models individually:

```yaml
# Instead of model_collection, specify each component:
bf_focus:
  model_path: "/path/to/models/bf_focus/model.pth"
  model_metadata: "/path/to/models/bf_focus/model_metadata.json"
  inferer_args:
    scale_method: "range01"
    patch_size: 256
    overlap_ratio: 0.25
    half_precision: true

fl_focus:
  model_path: "/path/to/models/fl_focus/model.pth"
  model_metadata: "/path/to/models/fl_focus/model_metadata.json"
  inferer_args:
    scale_method: "fixed_range"
    patch_size: 256

segmentation:
  model_path: "/path/to/models/segmentation/model.pth"
  model_metadata: "/path/to/models/segmentation/model_metadata.json"

cell_classifier:
  model_path: "/path/to/models/cell_classifier/model.pth"
  model_metadata: "/path/to/models/cell_classifier/model_metadata.json"
  classes:
    0: "single-cell"
    1: "clump"
    2: "noise"
    3: "off-focus"
    4: "joint-cell"
```

This approach is useful for:
- Development and testing of new models
- Benchmarking different model versions
- Custom pipeline modifications

See [models.md](models.md) for more details on model management.