Modalkit Documentation¶

A framework layer for deploying ML models on Modal

What is Modalkit?¶

Modalkit is a lightweight framework that sits on top of Modal to provide ML-specific patterns and conveniences. Think of it as a set of best practices and utilities for deploying machine learning models on Modal's excellent serverless infrastructure.

Why use Modalkit?¶

While Modal provides powerful primitives for serverless compute, Modalkit adds:

🎯 ML-Specific Patterns: Standardized inference pipeline (preprocess → predict → postprocess)
🔧 Configuration Management: YAML-based config for Modal deployments
🔐 Built-in Auth: Easy authentication setup for ML APIs
📦 Type Safety: Pydantic models for request/response validation
🔄 Queue Integration: Async inference with SQS/Taskiq support

How it works¶

Modalkit wraps your ML model in a Modal-compatible structure:

# Your ML code
class MyModel(InferencePipeline):
    def predict(self, inputs):
        return model.generate(inputs)

# Modalkit handles the Modal integration
@app.cls(**modal_config.get_app_cls_settings())
class MyApp(ModalService):
    inference_implementation = MyModel

# Deploy with Modal CLI
# modal deploy app.py

Under the hood, Modalkit: 1. Configures Modal container specs from YAML 2. Sets up FastAPI endpoints 3. Handles authentication middleware 4. Manages batch processing 5. Integrates with Modal's volume and secret systems

Prerequisites¶

Python 3.9+
Modal account and CLI installed
Basic familiarity with Modal concepts

Quick Start¶

Getting Started

Build your first ML deployment on Modal

Start here
Configuration

Learn how Modalkit configures Modal resources

Config guide
Deployment

Deploy to Modal's infrastructure

Deploy guide
Examples

Real-world ML deployment examples

View examples

Core Concepts¶

InferencePipeline¶

Your model inherits from InferencePipeline and implements three methods: - preprocess(): Prepare raw inputs for your model - predict(): Run inference - postprocess(): Format outputs

Modalkit automatically configures: - Container images and dependencies - GPU resources - Secrets and volumes - Concurrency limits - CloudBucketMounts for S3/GCS access

Configuration¶

A single YAML file configures your entire deployment:

app_settings:
  deployment_config:
    gpu: "T4"
    cloud_bucket_mounts:
      - mount_point: "/mnt/models"
        bucket_name: "my-models"

When to use Modalkit¶

✅ Use Modalkit when you want: - Quick ML model deployment on Modal - Standardized API patterns - Configuration-driven deployments - Built-in auth and validation

❌ Use Modal directly when you need: - Non-ML workloads - Custom networking requirements - Fine-grained control over containers

Learn More¶

Modal Documentation - Understand the platform
Modalkit GitHub - Source code and issues
Examples - Working code examples

Modalkit is an open-source project • Not affiliated with Modal Labs