Skip to content

Modalkit Documentation

A framework layer for deploying ML models on Modal

What is Modalkit?

Modalkit is a lightweight framework that sits on top of Modal to provide ML-specific patterns and conveniences. Think of it as a set of best practices and utilities for deploying machine learning models on Modal's excellent serverless infrastructure.

Why use Modalkit?

While Modal provides powerful primitives for serverless compute, Modalkit adds:

  • 🎯 ML-Specific Patterns: Standardized inference pipeline (preprocess → predict → postprocess)
  • 🔧 Configuration Management: YAML-based config for Modal deployments
  • 🔐 Built-in Auth: Easy authentication setup for ML APIs
  • 📦 Type Safety: Pydantic models for request/response validation
  • 🔄 Queue Integration: Async inference with SQS/Taskiq support

How it works

Modalkit wraps your ML model in a Modal-compatible structure:

# Your ML code
class MyModel(InferencePipeline):
    def predict(self, inputs):
        return model.generate(inputs)

# Modalkit handles the Modal integration
@app.cls(**modal_config.get_app_cls_settings())
class MyApp(ModalService):
    inference_implementation = MyModel

# Deploy with Modal CLI
# modal deploy app.py

Under the hood, Modalkit: 1. Configures Modal container specs from YAML 2. Sets up FastAPI endpoints 3. Handles authentication middleware 4. Manages batch processing 5. Integrates with Modal's volume and secret systems

Prerequisites

  • Python 3.9+
  • Modal account and CLI installed
  • Basic familiarity with Modal concepts

Quick Start

  • Getting Started


    Build your first ML deployment on Modal

    Start here

  • Configuration


    Learn how Modalkit configures Modal resources

    Config guide

  • Deployment


    Deploy to Modal's infrastructure

    Deploy guide

  • Examples


    Real-world ML deployment examples

    View examples

Core Concepts

InferencePipeline

Your model inherits from InferencePipeline and implements three methods: - preprocess(): Prepare raw inputs for your model - predict(): Run inference - postprocess(): Format outputs

Modalkit automatically configures: - Container images and dependencies - GPU resources - Secrets and volumes - Concurrency limits - CloudBucketMounts for S3/GCS access

Configuration

A single YAML file configures your entire deployment:

app_settings:
  deployment_config:
    gpu: "T4"
    cloud_bucket_mounts:
      - mount_point: "/mnt/models"
        bucket_name: "my-models"

When to use Modalkit

Use Modalkit when you want: - Quick ML model deployment on Modal - Standardized API patterns - Configuration-driven deployments - Built-in auth and validation

Use Modal directly when you need: - Non-ML workloads - Custom networking requirements - Fine-grained control over containers

Learn More


Modalkit is an open-source project • Not affiliated with Modal Labs