Modalkit Documentation¶
A framework layer for deploying ML models on Modal
What is Modalkit?¶
Modalkit is a lightweight framework that sits on top of Modal to provide ML-specific patterns and conveniences. Think of it as a set of best practices and utilities for deploying machine learning models on Modal's excellent serverless infrastructure.
Why use Modalkit?¶
While Modal provides powerful primitives for serverless compute, Modalkit adds:
- 🎯 ML-Specific Patterns: Standardized inference pipeline (preprocess → predict → postprocess)
- 🔧 Configuration Management: YAML-based config for Modal deployments
- 🔐 Built-in Auth: Easy authentication setup for ML APIs
- 📦 Type Safety: Pydantic models for request/response validation
- 🔄 Queue Integration: Async inference with SQS/Taskiq support
How it works¶
Modalkit wraps your ML model in a Modal-compatible structure:
# Your ML code
class MyModel(InferencePipeline):
def predict(self, inputs):
return model.generate(inputs)
# Modalkit handles the Modal integration
@app.cls(**modal_config.get_app_cls_settings())
class MyApp(ModalService):
inference_implementation = MyModel
# Deploy with Modal CLI
# modal deploy app.py
Under the hood, Modalkit: 1. Configures Modal container specs from YAML 2. Sets up FastAPI endpoints 3. Handles authentication middleware 4. Manages batch processing 5. Integrates with Modal's volume and secret systems
Prerequisites¶
- Python 3.9+
- Modal account and CLI installed
- Basic familiarity with Modal concepts
Quick Start¶
-
Getting Started
Build your first ML deployment on Modal
-
Configuration
Learn how Modalkit configures Modal resources
-
Deployment
Deploy to Modal's infrastructure
-
Examples
Real-world ML deployment examples
Core Concepts¶
InferencePipeline¶
Your model inherits from InferencePipeline
and implements three methods:
- preprocess()
: Prepare raw inputs for your model
- predict()
: Run inference
- postprocess()
: Format outputs
Modal Integration¶
Modalkit automatically configures: - Container images and dependencies - GPU resources - Secrets and volumes - Concurrency limits - CloudBucketMounts for S3/GCS access
Configuration¶
A single YAML file configures your entire deployment:
app_settings:
deployment_config:
gpu: "T4"
cloud_bucket_mounts:
- mount_point: "/mnt/models"
bucket_name: "my-models"
When to use Modalkit¶
✅ Use Modalkit when you want: - Quick ML model deployment on Modal - Standardized API patterns - Configuration-driven deployments - Built-in auth and validation
❌ Use Modal directly when you need: - Non-ML workloads - Custom networking requirements - Fine-grained control over containers
Learn More¶
- Modal Documentation - Understand the platform
- Modalkit GitHub - Source code and issues
- Examples - Working code examples
Modalkit is an open-source project • Not affiliated with Modal Labs