Getting Started¶
This guide will walk you through creating your first ML deployment with Modalkit on Modal.
Prerequisites¶
-
Install Modal CLI
Follow the prompts to authenticate with your Modal account. -
Install Modalkit
Project Structure¶
Create a new directory for your project:
my-ml-project/
├── app.py # Modal app definition
├── model.py # Your ML model code
├── modalkit.yaml # Configuration
└── requirements.txt # Dependencies
Step 1: Define Your Model¶
Create model.py
with your inference logic:
from modalkit.inference_pipeline import InferencePipeline
from pydantic import BaseModel
from typing import List
# Define input/output schemas
class TextInput(BaseModel):
text: str
language: str = "en"
class TextOutput(BaseModel):
translated_text: str
confidence: float
# Implement your model
class TranslationModel(InferencePipeline):
def __init__(self, model_name: str, all_model_data_folder: str, common_settings: dict, *args, **kwargs):
super().__init__(model_name, all_model_data_folder, common_settings)
# Load your model here
# self.model = load_model(...)
def preprocess(self, input_list: List[TextInput]) -> dict:
"""Prepare batch of inputs for the model"""
texts = [item.text for item in input_list]
languages = [item.language for item in input_list]
return {"texts": texts, "languages": languages}
def predict(self, input_list: List[TextInput], preprocessed_data: dict) -> dict:
"""Run batch inference"""
# Your actual model inference
translations = [f"Translated: {text}" for text in preprocessed_data["texts"]]
scores = [0.95] * len(translations)
return {"translations": translations, "scores": scores}
def postprocess(self, input_list: List[TextInput], raw_output: dict) -> List[TextOutput]:
"""Format outputs"""
return [
TextOutput(translated_text=text, confidence=score)
for text, score in zip(raw_output["translations"], raw_output["scores"])
]
Step 2: Configure Your Deployment¶
Create modalkit.yaml
:
app_settings:
app_prefix: "translation-demo"
auth_config:
# For development - use hardcoded key
api_key: "dev-key-123"
auth_header: "x-api-key"
build_config:
image: "python:3.11-slim"
tag: "latest"
workdir: "/app"
deployment_config:
# No GPU for this example
gpu: null
concurrency_limit: 10
container_idle_timeout: 300
# Use Modal proxy auth in production
secure: false
batch_config:
max_batch_size: 8
wait_ms: 50
model_settings:
local_model_repository_folder: "./models"
common:
device: "cpu"
model_entries:
translation_model:
version: "1.0"
Step 3: Create Modal App¶
Create app.py
:
import modal
from modalkit.modal_service import ModalService, create_web_endpoints
from modalkit.modal_config import ModalConfig
from model import TranslationModel, TextInput, TextOutput
# Initialize Modalkit
modal_config = ModalConfig()
app = modal.App(name=modal_config.app_name)
# Define Modal app class
@app.cls(**modal_config.get_app_cls_settings())
class TranslationApp(ModalService):
inference_implementation = TranslationModel
model_name: str = modal.parameter(default="translation_model")
modal_utils: ModalConfig = modal_config
# Create API endpoints
@app.function(**modal_config.get_handler_settings())
@modal.asgi_app(**modal_config.get_asgi_app_settings())
def web_endpoints():
return create_web_endpoints(
app_cls=TranslationApp,
input_model=TextInput,
output_model=TextOutput
)
Step 4: Test Locally¶
Run your app locally with Modal:
Test with curl:
# Sync endpoint
curl -X POST http://localhost:8000/predict_sync \
-H "Content-Type: application/json" \
-H "x-api-key: dev-key-123" \
-d '{"text": "Hello world", "language": "en"}'
# Batch endpoint
curl -X POST http://localhost:8000/predict_batch \
-H "Content-Type: application/json" \
-H "x-api-key: dev-key-123" \
-d '[{"text": "Hello"}, {"text": "World"}]'
Step 5: Deploy to Modal¶
Deploy your app:
Your app is now live! Modal will provide the URL:
What's Next?¶
Add GPU Support¶
Use Cloud Storage¶
cloud_bucket_mounts:
- mount_point: "/mnt/models"
bucket_name: "my-model-bucket"
secret: "aws-credentials"
Enable Modal Proxy Auth¶
Add Async Processing¶
See the Async Example for queue-based base_inference.
Troubleshooting¶
Common Issues¶
- Import errors: Make sure all dependencies are in your container image
- Auth failures: Check your API key configuration
- GPU issues: Ensure your Modal workspace has GPU access
Getting Help¶
- Check Modal's documentation
- View Modalkit examples
- Open an issue on GitHub