🔮 Automagic Instrumentation¶
Tracelet's most powerful feature is automagic instrumentation - automatic detection and logging of machine learning hyperparameters with zero configuration. Just enable automagic mode and Tracelet intelligently captures your experiment parameters using advanced heuristics.
Overview¶
Traditional experiment tracking requires manual logging of every hyperparameter:
# Traditional approach - tedious and error-prone
experiment.log_params({
"learning_rate": 0.001,
"batch_size": 64,
"epochs": 100,
"dropout": 0.3,
"hidden_layers": [256, 128, 64],
"optimizer": "adam",
# ... 20+ more parameters
})
With automagic instrumentation, this becomes:
# Automagic approach - zero configuration
learning_rate = 0.001
batch_size = 64
epochs = 100
dropout = 0.3
hidden_layers = [256, 128, 64]
optimizer = "adam"
# All parameters automatically captured! ✨
Quick Start¶
Basic Automagic Usage¶
from tracelet import Experiment
# Enable automagic mode
experiment = Experiment(
name="automagic_experiment",
backend=["mlflow"],
automagic=True # ✨ Enable automagic instrumentation
)
experiment.start()
# Define hyperparameters normally - they're captured automatically
learning_rate = 3e-4
batch_size = 128
epochs = 50
dropout_rate = 0.1
num_layers = 6
# Your training code here...
# Automagic captures all relevant variables!
experiment.end()
Framework Integration¶
Automagic automatically hooks into popular ML frameworks:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from tracelet import Experiment
# Enable automagic
experiment = Experiment("pytorch_training", automagic=True)
experiment.start()
# Model hyperparameters (automatically captured)
learning_rate = 0.001
weight_decay = 1e-4
momentum = 0.9
batch_size = 32
epochs = 10
input_size = 784
hidden_size = 128
num_classes = 10
# Create a simple model (automatically instrumented)
model = nn.Sequential(
nn.Linear(input_size, hidden_size),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(hidden_size, num_classes)
)
optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
criterion = nn.CrossEntropyLoss()
# Create dummy training data for demonstration
train_data = torch.randn(1000, input_size)
train_targets = torch.randint(0, num_classes, (1000,))
train_dataset = TensorDataset(train_data, train_targets)
dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
# Training loop - metrics captured via framework hooks
for epoch in range(epochs):
for batch_idx, (inputs, targets) in enumerate(dataloader):
optimizer.zero_grad()
loss = criterion(model(inputs), targets)
loss.backward()
optimizer.step() # Learning rate automatically logged
# Loss and gradient norms captured automatically (if track_model_gradients=True)
# Optional: Break after a few batches for demonstration
if batch_idx >= 2:
break
experiment.end()
How Automagic Works¶
Intelligent Parameter Detection¶
Automagic uses sophisticated heuristics to identify ML-relevant parameters:
1. Name Pattern Recognition¶
# These are automatically detected by name patterns
learning_rate = 0.001 # Contains "rate"
batch_size = 64 # Contains "size"
num_layers = 5 # Starts with "num_"
hidden_dim = 256 # Contains "dim"
max_epochs = 100 # Contains "epoch"
2. Value Range Analysis¶
# Detected by typical ML value ranges
learning_rate = 3e-4 # 0.00001 - 0.1 range
dropout = 0.3 # 0 - 1 range for rates
batch_size = 128 # 1 - 1024 range (power of 2)
temperature = 2.0 # Scientific notation
3. Data Type Classification¶
# Boolean hyperparameters
use_layer_norm = True # Boolean flags
enable_dropout = False # use_*, enable_*, has_*
# String configurations
optimizer = "adamw" # Optimizer names
activation = "gelu" # Activation functions
lr_scheduler = "cosine" # Scheduler types
4. Keyword Detection¶
# ML-specific keywords automatically recognized
alpha = 0.7 # Regularization parameter
beta1 = 0.9 # Optimizer beta
epsilon = 1e-8 # Numerical stability
patience = 10 # Early stopping
threshold = 1e-4 # Convergence threshold
Framework Hooks¶
Automagic automatically instruments popular ML frameworks:
PyTorch Integration¶
- Optimizer hooks: Automatically log learning rates and gradient norms
- Loss function hooks: Capture loss values during forward passes (Note: Only works with nn.Module-based losses like nn.CrossEntropyLoss(), not functional equivalents like torch.nn.functional.cross_entropy)
- Model hooks: Track model architecture and parameter counts
- Checkpoint hooks: Monitor model saving and loading
Scikit-learn Integration¶
- Estimator hooks: Capture model hyperparameters during
.fit()
calls - Dataset hooks: Log training set size and feature dimensions
- Prediction hooks: Track inference statistics
XGBoost Integration¶
- Training hooks: Capture boosting parameters and evaluation metrics
- Parameter extraction: Automatic detection of tree-specific settings
Smart Filtering¶
Automagic intelligently excludes non-relevant variables:
# ❌ Automatically excluded
i = 0 # Loop variables
model = nn.Sequential() # Complex objects
device = "cuda" # System variables
tmp_value = 123 # Temporary variables
_private_var = "test" # Private variables
# ✅ Automatically included
learning_rate = 0.001 # ML hyperparameter
batch_size = 64 # Training parameter
use_attention = True # Boolean configuration
Configuration¶
Automagic Settings¶
Control automagic behavior through configuration:
from tracelet.automagic import AutomagicConfig
config = AutomagicConfig(
# Hyperparameter detection
detect_function_args=True, # Function argument scanning
detect_class_attributes=True, # Class attribute detection
detect_argparse=True, # Command-line argument parsing
detect_config_files=True, # Configuration file parsing
# Model tracking
track_model_architecture=True, # Model structure capture
track_model_checkpoints=True, # Checkpoint monitoring
track_model_gradients=False, # Gradient tracking (expensive)
# Dataset tracking
track_dataset_info=True, # Dataset statistics
track_data_samples=False, # Data sample logging (privacy)
# Training monitoring
monitor_training_loop=True, # Training progress detection
monitor_loss_curves=True, # Loss trend analysis
monitor_learning_rate=True, # LR schedule tracking
# Resource monitoring
monitor_gpu_memory=True, # GPU usage tracking
monitor_cpu_usage=True, # CPU utilization
# Framework selection
frameworks={"pytorch", "sklearn", "xgboost"}
)
experiment = Experiment(
"configured_experiment",
automagic=True,
automagic_config=config
)
Environment Variables¶
Configure automagic via environment variables:
# Enable/disable automagic
export TRACELET_ENABLE_AUTOMAGIC=true
# Select frameworks to instrument
export TRACELET_AUTOMAGIC_FRAMEWORKS="pytorch,sklearn"
# Control detection scope
export TRACELET_DETECT_FUNCTION_ARGS=true
export TRACELET_DETECT_CLASS_ATTRIBUTES=true
export TRACELET_TRACK_MODEL_ARCHITECTURE=true
Advanced Usage¶
Manual Hyperparameter Capture¶
Force capture of specific variables:
from tracelet.automagic import capture_hyperparams
# Capture current scope variables and log them automatically
# Note: This function automatically logs to the experiment, no need to call log_params
hyperparams = capture_hyperparams(experiment)
# Alternative: Use the experiment's built-in method
hyperparams = experiment.capture_hyperparams()
Custom Detection Rules¶
Automagic uses built-in detection patterns that can be customized through configuration:
from tracelet.automagic import AutomagicConfig
# Configure detection patterns through AutomagicConfig
config = AutomagicConfig(
# Enable/disable detection methods
detect_function_args=True,
detect_argparse=True,
detect_config_files=True,
# Framework-specific detection
frameworks={"pytorch", "sklearn", "xgboost"}
)
experiment = Experiment(
"custom_detection",
automagic=True,
automagic_config=config
)
Integration with Existing Code¶
Automagic works seamlessly with existing tracking:
experiment = Experiment("mixed_tracking", automagic=True)
experiment.start()
# Automagic captures these automatically
learning_rate = 0.001
batch_size = 64
# Manual logging still works
experiment.log_params({
"model_name": "custom_transformer",
"dataset_version": "v2.1"
})
# Both automatic and manual tracking combined!
Performance Considerations¶
Overhead¶
Automagic is designed for minimal performance impact:
- Frame inspection: ~0.1ms per variable check
- Hook installation: One-time cost at experiment start
- Metric capture: Asynchronous, non-blocking
- Memory usage: <10MB for typical experiments
Best Practices¶
- Scope management: Define hyperparameters at function/class level
- Naming conventions: Use descriptive, ML-specific variable names
- Framework integration: Let automagic handle metric capture
- Selective enabling: Disable expensive features if not needed
# ✅ Good practice
def train_model():
learning_rate = 0.001 # Clear scope
batch_size = 64 # Descriptive name
experiment = Experiment("training", automagic=True)
# Automagic captures hyperparameters from this scope
# ❌ Avoid
learning_rate = 0.001 # Global scope (harder to detect)
lr = 0.001 # Ambiguous name
Troubleshooting¶
Common Issues¶
Hyperparameters not detected:
# Check variable scope and naming
def train():
learning_rate = 0.001 # ✅ Function scope
lr = 0.001 # ❌ Ambiguous name
# Ensure automagic is enabled
experiment = Experiment("test", automagic=True) # ✅
Framework hooks not working:
# Import frameworks after starting experiment
experiment.start()
import torch # ✅ Hooks installed after this import
# Or restart Python session if hooks conflict
Performance concerns:
# Disable expensive features
config = AutomagicConfig(
track_model_gradients=False, # Expensive
track_data_samples=False, # Privacy risk
monitor_cpu_usage=False # High frequency
)
Debug Mode¶
Enable detailed logging to understand automagic behavior:
import logging
logging.getLogger("tracelet.automagic").setLevel(logging.DEBUG)
experiment = Experiment("debug", automagic=True)
# Detailed logs show detection process
Examples¶
See comprehensive examples in the examples documentation and the examples directory for automagic usage patterns and best practices.
API Reference¶
For detailed API documentation, refer to the automagic module source code and docstrings.