Distributed Rate Limiting Guide
Phlow includes distributed rate limiting to protect your agents from abuse and ensure fair resource usage across multiple instances.
Overview
The rate limiting system provides: - Distributed limiting - Works across multiple instances using Redis - Automatic fallback - Falls back to in-memory limiting if Redis is unavailable - Sliding window algorithm - Accurate request counting over time - Multiple limit types - Different limits for authentication and role requests
Configuration
Basic Setup (In-Memory)
By default, Phlow uses in-memory rate limiting which works for single instances:
from phlow import PhlowMiddleware, PhlowConfig
# Rate limits are configured automatically:
# - Authentication: 60 requests per minute
# - Role requests: 10 requests per minute
middleware = PhlowMiddleware(config)
Distributed Setup (Redis)
For production deployments with multiple instances, enable Redis:
# Set Redis URL in environment
export REDIS_URL="redis://localhost:6379/0"
# Or with authentication
export REDIS_URL="redis://:password@redis-server:6379/0"
Phlow will automatically detect and use Redis when available.
Redis Deployment Options
1. Local Redis
# Start Redis locally
docker run -d -p 6379:6379 redis:alpine
# Configure Phlow
export REDIS_URL="redis://localhost:6379/0"
2. Redis Cloud
# Use Redis Cloud connection string
export REDIS_URL="redis://:password@redis-cloud-endpoint:16379/0"
3. AWS ElastiCache
# Use ElastiCache endpoint
export REDIS_URL="redis://my-cache-cluster.abc123.cache.amazonaws.com:6379/0"
Rate Limit Configuration
Default Limits
Operation | Limit | Window | Identifier |
---|---|---|---|
Authentication | 60/min | 1 minute | Token hash |
Role Requests | 10/min | 1 minute | Agent ID |
Custom Rate Limiters
Create custom rate limiters for your endpoints:
from phlow.distributed_rate_limiter import DistributedRateLimiter
# Create a custom rate limiter
api_limiter = DistributedRateLimiter(
max_requests=100, # 100 requests
window_ms=60_000, # per minute
redis_url=os.getenv("REDIS_URL")
)
# Use in your endpoint
@app.post("/api/generate")
async def generate(request: Request):
# Rate limit by IP address
client_ip = request.client.host
api_limiter.check_and_raise(client_ip)
# Process request...
How It Works
Sliding Window Algorithm
The rate limiter uses a sliding window algorithm for accurate counting:
- Each request is timestamped and stored
- Old requests outside the window are removed
- Current window requests are counted
- New request is allowed if under limit
This provides smooth rate limiting without sudden resets.
Redis Implementation
When Redis is available: - Uses sorted sets for efficient storage - Atomic operations prevent race conditions - Automatic key expiration for cleanup - Handles Redis failures gracefully
Fallback Behavior
If Redis is unavailable: - Automatically falls back to in-memory limiting - Logs warning about degraded functionality - Continues protecting your service - No code changes required
Monitoring
Rate Limit Headers
Add rate limit headers to responses:
from phlow.distributed_rate_limiter import DistributedRateLimiter
@app.post("/api/endpoint")
async def endpoint(request: Request):
limiter = DistributedRateLimiter(100, 60_000)
# Check if allowed
if not limiter.is_allowed(client_id):
# Return rate limit headers
return JSONResponse(
status_code=429,
content={"error": "Rate limit exceeded"},
headers={
"X-RateLimit-Limit": "100",
"X-RateLimit-Remaining": "0",
"X-RateLimit-Reset": str(int(time.time()) + 60)
}
)
# Process request...
Redis Monitoring
Monitor Redis rate limiting:
# Connect to Redis
redis-cli
# Monitor rate limit keys
KEYS phlow:ratelimit:*
# Check specific rate limit
ZRANGE phlow:ratelimit:auth:abc123 0 -1 WITHSCORES
# Monitor in real-time
MONITOR
Best Practices
- Use Redis in Production - In-memory limiting doesn't work across instances
- Set Reasonable Limits - Balance security with user experience
- Monitor Usage - Track rate limit hits to adjust limits
- Use Different Identifiers - Rate limit by user ID, IP, or API key as appropriate
- Handle 429 Responses - Implement exponential backoff in clients
- Configure Redis Persistence - Ensure rate limits survive Redis restarts
Troubleshooting
Redis Connection Failed
If you see "Failed to connect to Redis, falling back to in-memory rate limiting":
- Check Redis is running:
redis-cli ping
- Verify connection string:
redis-cli -u $REDIS_URL ping
- Check network/firewall rules
- Verify Redis version >= 5.0
Rate Limits Not Working Across Instances
- Ensure all instances use the same Redis
- Check Redis connectivity from each instance
- Verify REDIS_URL is set correctly
- Monitor Redis keys to confirm usage
Performance Issues
- Use Redis connection pooling (automatic)
- Consider Redis cluster for high load
- Tune Redis memory settings
- Use local Redis replicas for read-heavy workloads
Advanced Usage
Custom Storage Backend
Implement custom storage for rate limiting:
from phlow.distributed_rate_limiter import RateLimitBackend
class DynamoDBRateLimitBackend(RateLimitBackend):
def __init__(self, table_name: str):
self.table = boto3.resource('dynamodb').Table(table_name)
def is_allowed(self, key: str, max_requests: int, window_ms: int) -> bool:
# Implement using DynamoDB
pass
def reset(self, key: str) -> None:
# Clear rate limit for key
pass
# Use custom backend
limiter = DistributedRateLimiter(
max_requests=100,
window_ms=60_000,
backend=DynamoDBRateLimitBackend("rate-limits")
)
Rate Limit Strategies
Different strategies for different use cases:
# Per-user limiting
user_limiter = DistributedRateLimiter(1000, 3600_000) # 1000/hour
user_limiter.check_and_raise(f"user:{user_id}")
# Per-IP limiting
ip_limiter = DistributedRateLimiter(100, 60_000) # 100/min
ip_limiter.check_and_raise(f"ip:{client_ip}")
# Per-API key limiting
key_limiter = DistributedRateLimiter(10000, 86400_000) # 10k/day
key_limiter.check_and_raise(f"key:{api_key}")
# Combined limiting (most restrictive wins)
for limiter, identifier in [
(user_limiter, f"user:{user_id}"),
(ip_limiter, f"ip:{client_ip}"),
(key_limiter, f"key:{api_key}")
]:
limiter.check_and_raise(identifier)