A comprehensive overview of modern ML training infrastructure, covering cloud agnosticism, spot instances, on-premise solutions, heterogeneous hardware, distributed training, and emerging GPU cloud providers.
Blog
Machine Learning Infrastructure ML Training Infrastructure Cloud Agnostic ML Spot Training ML On-Premise ML Training Heterogeneous Hardware ML Distributed Training ML GPU Cloud Providers Skypilot AI Infrastructure MLOps Cloud Computing for ML Cost-Effective ML Training Scalable ML Infrastructure Modern ML Training
Read more
A practical guide for startup founders on when and how to invest in MLOps - from early stage flexibility to scaling infrastructure, with key principles and pitfalls to avoid.
MLOps Machine Learning Startups Infrastructure AI Engineering ML Engineering Model Deployment LLMOps ML Tooling Data Pipelines Observability Training Infrastructure AI in Startups Scaling ML Teams
Read more
Explore why building tech for India means rethinking everything from the ground up—addressing local habits, scale, price sensitivity, and cultural diversity with first-principles design.
Tech in India First Principles Thinking Product Localization Digital India AI in India
Read more
The history of Full stack, Machine learning which results in the mergence of the ML full stack engineer
Machine Learning Ecosystem Full stack ML frameworks Cloud infrastructure
Read more
Problems in the ML ecosystem. Fragmentation in machine learning, that keeps preventing the stack from growing higher. How I took a stab at the issue.
Machine Learning PyTorch Ecosystem Fragmentation ML frameworks
Read more
A guide to setting up a productive Python environment using Pipenv and Pyenv
Python Pipenv Pyenv Development Productivity
Read more