Sandeep Yadav
AI Engineer
New Delhi, India
4+ years building everything from scratch.
From fine-tuning LLMs to shipping agentic workflows handling real-world traffic — I've owned the full lifecycle.
One day, I stopped chasing titles. I started chasing clarity.
What drives me doesn't fit on a resume.
Building systems that last.
12+ open-source contributions. 3 production ML pipelines. Millions of inference requests served.
Training, experiment tracking, inference at scale, and the tooling that holds it all together.
Bigger problems. Harder systems. End-to-end.
Ready for what's next.
Start Here
New here? These are the best places to begin.
A HuggingFace Transformers contribution that powers thousands of inference pipelines.
A deep dive into building reliable, production-grade ML infrastructure.
Thoughts on practical AI engineering and shipping models to production.
Experience
Building LLM-powered applications and agentic workflows. Deploying inference pipelines on AWS.
- Shipped 3 production LLM apps serving 100K+ daily users
- Reduced inference latency by 40% with custom vLLM deployment
Designed ML pipelines and built real-time feature stores serving 50M+ predictions/day.
- Built real-time feature store serving 50M+ predictions/day
- Reduced model training time by 60% with distributed training
Full-stack development with Python and React. Led migration of monolith to microservices on Kubernetes.
- Led monolith to microservices migration on Kubernetes
- Built CI/CD pipelines reducing deploy time from hours to minutes
Open Source
Added efficient batch decoding for streaming inference pipelines.
Implemented custom sampling strategies for domain-specific generation.
Opinionated ML pipeline toolkit for rapid experimentation and deployment.
Speaking
Skills
ML / AI
MLOps & Data
Programming
Infra & Cloud
Soft Skills
Spoken Languages
Education
Focus on Machine Learning and Natural Language Processing.
Graduated with honors. Thesis on deep learning for medical imaging.
Led by Dr. Marily Nika (ex-Google PM). Completed capstone project.
Awards
Built a multi-agent document understanding pipeline in 48 hours.
Recognized for sustained contributions to ML ecosystem projects.
Testimonials
"One of the most thoughtful engineers I've worked with. Takes complex ML problems and delivers clean, production-ready solutions."
"Their open-source contributions to our inference pipeline saved us weeks of work. Clear code, excellent documentation."
"Rare combination of deep ML knowledge and strong engineering fundamentals. Ships reliable systems, not just notebooks."
FAQ
Yes — I take on select projects involving LLM applications, ML infrastructure, and AI strategy. Reach out via email to discuss.
Python + PyTorch for ML, HuggingFace for models, FastAPI for serving, Docker + K8s for deployment, and AWS for cloud infrastructure.
Actively. I contribute to HuggingFace Transformers, vLLM, and maintain a few of my own tools. Check the Open Source section above.
Use the Calendly link on the contact page, or send me an email. I typically respond within 48 hours.