As AI Engineer at Acme AI, I work across the full LLM application lifecycle — from model fine-tuning to production deployment.
Key Achievements
- RAG platform — built enterprise RAG system serving 10K+ queries/day with 94% accuracy
- Fine-tuning pipeline — established QLoRA fine-tuning workflow that ships 3 domain models/quarter
- Inference optimization — migrated to vLLM, reducing serving costs by 65%
- Agent framework — designed LangGraph-based agent orchestration adopted across 3 product teams
Tech Stack
Python, PyTorch, LangGraph, LangChain, HuggingFace, Unsloth, vLLM, MLflow, Qdrant, FastAPI, AWS