Talk: “Beyond Accuracy — Monitoring LLMs in Production”
Most teams ship LLMs with basic latency/error monitoring and call it done. This talk covered the monitoring patterns that actually catch problems before users complain.
Topics Covered
- Semantic drift detection — embedding-based monitoring to catch when input distributions shift
- Response quality scoring — lightweight LLM-as-judge pipelines that run async on sampled outputs
- Cost attribution — tracing token usage back to features and user segments
- Alert fatigue prevention — adaptive thresholds that account for natural usage pattern changes
Organizer Role
This was the 24th edition of our monthly meetup. As co-organizer I handle venue logistics, speaker sourcing, and post-event content publishing. We’ve grown from 20 to 85 regular attendees over two years.