DevOps with love
  • Home
  • About
  • Showcases

"We're running AI/ML workloads and don't know how to optimize cost or latency."

From inference on Kubernetes to GPU/spot usage and scaling: right-sizing, spot mixes, and observability for AI pipelines. I’ve optimized infra where AI workloads run so cost and performance are predictable instead of a black box.

© 2010 - 2026 Kirill Kazakov · Powered by Hugo & Coder.