Blog — 99x.dev

Jan 2025 LLMs

Building Production RAG Systems: Lessons from the Trenches

Retrieval-augmented generation sounds simple in theory. In practice, getting it right requires careful attention to chunking strategies, embedding models, and retrieval pipelines.

Read article →

Dec 2024 Infrastructure

Scaling ML Inference: From Single GPU to Distributed Systems

When your model outgrows a single machine, the real engineering challenges begin. A practical guide to distributed inference with Ray and Kubernetes.

Read article →

Nov 2024 Architecture

Actor-Critic in Production: Reinforcement Learning Beyond Research

Moving RL from notebooks to production systems requires rethinking everything from training loops to deployment strategies. Here's what we learned.

Read article →

Oct 2024 Engineering

The Case for Boring AI Infrastructure

Not every problem needs a cutting-edge solution. Sometimes PostgreSQL, Redis, and well-structured Python get you further than the latest framework.

Read article →