Updates
Written thoughts and team deep-dives on everything you need to know about modern AI inference, orchestration, and production serving — by PipeShift.
Posts

How to Deploy Gemma 4
Google's best open model, without the complexity.
4/8/26

How to Deploy Gemma 3
Google's open model, self-hosted and production-ready.
3/30/26

How to Deploy Whisper v3
OpenAI's best transcription model, yours to self-host.
3/26/26

How to Deploy DeepSeek v3.2
Run DeepSeek V3.2 on Your Own Infrastructure
3/24/26

The Black Box Trap: How Rented AI Kills Margins at Scale
What rented inference costs you and how to avoid it.
3/9/26

Understanding Latency in AI Model Deployment
Components that make up latency in AI Agents
3/4/26