Deployment Patterns

How to structure an AI deployment that won’t collapse under its own weight.

For Solo Implementers

One person, no budget, no dedicated IT team. The most common scenario and the most fragile.

Key concepts:

Self-Hosted AI — Reduce dependency on external APIs
Human-in-the-Loop — Keep a safety net until you’re confident
Scope Creep — Start narrow, expand slowly

For Small Teams (2-10 people)

Shared workflows, overlapping responsibilities, and the first governance challenges.

Key concepts:

Shadow AI — Govern what you can’t see
Adoption Stall — Drive usage or watch it die
Silent Failure — Catch problems before they compound

Layer Hubs

Infrastructure Layer — Self-hosting vs cloud, GPU sizing
Automation Layer — Agents, workflows, HITL
Strategy & Planning — Budgets, scope, TCO

Medium Team Patterns (10-50 people)

Multiple users hitting shared AI infrastructure, first real governance requirements, cost visibility needed per team.

LiteLLM — Route traffic across models; enforce per-team spend limits with virtual keys
Langfuse — Trace agent decisions across multiple users and sessions; catch drift before it compounds
Open WebUI — Give teams a ChatGPT-like interface over self-hosted models without individual API access
Vendor Lock-In — Medium teams are when lock-in costs become real; evaluate exit paths before committing

Scaling Considerations

Model Quantization — Squeeze more users from the same GPU by choosing the right quantization level
Infrastructure Layer — Self-hosting vs cloud, GPU sizing, and the breakeven math

Each pattern involves real tradeoffs between cost, control, and complexity. The right pattern for a 15-person team looks nothing like the right pattern for a solo implementer.

WyrdWerk Deployment Wiki

Explorer

Deployment Patterns

For Solo Implementers

For Small Teams (2-10 people)

Layer Hubs

Medium Team Patterns (10-50 people)

Scaling Considerations

Graph View

Table of Contents

Backlinks