Backend-prefix taxonomy so the Open WebUI picker is self-documenting and a
model name can't lie about where it routes:
local_* -> Anvil/Ollama (free) e.g. local_qwen2.5-72b
proxy_* -> Claude via Meridian/Max e.g. proxy_claude-sonnet-4-6
direct_* -> metered OpenAI/Gemini e.g. direct_gpt-4o, direct_gemini-2.0-flash
Drops the redundant -max suffix (proxy_ already implies Max). api_base is now
emitted only when a model defines it, so direct_* hit the provider default
endpoint instead of Meridian. direct_* are SCAFFOLDED (no live keys): litellm.env
writes a placeholder so the proxy boots; deploy.sh pulls OPENAI_API_KEY/
GEMINI_API_KEY from Infisical /meridian if present (non-fatal). They 401 until
real keys land.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
LiteLLM sits in front of Meridian for clients that can't talk Anthropic's
/v1/messages format (Pulse OpenAI provider, paperless-ai, etc.). Routes
OpenAI-shaped requests to localhost:3456 (Meridian) which forwards to the
Max sub.
- New roles/litellm/ — Python venv, pip install litellm[proxy], systemd
- vars/main.yml — model map (haiku/sonnet/opus) + LITELLM_MASTER_KEY env lookup
- site.yml — adds litellm role + sanity-check assert
- deploy.sh — pulls LITELLM_MASTER_KEY from Infisical (/meridian/) on the
controller and exports it for the playbook
- New Infisical secret /meridian/vault_litellm_master_key
Smoke: Pulse → LiteLLM /v1/chat/completions → Meridian /v1/messages → Max sub
returns "pong" through both the LiteLLM master key auth and the Claude Code
SDK OAuth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>