Files
homelab-ansible-lxc-meridian/roles/litellm/templates/litellm-config.yaml.j2
T
Your Name 211d26cc63 litellm: re-index models with local_/proxy_/direct_ prefixes + scaffold OpenAI+Gemini
Backend-prefix taxonomy so the Open WebUI picker is self-documenting and a
model name can't lie about where it routes:
  local_*  -> Anvil/Ollama (free)        e.g. local_qwen2.5-72b
  proxy_*  -> Claude via Meridian/Max     e.g. proxy_claude-sonnet-4-6
  direct_* -> metered OpenAI/Gemini       e.g. direct_gpt-4o, direct_gemini-2.0-flash

Drops the redundant -max suffix (proxy_ already implies Max). api_base is now
emitted only when a model defines it, so direct_* hit the provider default
endpoint instead of Meridian. direct_* are SCAFFOLDED (no live keys): litellm.env
writes a placeholder so the proxy boots; deploy.sh pulls OPENAI_API_KEY/
GEMINI_API_KEY from Infisical /meridian if present (non-fatal). They 401 until
real keys land.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 12:05:55 -04:00

31 lines
1.1 KiB
Django/Jinja

# {{ ansible_managed }}
#
# LiteLLM proxy config. Routes OpenAI-shaped requests to backends by the
# model-name prefix set in vars (litellm_models):
# - proxy_* → Meridian's /v1/messages (same host, :3456), which ignores the
# upstream API key (placeholder); the Max-OAuth sub pays. Explicit api_base.
# - local_* → Anvil's Ollama (OpenAI-compatible, http://192.168.1.150:11434).
# Explicit api_base.
# - direct_* → a public provider (OpenAI/Gemini). NO api_base → LiteLLM uses
# the provider default endpoint; api_key reads os.environ/<PROVIDER>_API_KEY.
# api_base is emitted only when a model defines it; omit it to reach a provider
# default.
model_list:
{% for m in litellm_models %}
- model_name: {{ m.name }}
litellm_params:
model: {{ m.backend }}
{% if m.api_base is defined %}
api_base: {{ m.api_base }}
{% endif %}
api_key: {{ m.api_key | default('placeholder-meridian-ignores-this') }}
{% endfor %}
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
litellm_settings:
drop_params: true # tolerate clients sending unsupported params
set_verbose: false