litellm: fix direct_* model IDs — gemini 2.5, drop o3-mini
Verified the direct_* providers end-to-end after billing was enabled. - OpenAI direct_gpt-4o / direct_gpt-4o-mini: working. - Gemini: gemini-2.0-flash 404s (LiteLLM 1.55.10 rewrites it to a retired experimental name) and gemini-1.5-pro is retired -> switch to the current GA gemini-2.5-flash / gemini-2.5-pro (both verified). - Drop direct_o3-mini: o-series needs max_completion_tokens, which 1.55.10 won't translate from the max_tokens clients (Open WebUI) send -> 400. Re-add after a LiteLLM bump. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+13
-10
@@ -108,25 +108,28 @@ litellm_models:
|
||||
backend: anthropic/claude-opus-4-7
|
||||
api_base: "http://127.0.0.1:{{ meridian_port }}"
|
||||
# ---- direct_* → public provider APIs, METERED, keys from Infisical /meridian ----
|
||||
# SCAFFOLDED 2026-06-05 with NO real keys: these 401 on call until
|
||||
# OPENAI_API_KEY / GEMINI_API_KEY land in Infisical /meridian and deploy.sh
|
||||
# pulls them (litellm.env carries a placeholder so the proxy still boots).
|
||||
# LIVE 2026-06-05 (OpenAI + Gemini keys verified end-to-end). Keys come from
|
||||
# Infisical /meridian via deploy.sh; litellm.env carries a placeholder so the
|
||||
# proxy still boots if a key is absent (that provider's models then 401).
|
||||
# No api_base → LiteLLM uses each provider's default endpoint; routing is by
|
||||
# the backend's provider prefix (openai/, gemini/).
|
||||
#
|
||||
# Gemini must be CURRENT model IDs: LiteLLM 1.55.10 rewrites gemini-2.0-flash
|
||||
# to a retired experimental name (404) and gemini-1.5-pro is itself retired.
|
||||
# 2.5-flash / 2.5-pro pass through clean. o-series (o3-mini) is intentionally
|
||||
# absent: it needs max_completion_tokens, which 1.55.10 won't translate from
|
||||
# the max_tokens that clients (Open WebUI) send → 400. Re-add after a bump.
|
||||
- name: direct_gpt-4o
|
||||
backend: openai/gpt-4o
|
||||
api_key: os.environ/OPENAI_API_KEY
|
||||
- name: direct_gpt-4o-mini
|
||||
backend: openai/gpt-4o-mini
|
||||
api_key: os.environ/OPENAI_API_KEY
|
||||
- name: direct_o3-mini
|
||||
backend: openai/o3-mini
|
||||
api_key: os.environ/OPENAI_API_KEY
|
||||
- name: direct_gemini-2.0-flash
|
||||
backend: gemini/gemini-2.0-flash
|
||||
- name: direct_gemini-2.5-flash
|
||||
backend: gemini/gemini-2.5-flash
|
||||
api_key: os.environ/GEMINI_API_KEY
|
||||
- name: direct_gemini-1.5-pro
|
||||
backend: gemini/gemini-1.5-pro
|
||||
- name: direct_gemini-2.5-pro
|
||||
backend: gemini/gemini-2.5-pro
|
||||
api_key: os.environ/GEMINI_API_KEY
|
||||
# Master key is required by LiteLLM. Pulled at deploy time from Infisical
|
||||
# /meridian/vault_litellm_master_key and passed via -e on the playbook
|
||||
|
||||
Reference in New Issue
Block a user