Your Name a39323db70 litellm: fix direct_* model IDs — gemini 2.5, drop o3-mini
Verified the direct_* providers end-to-end after billing was enabled.
- OpenAI direct_gpt-4o / direct_gpt-4o-mini: working.
- Gemini: gemini-2.0-flash 404s (LiteLLM 1.55.10 rewrites it to a retired
  experimental name) and gemini-1.5-pro is retired -> switch to the current GA
  gemini-2.5-flash / gemini-2.5-pro (both verified).
- Drop direct_o3-mini: o-series needs max_completion_tokens, which 1.55.10 won't
  translate from the max_tokens clients (Open WebUI) send -> 400. Re-add after a
  LiteLLM bump.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 12:39:17 -04:00

homelab-ansible-lxc-meridian

Ansible config for the Meridian + LiteLLM LXC (CTID 457 on pve01, 192.168.1.164).

What it is

Two services on one LXC, sharing one Claude Max OAuth subscription:

  • Meridian (rynfar/meridian, port 3456) — local Anthropic API server backed by the Claude Code SDK. Translates /v1/messages calls into Claude Code SDK query() calls. No auth at this layer — LAN reachability is the gate.
  • LiteLLM (berriai/litellm, port 4000) — OpenAI-compatible proxy that fronts Meridian. Lets clients that only speak OpenAI (Pulse, paperless-ai, etc.) ride the same Max sub. Master-key auth required.
  Anthropic-format client  ─────────────────────►  :3456  Meridian  ─►  Claude Max
                                                                       (OAuth)
  OpenAI-format client     ─►  :4000  LiteLLM  ─►  127.0.0.1:3456 ─────►   ↑
                              (master key)

Wired today: Pulse (Settings → AI → OpenAI provider). Planned: paperless-ai, HAOS conversation agent (via custom_component fork that adds CONF_BASE_URL).

Architecture

  • Debian 12 LXC, no Docker (everything native)
  • Meridian: Node 22 from NodeSource apt + npm i -g @rynfar/meridian. systemd unit meridian.service as user meridian, bound to 0.0.0.0:3456, HOME=/opt/meridian.
  • LiteLLM: Python venv at /opt/litellm/venv + pip install 'litellm[proxy]'. systemd unit litellm.service as user litellm, bound to 0.0.0.0:4000. Requires=meridian.service so it can't outlive the backend.
  • OAuth credentials at /opt/meridian/.claude/ (mode 0700, owned by meridian).
  • LITELLM_MASTER_KEY at /opt/litellm/litellm.env (mode 0600, owned by litellm). Source of truth in Infisical /meridian/vault_litellm_master_key. Pulled by deploy.sh on the controller and exported for the playbook to consume.
  • No Caddy, no Cloudflare. Both ports exposed via the same UDM alias meridian.lan.balders.ca → .164.

Bootstrap

  1. Provision the LXC via homelab-terraform/lxc (terraform apply).
  2. Run the LXC bootstrap one-liner from feedback_lxc_bootstrap_user:
    ssh root@192.168.1.164 'apt-get update && apt-get install -y sudo && useradd -m -s /bin/bash cbalders && echo "cbalders ALL=(ALL) NOPASSWD:ALL" >/etc/sudoers.d/90-cbalders && chmod 440 /etc/sudoers.d/90-cbalders'
    
    (Plus authorized_keys for cbalders.)
  3. Local first deploy (Semaphore can't reach a fresh host):
    ./deploy.sh
    
    Expect: Node 22 installed, @rynfar/meridian installed, systemd unit deployed and enabled but not started (no creds yet — claude_creds.stat.exists gates the start task).
  4. OAuth bootstrap — run claude auth login --claudeai directly on the LXC via the bundled binary. Do not scp ~/.claude/ from your Mac — macOS keeps the refresh token in the Keychain and the snapshot 401s as soon as the short-lived access token expires (incident write-up: 2026-05-17 → 2026-05-19, see project_meridian).
    # Stop the service so it's not racing the auth writer.
    ssh cbalders@192.168.1.164 sudo systemctl stop meridian
    
    # Paste-code flow as the meridian user (needs -t for TTY).
    ssh -t cbalders@192.168.1.164 \
      'sudo -u meridian -H /usr/lib/node_modules/@rynfar/meridian/node_modules/@anthropic-ai/claude-code/bin/claude.exe auth login --claudeai'
    # → prints https://claude.com/cai/oauth/authorize?... — paste into a Mac
    #   browser, log in with the Max account, paste the code back.
    # → ends with: Login successful.
    
    ssh cbalders@192.168.1.164 sudo systemctl start meridian
    
    # Verify (expect loggedIn: true, subscriptionType: max):
    ssh cbalders@192.168.1.164 \
      'sudo -u meridian -H /usr/lib/node_modules/@rynfar/meridian/node_modules/@anthropic-ai/claude-code/bin/claude.exe auth status'
    
  5. Smoke from a LAN host (Anthropic format, direct):
    curl http://192.168.1.164:3456/v1/messages \
      -H 'Content-Type: application/json' \
      -H 'anthropic-version: 2023-06-01' \
      -d '{"model":"claude-haiku-4-5","max_tokens":40,"messages":[{"role":"user","content":"reply with the single word: pong"}]}'
    
  6. Smoke via LiteLLM (OpenAI format, master-key auth):
    KEY=$(infisical secrets get vault_litellm_master_key --env prod --path /meridian --plain)
    curl http://192.168.1.164:4000/v1/chat/completions \
      -H "Authorization: Bearer $KEY" -H 'Content-Type: application/json' \
      -d '{"model":"claude-haiku-4-5","max_tokens":40,"messages":[{"role":"user","content":"reply with the single word: pong"}]}'
    

Wiring a client

Client type Endpoint Auth
Anthropic-native (HAOS, Cline, Aider, OpenCode) http://meridian.lan.balders.ca:3456 any x-api-key (ignored)
OpenAI-native (Pulse, paperless-ai, Open WebUI) http://meridian.lan.balders.ca:4000/v1 Authorization: Bearer $LITELLM_MASTER_KEY

Available model aliases (same on both endpoints, all backed by Claude Max): claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-7.

Pulse (proven 2026-05-19)

PULSE_ADMIN_TOKEN=$(infisical secrets get vault_pulse_admin_token --env prod --path /pulse --plain)
LITELLM_KEY=$(infisical secrets get vault_litellm_master_key --env prod --path /meridian --plain)
curl -X POST https://pulse.balders.ca/api/settings/ai/update \
  -H "X-API-Token: $PULSE_ADMIN_TOKEN" -H 'Content-Type: application/json' \
  -d "{\"provider\":\"openai\",\"openai_api_key\":\"$LITELLM_KEY\",\"openai_base_url\":\"http://meridian.lan.balders.ca:4000/v1\",\"model\":\"claude-haiku-4-5\",\"enabled\":true}"
# verify
curl -X POST https://pulse.balders.ca/api/ai/test -H "X-API-Token: $PULSE_ADMIN_TOKEN" \
  -d '{"provider":"openai","model":"claude-haiku-4-5"}'   # → {"success":true,...}

Operations

  • Subsequent deploys: via Semaphore template "Meridian Deploy" (scheduled Sun 02:55 EDT). LITELLM_MASTER_KEY is auto-reconciled into Semaphore environment 4 by homelab-ansible-lxc-semaphore/scripts/sync-semaphore-state.py (merge-only ENVIRONMENT_KEYS step).
  • Token refresh: handled automatically by the Claude Code SDK. Manual fallback: sudo -u meridian /usr/bin/meridian refresh-token.
  • Restart after creds change: sudo systemctl restart meridian (LiteLLM follows automatically via Requires=).
  • Rotate master key: update /meridian/vault_litellm_master_key in Infisical, redeploy, update consumers (Pulse, paperless-ai, etc.).
  • Logs: journalctl -u meridian -f / journalctl -u litellm -f.

Files

roles/meridian/        Node 22 + npm i @rynfar/meridian + systemd unit
roles/litellm/         Python venv + pip install litellm[proxy] + systemd unit
roles/node_exporter/   Prometheus exporter for fleet metrics
vars/main.yml          base packages, ssh keys, meridian + litellm config
site.yml               playbook entrypoint (sanity-check assert on LITELLM_MASTER_KEY)
inventory.ini          single host (192.168.1.164)
deploy.sh              wrapper for local first-run; pulls LITELLM_MASTER_KEY from Infisical

Memory pointers

  • project_meridian — overall design, OAuth model, consumers
  • feedback_local_dns_only — DNS convention (no public CF for services)
  • feedback_lxc_bootstrap_user — root bootstrap pattern for fresh LXCs
  • feedback_fresh_host_bootstrap — Semaphore can't reach fresh hosts

Logging

Ships systemd journald only (no Docker on this LXC) to Loki on observe.lan.balders.ca:3100 via Grafana Alloy. Bare-metal Alloy installed via Grafana apt repo (not a container) because Meridian + LiteLLM are systemd services. Query {host="meridian", unit="meridian.service"} or {host="meridian", unit="litellm.service"} in Grafana Explore.

S
Description
Meridian — local Anthropic API powered by Claude Max OAuth, for HAOS LLM integration
Readme 97 KiB
Languages
Jinja 68.7%
Shell 31.3%