homelab-ansible-lxc-meridian

Author	SHA1	Message	Date
Your NameandClaude Opus 4.8	02c2f4ee2d	litellm: pull /meridian secrets in-playbook from Infisical (runner-agnostic) Replaces the deploy.sh env-var hand-off (which only worked locally and would have made Semaphore write placeholder keys, regressing direct_*) with the standard in-playbook Infisical pull used by dawarich/mcp/cloudflared: - site.yml pre_tasks: login via the shared 828d2cc8 machine identity, read /meridian as_dict, set_fact litellm_master_key + the openai/gemini keys. - vars/vault.yml: shared ansible-vault client secret (copied from sibling repo). - requirements.yml: + infisical.vault. - deploy.sh: drop the infisical-CLI pulls; add --ask-vault-pass. Same secret path for Semaphore and local — no per-template env wiring. Deploy prereqs: attach the ansible-vault password to Semaphore template 27, and ensure the 828d2cc8 identity can read /meridian (env prod). Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>	2026-06-05 13:00:54 -04:00
Your NameandClaude Opus 4.8	a39323db70	litellm: fix direct_* model IDs — gemini 2.5, drop o3-mini Verified the direct_* providers end-to-end after billing was enabled. - OpenAI direct_gpt-4o / direct_gpt-4o-mini: working. - Gemini: gemini-2.0-flash 404s (LiteLLM 1.55.10 rewrites it to a retired experimental name) and gemini-1.5-pro is retired -> switch to the current GA gemini-2.5-flash / gemini-2.5-pro (both verified). - Drop direct_o3-mini: o-series needs max_completion_tokens, which 1.55.10 won't translate from the max_tokens clients (Open WebUI) send -> 400. Re-add after a LiteLLM bump. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>	2026-06-05 12:39:17 -04:00
Your NameandClaude Opus 4.8	211d26cc63	litellm: re-index models with local_/proxy_/direct_ prefixes + scaffold OpenAI+Gemini Backend-prefix taxonomy so the Open WebUI picker is self-documenting and a model name can't lie about where it routes: local_* -> Anvil/Ollama (free) e.g. local_qwen2.5-72b proxy_* -> Claude via Meridian/Max e.g. proxy_claude-sonnet-4-6 direct_* -> metered OpenAI/Gemini e.g. direct_gpt-4o, direct_gemini-2.0-flash Drops the redundant -max suffix (proxy_ already implies Max). api_base is now emitted only when a model defines it, so direct_* hit the provider default endpoint instead of Meridian. direct_* are SCAFFOLDED (no live keys): litellm.env writes a placeholder so the proxy boots; deploy.sh pulls OPENAI_API_KEY/ GEMINI_API_KEY from Infisical /meridian if present (non-fatal). They 401 until real keys land. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>	2026-06-05 12:05:55 -04:00
Your NameandClaude Opus 4.7	b64a95a71b	litellm: drop claude-/gpt- shadow aliases Honest model names only — local picks up real Ollama names (qwen2.5-72b, llama-3.3-70b, llama-3.1-8b, nomic-embed-text), Claude via *-max only. The shadows were briefly useful (paperless-ai wizard probe quirk) and then briefly used to make the ALL-LOCAL cutover transparent to clients, but having "claude-sonnet-4-6" silently route to llama3.3:70b in the Open WebUI picker was a constant foot-gun. Pulse re-pointed to a clean alias in its UI prior to this push; paperless-ai was already on qwen2.5-72b. Trade-off captured in [[litellm-openai-alias-shadowing]]. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>	2026-05-29 20:07:49 -04:00
Your NameandClaude Opus 4.7	e866d0c89f	litellm: add qwen2.5-72b alias (Anvil) as the best-quality local model Replaces the short-lived mistral-large alias. Backed by ollama_chat/qwen2.5:72b on Anvil. Consumers (paperless-ai, RAG chat, HA, morning-report) target this. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>	2026-05-28 22:26:34 -04:00
Your NameandClaude Opus 4.7	c29e24b51b	litellm: route all homelab LLM load to Anvil/Ollama by default Per-model api_base/api_key overrides in the template (default stays Meridian's local port). All standard aliases (claude-, gpt-) now point at Anvil's Ollama (mini/haiku-class -> llama3.1:8b, rest -> llama3.3:70b). Claude/Max reachable only via new *-max escape-hatch aliases. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>	2026-05-28 11:16:46 -04:00
Your Name	bee546cea8	alloy: cutover prometheus.exporter.unix to standard job names Drops the _canary suffix on alloy_prom_job. Prometheus retired its static node_* scrape jobs in the same release; Alloy's remote_write fills the gap with identical job/instance/group/hostname labels.	2026-05-21 20:52:50 -04:00
Your Name	40af073d9c	alloy: add prometheus.exporter.unix canary (Track A fleet rollout) Embeds node_exporter inside Alloy alongside Loki shipping; pushes metrics via remote_write to observe Prom with job=node_lxc_canary to run side-by-side with the existing node_exporter scrape until cutover. See homelab-docs/docs/audit/alloy-consolidation-2026-05-21.md.	2026-05-21 19:21:22 -04:00
Your Name	03d1d4630f	alloy: bare-metal systemd shipper for journald → Loki Meridian + LiteLLM both run as systemd services on this LXC (no docker) so the Docker-container Alloy pattern from other repos doesn't apply. Apt-install grafana/alloy via apt.grafana.com, journald-only scrape, ships to Loki on observe.lan.balders.ca. Side benefit: Meridian.service + LiteLLM.service logs (including the gpt-* alias shadowing requests from paperless-ai) now searchable in Loki, not just journalctl on the LXC.	2026-05-19 22:49:44 -04:00
Your NameandClaude Opus 4.7	49c6e10574	litellm: shadow gpt-4o-mini / gpt-4o / gpt-4-turbo aliases onto Claude backends paperless-ai's setup wizard validates the OpenAI provider by hardcoding model=gpt-4o-mini in the probe, regardless of the OPENAI_MODEL env. Without the alias LiteLLM 400s ("Invalid model name") and the wizard rejects the key. Shadow common OpenAI names onto our Claude backends so any client that probes gpt-* gets a healthy response (and routes to the Max sub). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>	2026-05-19 13:39:51 -04:00
Your NameandClaude Opus 4.7	a6b26c500f	litellm: add OpenAI→Meridian shim role (venv + systemd, port 4000) LiteLLM sits in front of Meridian for clients that can't talk Anthropic's /v1/messages format (Pulse OpenAI provider, paperless-ai, etc.). Routes OpenAI-shaped requests to localhost:3456 (Meridian) which forwards to the Max sub. - New roles/litellm/ — Python venv, pip install litellm[proxy], systemd - vars/main.yml — model map (haiku/sonnet/opus) + LITELLM_MASTER_KEY env lookup - site.yml — adds litellm role + sanity-check assert - deploy.sh — pulls LITELLM_MASTER_KEY from Infisical (/meridian/) on the controller and exports it for the playbook - New Infisical secret /meridian/vault_litellm_master_key Smoke: Pulse → LiteLLM /v1/chat/completions → Meridian /v1/messages → Max sub returns "pong" through both the LiteLLM master key auth and the Claude Code SDK OAuth. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>	2026-05-19 11:23:52 -04:00
Your NameandClaude Opus 4.7	5e16fee73b	initial scaffold: Meridian LXC (Node 22 + npm @rynfar/meridian + systemd) Deploys @rynfar/meridian on a Debian 12 LXC, bound to 0.0.0.0:3456. OAuth credentials transferred manually after first deploy (claude login on Mac, scp ~/.claude to /opt/meridian/.claude). systemd unit is enabled but gated on credentials.json existence so the first deploy doesn't crash-loop. LXC has no auth layer — security model is LAN-only reachability. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>	2026-05-17 21:20:41 -04:00