Skip to content

Hermes Docker + NewAPI Gateway Implementation Plan

Goal: Deploy Hermes agents as Docker containers using the official nousresearch/hermes-agent image, routed through the self-hosted NewAPI LLM gateway at llm.fsagent.cc.

Status: Components 1 and 3 done. Component 2 (SearXNG) pending.


Architecture

User (WeChat Pay)

provisioner / spawn-profile.sh

docker run nousresearch/hermes-agent  ← per-user container
       ↓                    ↓
  /opt/data               hermes gateway :8642
  (profile dir)                ↓
  .env (LLM_API_KEY)       MCP + web search
  config.yaml
  skills/

  llm.fsagent.cc/v1   ← NewAPI gateway (Vultr Japan)
  http://host.docker.internal:9100/mcp   ← twilight-mcp-tushare
  http://127.0.0.1:8088/search [TODO]   ← SearXNG (not yet deployed)

Component 1: LLM — NewAPI Gateway ✅ DONE

Commit: fbd83f2 feat(profile): switch LLM provider to NewAPI gateway

What changed

  • config.yaml.template: base_url: https://llm.fsagent.cc/v1, api_key: ${LLM_API_KEY}, model: openrouter/free
  • Removed fallback_providers block — NewAPI handles channel pooling internally
  • spawn-profile.sh: input LLM_API_KEY replaces SILICONFLOW_KEY / OPENROUTER_KEY
  • secrets.schema.json + .env.example updated to match

Available models via Hermes_Test key (sk-J6ItPNKnNSxWnvjgWxOsG5J1XK2zuOnGJpK6R4nBDe4JIhMM)

All :free tier, routed through openrouter/free by default:

  • openrouter/free (default — gateway picks best available free model)
  • nvidia/nemotron-3-super-120b-a12b:free
  • nvidia/nemotron-3-nano-30b-a3b:free
  • nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free
  • arcee-ai/trinity-large-thinking:free
  • liquid/lfm-2.5-1.2b-thinking:free
  • nvidia/nemotron-nano-12b-v2-vl:free
  • nvidia/nemotron-nano-9b-v2:free
  • poolside/laguna-m.1:free
  • poolside/laguna-xs.2:free
  • inclusionai/ring-2.6-1t:free
  • baidu/cobuddy:free
  • liquid/lfm-2.5-1.2b-instruct:free

Per-user provisioning

When a user subscribes, provisioner must:

  1. Call NewAPI API to create a per-user channel key (or issue from a key pool)
  2. Pass LLM_API_KEY=<key> to spawn-profile.sh

NewAPI user creation API:

bash
# Create channel via NewAPI admin API (requires admin key)
POST https://llm.fsagent.cc/api/user/register   # or admin/token create

Component 2: Web Search — SearXNG ⬜ TODO

Plan (implement after SearXNG keys are shared)

  1. Add searxng service to deploy/compose.yml (warehouse stack)

    • Image: searxng/searxng:latest
    • Bind: 127.0.0.1:8088:8080
    • Mount: SearXNG config at /etc/searxng/settings.yml
    • Configure engines: Google Custom Search (API key), Bing, DuckDuckGo
  2. Add deploy/searxng/settings.yml — SearXNG config with enabled engines

  3. Update config.yaml.template:

    yaml
    web:
      backend: searxng
      url: ${SEARXNG_URL}
  4. Update spawn-profile.sh — write SEARXNG_URL based on tier:

    • prefab (host network): http://127.0.0.1:8088/search
    • container (isolated): http://host.docker.internal:8088/search
  5. Add SEARXNG_URL= to .env.example

Note on Exa/Tavily: Not standard SearXNG engines. Options:

  • (A) Add as MCP tools in mcp_servers block of config.yaml.template
  • (B) Skip for now — SearXNG + DuckDuckGo covers basic needs

Component 3: Hermes Docker Container ✅ DONE

Commits: b1a48f6 (config.yaml materialization), fbd83f2 (LLM key)

How it works

spawn-profile.sh already handles the full container lifecycle:

bash
# Prefab tier (host network, moderate limits)
spawn-profile.sh <name> prefab LLM_API_KEY=<key>

# Container tier (isolated network, tighter limits)
spawn-profile.sh <name> container LLM_API_KEY=<key>

Container details:

  • Image: nousresearch/hermes-agent:latest
  • Profile dir → /opt/data (HOME)
  • config.yaml materialized at spawn time from template
  • {{MCP_URL}} substituted: prefab=http://127.0.0.1:9100/mcp, container=http://host.docker.internal:9100/mcp
  • Health: curl http://127.0.0.1:8642/health every 30s
  • Runs: gateway run

What container tier needs from each profile

/opt/data/
├── config.yaml       ← materialized from template (spawn-profile.sh handles)
├── .env              ← written by spawn-profile.sh
└── skills/
    └── research/
        └── stock-research/   ← copied from repo at spawn time

Test a profile manually

bash
# On ECS (or locally with Docker)
cd ~/twilight/source
./scripts/admin/spawn-profile.sh test-user-1 container \
  LLM_API_KEY=sk-J6ItPNKnNSxWnvjgWxOsG5J1XK2zuOnGJpK6R4nBDe4JIhMM \
  DASHSCOPE_KEY=<your-key>

# Check logs
docker logs -f hermes-test-user-1

# Check health
docker inspect hermes-test-user-1 --format '{{.State.Health.Status}}'

Open Questions

  1. NewAPI per-user key issuance: How does provisioner create a key per user? Need NewAPI admin API docs or admin key.
  2. SearXNG keys: User has Google Custom Search, Exa, Tavily keys — share in a file to complete component 2.
  3. Exa/Tavily routing: SearXNG MCP engine vs direct MCP tools — decision pending.

团队内部文档