Hermes Profile + Container Hybrid Test Plan

Current State (2026-05-13)

Component	Location	State
FastAPI backend	Alibaba ECS `:8081`	Running, healthy (`/healthz` ok)
MCP Tushare	Alibaba ECS `:9100`	Running, 1.6GB DuckDB warehouse (2yr data backfilled)
Postgres 16	Alibaba ECS container	Running
React frontend	Alibaba ECS `:8082`	Running
NestJS billing	Local (not deployed)	Built, WeChat Pay wired, provisioning stub only
Profile template	ECS `~/twilight/source/profile/`	Exists, not instantiated

Problem

Payment triggers need to: (1) provision a Hermes profile, (2) seed secrets, (3) start it. Currently the provisioner is a stub — it creates DB rows but does nothing on the filesystem.

We want to test a hybrid approach: some profiles run as directory-based instances, some as Docker containers. Validate isolation, scaling, and upgrade path before building the real product.

Phase 0 — Prefab: Spawn 5 Test Profiles on ECS

Goal: Prove one profile can run end-to-end, then prove 5 can coexist with isolation.

0.1 Profile Spawner Script

Write scripts/spawn-profile.sh — one script that:

bash

spawn-profile.sh <profile-name> <tier>

Copies profile/template-stock-research-pro/ to ~/hermes-profiles/<profile-name>/
Generates unique secrets: per-profile SILICONFLOW_API_KEY, TWILIGHT_API_TOKEN
Plants .env from .env.example template
Writes config.yaml with profile-specific settings
Creates a systemd user unit OR a docker-compose service (depending on tier)
Returns: profile path, status URL, and connection info

0.2 Two Tiers of Running Mode

Tier	Mechanism	Isolation	Use Case
`prefab` (Profile)	Directory under `~/hermes-profiles/<name>`, launched via systemd	Process-level, shared OS	Fast spawn, low overhead, internal testing
`container`	`docker run` with bind-mounted profile dir	Full container isolation	Production-ready, future per-user instances

0.3 Test Profiles (5 accounts)

#	Name	Tier	Purpose
1	`test-basic-1`	prefab	Baseline: can a fresh profile run stock research?
2	`test-basic-2`	prefab	Same tier, verify no cross-talk between profiles
3	`test-parallel-1`	prefab	Run concurrent queries, verify no data corruption
4	`test-container-1`	container	First containerized profile, verify isolation
5	`test-upgrade-1`	prefab→container	Start as prefab, migrate to container mid-run

0.4 Verification Checklist

For each profile:

[ ] Query: fetch_price.py 600519 — returns data with valid cite
[ ] Query: fetch_fundamentals.py 000858 --period 20251231 — returns cite envelope
[ ] Multi-turn conversation: 3+ turns, no context bleed
[ ] .env isolation: Profile A cannot read Profile B's secrets
[ ] Data source: all queries go through MCP (:9100) or Tushare, never cached from another profile

Phase 1 — Payment → Profile Provisioning Bridge

Goal: When a WeChat payment succeeds, automatically spawn a profile.

1.1 Provisioner Enhancement

Extend backend/src/modules/provisioning/provisioning.service.ts:

typescript

async provisionPaidOrder(orderId: string) {
  // ... existing DB work ...

  // NEW: call the spawn-profile script on ECS
  // Option A: SSH exec from NestJS (simple, works now)
  // Option B: HTTP endpoint on ECS that the NestJS backend calls
  // Option C: Docker API (if profiles run as containers)

  // For Prefab: SSH exec
  await execSpawnProfile({
    host: process.env.ECS_HOST,
    user: 'twilight',
    profileName: `user-${userId}-${Date.now()}`,
    tier: planCode === 'dedicated_pro' ? 'container' : 'prefab',
    secrets: {
      SILICONFLOW_API_KEY: generateApiKey(),
      TWILIGHT_API_TOKEN: issueBearerToken(userId),
      WEIXIN_HOME_CHANNEL: capturedChannel,
    },
  });
}

1.2 SSH Key for Provisioning

Generate dedicated deploy key (NestJS → ECS)
Store in keychain: twilight-provisioner-ssh
ECS: ~twilight/.ssh/authorized_keys with restricted command

Phase 2 — Data MCP Health + Profile Integration

Goal: Profiles use MCP Tushare as primary data source instead of direct Tushare API.

2.1 MCP Tushare Status Check

Current state: container running, 1.6GB warehouse, WAL active. Need to verify:

[ ] MCP endpoint actually responds with tools list (currently returning 404)
[ ] Scheduler backfill still running (check trade_cal refresh)
[ ] New daily data appearing in r_stock_daily table
[ ] Warehouse read routes (/warehouse/*) still functional

2.2 Profile MCP Wiring

Update config.yaml.template to declare MCP Tushare:

yaml

mcp_servers:
  tushare:
    url: http://127.0.0.1:9100/mcp

For containerized profiles, use Docker network DNS:

yaml

mcp_servers:
  tushare:
    url: http://twilight-mcp-tushare:9100/mcp

For prefab profiles, use localhost:

yaml

mcp_servers:
  tushare:
    url: http://127.0.0.1:9100/mcp

Current skills live at ~/twilight/source/skills/research/stock-research/. Options:

Approach	Pros	Cons
Bind mount skills into each profile	Single source of truth, easy updates	Shared mutable state
Clone skills per profile	Full isolation	Update requires propagation
Symlink to central skills dir	Low overhead, easy updates	Breaks if skills dir moves

Recommendation: symlink for Prefab, bind mount for containers.

Phase 3 — Unified Management Layer

Goal: Manage all 5+ profiles from a single interface.

3.1 Profile Registry

SQLite/Postgres table tracking:

sql

CREATE TABLE hermes_profiles (
  id TEXT PRIMARY KEY,
  name TEXT UNIQUE,
  tier TEXT,          -- 'prefab' | 'container'
  user_id TEXT,       -- NULL for test profiles
  status TEXT,        -- 'running' | 'stopped' | 'error'
  pid INTEGER,        -- for prefab
  container_id TEXT,  -- for container
  profile_path TEXT,
  created_at TIMESTAMP,
  last_health_check TIMESTAMP
);

3.2 Management CLI

bash

hermes-admin list              # Show all profiles
hermes-admin start <name>      # Start a stopped profile
hermes-admin stop <name>       # Stop a running profile
hermes-admin health <name>     # Check profile health
hermes-admin migrate <name>    # prefab → container migration
hermes-admin logs <name>       # Tail profile logs
hermes-admin exec <name> cmd   # Run command inside profile

3.3 Prometheus Metrics

Each profile exposes /metrics:

hermes_query_total{profile="name"}
hermes_query_duration_seconds{profile="name"}
hermes_cite_verification_failures_total{profile="name"}

File Structure

repo/
├── scripts/
│   ├── admin/          # ops: profile lifecycle, token, secrets
│   │   ├── spawn-profile.sh        # NEW: create + start a profile
│   │   ├── install-skill.sh        # moved from scripts/
│   │   ├── issue-token.py          # moved from scripts/
│   │   ├── with-hermes-secrets.sh  # moved from scripts/
│   │   └── with-cf-token.sh        # moved from scripts/
│   ├── data/           # data pipeline
│   │   ├── backfill_tushare_t1.py  # moved from scripts/
│   │   └── seed_trade_cal.py       # moved from scripts/
│   └── deploy/         # deployment helpers
│       ├── backend.sh              # moved from deploy/
│       ├── frontend.sh             # moved from deploy/
│       └── install.sh              # moved from deploy/
│
├── deploy/             # static configs + templates only
│   ├── compose.yml
│   ├── Dockerfile      # updated: COPY scripts/{data,admin}/
│   ├── *.service
│   ├── *.template
│   └── *.md
│
├── profile/            # templates only
│   └── template-stock-research-pro/
│       ├── AGENTS.md, SOUL.md, config.yaml.template
│       ├── .env.example, secrets.schema.json
│       └── legacy/
│
├── ops/                # NEW: profile runtime state (on ECS)
│   ├── profiles.db     # SQLite registry (created by spawn-profile.sh)
│   ├── logs/           # per-profile log files
│   └── configs/        # generated profile configs
│
├── docs/planning/
│   ├── hermes-profile-hybrid-test.md   # this file
│   ├── 00-09/*.md                      # existing planning docs
│   ├── plans/                          # dated plans
│   └── superpowers/                    # superpowers specs
│
├── core/               # Python shared library (citation, verifier, etc.)
├── skill/              # Hermes skills (stock-research)
├── src/                # FastAPI backend (Python)
├── backend/            # NestJS billing + auth (TypeScript)
├── frontend/           # React UI
├── mcp/                # MCP servers (tushare_mcp)
└── install.sh          # root entry point (updated path → scripts/admin/)

Build Order

Week 1 (now)
├── 0.1 Write spawn-profile.sh
├── 0.2 Spawn 5 test profiles
├── 0.3 Verify single profile works (price + fundamentals queries)
└── 2.1 Check MCP Tushare health (fix 404 if needed)

Week 2
├── 0.4 Verify 5 profiles coexist with isolation
├── 0.5 Test prefab → container migration
├── 1.1 Wire payment → provisioning bridge
└── 2.2 Wire profiles to MCP data source

Week 3
├── 3.1 Profile registry
├── 3.2 Management CLI
├── 3.3 Metrics + health dashboard
└── Integration test: full payment → provision → query flow

Risks

Risk	Impact	Mitigation
SiliconFlow API rate limits with 5 profiles	Queries fail or slow	Use different API keys per profile, or pool through backend
DuckDB WAL contention (MCP + profiles reading)	Data corruption	Profiles read-only against warehouse; MCP handles writes
Hermes LLM quality (Nous model)	Poor research output	Acceptable for internal testing; swap to SiliconFlow Qwen model
Profile spawn time	Slow user onboarding	Profile copy takes <1s; container start ~3s

Hermes Profile + Container Hybrid Test Plan ​

Current State (2026-05-13) ​

Problem ​

Phase 0 — Prefab: Spawn 5 Test Profiles on ECS ​

0.1 Profile Spawner Script ​

0.2 Two Tiers of Running Mode ​

0.3 Test Profiles (5 accounts) ​

0.4 Verification Checklist ​

Phase 1 — Payment → Profile Provisioning Bridge ​

1.1 Provisioner Enhancement ​

1.2 SSH Key for Provisioning ​

Phase 2 — Data MCP Health + Profile Integration ​

2.1 MCP Tushare Status Check ​

2.2 Profile MCP Wiring ​

2.3 Skills Directory Sharing ​

Phase 3 — Unified Management Layer ​

3.1 Profile Registry ​

3.2 Management CLI ​

3.3 Prometheus Metrics ​

File Structure ​

Build Order ​

Risks ​

Hermes Profile + Container Hybrid Test Plan

Current State (2026-05-13)

Problem

Phase 0 — Prefab: Spawn 5 Test Profiles on ECS

0.1 Profile Spawner Script

0.2 Two Tiers of Running Mode

0.3 Test Profiles (5 accounts)

0.4 Verification Checklist

Phase 1 — Payment → Profile Provisioning Bridge

1.1 Provisioner Enhancement

1.2 SSH Key for Provisioning

Phase 2 — Data MCP Health + Profile Integration

2.1 MCP Tushare Status Check

2.2 Profile MCP Wiring

2.3 Skills Directory Sharing

Phase 3 — Unified Management Layer

3.1 Profile Registry

3.2 Management CLI

3.3 Prometheus Metrics

File Structure

Build Order

Risks