Runbook — W4 `/price` cutover

Operator playbook for executing the v0.3.0 W4 breaking-change cutover on tvps (the Vultr Japan VPS that runs api.fsagent.cc).

This runbook expands the cutover playbook in docs/planning/superpowers/specs/2026-05-11-w4-price-route-cutover.md §7–§8 into concrete commands, expected output, and decision branches.

Audience: the human running the cutover (currently: project owner). Read end-to-end before starting. Total budget: 60 minutes including verification, plus a 30-minute rollback window if needed.

0. Prerequisites — must be true before T+00:00

If any item is NO, do not start. Resolve first.

[ ] W4 implementation PR(s) merged to main; commit pinned in ~/twilight/source on tvps
[ ] All W4 dependency PRs merged (see spec §10): #38, #39, #41, #42, #43, #44, #46, #47, #48 and T2.7 lifespan injection
[ ] core/tests/integration/test_service_price_route.py green in CI on the cutover commit
[ ] Tushare token still valid (TUSHARE_TOKEN in tvps ~/twilight/.env)
[ ] Hermes profile install script scripts/install-skill.sh is on the latest revision in the user's local repo
[ ] No active client traffic that would be disrupted by a 60-second /price blip — agent in idle, no scheduled batch

Two terminals open:

# Terminal A (local repo)
cd ~/path/to/twilight-drive
git fetch origin main
git checkout main && git pull

# Terminal B (tvps)
ssh tvps
cd ~/twilight/source

1. Cutover timeline

T+00:00 — Pre-flight snapshot (5 min)

In Terminal B (tvps):

# Verify current state
docker compose -f deploy/compose.yml ps
# Expected: twilight-backend running (Up X minutes), restart count 0

# Snapshot the DuckDB warehouse for rollback
cp ~/twilight/data/warehouse.duckdb ~/twilight/data/warehouse.duckdb.pre-w4
ls -lh ~/twilight/data/warehouse.duckdb*
# Expected: both files, same size

# Record current image digest for fast rollback
docker compose -f deploy/compose.yml images | tee /tmp/w4-pre-images.txt
# Expected: prints the IMAGE column with current digest; SAVE the digest

Decision: if docker compose ps shows anything other than healthy + running → abort, investigate first. The cutover assumes the current service is in good shape.

T+00:05 — Tag and push (server deploy) (10 min)

In Terminal A (local):

git tag v0.3.0-w4-cutover
git push origin v0.3.0-w4-cutover
# CI will build + publish ghcr.io/lacatfly/twilight-drive:v0.3.0-w4-cutover

Wait for the build to publish. Watch with:

gh run watch --exit-status
# Or: open https://github.com/LaCatFly/twilight-drive/actions

When the image is published, in Terminal B:

# Set the version and pull
export TWILIGHT_VERSION=v0.3.0-w4-cutover
docker compose -f deploy/compose.yml pull
docker compose -f deploy/compose.yml up -d

# Wait for healthcheck
sleep 5
docker compose -f deploy/compose.yml ps
# Expected: twilight-backend (healthy), restart count 0

T+00:15 — Server smoke test (5 min)

Still in Terminal B:

# Pull a token for testing (env or vault, do NOT hardcode)
TOK="$(cat ~/twilight/.tokens/operator-test)"

# Test the new shape on a known liquid stock
curl -s -H "Authorization: Bearer $TOK" \
  "https://api.fsagent.cc/price?code=600519.SH" | jq .

Expected (new shape, all fields):

jsonc

{
  "ts_code": "600519.SH",
  "trade_date": "2026-05-09",
  "close_raw": 1700.5,           // number, not null
  "close_qfq": 1700.5,           // number when adj_factor present; null OK if Tushare returned no factor
  "close_hfq": 1701.2,
  "open_raw": 1695.0,
  "high_raw": 1710.0,
  "low_raw": 1690.0,
  "volume": 1234567,             // 手
  "amount": 2100000.0,           // 千元
  "adj_factor": 1.0,
  "adj_factor_latest": 1.0,
  "as_of": "2026-05-09T07:30:00+00:00",
  "freshness_seconds": 42,
  "source": "tushare-daily",     // or akshare-spot during OPEN
  "source_chain": ["tushare-daily"],
  "market_state": "closed_final",
  "units": { "close_raw": "yuan", "volume": "hand", "amount": "kyuan", ... },
  "cite": { "kind": "tool", "tool_call_id": "tc_...", ... }
}

Red flags (rollback immediately if any):

close_raw is null or absent — Composer didn't return a Quote
Response is the old envelope (value / metric keys) — pull didn't take
HTTP 5xx — Composer or AuditLog is failing; check docker compose logs backend --tail=200

T+00:25 — Hermes profile reinstall (10 min)

In Terminal A (local):

bash scripts/install-skill.sh
# Expected: copies skill/stock-research/ files into Hermes config dir,
# prints "skill installed" or similar

In the user's main Hermes shell (whatever launched Hermes):

# Restart Hermes so it reloads the new skill
hermes restart
# Or whatever the operator's actual restart command is

Verify the profile loaded by checking Hermes startup log for the stock-research skill register line. If unsure, run a trivial agent prompt and confirm the skill is listed in its tool list.

T+00:35 — Live agent smoke test (15 min)

Use the Hermes agent end-to-end:

Latest-quote path. Ask the agent: "What is the current price of 贵州茅台 (600519)?"
- Expected: Agent emits a price with close_qfq or close_raw, mentions freshness_seconds < 120, and the market_state matches reality (OPEN / CLOSED_FINAL).
- Red flag: agent says "price unavailable" or quotes a stale value (freshness_seconds > 86400 outside CLOSED_FINAL).
Historical path. Ask: "What was the close of 600519 on 2024-01-02?"
- Expected: Agent returns a number with trade_date: "2024-01-02", close_qfq populated (factors are static for historical dates).
- Red flag: close_qfq is null but close_raw is present — adj_factor lookup is broken.
Cache freshness sanity. Same query twice, 90 seconds apart, during OPEN market state:
- Expected: Second call's freshness_seconds is ≥ 60 less than the first (cache TTL is 60s during OPEN; after expiry, a new fetch resets it).
- Red flag: freshness_seconds jumps backward in time, or stays at 0 across calls (cache not respecting TTL).
Unknown code error path. Ask: "What is the price of 999999.SH?"
- Expected: Agent reports the code is unknown or unavailable. HTTP 502 (not 500) in docker compose logs backend.

T+00:50 — Decision point

All four agent checks green + curl smoke green: continue to T+00:55.
Any red flag and no <5-min forward fix: execute rollback (§2).

T+00:55 — Status doc + close out (5 min)

In Terminal A:

# Update status index
$EDITOR docs/status/index.md
# Add: "v0.3.0 W4 /price cutover completed YYYY-MM-DD; link to spec"

git add docs/status/index.md
git commit -m "docs(status): W4 /price cutover complete"
git push

Move the AuditLog forward-baseline: the cutover anchor is now live. Old DailyCache paths are dead code (deleted as part of the W4 PR per spec §5.2).

2. Rollback procedure

Trigger conditions (any of):

New /price returns 5xx on >5% of agent traffic
close_raw consistently null for known-liquid codes
Hermes profile reports value parse error from any prompt
AuditLog shows >10 critical diffs in 10 minutes (cross-source disagreement spike — only relevant if W7 TDX auditor is also live; ignore in v0.3.0)
Operator judgment: anything that doesn't have a forward-fix in <60 minutes

Budget: 30 minutes total.

2.1 Server revert (10 min)

In Terminal B (tvps), reverse the deploy:

# Use the digest captured at T+00:00 from /tmp/w4-pre-images.txt
PREV_DIGEST="ghcr.io/lacatfly/twilight-drive@sha256:..."  # paste from snapshot

# Edit compose to pin to that digest (or use the version tag of the
# previous release, e.g. v0.2.0)
export TWILIGHT_VERSION=v0.2.0   # or whatever was running pre-cutover

docker compose -f deploy/compose.yml pull
docker compose -f deploy/compose.yml up -d

# Verify
sleep 5
docker compose -f deploy/compose.yml ps
curl -s -H "Authorization: Bearer $TOK" \
  "https://api.fsagent.cc/price?code=600519.SH" | jq 'keys'
# Expected: keys include "value", "metric", "code" (old shape)

2.2 Hermes profile revert (10 min)

In Terminal A (local):

git checkout v0.2.0 -- skill/stock-research/
bash scripts/install-skill.sh
# In Hermes shell:
hermes restart

2.3 DuckDB revert (5 min, only if schema regressed)

W4 cutover does not migrate data; the snapshot is a defense-in-depth measure. Only restore if the running service touched the warehouse schema in a way that the v0.2.0 image can no longer read.

In Terminal B:

docker compose -f deploy/compose.yml stop backend
cp ~/twilight/data/warehouse.duckdb.pre-w4 ~/twilight/data/warehouse.duckdb
docker compose -f deploy/compose.yml up -d backend

2.4 Post-rollback verification (5 min)

Same as §1 T+00:15 smoke test, but expecting the old envelope shape. File a follow-up issue describing the rollback trigger; do not retry the cutover the same day.

3. Monitoring queries (post-cutover, first 48h)

The new envelope writes one row per /price call to provider_audit_log via the lifespan-injected AuditLog. Run these from tvps to track health:

docker compose -f deploy/compose.yml exec backend python - <<'PY'
import duckdb
con = duckdb.connect("/data/warehouse.duckdb", read_only=True)

# Error rate, last hour
print(con.execute("""
    SELECT
      COUNT(*) AS total,
      SUM(CASE WHEN error IS NOT NULL THEN 1 ELSE 0 END) AS errors,
      ROUND(100.0 * SUM(CASE WHEN error IS NOT NULL THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 2) AS err_pct
    FROM provider_audit_log
    WHERE ts > now() - INTERVAL 1 HOUR
""").fetchall())

# Provider chain distribution (was the fallback exercised more than baseline?)
print(con.execute("""
    SELECT served_by, COUNT(*) AS n
    FROM provider_audit_log
    WHERE ts > now() - INTERVAL 1 HOUR
    GROUP BY served_by ORDER BY n DESC
""").fetchall())

# p95 latency
print(con.execute("""
    SELECT quantile_cont(latency_ms, 0.95) AS p95_ms
    FROM provider_audit_log
    WHERE ts > now() - INTERVAL 1 HOUR
""").fetchall())
PY

Healthy baseline (from v0.2.0):

Metric	Healthy	Investigate
err_pct (1h)	< 1%	> 3%
p95 latency	< 400 ms	> 600 ms
`served_by="akshare-spot"` share during OPEN	50–90%	falls to 0% (akshare provider broken)

4. Known failure modes — preauthored responses

Symptom	Likely cause	Operator action
`close_raw` null but `close_qfq` populated	adj_factor math returning when raw not — should not happen	Pull logs, file bug; rollback if recurring
`freshness_seconds` always 0 across calls	Cache not hit (TTL bug or always-CLOSING state)	Check `MarketState` resolver: `select market_state, count(*) from provider_audit_log group by market_state` — if all "closing", the trade_cal seed is stale
HTTP 502 on liquid codes during OPEN	Composer exhausted chain (akshare + tushare both failing)	Check `docker compose logs backend --tail=200`; if upstream issue, rollback won't help — write status page note and wait
HTTP 500 with `CanonicalUnitViolation`	Provider returned data in wrong units (spec §1 incidents 4-5 class)	Cutover is exposing a real bug — do not roll back to mask it. File issue, fix forward
`source_chain` length 1 always	Composer never falls back; could be by design (chain ends on first success) — verify against spec §6.2	Check spec §6.2 chain entries vs. observed; no action unless a chain entry is provably skipped
Hermes agent emits both `value` and `close_raw`	Old prompt template still has `value` mention	Search Hermes prompts for hardcoded `value`; update SKILL.md back-compat usage

5. Cross-references

Spec (design): docs/planning/superpowers/specs/2026-05-11-w4-price-route-cutover.md
Target schema: 02-tushare-integration-design.md §9.2
Lifespan prerequisite: 2026-05-10-lifespan-integration-spec.md §4 (W2.7)
Composer contract: 02-tushare-integration-design.md §6
Field mapping (old → new): spec §4
Deploy config: deploy/compose.yml, deploy/install.sh

If this runbook drifts from the spec, the spec wins — file a PR to re-sync this doc rather than acting on stale steps.

Runbook — W4 /price cutover ​

0. Prerequisites — must be true before T+00:00 ​

1. Cutover timeline ​

T+00:00 — Pre-flight snapshot (5 min) ​

T+00:05 — Tag and push (server deploy) (10 min) ​

T+00:15 — Server smoke test (5 min) ​

T+00:25 — Hermes profile reinstall (10 min) ​

T+00:35 — Live agent smoke test (15 min) ​

T+00:50 — Decision point ​

T+00:55 — Status doc + close out (5 min) ​

2. Rollback procedure ​

2.1 Server revert (10 min) ​

2.2 Hermes profile revert (10 min) ​

2.3 DuckDB revert (5 min, only if schema regressed) ​

2.4 Post-rollback verification (5 min) ​

3. Monitoring queries (post-cutover, first 48h) ​

4. Known failure modes — preauthored responses ​

5. Cross-references ​

Runbook — W4 `/price` cutover

0. Prerequisites — must be true before T+00:00

1. Cutover timeline

T+00:00 — Pre-flight snapshot (5 min)

T+00:05 — Tag and push (server deploy) (10 min)

T+00:15 — Server smoke test (5 min)

T+00:25 — Hermes profile reinstall (10 min)

T+00:35 — Live agent smoke test (15 min)

T+00:50 — Decision point

T+00:55 — Status doc + close out (5 min)

2. Rollback procedure

2.1 Server revert (10 min)

2.2 Hermes profile revert (10 min)

2.3 DuckDB revert (5 min, only if schema regressed)

2.4 Post-rollback verification (5 min)

3. Monitoring queries (post-cutover, first 48h)

4. Known failure modes — preauthored responses

5. Cross-references