Skip to content

Twilight Drive Phase 1 — Hosted SaaS MVP

Date: 2026-05-07 Status: Draft (pending user review) Source: Consolidated from docs/planning/plans/archive/2026-04-30-p1-service.md and /Users/syone/prd/MASTER_PLAN.md

Background

Twilight Drive is an A-share research agent with mandatory citations and deterministic verification. The core package (twilight-drive-core 0.1.0) is shipped with citation protocol, verifier, Tushare adapter, and stock-research skill scripts. Two Hermes profiles are running on a Mac Mini (Main/crypto + Stock-Research).

The product vision is a hosted SaaS: users pay ¥198/month, scan a WeChat QR, and get a personal stock-research bot — no tokens, no keys, no config. Phase 1 turns the spec into a shippable paid product.

Goal

Ship a paid A-share research bot on WeChat in four milestones (P1.0 → P1.3):

  1. P1.0 — FastAPI backend on Vultr with DuckDB cache + bearer auth (closed alpha)
  2. P1.1 — Landing page + WeChat Pay + provisioning flow (paid product)
  3. P1.2 — Multi-source data layer (pytdx3 + akshare + Yahoo)
  4. P1.3 — Web search proxy (DashScope → Google fallback)

Success criteria at the end: user pays ¥198 → scans QR → names bot → sends "查 600519 的 P/E" → gets a cited answer in WeChat, having never touched a single token or key.

Non-Goals

  • Lite tier ($49/mo) enforcement — displayed but disabled in MVP
  • Real-time intraday data stream
  • Research report PDF ingestion (deferred to P2)
  • Cross-source factor generation (P2)
  • Per-user dashboards / billing portals (P2)
  • Open-source distribution of the service code (undecided)

Architecture

                        ┌──────────────────────────────────────┐
                        │         Vultr ($12/mo, 4GB)          │
                        │                                      │
   WeChat User ──QR──►  │  ┌────────────┐    ┌───────────────┐ │
   (Weixin iLink)       │  │  Hermes    │    │  FastAPI      │ │
                        │  │  Gateway   │───►│  +Pydantic    │ │
                        │  │ (per-user) │    │  /price       │ │
                        │  └────────────┘    │  /fundamentals│ │
                                              │  /search      │ │
                        ┌────────────┐        │               │ │
                        │  WeChat    │        │  ┌─────────┐  │ │
                        │  Pay       │──webhook──►│Provisioner│ │
                        │  Webhook   │        │  └─────────┘  │ │
                        └────────────┘        │               │ │
                                              │  ┌─────────┐  │ │
                        ┌────────────┐        │  │DuckDB   │  │ │
                        │  Postgres  │◄───────┼──│cache    │  │ │
                        │  users/    │        │  │         │  │ │
                        │  billing   │        │  └────┬────┘  │ │
                        └────────────┘        │       │miss   │ │
                                              │       ▼       │ │
                                              │  ┌─────────┐  │ │
                                              │  │Tushare  │  │ │
                                              │  │Pro (paid)│  │ │
                                              │  └─────────┘  │ │
                                              └────────────────┘

Two things both called "gateway":

  • FastAPI service — HTTP server skills call (/price, /fundamentals, /search). One process, all users.
  • Hermes gateway — per-user agent process holding a WeChat iLink connection, running the agent loop. Each paying user gets one.

§1 — P1.0: MVP Backend (Closed Alpha)

1.1 FastAPI Service

Endpoint surface:

GET  /healthz                          → {"ok": true, "version": "0.2.0"}
GET  /price?code=600519.SH&trade_date=20260430
                                       → {value, metric, code, as_of, cite{...}}
GET  /fundamentals?code=600519.SH&period=20251231
                                       → {code, as_of, claims:[...]}
GET  /search?query=...&max_results=10  → 501 Not Implemented
GET  /reports/search?code=...          → 501 Not Implemented

Bearer token auth:

  • Each alpha user gets a TWILIGHT_API_TOKEN generated by admin script
  • Token stored as sha256(token) in a SQLite file (upgrade to Postgres in P1.1)
  • Middleware: Authorization: Bearer <token> required on all /price, /fundamentals, /search
  • Missing token → 401; invalid → 401; revoked → 403
  • Token issued via CLI: python scripts/admin.py issue-token --user alpha-001

DuckDB cache:

  • Single file at /var/lib/twilight/cache.duckdb
  • Table daily_cache(code VARCHAR, trade_date VARCHAR, close DOUBLE, fetched_at TIMESTAMP)
  • Cache key: (code, trade_date) — same-day queries hit cache
  • TTL: 24 hours (86400 seconds)
  • Miss → Tushare HTTP call → insert into DuckDB → return
  • Success criterion: 100 sequential same-code requests → 1 Tushare call total

1.2 Cite Envelope

Every response wraps data in the existing core/core/citation.py schema:

json
{
  "value": 1850.25,
  "metric": "close",
  "code": "600519.SH",
  "as_of": "2026-04-30",
  "cite": {
    "kind": "tool",
    "source": "tushare",
    "served_by": "twilight-drive-backend",
    "served_version": "0.2.0",
    "table": "daily",
    "fetched_at": "2026-04-30T11:00:00Z",
    "cache_age_seconds": 86400,
    "tool_call_id": "tc_..."
  }
}

1.3 Skill Side — Zero Changes

Existing skill/stock-research/scripts/_client.py already supports dual mode:

  • Direct — calls Tushare locally (current dev mode)
  • Service — calls FastAPI with bearer token (P1.0 mode)

Switch via env var TWILIGHT_SERVICE_URL=http://vultr-ip:8000. No code changes needed.

1.4 Deployment

Operator-facing details live in docs/planning/09-deployment-pattern.md (post-2026-05-07 update). This spec keeps the requirements line; the how belongs in the deploy spec because it's reusable for backend + P1.1 provisioner + P2-A warehouse.

  • Host: Vultr Japan (co-located with PRD; Docker container as the isolation boundary)
  • Public ingress: cloudflared named tunnel twilight-backendapi.fsagent.cc
    • Replaces the original Caddy-based plan; cloudflared gives us TLS, edge auth (CF Access), and zero open ports on the VPS in one piece
  • Container runtime: Docker Compose (read-only fs, dropped capabilities, resource limits)
  • systemd-user units: twilight-backend.service + twilight-cloudflared.service (prefix avoids collision with PRD's poly-trade-*)
  • Secrets: ~/twilight/.env (mode 600); never embedded in image
  • Edge auth: CF Access policy on api.fsagent.cc (Email OTP for v0.1.x team access; service tokens once P1.1 provisioner ships)

1.5 Success Criteria

  • [ ] curl http://vultr-ip:8000/healthz{"ok": true}
  • [ ] curl -H "Authorization: Bearer $TOKEN" "http://vultr-ip:8000/price?code=600519.SH" → cited price
  • [ ] curl "http://vultr-ip:8000/price?code=600519.SH" (no token) → 401
  • [ ] DuckDB cache: 100 sequential same-code requests → 1 Tushare call (journalctl confirms)
  • [ ] Response time < 50ms on cache hit (measured with time curl ...)
  • [ ] Verifier passes on all responses (cite envelope intact)

§2 — P1.1: Onboarding + Payment

2.1 Landing Page

📤 已迁出本站点

本节中关于"价格 / Lite tier / 付费 landing 页 / WeChat Pay"等对外内容,在实施时呈现于 dev.fsagent.cc。对外付费版本将在独立站点 + 独立项目中推出。本站点呈现此 spec 仅限内部参考;实施时仅写 provisioner 服务、profile 克隆、Postgres schema 等技术部分。

Vue.js SPA (lightweight, single-page):

┌──────────────────────────────────────────────┐
│          Twilight Drive                     │
│    A-share research agent, citations that    │
│    trace, reasoning you can audit.           │
│                                              │
│  ┌────────────────┐  ┌──────────────────┐   │
│  │     Pro        │  │      Lite        │   │
│  │  ¥198/month    │  │   $49/month      │   │
│  │  [Buy Pro]     │  │  Coming Soon     │   │
│  │                │  │  (disabled)      │   │
│  └────────────────┘  └──────────────────┘   │
│                                              │
│  Features:                                   │
│  ✓ Full A-share fundamentals                 │
│  ✓ Real-time price via CLOB                  │
│  ✓ Web search for news & announcements       │
│  ✓ WeChat bot — no config needed             │
│  ✓ Mandatory citations on every number       │
└──────────────────────────────────────────────┘

2.2 WeChat Pay Integration

  • WeChat Pay Native API (JSAPI + Native QR)
  • Webhook endpoint: POST /api/payments/webhook
  • Webhook signature verified with WECHAT_PAY_WEBHOOK_SECRET
  • On payment confirmed:
    1. Insert payments row: {user_id, amount_cents, currency, processor, processor_ref, status='paid'}
    2. Set users.paid_until = now() + 30 days
    3. Trigger provisioner (async)

2.3 Provisioner Service

Separate Python process (different from FastAPI — different ops profile):

python
async def provision_user(user_id: UUID, plan: str = "pro") -> ProvisionResult:
    """Called by payment webhook or admin CLI."""
    
    # 1. Generate API key (in memory only)
    raw_key = secrets.token_urlsafe(32)
    key_hash = hashlib.sha256(raw_key.encode()).hexdigest()
    
    # 2. Store hash in Postgres
    await db.execute(
        "INSERT INTO api_keys (key_hash, user_id) VALUES ($1, $2)",
        key_hash, user_id
    )
    
    # 3. Clone profile from template
    profile_name = f"user-{user_id.hex[:8]}"
    await hermes_profile_create(
        profile_name=profile_name,
        clone_from="template-stock-research-pro",
        secrets={
            "TWILIGHT_API_TOKEN": raw_key,
            "TWILIGHT_SERVICE_URL": f"https://{DOMAIN}/api",
        },
    )
    
    # 4. Start gateway container (Docker)
    await docker_run_hermes(profile_name)
    
    # 5. Generate WeChat QR
    qr_url = await generate_ilink_qr(profile_name)
    
    # 6. Discard raw_key from memory
    del raw_key
    
    return ProvisionResult(qr_url=qr_url, profile_name=profile_name)

2.4 Postgres Schema

sql
CREATE TABLE users (
  user_id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  weixin_openid  VARCHAR UNIQUE,
  display_name   VARCHAR,
  plan           VARCHAR NOT NULL DEFAULT 'pro',
  status         VARCHAR NOT NULL DEFAULT 'active',
  created_at     TIMESTAMP NOT NULL DEFAULT now(),
  paid_until     TIMESTAMP NOT NULL
);

CREATE TABLE api_keys (
  key_hash       VARCHAR PRIMARY KEY,
  user_id        UUID NOT NULL REFERENCES users(user_id),
  scope          VARCHAR DEFAULT 'data:read',
  rate_limit_per_min INT DEFAULT 60,
  created_at     TIMESTAMP NOT NULL DEFAULT now(),
  last_used_at   TIMESTAMP,
  revoked_at     TIMESTAMP
);

CREATE TABLE payments (
  payment_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id        UUID REFERENCES users(user_id),
  amount_cents   INT NOT NULL,
  currency       VARCHAR(3) NOT NULL,
  processor      VARCHAR NOT NULL,
  processor_ref  VARCHAR NOT NULL,
  status         VARCHAR NOT NULL,
  created_at     TIMESTAMP NOT NULL DEFAULT now(),
  paid_at        TIMESTAMP
);

CREATE TABLE hermes_profiles (
  profile_name   VARCHAR PRIMARY KEY,
  user_id        UUID NOT NULL REFERENCES users(user_id),
  host           VARCHAR NOT NULL,
  weixin_bot_id  VARCHAR,
  status         VARCHAR NOT NULL,
  created_at     TIMESTAMP NOT NULL DEFAULT now()
);

CREATE TABLE usage_log (
  id             BIGSERIAL PRIMARY KEY,
  user_id        UUID NOT NULL,
  endpoint       VARCHAR NOT NULL,
  occurred_at    TIMESTAMP NOT NULL DEFAULT now(),
  upstream_cost_pts INT DEFAULT 0
);
CREATE INDEX usage_log_user_time ON usage_log (user_id, occurred_at DESC);

2.5 Secrets on Linux (Vultr)

SecretStorage
TUSHARE_TOKENsystemd EnvironmentFile, mode 0600
WECHAT_PAY_MCH_ID, WECHAT_PAY_API_KEY, WECHAT_PAY_WEBHOOK_SECRETsystemd EnvironmentFile
DATABASE_URL (Postgres)systemd EnvironmentFile
Per-user TWILIGHT_API_TOKENpass (gpg-backed), planted by provisioner

2.6 Success Criteria

  • [ ] Landing page at https://twilight-drive.com (or Vultr IP + self-signed for alpha)
  • [ ] Click "Buy Pro" → WeChat Pay QR appears
  • [ ] Scan + pay ¥198 → webhook received → user record created
  • [ ] Provisioner spawns Hermes profile → QR URL returned to landing page
  • [ ] User scans QR, names bot, sends "查 600519 的 P/E" → cited answer in WeChat
  • [ ] User never copy-pasted any token, key, or URL

§3 — P1.2: Multi-source Data Layer

3.1 Sources and Roles

SourceWhat it's good forWhen we use it
Tushare Pro (paid)Daily basic, fundamentals, estimates, fund/HKPrimary for everything
pytdx3 (free, TDX socket)Full historical OHLC, no rate capBulk historical backfill
akshareMacro, industry classification, alternative ratiosCross-source enrichment
Yahoo FinanceUS tickers, ADRs, exchange-rate baselineGlobal coverage, ADR cross-reference

3.2 Implementation

All sources wired behind the same FastAPI interface. Only cite.served_by reveals which source(s) backed a value:

json
{
  "cite": {
    "source": "tushare",
    "served_by": ["tushare", "pytdx3"],
    "cross_check": {
      "tushare_close": 1850.25,
      "pytdx3_close": 1850.20,
      "diff_pct": 0.003,
      "agreement": true
    }
  }
}

Cross-source agreement check:

  • If Tushare close and pytdx3 close for same (code, trade_date) differ > 0.01% → log warning
  • Continue serving Tushare as primary (paid, authoritative)
  • Log stored in /var/lib/twilight/cross-check.log

3.3 pytdx3 Historical Backfill

python
# scripts/backfill_pytdx3.py
# Fetches 5 years of daily data for top 300 stocks (by market cap)
# via pytdx3 direct TDX socket connection (free, no auth)

async def backfill(code_list: list[str], years: int = 5) -> int:
    """Return rows inserted into DuckDB."""
    conn = await pytdx3.connect()
    for code in code_list:
        rows = await conn.get_daily_history(code, years=years)
        await duckdb.insert_many("daily", rows, on_conflict="ignore")
    return total_inserted

3.4 Success Criteria

  • [ ] GET /price?code=600519.SH returns served_by: ["tushare"]
  • [ ] GET /price?code=AAPL returns served_by: ["yahoo"]
  • [ ] pytdx3 backfilled ≥ 5 years for top 300 stocks in DuckDB
  • [ ] Cross-source check logs warning when Tushare vs pytdx3 differ > 0.01%
  • [ ] No breaking changes to /price or /fundamentals response schema

§4 — P1.3: Web Search Proxy

Note: P1.3 uses DashScope + Google as a temporary solution. Phase 2 Initiative C replaces this with SearXNG.

4.1 Provider Stack

ProviderCostWhen
DashScope built-in web_searchFree for Qwen OAuth users, 1000/dayDefault
Google Custom Search$5 / 1000 queries above free tierFallback

4.2 Endpoint

GET /search?query=600519+业绩公告+2026&since=2026-01-01&max_results=10
   → {
       "results": [
         {
           "title": "...",
           "url": "https://...",
           "snippet": "...",
           "fetched_at": "...",
           "cite": {
             "kind": "tool",
             "source": "dashscope_web_search",
             "served_by": "twilight-drive-backend",
             "original_url": "https://..."
           }
         }
       ]
     }

4.3 Fallback Logic

python
async def search(query: str, max_results: int = 10) -> list[SearchResult]:
    try:
        return await dashscope_search(query, max_results)
    except (RateLimitError, EmptyResultsError):
        return await google_custom_search(query, max_results)

4.4 Success Criteria

  • [ ] /search?query=600519+业绩公告+2026 → DashScope results with cite envelope
  • [ ] DashScope quota exhausted → fallback to Google Custom Search transparent
  • [ ] Cite envelope on each result has served_by: "twilight-drive-backend"

§5 — Capacity & Hosting Strategy

Per-User Resource Profile

ResourceIdleActive Conversation
Hermes gateway disk~50 MB~50 MB
Hermes gateway RAM~80 MB~230 MB (+150 MB transient)
Hermes container limit--memory 150m --cpus 0.5Docker cgroup enforced
LLM inference0 bytes (DashScope cloud)0 bytes

MVP: Single-Host Vultr (≤ 30 users)

Vultr ($12/mo, 4GB RAM)
├── FastAPI Docker (~200 MB)
├── Postgres Docker (~300 MB)
├── Provisioner Docker (~50 MB)
├── cloudflared Docker (~64 MB)
├── DuckDB cache replica (~200 MB)
└── ALL hermes containers (30 × 80 MB idle = 2.4 GB)
    + peak conversation overhead ~0.5 GB
    ─────────────────────────────────────
    Total: ~3.6 GB / 4 GB (add 2 GB swap)

Phase 2: Mac Mini Overflow

Trigger: Vultr peak RAM > 70% for a week, or paid users > 30.

  • Provisioner picks host based on current headroom
  • Docker 容器天然可移植,新用户在 Mac Mini 上 docker run 即可
  • Cloudflare Tunnel for public URL → home Mac

§6 — Operational Concerns

Rate Limiting

  • P1.0: per-token bucket, 60 rpm default, 429 with Retry-After
  • P1.1: soft cap logged but not enforced (observe first)
  • Hard cap enforced in Phase 2

Observability

  • structlog JSON to stdout / journalctl
  • /healthz checks: DB connectivity, Tushare last-success, DashScope last-success
  • Sentry for unhandled exceptions (deferred to P1.5)

Deployment Pipeline

GitHub tag (v0.2.0)

GitHub Actions: build Docker image → ghcr.io

ssh vultr → docker compose pull → docker compose up -d

§7 — Test Plan

Unit Tests

TestWhat it verifies
test_bearer_auth_validValid token → 200
test_bearer_auth_missingNo token → 401
test_bearer_auth_invalidWrong token → 401
test_bearer_auth_revokedRevoked token → 403
test_duckdb_cache_hitSecond same-code request → DuckDB hit, 0 Tushare calls
test_duckdb_cache_missFirst request → Tushare call, inserted into DuckDB
test_cite_envelope_on_priceResponse contains valid cite with all required fields
test_cite_envelope_on_fundamentalsSame for fundamentals
test_healthz{"ok": true, "version": "..."}
test_cross_source_agreementTushare vs pytdx3 diff < 0.01% → no warning logged
test_cross_source_disagreementDiff > 0.01% → warning logged, Tushare still served
test_search_dashscope/search returns DashScope results
test_search_fallbackDashScope rate limited → Google Custom Search used

Integration Tests

TestWhat it verifies
test_full_onboarding_flowPay → provision → QR → bot responds
test_provisioner_idempotentDouble webhook → only one profile created
test_payment_webhook_signatureInvalid webhook signature → 403

Load Tests (manual)

TestWhat it verifies
test_100_sequential_same_code100 requests for 600519 → 1 Tushare call
test_30_concurrent_different_codes30 simultaneous requests → no DuckDB lock errors

§8 — Files Created / Modified

FileActionPhase
twilight-drive/src/service/main.pyNEWP1.0
twilight-drive/src/service/auth.pyNEWP1.0
twilight-drive/src/service/cache.pyNEWP1.0
twilight-drive/src/service/routes/price.pyNEWP1.0
twilight-drive/src/service/routes/fundamentals.pyNEWP1.0
twilight-drive/src/service/routes/search.pyNEWP1.3
twilight-drive/src/service/routes/healthz.pyNEWP1.0
twilight-drive/src/service/config.pyNEWP1.0
twilight-drive/src/provisioner/main.pyNEWP1.1
twilight-drive/src/provisioner/hermes.pyNEWP1.1
twilight-drive/src/provisioner/payment.pyNEWP1.1
twilight-drive/src/landing/NEW (Vue SPA)P1.1
twilight-drive/scripts/deploy-vultr.shNEWP1.0
twilight-drive/scripts/admin.pyNEWP1.0
twilight-drive/scripts/backfill_pytdx3.pyNEWP1.2
twilight-drive/scripts/issue-token.pyNEWP1.0
twilight-drive/tests/test_service_auth.pyNEWP1.0
twilight-drive/tests/test_service_cache.pyNEWP1.0
twilight-drive/tests/test_service_routes.pyNEWP1.0
twilight-drive/tests/test_provisioner.pyNEWP1.1
twilight-drive/tests/test_integration_onboarding.pyNEWP1.1
deploy/DockerfileNEWP1.0 (landed 2026-05-07 in advance)
deploy/compose.ymlNEWP1.0 (landed 2026-05-07 in advance)
deploy/cloudflared-config.yml.templateNEWP1.0 (landed 2026-05-07 in advance)
deploy/twilight-backend.serviceNEWP1.0 (landed 2026-05-07 in advance)
deploy/twilight-cloudflared.serviceNEWP1.0 (landed 2026-05-07 in advance)
deploy/install.shNEWP1.0 (landed 2026-05-07 in advance)
deploy/env.exampleNEWP1.0 (landed 2026-05-07 in advance)
twilight-drive/.env.exampleMODIFY (add payment/DB vars)P1.1
twilight-drive/pyproject.tomlMODIFY (add fastapi, httpx, duckdb deps)P1.0

§9 — Open Questions

#QuestionStatus
1Domain name — .com, .dev, or Vultr IP for alpha?Open. MVP can launch on IP + self-signed cert.
2Open source vs closed source for the service code?Open. Affects license, data source adapter commit strategy.
3Linux secret storage — pass vs systemd-creds vs plain file?Recommend pass for macOS Keychain symmetry.
4Hermes internal API surface — Competence registry, observability hooks?Needs discovery task against Hermes codebase.
5Language for landing page — Chinese primary with English subtitle?Chinese primary for MVP (target users are CN).

§10 — Out of Scope (Explicit)

  • Per-user dashboards / billing portals (P2)
  • Lite tier scope enforcement (P2)
  • Real-time intraday stream (P3)
  • Research report PDF ingestion (P2)
  • Cross-source factor generation (P2)
  • Multi-host gateway migration (P3)
  • Stripe integration (P2)
  • Usage caps enforcement (P2 — soft cap logging only in P1)

团队内部文档