P1 — Hosted Service Backend

📤 已 superseded · 内部参考 · 不对外

本文已被 superpowers/specs/2026-05-07-twilight-drive-phase1.md 取代，仅作演化追溯。其中 ¥198 / Lite tier / WeChat Pay 等对外内容由独立项目承接，不展示于 dev.fsagent.cc 的对外路径。

Status: 🔄 Superseded by ../superpowers/specs/2026-05-07-twilight-drive-phase1.mdNote: 留在原位作为产品定位演化的追溯（"data API → end-to-end onboarding loop" 的认知拐点在这里发生）。新 spec 把本文的 4 个产品决策落到 P1.0–P1.3 的可执行任务清单。 Goal: Define the hosted SaaS that turns the stock-research skill into a paid product. Users pay once, scan a WeChat QR code, claim a bot — done. The skill, hermes profile, API key, Tushare access, and web search are all provisioned and managed backend-side.

Why this is bigger than "a backend"

The first draft of this spec treated P1 as a data-serving API. User input clarified that the actual product is the end-to-end onboarding loop: payment → bot provisioning → WeChat-pairable hermes instance the user just talks to. The data API is one component inside that loop, not the product.

Three implications:

API keys are invisible. Users never see, set, or rotate TWILIGHT_API_TOKEN. The provisioning flow plants it inside the user's hermes profile during setup. (Inspired by sub2api, but stricter — even the key is hidden.)
The product is the bot, not the data. Tushare / akshare / TDX / web-search are all upstream pools we manage on the user's behalf. The user pays for "a research analyst on WeChat", not "5,000 Tushare requests/month".
Two pricing tiers in display, one selectable in MVP. Anchor the price; ship the high tier.

Product surface (MVP)

Pricing tiers (displayed on landing page)

Tier	Price	Selectable in MVP?	Notes
Pro	¥198 / month	✅ Yes	Full skill access, paid Tushare backend, Qwen web search, no usage cap (informally — see below)
Lite	$49 / month	❌ Disabled, for Phase 2	Reduced data scope, no fundamentals depth, slower model tier — pricing displayed for anchoring

The Lite tier exists in the UI to make Pro feel like the obvious choice. Phase 2 will introduce per-tier scope enforcement; for MVP both render but only Pro accepts payment.

Onboarding flow (4 steps, ≤ 5 minutes)

[1] User visits landing page
       ↓ clicks "Buy Pro"
[2] Pay ¥198 (WeChat Pay / Stripe — see open Q on payment processor)
       ↓ payment confirmed via webhook
[3] Backend creates a hermes profile + issues an API key (invisible)
   Backend shows: "Scan this QR code with WeChat to pair your bot"
       ↓
[4] User scans iLink Bot QR → bot is now in their WeChat
   They name the bot ("张老师"), bot replies "可以开始问股票了"
   ↓
   User just chats. No tokens to configure, no keys to copy.

Hermes profile provisioning steps under the hood at [3]:

hermes profile create user-{user_id} --clone-from template-stock-research
Plant TUSHARE_TOKEN, TWILIGHT_API_TOKEN, WEIXIN_TOKEN, DASHSCOPE_API_KEY etc. into the new profile's Keychain (Linux/Mac mini equivalent: pass or systemd EnvironmentFile)
Create LaunchAgent / systemd unit for the new profile's gateway
Generate iLink Bot QR + return URL to landing page

Phased technical architecture

We build the interface in one shot but the implementation in phases.

P1.0 — MVP backend with closed alpha (target: 2 weeks)

┌───────────┐  HTTPS   ┌────────────┐         ┌─────────┐
│  skill    │ ───────► │  FastAPI   │  cache  │ DuckDB  │
│ _client.py│          │ +Pydantic  │ ──────► │ on disk │
└───────────┘          └─────┬──────┘         └─────────┘
                             │ miss
                             ▼
                       ┌──────────┐
                       │ Tushare  │ (paid Pro account, single token)
                       └──────────┘

/price, /fundamentals, /reports/search (last is 501 until P2) match existing ServiceClient schemas — skill side needs zero changes
DuckDB on disk: cache + future warehouse, single file at /var/lib/twilight/cache.duckdb
API tokens manually issued to alpha users; not yet wired to payment
One Tushare Pro account funded by us; token in systemd EnvironmentFile

Why this is enough for ≤ 50 alpha users: Real query patterns are narrow (most users repeat 3-5 popular tickers), so cache hit ratio after week 1 should be > 95%. Tushare cost stays roughly constant in user count.

P1.1 — Onboarding + payment (target: +2 weeks)

[Landing page]   [Payment]   [Provisioner]    [Hermes pool]
  Vue/Next      WeChat Pay   Python service   per-user profile
       \            \              /                /
        \            \            /                /
         ────────►  Postgres (users, payments, api_keys)  ◄────

Adds:

Landing page with two pricing cards (one disabled), payment CTA
Payment integration (open Q: WeChat Pay native + Stripe for non-CN cards?)
Provisioner service: receives payment webhook, creates hermes profile, issues API key, returns QR code
Postgres: separate from DuckDB warehouse — stores user identity / billing state, not market data

API tokens still managed manually if needed; the provisioner handles auto-issuance for paid users.

P1.2 — Multi-source data layer (target: +1 week)

User clarified: akshare + yahoo + TDX are complementary, not redundancy. They feed verification + factor generation later.

Sources, with role:

Source	What it's good for	When we use it
Tushare Pro (paid)	Daily basic, fundamentals, estimates, fund/HK	Primary for everything we serve
pytdx3 (free, direct TDX socket)	Full historical OHLC, free, no rate cap documented	Bulk historical backfill — saves Tushare points
akshare	Macro, industry classification, alternative ratios	Cross-source enrichment, factor generation
Yahoo Finance	US tickers, ADRs, exchange-rate baseline	Global coverage; ADR / HK cross-reference

P1.2 wires up the additional sources behind the same FastAPI but does not change the contract. Only served_by in cite envelope reveals which source(s) backed a value.

Cross-source verification is a separate P3 concern (factor generation). P1.2 just gets the wires in.

P1.3 — Web search proxy (target: +3 days)

Same logic: user shouldn't manage a search-API key. Backend acquires + meters search quota.

Provider stack:

Provider	Cost	When
DashScope built-in `web_search`	Free for Qwen OAuth users, 1000/day	Default — already part of Qwen-Agent framework hermes uses
Google Custom Search	$5 / 1000 queries above free	Fallback when DashScope rate-limits or returns empty

No in-house search index — we don't have ranking / freshness budget for that.

Backend exposes:

GET /search?query=&since=&max_results=
   → {results: [{title, url, snippet, fetched_at, cite{...}}]}

The skill calls our endpoint instead of DashScope/Google directly; we route + meter on the backend.

P1.5 — Scheduled warehouse (target: when load justifies)

Same as original draft: APScheduler in-process, daily batch pulls of daily(trade_date=...) (1 call → all 5,000 stocks), historical backfill from pytdx3. FastAPI never hits Tushare directly anymore.

P2 — Lite tier + usage caps + multi-tier 4-layer warehouse

Deferred. Spec separately.

Tech selection

Layer	Choice	Why this	Reject
Web framework	FastAPI	core already uses Pydantic v2; reuses `ServiceClient` schemas	Flask, aiohttp
User DB	Postgres	Need real ACID for payments; SQLAlchemy + Alembic well-trodden	DuckDB (no concurrency for writes), SQLite (won't scale)
Data warehouse	DuckDB on disk	100s of concurrent reads fine; single writer (scheduler); no ops	Postgres (overkill for OLAP-style queries)
Cache	Same DuckDB file, ephemeral tables	One less moving part	Redis (extra service for no clear win at this scale)
Scheduler	APScheduler in-process	No extra service	systemd timers, Celery
Auth	Bearer token (sha256-hashed in DB), per-user	sub2api-inspired; user never sees	OAuth (no need yet), JWT (no need yet)
Payment	WeChat Pay (CN only) for MVP	All target users are in China; one processor = one integration	Stripe — deferred to Phase 2 when we have non-CN demand
Process manager	systemd	Vultr Ubuntu / Mac mini both support	Docker Compose
TLS	Caddy in front	Auto-LE certs, 5-line config	nginx
Hosting (single host MVP)	Vultr Cloud Compute, $12/mo, 4GB RAM	Runs FastAPI + Postgres + Provisioner + DuckDB replica + ALL hermes gateways for first ~30 users	Hetzner (CN egress slow), Fly.io
Hosting (batch + future overflow)	Mac mini M4 Pro 48GB at home, Cloudflare Tunnel	Warehouse PRIMARY (writes + nightly fills via pytdx3); rsyncs read-only replica to Vultr. Becomes gateway-overflow target only when Vultr RAM tightens.	DigitalOcean droplet, GCE — same cost, no upside
Tunnel	Cloudflare Tunnel	Public URL → home Mac without exposing IP / opening ports	Tailscale Funnel (smaller free tier), ngrok (paid)
Observability	structlog JSON + `/healthz`	core already uses structlog	Prometheus until needed
Sources for data	Tushare Pro (paid) + pytdx3 (free) + akshare + yfinance	Complementary, see P1.2	Wind / Choice (overkill cost)
Web search	DashScope `web_search` → Google Custom Search fallback	Free first 1k/day, then $5/1k	Tavily (paid), Brave (less reliable for CN)
LLM inference	DashScope cloud (Qwen3)	hermes already uses; Mac mini does NOT need to host models	Local llama.cpp (Mac mini becomes 1-user-at-a-time)
Provisioner	Python service that wraps `hermes profile create`	Hermes already supports profiles	Bespoke script collection

Glossary: two things both called "gateway"

Worth pinning down because the spec touches both:

Term	What	Where it lives
FastAPI service	The HTTP server skills call: `/price`, `/fundamentals`, `/search`, `/admin/*`. One process serving all users.	Single instance on Vultr
Hermes gateway	The per-user agent process: holds a WeChat iLink connection, runs the agent loop on each message, calls our FastAPI for data. Each paying user runs one of these.	One process per user. ~80 MB idle RAM each.

Below, "gateway" alone means the hermes per-user process.

Provisioner — what it is, why it's separate from FastAPI

The control-plane service that runs once per user lifecycle event (signup, plan change, cancel). On signup, payment webhook fires → provisioner:

1. Insert user row in Postgres
2. Generate raw_api_key = secrets.token_urlsafe(32)   (in memory only)
3. Hash it, store hash in api_keys table
4. Pick host for this user's hermes gateway          (always Vultr in MVP)
5. Spawn the gateway:
     hermes profile create user-abc123 --clone-from template-stock-research-pro
     security/keychain: store raw_api_key under twilight-drive-api-token
     systemctl --user enable --now hermes@user-abc123
6. Discard raw_api_key from server memory
7. Generate iLink Bot QR via existing hermes weixin tooling
8. Return QR URL to landing page

	FastAPI (data plane)	Provisioner (control plane)
Frequency	Every user request, all day	Once per signup; sporadic
Operations	Read-mostly	Heavy writes (DB + process spawn)
Failure cost	Failed `/price` retries cheaply	Failed provision = paid user with no bot — much worse
Privilege	None special	Needs `hermes profile create` rights
Code style	Pure HTTP request handlers	State machine with retry / compensation

Different ops profile → separate process. Same Python repo / same systemd box, different entry points.

Capacity & hosting strategy

Per-user resource profile (steady state)

Hermes gateway disk: ~50 MB (sessions + memories + skills)
Hermes gateway idle: ~80 MB RAM (waiting on WeChat events)
Active conversation: + ~150 MB RAM (transient, only one in-flight per user)
LLM inference: 0 bytes locally — DashScope cloud, gateway just relays

MVP: single-host on Vultr (≤ 30 paying users)

Vultr ($12/mo, 4 GB RAM)              Mac mini at home
├── FastAPI                            ├── Scheduler (APScheduler)
├── Postgres (users + payments)        ├── DuckDB warehouse PRIMARY (writes)
├── Provisioner                        │     └── pytdx3 backfills, daily fills
├── DuckDB warehouse REPLICA           │
└── ALL hermes gateways                │     └── nightly rsync ─→ Vultr replica
                                       └── (no user gateways yet)
   ▲ public via Caddy + Let's Encrypt

For ~30 idle gateways: 30 × 80 MB = 2.4 GB; +0.5 GB peak conversation overhead = ~3 GB. Plus FastAPI / Postgres / Provisioner ~0.5 GB. Vultr 4 GB box is comfortable but not over-provisioned. Add 2 GB of swap for headroom.

Why not on Mac mini for MVP: WeChat iLink connections from home IP are at the mercy of your ISP / power. A user paying ¥198/month does not accept that "the bot is down because Liang's apartment lost power". Vultr 24×7 commercial uptime is the right host until the cost of upgrading Vultr exceeds the operational cost of multi-host.

Phase 2: add Mac mini gateway overflow (when Vultr RAM tightens)

Trigger: Vultr peak RAM > 70% for a week, or paid users > 30. At that point:

Provisioner picks host based on current headroom (min(load(vultr), load(mac)))
New users land on Mac mini; existing users stay where they were provisioned
Migration of an existing user is a P3 concern (means re-pairing the WeChat bot identity)

100 paying users: still fits within (4 GB Vultr + 48 GB Mac mini); the constraint becomes WeChat iLink per-IP connection limits, not RAM. Confirm by load-testing at the 50-user mark.

Sources: Mac mini M4 Pro 48GB benchmarks — local LLM inference is the bottleneck; we sidestep by keeping inference on DashScope cloud.

API surface (P1.0+)

GET  /healthz                          → {"ok": true, "version": "0.2.0"}

# Data plane (skill calls these)
GET  /price?code=600519.SH&trade_date=20260430
                                       → {value, metric, code, as_of, cite{...}}
GET  /fundamentals?code=600519.SH&period=20251231
                                       → {code, as_of, claims:[...]}
GET  /reports/search?code=...&since=...
                                       → 501 Not Implemented (P2)
GET  /search?query=...&max_results=10  → {results:[{title, url, snippet, cite{...}}]}

# Control plane (provisioner / admin)
POST /admin/users                      → create user, returns invisible api_key
POST /admin/payments/webhook           → WeChat Pay / Stripe callbacks
POST /admin/provision/{user_id}        → spawn hermes profile, return QR url
GET  /admin/users/{user_id}/usage      → request counts by endpoint, by day

All /admin/* requires admin scope (separate token class). All data-plane endpoints require user-scoped bearer token.

Cite envelope on hosted responses

json

{
  "kind": "tool",
  "source": "tushare",                  // upstream actually consulted
  "served_by": "twilight-drive-backend",
  "served_version": "0.2.0",
  "table": "daily",
  "fetched_at": "2026-04-30T11:00:00Z",
  "cache_age_seconds": 86400,           // 0 if fresh
  "tool_call_id": "tc_..."
}

Verifier passes through unchanged — kind: "tool" is still the only enum value. New fields are decorative + auditable.

Database schema

Postgres (users + billing)

sql

CREATE TABLE users (
  user_id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email          VARCHAR UNIQUE,
  weixin_openid  VARCHAR UNIQUE,
  display_name   VARCHAR,
  plan           VARCHAR NOT NULL DEFAULT 'pro',  -- 'pro', 'lite' (P2)
  status         VARCHAR NOT NULL DEFAULT 'active', -- 'active', 'paused', 'expired'
  created_at     TIMESTAMP NOT NULL DEFAULT now(),
  paid_until     TIMESTAMP NOT NULL              -- enforced at gateway entry
);

CREATE TABLE api_keys (
  key_hash       VARCHAR PRIMARY KEY,             -- sha256, raw key shown to nobody after issuance
  user_id        UUID NOT NULL REFERENCES users(user_id),
  scope          VARCHAR DEFAULT 'data:read',
  rate_limit_per_min INT DEFAULT 60,              -- soft cap; later tiered by plan
  created_at     TIMESTAMP NOT NULL DEFAULT now(),
  last_used_at   TIMESTAMP,
  revoked_at     TIMESTAMP                         -- nullable
);

CREATE TABLE payments (
  payment_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id        UUID REFERENCES users(user_id),
  amount_cents   INT NOT NULL,
  currency       VARCHAR(3) NOT NULL,              -- 'CNY' or 'USD'
  processor      VARCHAR NOT NULL,                 -- 'wechat_pay' | 'stripe'
  processor_ref  VARCHAR NOT NULL,                 -- their txn id
  status         VARCHAR NOT NULL,                 -- 'pending' | 'paid' | 'refunded'
  created_at     TIMESTAMP NOT NULL DEFAULT now(),
  paid_at        TIMESTAMP
);

CREATE TABLE hermes_profiles (
  profile_name   VARCHAR PRIMARY KEY,              -- e.g. 'user-abc123'
  user_id        UUID NOT NULL REFERENCES users(user_id),
  host           VARCHAR NOT NULL,                 -- 'vultr-1' | 'mac-mini-1'
  weixin_bot_id  VARCHAR,                          -- iLink Bot identity
  status         VARCHAR NOT NULL,                 -- 'provisioning' | 'active' | 'error'
  created_at     TIMESTAMP NOT NULL DEFAULT now()
);

CREATE TABLE usage_log (
  id             BIGSERIAL PRIMARY KEY,
  user_id        UUID NOT NULL,
  endpoint       VARCHAR NOT NULL,                 -- '/price', '/search', etc.
  occurred_at    TIMESTAMP NOT NULL DEFAULT now(),
  upstream_cost_pts INT                              -- Tushare points or DashScope quota
);
CREATE INDEX usage_log_user_time ON usage_log (user_id, occurred_at DESC);

DuckDB (data warehouse) — same as previous spec

daily, daily_basic, fina_indicator, trade_cal — see schema in earlier draft section. The Postgres api_keys lookup is fast enough that we don't co-locate it with DuckDB.

Operational concerns

Secrets on Linux (Vultr — primary serving host) and macOS (Mac mini — batch host)

Secret	Vultr (Linux)	Mac mini
Tushare Pro token	systemd EnvironmentFile, mode 0600	macOS Keychain (`twilight-backend-tushare`) — only for nightly pytdx3/Tushare batch jobs
DashScope API key (backend pool)	systemd EnvironmentFile	not needed (Mac mini doesn't run gateways in MVP)
WeChat Pay webhook secret	systemd EnvironmentFile	not needed
Postgres password	systemd EnvironmentFile	not needed (Postgres only on Vultr)
Cloudflare Tunnel token	not used (Vultr has public IP)	managed by `cloudflared` daemon
Per-user `TWILIGHT_API_TOKEN` (in user's hermes profile)	provisioner plants via `pass`/keyring on the same Vultr box hosting the gateway	n/a in MVP — no user gateways here

User-facing secrets (i.e., TWILIGHT_API_TOKEN in user's hermes profile): provisioner generates, plants into the host's secret store, hands the QR back, and discards the raw value. User never sees it.

Note on Linux secret storage: macOS has Keychain. Linux equivalents we'd consider:

pass (gpg-backed, well-documented) — simple, fits per-user-profile model
systemd-creds — built-in, encrypted at rest with system key
Plain file under /var/lib/twilight/secrets/<user_id> mode 0600 — simplest, fine for single-host MVP

Recommend pass for symmetry with the macOS Keychain pattern; revisit if it gets in the way.

sub2api-style API key flow (with the "invisible" twist)

[user pays]
       ↓
[backend: payment webhook]
       ↓
[create user, generate raw_api_key = secrets.token_urlsafe(32)]
       ↓
[hash, store hash in postgres api_keys table]
       ↓
[provision hermes profile on chosen host]
       ↓
[on that host: security add-generic-password -s twilight-drive-api-token -w $raw_api_key]
       ↓
[discard raw key from server memory]
       ↓
[return QR code URL to user]

After this flow, the raw token exists ONLY inside the user's hermes profile Keychain. Backend keeps the hash for verification, can revoke (set revoked_at), but cannot recover or display the raw value.

This matches the user's "invisible to him" requirement and exceeds sub2api's default (which exposes the key once).

Rate limiting / caps

P1.0: per-token-bucket rate limit, 60 rpm default. 429 with Retry-After. Skill side already retries.

P1.x: optional caps wired into the rate limiter — daily caps per plan (deferred per user instruction). Soft caps (logged but not enforced) come first so we can observe usage before deciding cutoffs.

Deployment

Two boxes. CI builds + tags artifacts; deploy is ssh && docker compose pull && systemctl restart.

GitHub tag (v0.2.0)
    ↓
GitHub Actions: build wheels + Docker image → ghcr.io (private)
    ↓
deploy-vultr.yml:    ssh vultr → systemctl restart twilight-service
deploy-mac-mini.yml: ssh mac-mini → systemctl --user restart twilight-batch
                       (only after P1.5 — until then mac-mini is dormant)

Per-user hermes profile lifecycle (MVP — single host):

Provisioner runs on Vultr, host=Vultr always
hermes profile create user-abc123 --clone-from template-stock-research-pro
Plant TWILIGHT_API_TOKEN (and any per-user secrets) via pass
Register systemctl --user unit for the new gateway
Generate iLink Bot QR via existing hermes weixin tooling
Return QR URL to landing page

Phase 2: when Vultr RAM tightens, provisioner gains a host-selection step (round-robin between Vultr + Mac mini). Until then it's hardcoded to Vultr.

Observability

structlog JSON to stdout/journalctl on Linux, Console on macOS
/healthz checks: DB connectivity, Tushare last-success timestamp, DashScope last-success timestamp
/metrics deferred to P1.5 — Prometheus client is one decorator when needed
Sentry for unhandled exceptions

Open questions

#	Question	Status
1	Domain	Still open, but not blocking. MVP can launch on Vultr IP + self-signed; rotate to a real domain (`twilight-drive.com` or `.dev`, ~$15/yr) before opening to non-alpha users. Cloudflare in front handles TLS regardless.
2	Tushare account	✅ Resolved — paid Tushare Pro account funded by the project. Tinyshare (piwheels) is a bytecode-protected wrapper of Tushare meant for individual research; proxying paying users through it would violate its scope. ¥200–2000/yr Tushare Pro is the right line.
3	Multi-source role	✅ Resolved — akshare + yfinance + TDX are complementary, not redundancy. Roles: enrichment, factor generation, cross-source verification at the data-quality layer. P1.2 wires them in behind the same API.
4	User onboarding	✅ Resolved — pay → confirm → WeChat QR → claim bot with name → use. API key invisible to user.
5	Pricing	✅ Resolved — ¥198/mo Pro (selectable), $49/mo Lite (display-only, Phase 2).
6	Payment processor	✅ Resolved — WeChat Pay only for MVP. Target users are CN. Stripe deferred to Phase 2.
7	Mac mini ↔ Vultr load split	✅ Resolved — single-host on Vultr for MVP (≤ 30 users): all serving processes (FastAPI, Postgres, Provisioner, gateways) live there. Mac mini is warehouse PRIMARY (writes + nightly batch via pytdx3) with rsync replica → Vultr. Mac mini gains gateway overflow only when Vultr RAM tightens.
8	Provisioner host choice	✅ Resolved — Vultr, same box as FastAPI. Same secrets, same operational surface.

Success criteria

P1.0 (closed alpha) ships when:

[ ] FastAPI on Vultr, 4 endpoints, bearer auth
[ ] DuckDB cache returns 600519's last close in < 50ms after warmup
[ ] One-command deploy from local
[ ] Verifier passes on responses (cite envelope intact)
[ ] Auth: requests without bearer → 401; revoked → 403
[ ] Survives 100 sequential same-code requests with 1 Tushare call total

P1.1 (paid product) ships when:

[ ] Landing page live with 2 pricing cards (1 disabled)
[ ] Pay ¥198 in WeChat Pay → user record created in Postgres
[ ] Provisioner spawns hermes profile, returns QR
[ ] User scans QR, names bot, sends "查 600519 的 P/E" → cited answer in WeChat
[ ] User never copy-pasted any token, key, or URL

P1.2 (multi-source) ships when:

[ ] pytdx3 backfilled daily for ≥ 5 years for top 300 stocks
[ ] akshare + yfinance plugged in behind /price for non-A-share queries
[ ] cross-source agreement check logs a warning if Tushare and pytdx3 differ on the same (code, trade_date, close) by > 0.01%

P1.3 (web search) ships when:

[ ] /search proxy returns DashScope results for query="600519 业绩公告 2026"
[ ] DashScope quota exhausted → fallback to Google Custom Search transparent
[ ] Cite envelope on each search result has served_by: "twilight-drive-backend" + the original source URL

Out of scope (explicit)

Per-user dashboards / billing portals (P2 / P3)
Lite tier scope enforcement
Real-time / intraday streaming
WebSocket bot transports (we're WeChat-only for MVP)
Multi-region failover
Local on-device LLM inference (DashScope owns it)
In-house factor generation (P3)
Research-report ingestion (P2)

Estimated effort

Phase	Scope	Effort	Calendar
P1.0	Backend data API, single host, manual API keys	~10h	2-3 sessions
P1.1	Landing + payment + provisioner + WeChat QR loop	~12-15h	3-4 sessions
P1.2	pytdx3 + akshare + yfinance behind same API	~5-6h	1-2 sessions
P1.3	`/search` proxy with DashScope → Google fallback	~3h	1 session
P1.5	APScheduler warehouse fill	~6h	1-2 sessions
P1.7	Mac mini deployment + Cloudflare Tunnel + load split	~4h	1 session
Total to first paying user	P1.0 + P1.1 minimum	~22-25h	~3 weeks part-time

Next step

All blocking decisions are resolved. Domain (Q1) is non-blocking — MVP can launch against the Vultr IP and rotate to a real domain before opening to non-alpha users.

Next concrete output: an executable task plan in docs/planning/plans/2026-MM-DD-p1-mvp-tasks.md, modelled on 2026-04-28-03-agent-framework-mvp.md (TDD-style, atomic commits, ordered task list). Plan that work when ready.

Sources consulted in this revision

sub2api — invisible API key inspiration
pytdx3 — pure-Python TDX socket client (no local install, full historical OHLC, free)
Qwen Code Web Search Tool — DashScope free tier, 1k/day, Tavily / Google fallback
Mac mini M4 Pro 48GB benchmarks — confirms 100-user feasible if inference is offloaded
Tushare official
Tinyshare (piwheels) — wrapper of Tushare with bytecode-protected Multi-Version

P1 — Hosted Service Backend ​

Why this is bigger than "a backend" ​

Product surface (MVP) ​

Pricing tiers (displayed on landing page) ​

Onboarding flow (4 steps, ≤ 5 minutes) ​

Phased technical architecture ​

P1.0 — MVP backend with closed alpha (target: 2 weeks) ​

P1.1 — Onboarding + payment (target: +2 weeks) ​

P1.2 — Multi-source data layer (target: +1 week) ​

P1.3 — Web search proxy (target: +3 days) ​

P1.5 — Scheduled warehouse (target: when load justifies) ​

P2 — Lite tier + usage caps + multi-tier 4-layer warehouse ​

Tech selection ​

Glossary: two things both called "gateway" ​

Provisioner — what it is, why it's separate from FastAPI ​

Capacity & hosting strategy ​

Per-user resource profile (steady state) ​

MVP: single-host on Vultr (≤ 30 paying users) ​

Phase 2: add Mac mini gateway overflow (when Vultr RAM tightens) ​

API surface (P1.0+) ​

Cite envelope on hosted responses ​

Database schema ​

Postgres (users + billing) ​

DuckDB (data warehouse) — same as previous spec ​

Operational concerns ​

Secrets on Linux (Vultr — primary serving host) and macOS (Mac mini — batch host) ​

sub2api-style API key flow (with the "invisible" twist) ​

Rate limiting / caps ​

Deployment ​

Observability ​

Open questions ​

Success criteria ​

Out of scope (explicit) ​

Estimated effort ​

Next step ​

Sources consulted in this revision ​