Skip to content

Deploy Pattern — Implementation Plan

Status: ⏳ Drafted 2026-05-07 · awaiting approval Source spec: docs/planning/09-deployment-pattern.mdScope: Land the deploy/ skeleton in the repo (Dockerfile, compose, systemd units, install.sh) so that Week 2's P1.0 backend can ship into a tested deployment lane.

Goal: Translate the spec's Risk Register into actual config files + scripts the operator can run on Vultr (and CI can build images for) without inventing topology at deploy time.

Tech stack: Docker Compose, systemd-user, cloudflared named tunnel, GitHub Actions OIDC → ghcr.io. No code logic in this plan — that's Week 2 P1.0.

Repo surface:

deploy/                                  # NEW directory
├── README.md
├── Dockerfile                           # backend image (builds against src/service/, lands W2)
├── compose.yml                          # docker-compose for backend (+ future postgres)
├── env.example                          # symlinked / kept-in-sync with profile template
├── twilight-backend.service             # systemd-user wraps `docker compose up -d`
├── twilight-cloudflared.service         # systemd-user runs cloudflared tunnel
├── cloudflared-config.yml.template      # ingress rules with <UUID> placeholder
└── install.sh                           # one-shot installer (mirrors PRD's deploy/install.sh)

.github/workflows/release.yml            # MODIFY: add docker build + push job
docs/planning/superpowers/specs/2026-05-07-twilight-drive-phase1.md  # MODIFY: drop Caddy

Task 1: Dockerfile

Files:

  • Create: deploy/Dockerfile

  • [ ] Step 1.1: Base imageFROM python:3.11-slim@sha256:<digest> (pin a specific digest; Dependabot updates)

  • [ ] Step 1.2: Non-root useruseradd --uid 10001 --no-create-home twilight && USER twilight

  • [ ] Step 1.3: Working dir/app owned by twilight; uv installed via pip install uv (~25 MB; gives us reproducible installs)

  • [ ] Step 1.4: DependenciesCOPY core/pyproject.toml core/uv.lock ./core/ then uv pip install --system --require-hashes -r core/uv.lock (when we have uv.lock; else pip install -e ./core)

  • [ ] Step 1.5: SourceCOPY core/ ./core/ && COPY src/ ./src/ (src/ lands in W2)

  • [ ] Step 1.6: EntrypointCMD ["uvicorn", "service.main:app", "--host", "0.0.0.0", "--port", "8080"]

  • [ ] Step 1.7: Healthcheck baked into imageHEALTHCHECK --interval=30s --timeout=5s CMD curl -fsS http://127.0.0.1:8080/healthz || exit 1

  • [ ] Step 1.8: Build smokedocker buildx build deploy/ -t twilight-drive:dev should fail clean with "missing src/" before W2; passes after

Note: Until W2 ships src/service/main.py, the Dockerfile is documentation; CI will skip the build step (or run it as continue-on-error: true).

Task 2: docker-compose

Files:

  • Create: deploy/compose.yml

  • [ ] Step 2.1: Service backend:

    • image: ghcr.io/lacatfly/twilight-drive:${TWILIGHT_VERSION:-latest} (operator overrides via env)
    • container_name: twilight-backend
    • restart: unless-stopped
    • ports: ["127.0.0.1:8081:8080"]
    • env_file: ../.env (the operator's ~/twilight/.env)
    • volumes: ["../data:/data"] (DuckDB cache persists; container runs read-only)
    • read_only: true
    • tmpfs: ["/tmp"]
    • cap_drop: [ALL]
    • security_opt: ["no-new-privileges:true"]
    • deploy.resources.limits: {cpus: "1.5", memory: "1500m"}
    • logging.driver: "json-file", options: {max-size: "10m", max-file: "3"}
    • healthcheck from Dockerfile inherited
  • [ ] Step 2.2: Service postgres (commented for v0.2.0; uncomment for P1.1):

    • image: postgres:16-alpine@sha256:<digest>
    • volumes: ["../postgres-data:/var/lib/postgresql/data"]
    • env_file: ../.env-postgres (separate file; only POSTGRES_* keys)
    • restart: unless-stopped
    • Resource limits + log rotation
  • [ ] Step 2.3: Network — implicit twilight_default bridge; no network_mode: host; no docker.sock mounting in any service

Task 3: cloudflared config template

Files:

  • Create: deploy/cloudflared-config.yml.template

  • [ ] Step 3.1: Tunnel block — UUID placeholder + creds path:

    yaml
    tunnel: {{TUNNEL_UUID}}
    credentials-file: /home/{{USER}}/.cloudflared/twilight-backend-{{TUNNEL_UUID}}.json
  • [ ] Step 3.2: Ingress rules — single hostname for v0.2.0:

    yaml
    ingress:
      - hostname: api.fsagent.cc
        service: http://127.0.0.1:8081
        originRequest:
          connectTimeout: 30s
          noTLSVerify: false
      - service: http_status:404
  • [ ] Step 3.3: Logging + metrics:

    yaml
    loglevel: info
    no-autoupdate: true   # we control updates via apt or manual
    metrics: 127.0.0.1:9090   # local-only Prometheus endpoint, optional

Task 4: systemd units

Files:

  • Create: deploy/twilight-backend.service

  • Create: deploy/twilight-cloudflared.service

  • [ ] Step 4.1: twilight-backend.service — wrap docker compose:

    ini
    [Unit]
    Description=Twilight Drive Backend (Docker)
    Requires=docker.service
    After=docker.service network-online.target
    
    [Service]
    Type=simple
    WorkingDirectory=<home>/twilight
    ExecStartPre=-/usr/bin/docker compose down
    ExecStart=/usr/bin/docker compose up
    ExecStop=/usr/bin/docker compose down
    Restart=always
    RestartSec=10
    StandardOutput=append:<home>/twilight/logs/backend.log
    StandardError=append:<home>/twilight/logs/backend.log
    
    [Install]
    WantedBy=default.target
  • [ ] Step 4.2: twilight-cloudflared.service — independent tunnel:

    ini
    [Unit]
    Description=Cloudflare Tunnel for Twilight Drive Backend
    After=network.target twilight-backend.service
    
    [Service]
    Type=simple
    ExecStart=<home>/.local/bin/cloudflared --config <home>/.cloudflared/twilight-config.yml tunnel run twilight-backend
    StandardOutput=append:<home>/twilight/logs/cloudflared.log
    StandardError=append:<home>/twilight/logs/cloudflared.log
    Restart=always
    RestartSec=10
    
    [Install]
    WantedBy=default.target
  • [ ] Step 4.3: Naming check — install.sh refuses to install if poly-trade-*.service is owned by a different user (it's all openclaw so this is a sanity check, not a real risk)

Task 5: install.sh

Files:

  • Create: deploy/install.sh

  • [ ] Step 5.1: Idempotent dirsmkdir -p ~/twilight/{data,logs} and ~/.config/systemd/user

  • [ ] Step 5.2: Substitute <home> and <user> — sed replace in service templates → write to ~/.config/systemd/user/

  • [ ] Step 5.3: Substitute {{TUNNEL_UUID}} + {{USER}} — read from ~/.cloudflared/TWILIGHT_TUNNEL_UUID (created by tunnel-create step) → write ~/.cloudflared/twilight-config.yml

  • [ ] Step 5.4: Reload systemdsystemctl --user daemon-reload

  • [ ] Step 5.5: Enable + startsystemctl --user enable --now twilight-backend.service twilight-cloudflared.service

  • [ ] Step 5.6: Lingerloginctl enable-linger $(whoami) || warn

  • [ ] Step 5.7: Health checkscurl -fsS http://127.0.0.1:8081/healthz (until W2 backend ships, expect 502 — install.sh prints WARNING but doesn't fail)

  • [ ] Step 5.8: Print next steps — Cloudflare Access policy creation reminder + cloudflared tunnel info twilight-backend URL

Task 6: env.example synced with profile template

Files:

  • Create: deploy/env.example

  • [ ] Step 6.1: Decide sourceprofile/template-stock-research-pro/.env.example is the user-profile env; backend deployment env has different keys (TWILIGHT_BEARER_DB_PATH, DUCKDB_PATH, LISTEN_HOST, LOG_LEVEL, plus shared TUSHARE_TOKEN, SILICONFLOW_API_KEY)

  • [ ] Step 6.2: Write standalone deploy env.example — backend-specific keys with comments; cross-link to profile template for skill-side keys

  • [ ] Step 6.3: install.sh checks ~/twilight/.env exists — refuses to start if not (prints "copy deploy/env.example and fill")

Task 7: GitHub Actions — Docker build + push

Files:

  • Modify: .github/workflows/release.yml

  • [ ] Step 7.1: New job docker — runs after wheel + skill tarball jobs succeed

  • [ ] Step 7.2: OIDC auth to ghcr.iopermissions: { packages: write, id-token: write }; use docker/login-action with GITHUB_TOKEN

  • [ ] Step 7.3: Build + push two tags:0.X.Y (semver) + :latest; pushed under ghcr.io/lacatfly/twilight-drive

  • [ ] Step 7.4: Output digestdocker/build-push-action → outputs digest; printed to job summary

  • [ ] Step 7.5: Skip on src/ missing — guard step if: hashFiles('src/service/main.py') != ''; until W2 lands, the docker job no-ops with a warning instead of failing the release

Task 8: Update phase1 spec — drop Caddy

Files:

  • Modify: docs/planning/superpowers/specs/2026-05-07-twilight-drive-phase1.md

  • [ ] Step 8.1: Replace "Caddy reverse proxy" mentions with "cloudflared named tunnel"

  • [ ] Step 8.2: Cross-link to 09-deployment-pattern.md for the operator-facing details

  • [ ] Step 8.3: Update success criteria — "401 without bearer / 403 with revoked" still applies; add "CF Access wraps the bearer with Email OTP / service token"

Task 9: Pages custom domain (operator action)

Not in this PR — operator does this manually in Cloudflare dashboard:

  • [ ] Add custom domain dev.fsagent.cc to twilight-drive Pages project
  • [ ] Update VitePress docs/.vitepress/config.ts site URL to the custom domain (cosmetic; CF Pages handles the redirect either way)

Task 10: Verification

  • [ ] Step 10.1: bash -n deploy/install.sh — syntax check
  • [ ] Step 10.2: shellcheck deploy/install.sh — shellcheck clean
  • [ ] Step 10.3: docker compose -f deploy/compose.yml config — yaml schema validates (without actually pulling)
  • [ ] Step 10.4: cloudflared --config deploy/cloudflared-config.yml.template tunnel ingress validate — once UUID is filled, this should report "OK"
  • [ ] Step 10.5: Dry-run install.sh on Mac — point HOME to /tmp/twilight-test, run install.sh, confirm files written to expected paths (won't start services because no docker daemon on dev Mac, but file generation should work)
  • [ ] Step 10.6: Real install on VPS — operator runs the bootstrap (one-shot section in 09-deployment-pattern.md); install.sh enables services; cloudflared tunnel comes up; curl https://api.fsagent.cc/healthz from outside Cloudflare Access (should 403) and from authorized browser (should 200 once W2 backend ships)

Risks / 缓解

风险缓解
Dockerfile builds against src/ that doesn't exist yetLand Dockerfile + compose anyway; CI guard hashFiles('src/service/main.py') != ''; install.sh prints WARN if backend unreachable
install.sh edits live systemd state on operator's machineMark dry-run mode DRY_RUN=1 that emits files to a temp dir for review
Operator forgets to create CF Access policy → backend exposedinstall.sh prints policy-creation reminder + Zero Trust URL; W2 backend bearer auth is second line of defense (defense in depth)
Dependabot churns the base image SHA every weekGroup base-image PRs in 7-day cooldowns (already in our Actions defaults per CLAUDE.md); review monthly

不在本 plan 范围

  • 实际写 src/service/main.py (Week 2 P1.0)
  • Postgres bootstrap (P1.1)
  • New-API token gateway (P2-C)
  • 网络 egress allowlist (P2 评估)
  • 把 PRD 迁到 rootless docker (P2 评估)

进一步阅读

团队内部文档