主题
Bug Fix Sweep Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Fix 12 bugs across Python backend, core data providers, MCP proxy, and NestJS app-backend — all identified by code review on 2026-05-16.
Architecture: Three independent subsystems (Python service/core/mcp, TypeScript backend). Fixes within each subsystem are ordered by dependency (fix callee before caller). All Python changes go on one feature branch; all TypeScript changes on another, or both on one branch.
Tech Stack: Python 3.11 (FastAPI, APScheduler, DuckDB, httpx), TypeScript (NestJS, Prisma, Axios), pytest
Files Modified
| File | Bug |
|---|---|
core/core/data/_market.py | L71: <= → < for CLOSE_TIME |
core/core/data/tushare.py | L41: data["data"] can be None, chained .get() raises AttributeError |
core/core/data/audit_log.py | L207: if value_b is falsy for 0.0; use != 0 |
core/core/data/providers/tushare_daily.py | L83: no explicit sort on adj_items[0]; L161-168: None factor silently stored |
mcp/tushare_mcp/proxy.py | L131: catches all Exception incl. programming errors; narrow to duckdb.Error | RuntimeError |
src/service/cache.py | L103, L166: concurrent put() → DuckDB file-lock conflict; add threading.Lock |
src/service/auth.py | L75: paid_until string vs date comparison; use date.fromisoformat() |
backend/src/modules/newapi/newapi.service.ts | L51: missing await on retry path |
backend/src/modules/billing/billing.service.ts | L213: ignores provisionPaidOrder return value |
backend/src/modules/provisioning/provisioning.service.ts | L72-73: NewAPI failure logged but instance stays PENDING; mark DEGRADED |
backend/src/modules/auth/auth.service.ts | L171-173, L299-301: OTP race — two concurrent verifies both succeed |
backend/src/modules/provisioning/provision-worker.service.ts | L70: attempts read from stale pre-claim task snapshot |
Phase 1: Python core/core/data fixes (no tests needed to read, but update existing)
Task 1: Fix _market.py — 15:00 boundary
Files:
Modify:
core/core/data/_market.py:71Test:
core/tests/unit/test_data_market.py(existing)[ ] Step 1: Read the existing test to understand coverage
bash
cat core/tests/unit/test_data_market.py- [ ] Step 2: Add a failing test for 15:00 boundary
In core/tests/unit/test_data_market.py, add:
python
def test_is_market_open_at_close_time_returns_false():
"""15:00:00 exactly is CLOSED, not open."""
from zoneinfo import ZoneInfo
from datetime import datetime
from core.data._market import is_market_open
now = datetime(2024, 1, 2, 15, 0, 0, tzinfo=ZoneInfo("Asia/Shanghai")) # Tuesday
assert is_market_open(now=now) is False
def test_is_market_open_just_before_close():
"""14:59:59 is still open."""
from zoneinfo import ZoneInfo
from datetime import datetime
from core.data._market import is_market_open
now = datetime(2024, 1, 2, 14, 59, 59, tzinfo=ZoneInfo("Asia/Shanghai"))
assert is_market_open(now=now) is True- [ ] Step 3: Run to confirm fail
bash
cd core && python -m pytest tests/unit/test_data_market.py::test_is_market_open_at_close_time_returns_false -vExpected: FAIL (currently returns True at 15:00).
- [ ] Step 4: Fix
_market.py
In core/core/data/_market.py, line 71, change:
python
if not (_OPEN_TIME <= now.time() <= _CLOSE_TIME):to:
python
if not (_OPEN_TIME <= now.time() < _CLOSE_TIME):- [ ] Step 5: Run tests
bash
cd core && python -m pytest tests/unit/test_data_market.py -vExpected: all pass.
- [ ] Step 6: Commit
bash
git add core/core/data/_market.py core/tests/unit/test_data_market.py
git commit -m "fix(market): exclude 15:00 from open window (< not <=)"Task 2: Fix tushare.py — data["data"] None AttributeError
Files:
- Modify:
core/core/data/tushare.py:41
Current code:
python
items = data.get("data", {}).get("items") or []Problem: if data["data"] is None (key exists, value is null), data.get("data", {}) returns None, then .get("items") raises AttributeError.
- [ ] Step 1: Add failing test
In core/tests/unit/ create or add to existing tushare test:
python
# core/tests/unit/test_data_tushare.py (add to existing file)
from unittest.mock import patch
import httpx
from core.data import tushare
def test_call_handles_null_data_field(respx_mock):
"""Tushare sometimes returns {"code":0,"data":null} — must not AttributeError."""
respx_mock.post(tushare.TUSHARE_URL).mock(
return_value=httpx.Response(200, json={"code": 0, "data": None})
)
result = tushare.call(
api_name="trade_cal", params={}, fields="cal_date", token="tok"
)
assert result == []- [ ] Step 2: Run to confirm fail
bash
cd core && python -m pytest tests/unit/test_data_tushare.py::test_call_handles_null_data_field -vExpected: FAIL with AttributeError: 'NoneType' object has no attribute 'get'.
- [ ] Step 3: Fix
tushare.py
In core/core/data/tushare.py, change line 41:
python
items = data.get("data", {}).get("items") or []to:
python
items = (data.get("data") or {}).get("items") or []- [ ] Step 4: Run tests
bash
cd core && python -m pytest tests/unit/test_data_tushare.py -v- [ ] Step 5: Commit
bash
git add core/core/data/tushare.py core/tests/unit/test_data_tushare.py
git commit -m "fix(tushare): handle null data field without AttributeError"Task 3: Fix audit_log.py — division by zero semantic
Files:
- Modify:
core/core/data/audit_log.py:207
Current code (line 207):
python
diff_pct = (
(value_a - value_b) / abs(value_b) * 100.0 if value_b else 0.0
)Problem: if value_b is False for value_b = 0.0 — correct in Python, but non-explicit and confusing. More importantly, value_b = -0.0 or NaN could produce unexpected results. Use explicit != 0 check.
- [ ] Step 1: Add test
In core/tests/unit/test_data_audit_log.py (add to existing):
python
def test_record_diff_zero_value_b_does_not_divide(tmp_path):
"""diff_pct is 0 when value_b is zero, not a ZeroDivisionError."""
log = AuditLog(str(tmp_path / "audit.duckdb"))
# Should not raise
log.record_diff(
code="600519.SH",
trade_date="2024-01-02",
provider_a="akshare",
value_a=100.0,
provider_b="tushare",
value_b=0.0,
severity="warn",
)- [ ] Step 2: Run to confirm passes (it should already — this confirms the behavior)
bash
cd core && python -m pytest tests/unit/test_data_audit_log.py::test_record_diff_zero_value_b_does_not_divide -v- [ ] Step 3: Fix
audit_log.py
Change line 207:
python
diff_pct = (
(value_a - value_b) / abs(value_b) * 100.0 if value_b else 0.0
)to:
python
diff_pct = (
(value_a - value_b) / abs(value_b) * 100.0 if value_b != 0 else 0.0
)- [ ] Step 4: Run tests
bash
cd core && python -m pytest tests/unit/test_data_audit_log.py -v- [ ] Step 5: Commit
bash
git add core/core/data/audit_log.py core/tests/unit/test_data_audit_log.py
git commit -m "fix(audit_log): use explicit != 0 check for diff_pct denominator"Task 4: Fix tushare_daily.py — missing sort + silent None factor
Files:
- Modify:
core/core/data/providers/tushare_daily.py - Test:
core/tests/unit/test_data_providers_tushare_daily.py
Two bugs:
adj_items[0]assumes DESC sort by trade_date but no explicit sort_extract_factorsreturns(None, latest)silently when target date has no factor
- [ ] Step 1: Add tests
In core/tests/unit/test_data_providers_tushare_daily.py, add:
python
def test_extract_factors_sorts_by_date_desc():
"""Latest factor is the highest date regardless of API return order."""
# adj_items in ASC order (wrong API order) — oldest first
adj_items = [
["600519.SH", "20240101", "1.0"],
["600519.SH", "20240103", "1.2"],
["600519.SH", "20240102", "1.1"],
]
target_factor, latest = TushareDailyProvider._extract_factors(adj_items, "20240102")
assert latest == 1.2 # highest date = highest index after sort
assert target_factor == 1.1
def test_extract_factors_raises_when_target_date_missing():
"""ProviderUnavailable raised if adj_factor for target date absent."""
adj_items = [
["600519.SH", "20240103", "1.2"],
]
with pytest.raises(ProviderUnavailable, match="adj_factor"):
TushareDailyProvider._extract_factors(adj_items, "20240101")- [ ] Step 2: Run to confirm fail
bash
cd core && python -m pytest tests/unit/test_data_providers_tushare_daily.py::test_extract_factors_sorts_by_date_desc tests/unit/test_data_providers_tushare_daily.py::test_extract_factors_raises_when_target_date_missing -v- [ ] Step 3: Fix
_extract_factors
In core/core/data/providers/tushare_daily.py, replace the _extract_factors staticmethod (lines 151-169):
python
@staticmethod
def _extract_factors(
adj_items: list[list[Any]], target_yyyymmdd: str
) -> tuple[float | None, float | None]:
if not adj_items:
return None, None
# Tushare returns DESC by default but we sort explicitly to be safe.
sorted_items = sorted(adj_items, key=lambda r: str(r[1]), reverse=True)
try:
latest = float(sorted_items[0][2])
except (TypeError, ValueError) as exc:
raise CanonicalUnitViolation(
f"adj_factor not numeric: {sorted_items[0]!r}"
) from exc
target_factor: float | None = None
for row in sorted_items:
if str(row[1]) == target_yyyymmdd:
try:
target_factor = float(row[2])
except (TypeError, ValueError) as exc:
raise CanonicalUnitViolation(
f"adj_factor not numeric: {row!r}"
) from exc
break
if target_factor is None:
raise ProviderUnavailable(
f"no adj_factor for {target_yyyymmdd} in {len(sorted_items)} rows"
)
return target_factor, latest- [ ] Step 4: Run tests
bash
cd core && python -m pytest tests/unit/test_data_providers_tushare_daily.py -v- [ ] Step 5: Commit
bash
git add core/core/data/providers/tushare_daily.py core/tests/unit/test_data_providers_tushare_daily.py
git commit -m "fix(tushare_daily): explicit sort on adj_items; raise ProviderUnavailable when target factor missing"Phase 2: Python MCP proxy fix
Task 5: Fix proxy.py — overly broad exception catch
Files:
- Modify:
mcp/tushare_mcp/proxy.py:131
Current code (lines 129-134):
python
if self._conn is not None:
try:
return self._duckdb_query(table, columns, date_cols, **kwargs)
except Exception as exc:
logger.warning("DuckDB query failed for %s, falling back: %s", method, exc)
if self._fallback is None:
raise RuntimeError(f"No Tushare token and DuckDB unavailable for '{method}'")Problem: catches all Exception including programming bugs (KeyError, AttributeError). Should only catch DuckDB errors and "no data" RuntimeError. Bugs silently fall back to API, hiding real issues.
- [ ] Step 1: Fix
proxy.py
Change lines 129-132:
python
if self._conn is not None:
try:
return self._duckdb_query(table, columns, date_cols, **kwargs)
except (duckdb.Error, RuntimeError) as exc:
logger.warning(
"DuckDB query failed for %s, falling back to API: %s",
method, exc,
)Verify import duckdb is already at the top of proxy.py. If not, add it.
- [ ] Step 2: Run MCP tests if they exist
bash
cd mcp && python -m pytest tests/ -v 2>/dev/null || echo "no mcp tests"- [ ] Step 3: Commit
bash
git add mcp/tushare_mcp/proxy.py
git commit -m "fix(mcp/proxy): narrow exception catch to duckdb.Error|RuntimeError"Phase 3: Python service fixes
Task 6: Fix cache.py — concurrent put() file lock conflict
Files:
- Modify:
src/service/cache.py
DailyCache.put() and FundamentalsCache.put() each open a fresh duckdb.connect(). Concurrent FastAPI requests → simultaneous writers → IO Error: Conflicting lock. Fix: add threading.Lock per cache instance; hold lock during writes.
get() reads are safe because DuckDB allows concurrent readers.
- [ ] Step 1: Add test
In core/tests/integration/test_service_cache.py, add:
python
import threading
def test_daily_cache_concurrent_puts_no_lock_error(tmp_path):
"""Multiple threads calling put() simultaneously must not raise."""
cache = DailyCache(str(tmp_path / "cache.duckdb"))
errors = []
def write(i):
try:
cache.put(code="600519.SH", trade_date=f"202401{i:02d}", close=float(i), fetched_at=datetime.now(UTC))
except Exception as e:
errors.append(e)
threads = [threading.Thread(target=write, args=(i,)) for i in range(1, 20)]
for t in threads:
t.start()
for t in threads:
t.join()
assert errors == [], f"concurrent put errors: {errors}"- [ ] Step 2: Run to confirm fail
bash
cd core && python -m pytest tests/integration/test_service_cache.py::test_daily_cache_concurrent_puts_no_lock_error -vExpected: FAIL with IO Error: Conflicting lock or similar (may be intermittent at low thread count — increase to 20 threads).
- [ ] Step 3: Fix
DailyCache
In src/service/cache.py, update DailyCache:
python
import threading # add to imports if not present
class DailyCache:
def __init__(self, path: str) -> None:
self.path = path
self._lock = threading.Lock()
Path(path).parent.mkdir(parents=True, exist_ok=True)
with duckdb.connect(path) as conn:
conn.execute(
"""
CREATE TABLE IF NOT EXISTS daily_cache (
code VARCHAR NOT NULL,
trade_date VARCHAR NOT NULL,
close DOUBLE NOT NULL,
fetched_at TIMESTAMP NOT NULL,
PRIMARY KEY (code, trade_date)
)
"""
)
# ... (keep _ttl_floor, get, get_latest unchanged) ...
def put(
self,
*,
code: str,
trade_date: str,
close: float,
fetched_at: datetime,
) -> None:
with self._lock:
with duckdb.connect(self.path) as conn:
conn.execute(
"INSERT INTO daily_cache(code, trade_date, close, fetched_at)"
" VALUES (?, ?, ?, ?)"
" ON CONFLICT (code, trade_date) DO UPDATE SET"
" close = excluded.close,"
" fetched_at = excluded.fetched_at",
[code, trade_date, close, _to_naive_utc(fetched_at)],
)- [ ] Step 4: Fix
FundamentalsCache— same pattern
python
class FundamentalsCache:
def __init__(self, path: str) -> None:
self.path = path
self._lock = threading.Lock()
Path(path).parent.mkdir(parents=True, exist_ok=True)
with duckdb.connect(path) as conn:
conn.execute(
"""
CREATE TABLE IF NOT EXISTS fundamentals_cache (
code VARCHAR NOT NULL,
period VARCHAR NOT NULL,
payload VARCHAR NOT NULL,
fetched_at TIMESTAMP NOT NULL,
PRIMARY KEY (code, period)
)
"""
)
# ... (keep _ttl_floor, get unchanged) ...
def put(
self,
*,
code: str,
period: str,
payload: dict[str, Any],
fetched_at: datetime,
) -> None:
with self._lock:
with duckdb.connect(self.path) as conn:
conn.execute(
"INSERT INTO fundamentals_cache(code, period, payload, fetched_at)"
" VALUES (?, ?, ?, ?)"
" ON CONFLICT (code, period) DO UPDATE SET"
" payload = excluded.payload,"
" fetched_at = excluded.fetched_at",
[code, period, json.dumps(payload), _to_naive_utc(fetched_at)],
)- [ ] Step 5: Run tests
bash
cd core && python -m pytest tests/integration/test_service_cache.py -v- [ ] Step 6: Commit
bash
git add src/service/cache.py core/tests/integration/test_service_cache.py
git commit -m "fix(cache): add threading.Lock to serialize concurrent put() calls"Task 7: Fix auth.py — paid_until date comparison clarity
Files:
- Modify:
src/service/auth.py:75
Current code:
python
if paid_until is not None and paid_until < date.today().isoformat():This works (ISO strings sort lexicographically) but comparing strings to string representation of a date is fragile if format ever changes. Use date.fromisoformat() for explicit intent.
- [ ] Step 1: Check existing auth tests
bash
grep -n "paid_until\|expired" core/tests/integration/test_service_auth.py | head -20- [ ] Step 2: Add test for expiry
In core/tests/integration/test_service_auth.py, add:
python
def test_expired_token_returns_403(client, tmp_path):
"""Token with paid_until in the past is rejected."""
import sqlite3
from datetime import date, timedelta
from service.auth import hash_token
token = "expired-token-test"
past = (date.today() - timedelta(days=1)).isoformat()
with sqlite3.connect(client.app.state.settings.bearer_db_path) as conn:
conn.execute(
"INSERT OR REPLACE INTO bearer_tokens (user_id, token_hash, plan, paid_until, revoked, created_at)"
" VALUES (?, ?, 'paid', ?, 0, '2026-01-01')",
("expired-user", hash_token(token), past),
)
resp = client.get("/whoami", headers={"Authorization": f"Bearer {token}"})
assert resp.status_code == 403
assert "expired" in resp.json()["detail"]- [ ] Step 3: Fix
auth.py
In src/service/auth.py, change line 75:
python
if paid_until is not None and paid_until < date.today().isoformat():to:
python
if paid_until is not None and date.fromisoformat(paid_until) < date.today():- [ ] Step 4: Run tests
bash
cd core && python -m pytest tests/integration/test_service_auth.py -v- [ ] Step 5: Commit
bash
git add src/service/auth.py core/tests/integration/test_service_auth.py
git commit -m "fix(auth): use date.fromisoformat() for paid_until comparison"Phase 4: TypeScript — critical business logic
Task 8: Fix newapi.service.ts — missing await on retry
Files:
- Modify:
backend/src/modules/newapi/newapi.service.ts:51
Current code (line 51):
typescript
return fn(this.sessionToken!);Missing await — if fn throws on retry, the unhandled rejection may not propagate correctly in all Node.js versions.
- [ ] Step 1: Fix
newapi.service.ts
In backend/src/modules/newapi/newapi.service.ts, change line 51:
typescript
return fn(this.sessionToken!);to:
typescript
return await fn(this.sessionToken!);The full withSession method after fix:
typescript
private async withSession<T>(fn: (token: string) => Promise<T>): Promise<T> {
if (!this.sessionToken) await this.login();
try {
return await fn(this.sessionToken!);
} catch (err: unknown) {
if (axios.isAxiosError(err) && err.response?.status === 401) {
this.sessionToken = null;
await this.login();
return await fn(this.sessionToken!);
}
throw err;
}
}- [ ] Step 2: Build to verify no TS errors
bash
cd backend && npx tsc --noEmitExpected: no errors.
- [ ] Step 3: Commit
bash
git add backend/src/modules/newapi/newapi.service.ts
git commit -m "fix(newapi): add await on retry call in withSession"Task 9: Fix billing.service.ts — unchecked provisionPaidOrder result
Files:
- Modify:
backend/src/modules/billing/billing.service.ts
Current code (line 213):
typescript
await this.provisioning.provisionPaidOrder(orderId);provisionPaidOrder never throws — it returns { ok: false, reason: string } on failure. The return value is silently ignored. If provisioning fails, WeChat gets a 200 OK, stops retrying, and the user is paid but not provisioned.
- [ ] Step 1: Fix
billing.service.ts
Replace line 213:
typescript
await this.provisioning.provisionPaidOrder(orderId);with:
typescript
const provResult = await this.provisioning.provisionPaidOrder(orderId);
if (!provResult.ok) {
this.logger.error(
`provisionPaidOrder failed for order ${orderId}: ${provResult.reason}`,
);
// Still return SUCCESS to WeChat to stop retry — provisioning failure
// is recorded on the instance. Ops must manually recover via admin tools.
}- [ ] Step 2: Build
bash
cd backend && npx tsc --noEmit- [ ] Step 3: Commit
bash
git add backend/src/modules/billing/billing.service.ts
git commit -m "fix(billing): log provisioning failure instead of silently ignoring result"Task 10: Fix provisioning.service.ts — DEGRADED state on NewAPI failure
Files:
- Modify:
backend/src/modules/provisioning/provisioning.service.ts
Current code (lines 72-73):
typescript
} catch (e: unknown) {
this.logger.error("NewAPI token issuance failed (non-fatal)", (e as Error)?.message);
}llmApiKey stays null on the HermesInstance. The worker in runSpawn falls back to process.env.HERMES_DEFAULT_LLM_KEY (line 85 of provision-worker.service.ts), so Hermes still starts. But the logger says "non-fatal" — log it clearly and mark the instance status so ops can see it.
Check the Prisma schema for valid HermesInstance.status values before editing:
- [ ] Step 1: Check Prisma schema for valid statuses
bash
grep -A 20 "enum HermesInstanceStatus\|status.*String\|PENDING_BIND\|DEGRADED" backend/prisma/schema.prisma | head -30- [ ] Step 2: If
DEGRADEDis a valid status, update provisioning.service.ts
If the enum includes DEGRADED:
typescript
} catch (e: unknown) {
this.logger.error(
`NewAPI token issuance failed for instance ${result.instanceId} — Hermes will use default LLM key`,
(e as Error)?.message,
);
await this.prisma.hermesInstance.update({
where: { id: result.instanceId },
data: { status: "DEGRADED" },
});
}If DEGRADED is NOT a valid status: add it to the Prisma schema enum and create a migration:
bash
# Add DEGRADED to HermesInstanceStatus enum in backend/prisma/schema.prisma
# Then:
cd backend && npx prisma migrate dev --name add-degraded-instance-status- [ ] Step 3: Build
bash
cd backend && npx tsc --noEmit- [ ] Step 4: Commit
bash
git add backend/src/modules/provisioning/provisioning.service.ts backend/prisma/
git commit -m "fix(provisioning): mark instance DEGRADED when NewAPI token issuance fails"Phase 5: TypeScript — race conditions
Task 11: Fix auth.service.ts — OTP double-use race
Files:
- Modify:
backend/src/modules/auth/auth.service.ts
Two methods have the same pattern: findFirst → bcrypt.compare → update usedAt. Two concurrent requests can both pass findFirst (same vc), both compare, and both set usedAt. Fix: use a conditional Prisma updateMany with usedAt: null filter — only the first update succeeds. Check count to detect the loser.
Affects both registerWithEmail (lines 168-173) and verifyOtp (lines 296-301).
- [ ] Step 1: Fix
registerWithEmail
Replace lines 168-174:
typescript
const vc = await this.prisma.emailVerificationCode.findFirst({
where: { email, purpose: 'REGISTER', usedAt: null, expiresAt: { gt: new Date() } },
orderBy: { createdAt: 'desc' },
});
if (!vc) throw new UnauthorizedException('验证码无效或已过期,请重新发送');
const ok = await bcryptjs.compare(codeRaw, vc.codeHash);
await this.prisma.emailVerificationCode.update({
where: { id: vc.id },
data: { attempts: { increment: 1 }, ...(ok ? { usedAt: new Date() } : {}) },
});
if (!ok) throw new UnauthorizedException('验证码无效或已过期,请重新发送');with:
typescript
const vc = await this.prisma.emailVerificationCode.findFirst({
where: { email, purpose: 'REGISTER', usedAt: null, expiresAt: { gt: new Date() } },
orderBy: { createdAt: 'desc' },
});
if (!vc) throw new UnauthorizedException('验证码无效或已过期,请重新发送');
const ok = await bcryptjs.compare(codeRaw, vc.codeHash);
if (!ok) {
await this.prisma.emailVerificationCode.update({
where: { id: vc.id },
data: { attempts: { increment: 1 } },
});
throw new UnauthorizedException('验证码无效或已过期,请重新发送');
}
// Atomic claim: only succeeds if usedAt is still null (first caller wins)
const claimed = await this.prisma.emailVerificationCode.updateMany({
where: { id: vc.id, usedAt: null },
data: { usedAt: new Date(), attempts: { increment: 1 } },
});
if (claimed.count === 0) throw new UnauthorizedException('验证码已被使用');- [ ] Step 2: Fix
verifyOtp
Replace lines 296-302 with the same pattern:
typescript
const vc = await this.prisma.emailVerificationCode.findFirst({
where: { email, purpose: 'OTP_LOGIN', usedAt: null, expiresAt: { gt: new Date() } },
orderBy: { createdAt: 'desc' },
});
if (!vc) throw new UnauthorizedException('验证码无效或已过期');
const ok = await bcryptjs.compare(codeRaw, vc.codeHash);
if (!ok) {
await this.prisma.emailVerificationCode.update({
where: { id: vc.id },
data: { attempts: { increment: 1 } },
});
throw new UnauthorizedException('验证码无效或已过期');
}
const claimed = await this.prisma.emailVerificationCode.updateMany({
where: { id: vc.id, usedAt: null },
data: { usedAt: new Date(), attempts: { increment: 1 } },
});
if (claimed.count === 0) throw new UnauthorizedException('验证码已被使用');- [ ] Step 3: Build
bash
cd backend && npx tsc --noEmit- [ ] Step 4: Commit
bash
git add backend/src/modules/auth/auth.service.ts
git commit -m "fix(auth): atomic OTP claim via updateMany with usedAt=null guard"Task 12: Fix provision-worker.service.ts — stale attempts count
Files:
- Modify:
backend/src/modules/provisioning/provision-worker.service.ts
Current code (lines 62-64, 70):
typescript
const claimed = await this.prisma.provisionTask.updateMany({
where: { id: task.id, status: "PENDING" },
data: { status: "RUNNING", attempts: { increment: 1 } },
});
if (claimed.count === 0) return;
// ...
} catch (e: unknown) {
const msg = (e as Error)?.message ?? String(e);
const attempts = task.attempts + 1; // ← uses PRE-CLAIM snapshottask.attempts was read before the claim. After updateMany increments it, task.attempts + 1 is one behind the actual DB value. Fix: re-fetch the task after successful claim.
- [ ] Step 1: Fix
provision-worker.service.ts
After line 64 (if (claimed.count === 0) return;), add a re-fetch:
typescript
if (claimed.count === 0) return;
// Re-fetch to get post-increment attempts count.
const freshTask = await this.prisma.provisionTask.findUnique({
where: { id: task.id },
include: {
order: { include: { plan: true } },
subscription: { include: { instance: true } },
},
});
if (!freshTask) return;Then replace all subsequent references to task with freshTask in processOne:
task.subscription?.instance→freshTask.subscription?.instancetask.id→freshTask.idtask.order.plan.priceCnyFen→freshTask.order.plan.priceCnyFentask.attempts + 1→freshTask.attempts(it's already incremented)
Full updated processOne after re-fetch block:
typescript
const instance = freshTask.subscription?.instance;
if (!instance) {
await this.failTask(freshTask.id, null, "no_instance_on_subscription");
return;
}
try {
await this.runSpawn(freshTask.id, instance.id, freshTask.order.plan.priceCnyFen);
} catch (e: unknown) {
const msg = (e as Error)?.message ?? String(e);
const attempts = freshTask.attempts; // already incremented by claim
if (attempts >= MAX_ATTEMPTS) {
await this.failTask(freshTask.id, instance.id, msg);
} else {
await this.prisma.provisionTask.update({
where: { id: freshTask.id },
data: { status: "PENDING", lastError: msg },
});
this.logger.warn(`task=${freshTask.id} attempt=${attempts} failed; requeued: ${msg}`);
}
}- [ ] Step 2: Build
bash
cd backend && npx tsc --noEmit- [ ] Step 3: Commit
bash
git add backend/src/modules/provisioning/provision-worker.service.ts
git commit -m "fix(provision-worker): re-fetch task after claim to get accurate attempts count"Phase 6: Final verification
- [ ] Run all Python tests
bash
cd core && python -m pytest tests/ -v --tb=short 2>&1 | tail -30Expected: all pass.
- [ ] Run TypeScript build
bash
cd backend && npx tsc --noEmitExpected: no errors.
- [ ] Push branch and open PR
bash
git push -u origin <branch-name>
gh pr create --title "fix: 12-bug sweep (Python core + NestJS backend)" --body "$(cat <<'EOF'
## Summary
- fix(_market): exclude 15:00 from open window
- fix(tushare): handle null data field
- fix(audit_log): explicit != 0 for diff_pct denominator
- fix(tushare_daily): explicit sort on adj_items; raise on missing factor
- fix(mcp/proxy): narrow exception catch to duckdb.Error|RuntimeError
- fix(cache): threading.Lock for concurrent put() calls
- fix(auth): date.fromisoformat() for paid_until comparison
- fix(newapi): await on retry in withSession
- fix(billing): log provisionPaidOrder failure instead of silently ignoring
- fix(provisioning): mark instance DEGRADED on NewAPI failure
- fix(auth): atomic OTP claim via updateMany
- fix(provision-worker): re-fetch task after claim
## Test Plan
- [ ] pytest core/tests/ passes
- [ ] npx tsc --noEmit passes in backend/
- [ ] Manual: pay flow end-to-end on staging
- [ ] Manual: OTP login with same code from two tabs — second should get 验证码已被使用
- [ ] Deploy to ECS, verify /quality/status shows healthy (fromisoformat fix)
EOF
)"