spec · auth refactor · 2026-Q2

Migrate session storage from cookies to JWT

Step-by-step plan to swap the auth backbone from server-side session cookies to stateless signed tokens, with a rollback path and a 2-week shadow period. This is an example pergam — fictional content.

01 Goal

Replace the current Rails-style session cookie with short-lived JWTs (15-min access + 7-day refresh) so we can scale the auth service horizontally without sticky sessions.

Non-goal: introducing OAuth providers. Out of scope for this iteration.

02 Architecture

flowchart LR
  classDef client fill:#161b22,stroke:#58a6ff,color:#e6edf3
  classDef edge   fill:#1f1a0a,stroke:#d29922,color:#ffa657
  classDef app    fill:#0a1f12,stroke:#3fb950,color:#7ee787
  classDef store  fill:#1c2128,stroke:#8b949e,color:#c9d1d9

  user["Browser"]:::client
  cdn["Edge (CDN)"]:::edge
  api["API gateway
verifies JWT"]:::app auth["auth-service
issues + refreshes"]:::app cache["KV cache
refresh tokens"]:::store user --> cdn --> api api -. on 401 .-> auth auth <--> cache auth -- "access + refresh" --> user

03 Phases

PhaseOwnerEffortOutcome
1 — Issue JWT alongside cookiesalice3 daysDual-write. No user-visible change.
2 — API gateway accepts bothbob2 daysEither auth source works. Metrics on use.
3 — Frontend switches to JWTalice4 daysJWT preferred; cookies still emitted for fallback.
4 — Shadow 14 days14 daysWatch for token-refresh edge cases at scale.
5 — Stop emitting cookiesbob1 dayCleanup. PR removes the dual-write code.

04 Token format

{
  "iss": "auth.example.com",
  "sub": "u_a1b2c3d4",
  "iat": 1715900000,
  "exp": 1715900900,
  "roles": ["user", "billing"],
  "tenant": "acme-corp"
}

Signed with ES256 (P-256). Public key rotated quarterly; both current and previous accepted during rotation.

05 Rollback

The dual-write phase makes rollback trivial: at any point through phase 4, the API gateway can prefer cookies again with a single config flag. The cleanup phase (5) is the last step only after the shadow period passes without incident.

06 Risks

  • Refresh-token theft — mitigated by IP+UA fingerprint mismatch detection (rotate on mismatch, invalidate old).
  • Clock skew — accept tokens with nbf up to 30s in the future to tolerate small drifts.
  • Logout sync — cookies invalidate server-side; JWT requires a deny-list in cache for 15 min after explicit logout.