281
specs/reviews/001-adversarial-review-phase2.md
Normal file
281
specs/reviews/001-adversarial-review-phase2.md
Normal file
@@ -0,0 +1,281 @@
|
||||
# Adversarial Review: Forage Client (Post Phase 2)
|
||||
|
||||
## Date: 2026-03-07
|
||||
## Scope: Full project review - architecture, workflow, business, security
|
||||
|
||||
---
|
||||
|
||||
## 1. Forage Needs a Clear Ownership Boundary
|
||||
|
||||
Forage is entirely dependent on forest-server for its core functionality. Every
|
||||
route either renders static marketing content or proxies to forest-server.
|
||||
|
||||
What does forage own today?
|
||||
- Auth? No, forest-server owns it.
|
||||
- User data? No, forest-server owns it.
|
||||
- Component registry? Future, and forest-server will own that too.
|
||||
- Deployment logic? Future, likely another backend service.
|
||||
|
||||
This isn't wrong - forage is the web product layer on top of forest's API layer.
|
||||
But this intent isn't crystallized anywhere. The PITCH.md lists a huge roadmap
|
||||
(deployments, managed services, billing) without clarifying what lives in forage
|
||||
vs. what lives in forest-server or new backend services.
|
||||
|
||||
**Why this matters**: Every architectural decision (session management, crate
|
||||
boundaries, database usage) depends on what forage will own. Without this
|
||||
boundary, we risk either building too much (duplicating forest-server) or too
|
||||
little (being a dumb proxy forever).
|
||||
|
||||
**Action**: Write a clear architecture doc or update PITCH.md with an explicit
|
||||
"forage owns X, forest-server owns Y, future services own Z" section. At
|
||||
minimum, forage will own: web sessions, billing/subscription state, org-level
|
||||
configuration, and the web UI itself. Forest-server owns: users, auth tokens,
|
||||
components, deployments.
|
||||
|
||||
Comment:
|
||||
Forage in the future is going to have many services that forest is going to be relying on, hence the brand, and site. Also forest-client might be fine as a UI for forest itself, but a flutter app isn't that great at web apps, and we need something native for SEO and the likes.
|
||||
We could adopt the forest-client as the dashboard tbd. Forage is the business entity of forest.
|
||||
|
||||
---
|
||||
|
||||
## 2. The Crate Structure Is Premature
|
||||
|
||||
Five crates today:
|
||||
- **forage-db**: 3 lines. Re-exports `PgPool`. No queries, no migrations.
|
||||
- **forage-core**: ~110 lines. A trait, types, 3 validation functions.
|
||||
- **forage-grpc**: Generated code wrapper. Justified.
|
||||
- **forage-server**: The actual application. All real logic lives here.
|
||||
- **ci**: Separate build tool. Justified.
|
||||
|
||||
The "pure core / effectful shell" split sounds principled, but `forage-core`
|
||||
is mostly type definitions. The `ForestAuth` trait is defined in core but
|
||||
implemented in server. The "pure" validation is 3 functions totaling ~50 lines.
|
||||
|
||||
**forage-db is dead weight.** There are no database queries, no migrations, no
|
||||
schema. It exists because CLAUDE.md says there should be a db crate. Either
|
||||
remove it or explicitly acknowledge it as a placeholder for future forage-owned
|
||||
state (sessions, billing, org config).
|
||||
|
||||
**Action**: Either consolidate to 3 crates (server, grpc, ci) until there's
|
||||
a real consumer for the core/db split, or commit to what forage-core and
|
||||
forage-db will contain (tied to decision #1). Premature crate boundaries add
|
||||
compile time and cognitive overhead without benefit.
|
||||
|
||||
Comment
|
||||
Lets keep the split for now, we're gonna fill it out shortly
|
||||
|
||||
---
|
||||
|
||||
## 3. Token Refresh Is Specified But Not Implemented
|
||||
|
||||
The spec says:
|
||||
> If access_token expired but refresh_token valid: auto-refresh, set new cookies
|
||||
|
||||
Reality: `RequireAuth` checks if a cookie exists. It doesn't validate the
|
||||
token, check expiry, or attempt refresh. When the access_token expires,
|
||||
`get_user()` fails and the user gets redirected to login - losing their
|
||||
session even though the refresh_token is valid.
|
||||
|
||||
Depending on forest-server's token lifetime configuration (could be 15 min to
|
||||
1 hour), users will get randomly logged out. This is the single most impactful
|
||||
missing feature.
|
||||
|
||||
**Action**: Implement BFF sessions (spec 003) which solves this by moving
|
||||
tokens server-side and handling refresh transparently.
|
||||
|
||||
---
|
||||
|
||||
## 4. The get_user Double-Call Pattern
|
||||
|
||||
Every authenticated page does:
|
||||
1. `get_user(access_token)` which internally calls `token_info` then `get_user`
|
||||
(2 gRPC calls in `forest_client.rs:161-192`)
|
||||
2. Then page-specific calls (e.g., `list_tokens` - another gRPC call)
|
||||
|
||||
That's 3 gRPC round-trips per page load. For server-rendered pages where
|
||||
latency = perceived performance, this matters.
|
||||
|
||||
The `get_user` implementation calls `token_info` to get the `user_id`, then
|
||||
`get_user` with that ID. This should be a single call.
|
||||
|
||||
**Action**: Short-term, BFF sessions with user caching (spec 003) eliminates
|
||||
repeated get_user calls. Long-term, consider pushing for a "get current user"
|
||||
endpoint in forest-server that combines token_info + get_user.
|
||||
|
||||
We should be able to store most of this in the session, with a jwt etc. That should be fine for now
|
||||
|
||||
---
|
||||
|
||||
## 5. Cookie Security Gap
|
||||
|
||||
`auth_cookies()` sets `HttpOnly` and `SameSite=Lax` but does NOT set `Secure`.
|
||||
|
||||
The spec explicitly requires:
|
||||
> forage_access cookie: access_token, HttpOnly, **Secure**, SameSite=Lax
|
||||
|
||||
Without `Secure`, cookies are sent over plain HTTP. Access tokens can be
|
||||
intercepted on any non-HTTPS connection.
|
||||
|
||||
**Action**: Fix immediately regardless of whether BFF sessions are implemented.
|
||||
If BFF sessions come first, ensure the session cookie sets `Secure`.
|
||||
|
||||
---
|
||||
|
||||
## 6. The Mock Is Too Friendly
|
||||
|
||||
`MockForestClient` always succeeds (except one login check). Tests prove:
|
||||
- Templates render without errors
|
||||
- Redirects go to the right places
|
||||
- Cookies get set
|
||||
|
||||
Tests do NOT prove:
|
||||
- Error handling for real error scenarios (only one bad-credentials test)
|
||||
- What happens when `get_user` fails mid-flow (token expired between pages)
|
||||
- What happens when `create_token` or `delete_token` fails
|
||||
- What happens when forest-server returns unexpected/partial data
|
||||
- Behavior under concurrent requests
|
||||
|
||||
**Action**: Make the mock configurable per-test. A builder pattern or
|
||||
`Arc<Mutex<MockBehavior>>` would let tests control which calls succeed/fail.
|
||||
Add error-path tests for every route, not just login.
|
||||
|
||||
---
|
||||
|
||||
## 7. Navigation Links to Nowhere
|
||||
|
||||
`base.html.jinja` links to: `/docs`, `/components`, `/about`, `/blog`,
|
||||
`/privacy`, `/terms`, `/docs/deployments`, `/docs/registry`, `/docs/services`.
|
||||
|
||||
None exist. They all 404.
|
||||
|
||||
This isn't a code quality issue - it's a user experience issue for anyone
|
||||
visiting the site. Every page has a nav and footer full of dead links.
|
||||
|
||||
**Action**: Either remove links to unbuilt pages, add placeholder pages with
|
||||
"coming soon" content, or use a `disabled` / `cursor-not-allowed` style that
|
||||
makes it clear they're not yet available.
|
||||
|
||||
Comment
|
||||
add a place holder and a todo, also remove the docs, we don't need that yet. also remove the blog and other stuff. Lets just stick with the main things. components and the login etc.
|
||||
|
||||
---
|
||||
|
||||
## 8. VSDD Methodology vs. Reality
|
||||
|
||||
VSDD.md describes 6 phases: spec crystallization, test-first implementation,
|
||||
adversarial refinement, feedback integration, formal hardening (fuzz testing,
|
||||
mutation testing, static analysis), and convergence.
|
||||
|
||||
In practice:
|
||||
- Phase 1 (specs): Done well
|
||||
- Phase 2 (TDD-ish): Tests written, but not strictly red-green-refactor
|
||||
- Phase 3 (adversarial): This review
|
||||
- Phases 4-6: Not done
|
||||
|
||||
The full pipeline includes fuzz testing, mutation testing, and property-based
|
||||
tests. None of these exist. The convergence criterion ("adversary must
|
||||
hallucinate flaws") is unrealistic - real code always has real improvements.
|
||||
|
||||
This isn't a problem if VSDD is treated as aspirational guidance rather than
|
||||
a strict process. But if the methodology doc says one thing and practice does
|
||||
another, the doc loses authority.
|
||||
|
||||
**Action**: Either trim VSDD.md to match what's actually practiced (spec ->
|
||||
test -> implement -> review -> iterate), or commit to running the full pipeline
|
||||
on at least one feature to validate whether the overhead is worth it.
|
||||
|
||||
Comment: Write in claude.md that we need to follow the process religiously
|
||||
|
||||
---
|
||||
|
||||
## 9. The Pricing Page Sells Vapor
|
||||
|
||||
The pricing page lists managed deployments, container runtimes, PostgreSQL
|
||||
provisioning, private registries. None of this exists. The roadmap has 4
|
||||
phases before any of it works.
|
||||
|
||||
The landing page has a "Get started for free" CTA leading to `/signup`, which
|
||||
creates an account on forest-server. After signup, the dashboard is empty -
|
||||
there's nothing to do. No components to browse, no deployments to create.
|
||||
|
||||
If this site goes live as-is, you're either:
|
||||
- Collecting signups for a waitlist (fine, but say so explicitly)
|
||||
- Implying a product exists that doesn't (bad)
|
||||
|
||||
**Action**: Add "early access" / "waitlist" framing. The dashboard should
|
||||
explain what's coming and what the user can do today (manage tokens, explore
|
||||
the registry when it exists). The pricing page should indicate which features
|
||||
are available vs. planned.
|
||||
|
||||
Comment: Only add container deployments for now, add the other things as tbd, forget postgresql for now
|
||||
|
||||
---
|
||||
|
||||
## 10. Tailwind CSS Not Wired Up
|
||||
|
||||
Templates use Tailwind classes (`bg-white`, `text-gray-900`, `max-w-6xl`, etc.)
|
||||
throughout, but the CSS is loaded from `/static/css/style.css`. If this file
|
||||
doesn't contain compiled Tailwind output, none of the styling works and the
|
||||
site is unstyled HTML.
|
||||
|
||||
`mise.toml` has `tailwind:build` and `tailwind:watch` tasks, but it's unclear
|
||||
if these have been run or if the output is committed.
|
||||
|
||||
**Action**: Verify the Tailwind pipeline works end-to-end. Either commit the
|
||||
compiled CSS or ensure CI builds it. An unstyled site is worse than no site.
|
||||
|
||||
---
|
||||
|
||||
## 11. forage-server Isn't Horizontally Scalable
|
||||
|
||||
With in-memory session state (post BFF sessions), raw token cookies (today),
|
||||
and no shared state layer, forage-server is a single-instance application.
|
||||
That's fine for now, but it constrains deployment options.
|
||||
|
||||
This isn't urgent - single-instance Rust serving SSR pages can handle
|
||||
significant traffic. But it should be a conscious decision, not an accident.
|
||||
|
||||
**Action**: Document this constraint. When horizontal scaling becomes needed,
|
||||
the session store trait makes it straightforward to swap to Redis/Postgres.
|
||||
|
||||
Comment: Set up postgresql like we do in forest and so forth
|
||||
|
||||
---
|
||||
|
||||
## Summary: Prioritized Actions
|
||||
|
||||
### Must Do (before any deployment)
|
||||
1. **Fix cookie Secure flag** - real security gap
|
||||
2. **Implement BFF sessions** (spec 003) - fixes token refresh, caching, security
|
||||
3. **Remove dead nav links** or add placeholders - broken UX
|
||||
|
||||
### Should Do (before public launch)
|
||||
4. **Add "early access" framing** to pricing/dashboard - honesty about product state
|
||||
5. **Verify Tailwind pipeline** - unstyled site is unusable
|
||||
6. **Improve test mock** - configurable per-test, error path coverage
|
||||
|
||||
### Do When Relevant
|
||||
7. **Define ownership boundary** (forage vs. forest-server) - shapes all future work
|
||||
8. **Simplify crate structure** or justify it with concrete plans
|
||||
9. **Align VSDD doc with practice** - keep methodology honest
|
||||
10. **Plan for horizontal scaling** - document the constraint, prepare the escape hatch
|
||||
|
||||
---
|
||||
|
||||
## What's Good
|
||||
|
||||
To be fair, the project has strong foundations:
|
||||
|
||||
- **Architecture is sound.** Thin frontend proxying to forest-server is the
|
||||
right call. Trait-based abstraction for testability is clean.
|
||||
- **Spec-first approach works.** Specs are clear, implementation matches them,
|
||||
tests verify the contract.
|
||||
- **Tech choices are appropriate.** Axum + MiniJinja for SSR is fast, simple,
|
||||
and right-sized. No over-engineering with SPAs or heavy frameworks.
|
||||
- **Cookie-based auth proxy is correct** for this kind of frontend (once moved
|
||||
to BFF sessions).
|
||||
- **CI mirrors forest's patterns** - good for consistency across the ecosystem.
|
||||
- **ForestAuth trait** makes testing painless and the gRPC boundary clean.
|
||||
- **The gRPC client** is well-structured with proper error mapping.
|
||||
|
||||
The issues are about what's missing, not what's wrong with what exists.
|
||||
176
specs/reviews/002-adversarial-review-phase3.md
Normal file
176
specs/reviews/002-adversarial-review-phase3.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Adversarial Review 002 - Post Spec 004 (Projects & Usage)
|
||||
|
||||
**Date**: 2026-03-07
|
||||
**Scope**: Full codebase review after specs 001-004
|
||||
**Tests**: 53 total (17 core + 36 server), clippy clean
|
||||
**Verified**: Against real forest-server on localhost:4040
|
||||
|
||||
---
|
||||
|
||||
## 1. Architecture: Repeated gRPC Calls Per Page Load
|
||||
|
||||
**Severity: High**
|
||||
|
||||
Every authenticated platform page (`projects_list`, `project_detail`, `usage`) calls `list_my_organisations` to verify membership. This means:
|
||||
|
||||
- `/orgs/testorg/projects` -> 1 call to list orgs + 1 call to list projects = **2 gRPC calls**
|
||||
- `/orgs/testorg/projects/my-api` -> 1 call to list orgs + 1 call to list artifacts = **2 gRPC calls**
|
||||
- Dashboard -> 1 call to list orgs (redirect) then the target page makes its own calls
|
||||
|
||||
This is the same pattern we fixed for `get_user()` in spec 003 (caching user in session). The org list should be cached in the session too, or at minimum passed through from the `Session` extractor.
|
||||
|
||||
**Recommendation**: Cache the user's org memberships in `SessionData` / `CachedUser`. Refresh on session refresh or after a configurable TTL. This eliminates the most expensive repeated call.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architecture: Two Traits, One Struct, Inconsistent Error Handling
|
||||
|
||||
**Severity: Medium**
|
||||
|
||||
`GrpcForestClient` now implements both `ForestAuth` and `ForestPlatform`. The `authed_request` helper is duplicated:
|
||||
- `GrpcForestClient::authed_request()` returns `AuthError`
|
||||
- `platform_authed_request()` is a free function returning `PlatformError`
|
||||
|
||||
Same logic, two copies, two error types. `AppState` holds `Arc<dyn ForestAuth>` + `Arc<dyn ForestPlatform>` which in production point to the same struct. This is fine for testability but means the constructors are getting wide (4 args now).
|
||||
|
||||
**Recommendation**: Consider a single `ForestClient` trait that combines both, or unify the auth helper into a generic form. Not urgent but will become pain as more services are added.
|
||||
|
||||
---
|
||||
|
||||
## 3. Security: Org Name in URL Path is User-Controlled
|
||||
|
||||
**Severity: Medium**
|
||||
|
||||
Routes use `{org}` from the URL path and pass it directly to gRPC calls and template rendering:
|
||||
- `format!("{org} - Projects - Forage")` in HTML title
|
||||
- `format!("Projects in {org}")` in meta description
|
||||
|
||||
MiniJinja auto-escapes by default in HTML context, so XSS via `<script>` in org name is mitigated. However:
|
||||
- The `title` tag is outside normal HTML body escaping in some edge cases
|
||||
- The `description` meta tag uses attribute context escaping
|
||||
|
||||
**Recommendation**: Validate or sanitize `{org}` and `{project}` path params at the route level. The org membership check already prevents arbitrary names from rendering (403 if not a member), but defense in depth matters.
|
||||
|
||||
---
|
||||
|
||||
## 4. Session: `last_seen_at` Updated on Every Request
|
||||
|
||||
**Severity: Low**
|
||||
|
||||
The `Session` extractor calls `state.sessions.update()` on **every single request** to update `last_seen_at`. For the PostgreSQL store, this means a write query per page load. For the in-memory store, it's a write lock on the HashMap.
|
||||
|
||||
**Recommendation**: Only update `last_seen_at` if the previous value is older than some threshold (e.g., 5 minutes). This is a simple check that eliminates 95%+ of session writes.
|
||||
|
||||
---
|
||||
|
||||
## 5. Testing: No Test for the `ForestPlatform` gRPC Implementation
|
||||
|
||||
**Severity: Medium**
|
||||
|
||||
The `GrpcForestClient` `ForestPlatform` impl (lines 294-393 of `forest_client.rs`) has zero test coverage. It's only tested indirectly via integration tests that use `MockPlatformClient`. The mapping from proto types to domain types (`Organisation`, `Artifact`) is untested.
|
||||
|
||||
Specifically:
|
||||
- The `zip(resp.roles)` could silently truncate if lengths don't match
|
||||
- The `unwrap_or_default()` on `a.context` hides missing data
|
||||
- The empty-string-to-None conversion for `description` is a subtle behavior
|
||||
|
||||
**Recommendation**: Add unit tests for the proto-to-domain conversion functions. Extract them into named functions (like `convert_user` and `convert_token` for auth) to make them testable.
|
||||
|
||||
---
|
||||
|
||||
## 6. Testing: Dashboard Test Changed Behavior Without Full Coverage
|
||||
|
||||
**Severity: Medium**
|
||||
|
||||
`dashboard_with_session_returns_200` was renamed to `dashboard_with_session_redirects_to_org` and now only checks for `StatusCode::SEE_OTHER`. The old test verified the dashboard rendered with `testuser` content. The new behavior (redirect) is tested, but the onboarding page content is only tested in `dashboard_no_orgs_shows_onboarding` which checks for `"forest orgs create"`.
|
||||
|
||||
Nobody tests:
|
||||
- What happens if `list_my_organisations` returns an error (not empty, an actual error)
|
||||
- The dashboard template rendering is correct (title, user info)
|
||||
|
||||
**Recommendation**: Add test for platform unavailable during dashboard load.
|
||||
|
||||
---
|
||||
|
||||
## 7. VSDD Process: Spec 004 Skipped the Red Gate
|
||||
|
||||
**Severity: Medium (Process)**
|
||||
|
||||
The VSDD spec says "All tests must fail before implementation begins." In spec 004, we wrote templates, routes, AND tests in the same step. Tests never had a Red phase - they were green on first run. This is pragmatic but violates VSDD.
|
||||
|
||||
The earlier specs (001-003) had proper Red->Green cycles. Spec 004 was implemented as "write everything at once."
|
||||
|
||||
**Recommendation**: For future specs, write the test assertions first with stub routes that return 501/500, verify they fail, then implement. Even if the cycle is fast, the discipline catches assumption errors.
|
||||
|
||||
---
|
||||
|
||||
## 8. Template: No Authenticated Navigation
|
||||
|
||||
**Severity: Medium (UX)**
|
||||
|
||||
The spec called for "Authenticated navigation with org switcher" but it wasn't implemented. All pages (projects, usage, onboarding) use the same marketing `base.html.jinja` which shows "Pricing / Components / Sign in" in the nav, even when the user is authenticated and browsing their org's projects.
|
||||
|
||||
This means:
|
||||
- No way to switch orgs from the nav
|
||||
- No visual indication you're logged in (except the page content)
|
||||
- No link back to projects/usage from the nav on authenticated pages
|
||||
|
||||
**Recommendation**: Either pass `user` and `orgs` to the base template and conditionally render an app nav, or create a separate `app_base.html.jinja` that authenticated pages extend.
|
||||
|
||||
---
|
||||
|
||||
## 9. Error UX: Raw Status Codes as Responses
|
||||
|
||||
**Severity: Medium**
|
||||
|
||||
403 and 500 errors return bare Axum status codes with no HTML body:
|
||||
- Non-member accessing `/orgs/someorg/projects` -> blank 403 page
|
||||
- Template error -> blank 500 page
|
||||
|
||||
**Recommendation**: Add simple error templates (`403.html.jinja`, `500.html.jinja`) and render them instead of bare status codes. Even a one-line "You don't have access to this organisation" is better than a browser default error page.
|
||||
|
||||
---
|
||||
|
||||
## 10. Code: `expires_in_seconds` is Suspiciously Large
|
||||
|
||||
**Severity: Low (Upstream)**
|
||||
|
||||
During integration testing, forest-server returned `expiresInSeconds: 1775498883` which is ~56 years. This is likely a bug in forest-server (perhaps it's returning an absolute timestamp instead of a duration). Our code treats it as a duration: `now + Duration::seconds(tokens.expires_in_seconds)`. If forest-server is actually returning a Unix timestamp, we'd set expiry to year 2082.
|
||||
|
||||
The session refresh logic would never trigger, which means tokens are effectively permanent. The BFF session protects the browser from this (sessions expire by `last_seen_at` reaper), but the underlying token is never refreshed.
|
||||
|
||||
**Recommendation**: Verify with forest-server what `expires_in_seconds` actually means. If it's a bug, cap it to a sane maximum (e.g., 24h) client-side.
|
||||
|
||||
---
|
||||
|
||||
## 11. Missing: CSRF Protection on State-Mutating Endpoints
|
||||
|
||||
**Severity: Medium (Security)**
|
||||
|
||||
`POST /logout`, `POST /login`, `POST /signup`, `POST /settings/tokens`, `POST /settings/tokens/{id}/delete` all accept form submissions with no CSRF token. The `SameSite=Lax` cookie provides baseline protection against cross-origin POST from foreign sites, but:
|
||||
|
||||
- `SameSite=Lax` allows top-level navigations (e.g., form auto-submit from a link)
|
||||
- A CSRF token is the standard defense-in-depth
|
||||
|
||||
**Recommendation**: Add CSRF tokens to all forms. MiniJinja can render a hidden `<input>` and the server validates it against a session-bound value.
|
||||
|
||||
---
|
||||
|
||||
## Prioritized Actions
|
||||
|
||||
### Must Do (before next feature)
|
||||
1. **Error pages**: Add 403/500 error templates (bare status codes are bad UX)
|
||||
2. **Authenticated nav**: Implement app navigation for logged-in users
|
||||
3. **Platform-unavailable test**: Add test for dashboard when `list_my_organisations` errors
|
||||
|
||||
### Should Do (this iteration)
|
||||
4. **Cache org memberships in session**: Eliminate repeated `list_my_organisations` gRPC call
|
||||
5. **Throttle session writes**: Only update `last_seen_at` if stale (>5min)
|
||||
6. **Extract proto conversion functions**: Make them testable, add unit tests
|
||||
7. **CSRF tokens**: Add to all POST forms
|
||||
|
||||
### Do When Relevant
|
||||
8. **Unify auth helper**: Deduplicate `authed_request` / `platform_authed_request`
|
||||
9. **Validate path params**: Sanitize `{org}` and `{project}` at route level
|
||||
10. **Investigate `expires_in_seconds`**: Confirm forest-server semantics, cap if needed
|
||||
11. **VSDD discipline**: Enforce Red Gate for future specs
|
||||
Reference in New Issue
Block a user