Skip to main content
← Back to list
01Issue
FeatureShippedSwamp CLI
Assigneesstack72

Relationships

#525 feat: giga-swamp phase 6 — Namespace-scoped sync

Opened by stack72 · 6/1/2026· Shipped 6/2/2026

Problem

Phases 1-4 established namespace infrastructure, catalog schema, filesystem layout, and CEL cross-namespace queries. A single repo with a namespace works correctly against a filesystem datastore. But the multi-repo sharing scenario — the core value of giga-swamp — requires namespace-scoped sync. Today, push/pull syncs the entire datastore, and each repo's catalog only sees its own writes. Two repos sharing an S3 datastore can't see each other's data.

Context: Previous Scoped Sync Failures

Two previous scoped sync attempts were reverted:

  • PR #1386: Pulled files landed in cache but catalog wasn't re-indexed. data list returned 0 rows after cross-repo pull. Post-pull catalog rebuild is load-bearing.
  • PR #134: Treated directory dirty paths as file paths. Scoped push tried to Deno.readFile on a directory and failed silently.

The failure mode is always the same: push works, pull works in isolation, but the writer-pushes/reader-pulls/reader-queries flow breaks silently. Unit tests don't catch it. Only swamp-uat catches the cross-repo scenario.

Solution: Four Sub-Phases (Each Independently Shippable)

Each sub-phase ships as a separate PR, gated independently on swamp-uat. If one breaks, revert it without touching the others. The system works in a degraded-but-correct state after each sub-phase.

Sub-phase 6a: Plumbing — namespace on DatastoreSyncOptions

Add optional namespace field to DatastoreSyncOptions. Pass it through the sync coordinator to pullChanged/pushChanged. Extensions that ignore it keep syncing everything — identical to today. Zero behavior change.

Ship gate: all tests pass, swamp-uat passes, behavior is byte-identical to before.

Sub-phase 6b: Per-namespace index partitioning

Replace the monolithic .datastore-index.json with per-namespace indexes at {namespace}/.datastore-index.json. Solo mode keeps one index. The sync coordinator passes namespace to push/pull so only the repo's own namespace subtree is synced.

This is the highest-risk sub-phase. The markDirty contract has 8 load-bearing rules — namespace support is additive (adds a field to DatastoreSyncOptions), it does NOT change how dirty tracking works. The dirty path is a directory, not a file.

Ship gate: swamp-uat datastore suite passes. Critical test path: writer repo pushes data, reader repo pulls, reader runs data list, data is visible with correct provenance. Post-pull catalog rebuild must fire.

Sub-phase 6c: Foreign catalog export/pull

Each namespace publishes a lightweight .catalog-export.json containing its catalog rows as a flat JSON array. Foreign catalog pull fetches these exports and upserts into the local catalog. This is net-new sync infrastructure — additive, nothing existing depends on it.

The local catalog backfill (full-replace) needs to become namespace-scoped: delete own-namespace rows, insert own-namespace rows, preserve foreign rows. This was deferred from Phase 3 and is now required.

Also adds: swamp datastore catalog pull command, lastSynced timestamp on foreign catalog data (Design Decision 7).

Ship gate: writer pushes data, exports catalog. Reader pulls foreign catalog. Reader queries cross-namespace data via CEL (ns:model syntax from Phase 4). Metadata is visible, content is not yet (that's 6d).

Sub-phase 6d: On-demand cross-namespace content fetch

When a cross-namespace CEL expression accesses another namespace's data content (e.g. attributes), the content is fetched on demand from the remote datastore. Single-file GET, not a full sync. Cached locally for the command duration but not persisted — foreign content is ephemeral.

Adds optional fetchForeignContent method to DatastoreSyncService. Extensions that don't implement it return null (content unavailable).

Ship gate: data.latest('ns:model', 'name').attributes returns the actual content from the foreign namespace, fetched on demand.

Critical Constraints

  • Do NOT change the markDirty contract — 8 load-bearing rules extension authors depend on
  • Do NOT change DEFAULT_DATASTORE_SUBDIRS — destructive migration side effects
  • Do NOT restructure acquireModelLocks without verifying catalog rebuild is preserved
  • Do NOT assume markDirty relPath is a file path — it's a directory path from the data repo
  • Do NOT trust unit tests alone for sync changes — swamp-uat is the only reliable validation
  • Do NOT benchmark as validation — Phase 2 of the previous scoped sync showed 5x improvement by accidentally dropping most of the work

Verification Requirements (Non-Negotiable)

Before EVERY sub-phase PR:

  1. deno check, deno lint, deno fmt, deno run test
  2. deno run compile — build candidate binary
  3. Run swamp-uat datastore suite against candidate binary
  4. Critical test path: writer repo pushes → reader repo pulls → reader runs data list → data is visible with correct provenance
  5. Verify catalog rebuild fires after pull
  6. Solo mode regression gate: byte-identical behavior

Design Context

Full design doc: resources/giga-swamp.md (see Sync section, Namespace-Scoped Sync, and Relationship to Scoped Sync Work sections)

This is Phase 6 of 7. Depends on Phases 1-4 (all shipped). Phase 5 (CLI output) and Phase 7 (migration commands) will follow after Phase 6.

02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED+ 1 MOREASSIGNED+ 8 MOREREVIEW+ 3 MOREPR_MERGED+ 1 MORENOTIFICATION_SKIPPED

Shipped

6/2/2026, 9:27:31 PM

Click a lifecycle step above to view its details.

03Sludge Pulse
stack72 assigned stack726/2/2026, 9:45:53 AM

Sign in to post a ripple.