An operating system
for AI agents.
Your agents die every conversation. Mazemaker keeps them alive.
The persistent layer your LLMs run on top of. Memory formation, not retrieval. Background consolidation while they sleep. Conflict supersession when your mind changes. A knowledge-graph filesystem your agent walks instead of searches. Federation across machines so one brain spans your whole fleet. Every claim drops on demand the moment its mechanism is removed — evidence, not vibes.
In one minute… a lot can happen.
Even the worst, fatal nightmares — an entire conversation lost to a context-window reset. Every fix, every preference, every name, every path you mentioned three weeks ago: gone in the time it takes you to refill a coffee.
Mazemaker fixes that. Plug it in, and your agent gets a brain that…
-
Remembers what you tell it. Preferences, decisions, fixes, the path of that file you mentioned three weeks ago.
-
Connects related ideas. Like a real notebook with cross-references, not a search box.
-
Reflects while you sleep. Overnight, it strengthens what matters and notices new connections.
-
Updates itself when you change your mind. Old facts get superseded, not duplicated.
That’s it — the rest of this page goes deeper the further you scroll.
Don’t want to install anything? A managed hosted endpoint runs at
api.mazemaker.dev
— sign up, point your agent at the MCP endpoint, done.
Not memory. The kernel.
Vector search retrieves nearby text. Mazemaker manages the cognition itself — processes (your agents), memory management (consolidation + supersession), filesystem (the knowledge graph), scheduler (dream cycles), IPC (federation). The difference is not a percentage. It is a phase change — questions vector databases cannot answer by construction become routine.
Vector databases treat memory as a flat sphere of disconnected documents. Mazemaker builds a labyrinth. Every memory is a node. Every relationship is a weighted edge auto-discovered at insert time. Your agent does not search the cloud — it walks the labyrinth. Spreading activation propagates outward from a starting node with attenuation, exactly the way human associative recall works. Hop-2 reasoning — the questions a cosine search literally cannot answer by construction — goes from R@10 0.00 to 1.00. Not a thirty-percent improvement. A phase change.
The biological-sleep-inspired dream engine runs three phases overnight. NREM replays recent memories and strengthens the edges that fired together. REM bridges isolated nodes that never met but probably should. Insight detects communities and crystallises summary memories from clusters. Post-dream synthesis on facts unreachable from any single memory: structurally 0.00 → 0.43 R@10. Memory gets denser, not noisier, every night. No competing product runs autonomous consolidation.
Every recall returns the activation trace — the path the search walked, the edge weights it followed, the confidence at each hop. Your agent can debug its own retrieval. You can see why a memory surfaced instead of trusting a black box. The graph is queryable, not just searchable. No other memory product offers this surface; the rest stop at "here are the top-k documents".
We submitted the entire benchmark suite — including the negative controls that must fail when the relevant mechanism is removed — to GPT-5.5 via the codex CLI. Eight rounds. The first two rejected the suite outright. By round eight, every concrete objection was closed by code change, not by argument. Round eight verdict: unconditional yes — no residual caveat. Every prompt and every verdict is committed verbatim in the repository. This is not how you build a wrapper. This is how you build a category.
Negative controls. Not benchmarks.
Every row below is a knob we turn off that must collapse the result. If the number doesn’t drop on demand when the mechanism is removed, the lift was a coincidence. Most AI infra ships positive demos and cherry-picks; we ship the controls that have to fail.
“If you can’t make the number drop on demand, you don’t have evidence — you have a coincidence.”
— Mazemaker testing protocol
Answer reachable only through A -> B -> C edges. Vanilla cosine cannot solve it by construction.
Collapse proves traversal is load-bearing, not the embedding model accidentally helping.
Facts inferable only after consolidation become reachable after dream cycles.
Newer contradictory facts supersede stale ones instead of duplicating noise.
Concept-mode distractors pile up; the graph still holds continuity.
Real prose n=200: lean beats skynet by +0.18 R@5 and drops dead-weight channels.
Dream Engine Three-Phase Consolidation
Triggered after 600s idle, after 50 new memories, manually through tooling, or as a standalone daemon.
Replay 100 recent memories
Run spreading activation, strengthen active edges by +0.05, weaken inactive edges, prune dead edges below 0.05.
Bridge isolated memories
Find 50 isolated memories, search similar unconnected nodes, create bridge connections at similarity x 0.3.
Store communities
Detect connected components, identify bridge nodes, materialize dream insights and derived cluster memory.
Dream Engine — deep dive
Triggers fan into the cycle; the cycle splits into NREM, REM, and Insight phases.
%%{init:{'flowchart':{'htmlLabels':true,'curve':'basis','padding':8}}}%%
flowchart LR
subgraph Trigger["TRIGGER"]
direction TB
T1["Idle 600s"]
T2["50 new memories"]
T3["Manual / Cron"]
end
D{{"Dream Cycle"}}
T1 --> D
T2 --> D
T3 --> D
D --> NREM
D --> REM
D --> INSIGHT
subgraph NREM["PHASE 1 · NREM"]
direction TB
N1["Replay 100 recent memories"] --> N2["Spreading activation"]
N2 --> N3{"Connection
active?"}
N3 -->|Yes| N4["Strengthen +0.05"]
N3 -->|No| N5["Weaken −0.01"]
N3 -->|Dead < 0.05| N6["Prune"]
end
subgraph REM["PHASE 2 · REM"]
direction TB
R1["Find 50 isolated memories"] --> R2["Search similar
unconnected nodes"]
R2 --> R3["Create bridge connections"]
R3 --> R4["weight = similarity × 0.3"]
end
subgraph INSIGHT["PHASE 3 · INSIGHT"]
direction TB
I1["BFS connected components"] --> I2["Identify communities"]
I2 --> I3["Find bridge nodes"]
I3 --> I4["Store dream_insights"]
end
classDef trigger fill:#1a140a,stroke:#fbbf24,stroke-width:1.5px,color:#fde68a;
classDef cycle fill:#1a0e2a,stroke:#a78bfa,stroke-width:2.5px,color:#f0a8ff,font-weight:bold;
classDef nrem fill:#0e1428,stroke:#60a5fa,stroke-width:1.5px,color:#dbeafe;
classDef rem fill:#1a0a18,stroke:#f472b6,stroke-width:1.5px,color:#fbcfe8;
classDef insight fill:#0a1a14,stroke:#34d399,stroke-width:1.5px,color:#a7f3d0;
class T1,T2,T3 trigger;
class D cycle;
class N1,N2,N3,N4,N5,N6 nrem;
class R1,R2,R3,R4 rem;
class I1,I2,I3,I4 insight;
Walk the Maze
Five pages go deeper. The first two document the numbers and the engine. The last three
— the cockpit, the install flow, the four-domain topology — explain what
actually lives behind architect.mazemaker.dev, mazemaker.dev,
and the pod on your machine. Each one reproducible. Each one auditable.
LongMemEval-S 500q retrieval, Comparison Bench 188/200 (94.0%, 0 errors), v2 NO → v8 UNCONDITIONAL YES. Methodology, raw numbers, repro by curl.
Architecture & pod → 6 layers · 1 podThe six-layer cognition stack: sponge, AFE, ColBERT+DAE embedding, three-phase dream, Stage S synthesis, targeted re-formation. Rootless Podman, HKDF vault, MCP on loopback.
★ NEW · The Architect → 12 monitors · loopback onlyThe cockpit at architect.mazemaker.dev. Twelve panels, the dream replay, the chrono-scrub timeline, the Hermes skill-indexing pipeline. Hosted UI, local data.
From curl … | bash to a healthy pod. Pre-flight, fingerprint, browser handoff, license JWT, embedding choice, Quadlet, pod boot. Every guarantee, every failure mode.
How mazemaker.online, .dev, api., architect. combine without ever crossing memory data. Selective AES, public-prefix gate, request-flow map.
Pod-to-pod memory propagation over HTTP(S). Per-pair Bearer keys, public-prefix gate, five-minute tick. Tailscale pair, hub-and-spoke team, WWW-scale mesh — same model.
Comparison matrix → 4 projects · 1 harnessHindsight, Letta, A-MEM, Cognee — same retrieval harness. Verified numbers where we have them (Hindsight 188/200 = 94.0%), QUEUED where we don’t. No fabricated numbers.
Lab notes (blog) → 5 stations · the maze, walkedFive stations from entrance to summit: memory benchmarks should measure memory, bench corpus on Postgres, formation beats retrieval-tuning, inception benchmarking, inside the 100-iteration loop.
Install One Command
One curl-bash. Browser opens itself for email verification & captcha. Comes back, builds
your local pod, registers itself with every AI tool you have. Done in under three minutes.
No sudo, no Docker, no API keys to copy-paste.
curl -fsSL https://api.mazemaker.dev/install.sh | bash
That’s it. The script handles fingerprint init, opens your browser for the onboard wizard, polls for the handoff, signs the install proof, requests your license JWT, builds and starts the four containers, runs the health check, and offers to wire mazemaker into every AI tool it detects on your machine.
install.sh detects your hardware, generates a device fingerprint and an Ed25519 install keypair, then opens your default browser at the onboarding wizard with everything pre-filled.
Verify your email (we send a 6-digit code), pass a Cloudflare Turnstile captcha, pick your tier. The wizard parks the install handoff and tells you to return to the terminal. No JWT to copy. No keys to paste.
install.sh polls for the handoff, signs the install proof locally, fetches the license
JWT, builds four rootless Podman containers, starts the pod, health-checks
http://127.0.0.1:8765/sse, and offers to register mazemaker with every AI
tool it detects (Claude Code, Cursor, VSCode, Cline, Roo, Continue, Goose, Codex, …).
Don’t use one of the auto-detected tools?
Point any MCP-speaking client at http://127.0.0.1:8765/sse. The integration
spec at
api.mazemaker.dev/integration.md
documents native SSE, the mcp-remote stdio bridge, and the streamable-http
transport — readable by humans and LLM agents alike, so your agent can self-wire.
curl -fsSL https://api.mazemaker.dev/wire.sh | bash
Something broken?
debug.sh runs 36 systematic checks across DNS, license, runtime staging,
container images, Quadlet units, systemd state, and the host-facing endpoint. Pattern-
matches every known failure mode collected during real installs. With --fix
it auto-repairs everything safe to repair.
curl -fsSL https://api.mazemaker.dev/debug.sh | bash curl -fsSL https://api.mazemaker.dev/debug.sh | bash -s -- --fix
Operator-grade pricing.
Tiered the way you actually deploy: one machine (Builder), your fleet (Pro), your org (Team), your perimeter (Enterprise). Community stays free forever for personal single-agent use. Founder rates lock in for life — the price you sign up at today is the price you pay forever, even when we raise list.
SQLite · CPU · 3-phase dream · CLI + MCP
A real pod, free forever — the same one-line installer, no payment, no build step. Or take the engine itself, AGPLv3 + PolyForm-NC source-available, and run it your way. No ColBERT, no DAE, no Stage S synthesis, no Architect UI.
- Single agent · personal use
- Hybrid recall (R@5 = 0.96)
- 3-phase dream (lightweight)
- SQLite + FastEmbed CPU
- CLI + MCP server
- One-line install · free pod (no build)
- Or build from source · community support
curl … | bash — stays Free until you upgrade. Or build from source.
SQLite or Postgres · 3-phase dream · managed install
The community engine, professionally installed and license-managed. One-line installer, auto-update, email support. The right tier when you’re shipping your own agent and want the brain just-working.
- 1 agent · 100k memories
- Hybrid recall · 3-phase dream
- SQLite or Postgres backend
- One-line install · auto-update
- BYOK or local MLX embeddings
- Email support
- Stripe billing · cancel any time
Founder rate: $9/mo, locked forever. Sign up before launch closes.
Postgres + pgvector · ColBERT @ 1.5 · full-fat dream · Architect · federation
The full Mazemaker. ColBERT @ 1.5 reranking lifts recall to R@5 = 0.98. Dream runs full-fat — DAE-augmented adjacency, Stage S synthesis on top of NREM/REM/Insight. The Architect cockpit lets you walk the labyrinth in 3D. Peer federation stitches your machines into one brain. Unlimited agents, unlimited memories.
- Unlimited agents · unlimited memories
- ColBERT @ 1.5 late-interaction (Pro+)
- Full-fat dream + DAE + Stage S synthesis
- Architect UI — 3D graph cockpit
- Peer federation — one brain across machines
- Postgres + pgvector backend
- Dashboard · usage · telemetry · backups
- Email + chat support
Founder rate: $29/mo, locked forever. Sign up before launch closes.
Everything in Pro + multi-seat + audit log + SSO
For orgs running multiple agents that need to share memory. The mesh becomes a team brain — one agent learns, all of them know.
- Everything in Pro
- 5 seats included (add-ons available)
- Shared memory mesh across team
- RBAC + audit log
- SSO (Google Workspace / Okta)
- Priority email support
Defense · robotics · regulated AI · research
For environments where data cannot leave the perimeter. Self-hosted license server, BYOK-HSM, airgap deploy, cross-site federation, audit log export, custom dream cadence, explainable recall on every call, SLA, dedicated support.
- Everything in Team
- Airgap deploy · offline license
- BYOK + BYOK-HSM
- Self-hosted license server
- Cross-site federation
- Audit log export (SIEM-friendly)
- SLA + dedicated support
All paid tiers ship the full engine; tier-gated features (ColBERT, DAE, Architect, federation, audit) flip on/off via license claims at runtime — same binary, same code path. BYOK embeddings stay on your machine — we never see your provider keys. Memories you wrote stay accessible forever, even if you cancel.
Common questions.
The objections we hear most. Short answers; pointers to the long ones.
curl … | bash safe?
You don’t have to pipe. curl -fsSL https://api.mazemaker.dev/install.sh -o install.sh downloads the script; read it, then run it. The script is short, signed, and reproducible — every release ships with a SHA-256 in the changelog. The same script also ships a debug.sh twin that runs 36 systematic checks; both are linked from the onboarding page.
Seven-day grace period. The pod keeps running, keeps reading, keeps writing — the license-client just stops being able to phone home. After the grace window, the pod refuses new writes until the next successful check-in. Memories you already wrote stay accessible forever; nobody gets locked out of their own data. Full detail in the architecture page.
Yes — the engine is the same; only the rerank / synthesis layers differ. Community uses SQLite WAL; Pro uses Postgres + pgvector. mazemaker dump exports your store; mazemaker restore imports it on the other side. No re-embedding, no re-graph-build. Vendor lock-in is the failure mode we’re explicitly designing against.
No — structurally, not by promise. The license backend records that a tool was called, never what was stored. Memory content stays inside the pod, encrypted at rest by wonderland with a vault key derived from HKDF(JWT, hardware-fingerprint) at runtime — the key never touches disk. See manifesto for the philosophy and privacy for the operational policy.
The engine is AGPLv3 + PolyForm-NC source-available on GitHub. If the SaaS goes dark, Community + Lite users keep running on the engine they already have; Pro users lose the managed-install + Architect UI, but the underlying pod keeps working off the last-issued license until grace runs out. Anyone can fork. Architecture is the policy — including the exit policy.
8 GB RAM, x86_64 or Apple Silicon, any modern CPU. GPU optional (CUDA or MLX accelerate recall ~3×, never required). Disk: ~500 MB for the pod images + your memory store. gemma3:270m hit 18/20 on the Comparison Bench — that’s a Raspberry Pi-class model. The engine itself is heavier than the model.
Because memory is the most intimate data class a coding agent will ever touch — every preference, every fix, every name, every file path you mentioned three weeks ago. Centralising it is the obvious play for a surveillance business model. We’re explicitly not that. The manifesto is the long version; the topology is the four-domain split that enforces the boundary.
Community is free forever — a one-line pod (or build the open-source engine yourself), SQLite + CPU, single agent; Pro is the same engine plus ColBERT @ 1.5 reranking (R@5 0.96 → 0.98), DAE-augmented dream consolidation, Stage S synthesis, the Architect cockpit, unlimited agents, and the Postgres + pgvector backend that the 100-iteration benchmark loop ran on.
Build the maze.
Your agent finds the way.
Persistent semantic memory, graph reasoning, dream consolidation, and audited benchmark lift for agents that need continuity.