← All case studies
03 · INTERNAL INFRASTRUCTURE · SHIPPED 2024 — present · 6 min read

A private memory layer for one operator running many systems

Running a portfolio of live systems exceeds what a single person can hold in working memory. The assistant was built to close that gap — a private semantic index that turns years of project files, decisions, and operational notes into context any AI coding session can retrieve in under two seconds.

14projects indexed
38kembeddings searchable
<2scontext recall
3devices synced
01 · CONTEXT

01 · CONTEXT

One operator now runs what used to take a small team. Trading systems, publishing pipelines, outreach bots, dashboards, client work. Each carries its own architecture, credentials, failure modes, and history of decisions.

The limiting factor is not execution. It is recall. Context switching between systems costs minutes per session and compounds into hours per week. The assistant was built to make that cost approach zero.

02 · WHY

02 · WHY BUILD IT

Commercial memory products optimize for teams, not for one person with deep context across unrelated domains. They live in someone else's cloud, see everything the operator writes, and charge per seat for features that do not apply.

A private layer was the correct answer. Full control over what gets indexed, what leaves the machine, and which assistant is allowed to query it. Operational data stays on hardware the operator owns. Nothing crosses a public boundary without a reason.

03 · HOW

03 · HOW IT WORKS

A local vector database holds embeddings of every meaningful file across the portfolio — specs, postmortems, configs, session notes, lessons. A small API service handles ingest and query. A standard transport exposes the index to AI coding assistants as a native tool, so retrieval happens inside the editor rather than through copy-paste.

The index is reached over a private mesh network. No public endpoint, no reverse proxy, no authentication theater. If a device is not on the mesh, the service does not exist for it. File sync mirrors the source-of-truth project tree across three machines, so the same working set is present whether the operator is at the desk, on the laptop, or querying from the always-on node.

MEMORY LAYER · STATS PANEL PROJECTS INDEXED ........ 14EMBEDDINGS ............. 38,412DEVICES SYNCED ......... 3RECALL P95 ............. 1.8sLAST SYNC .............. 00:02:14
04 · SYSTEM

04 · SYSTEM DESIGN

Embeddings refresh nightly. Canary queries run after each rebuild — a small set of questions with known-good answers that must still resolve correctly. If recall drifts, the job fails loudly rather than silently serving stale context. Silent degradation is the failure mode that matters most for memory infrastructure; a stale index is worse than no index.

Ingest is deliberate. Not every file is embedded. Generated artifacts, dependency trees, and transient logs are excluded by policy. What remains is the material a senior engineer would actually consult: decisions, interfaces, constraints, and the reasoning behind them.

The retrieval path is boring by design. Query in, ranked passages out, under two seconds at the 95th percentile. No agent loop, no re-ranking service, no orchestration layer. Boring is the point. The index is consulted dozens of times a day and has to behave like a filesystem, not a product.

Memory infrastructure changes what one operator can hold in their head — the ceiling is no longer recall, it is judgment. — on why the layer exists
05 · RESULTS

The ceiling moved

Fourteen projects are indexed across roughly thirty-eight thousand embeddings, synchronized across three devices, with sub-two-second recall. Sessions start with the relevant history already present. Decisions made six months ago are available in the same keystroke as decisions made this morning.

The practical effect is compounding. Work that previously required re-reading a project to remember how it was wired now begins at the edit. The operator holds more systems in parallel without the quality of any one of them degrading — which is the only honest measure of whether the layer is doing its job.

06 · STACK
ChromaDBFastAPIMCPTailscaleSyncthingPython

A local vector index, a small API, a native assistant transport, and a private mesh — assembled so the operator's history is always one query away.

NEXT CASE STUDY

A capture-management product for a vertical with no good incumbent

Read next →

If this maps to a system you need built or fixed — tell me about it.

WhatsApp → Telegram → Email →