Memory for AI agents

Store and retrieve embedding-backed context so your agents remember what matters—without stuffing the whole world into every request.

cache-assistant — session

Structure

/docs/api

/runbooks

/snippets

Retrieval

→ embed query
→ topK: 8 · cache: primary
→ merged context: 4.2k tokens
→ ready for model

Built for agents that remember

Everything you need to keep prompts sharp and retrieval predictable.

Structured caches

Organize chunks and links so context stays scoped to the task.

Vector search

Embedding-backed retrieval tuned for your workflows.

Hooks & jobs

Automate ingest and updates without manual copy-paste.

Model-ready

Ship context to chat and tools in a consistent shape.

Privacy-friendly, predictable context

You control what lands in a cache. Search narrows to what matters, so each request stays small and auditable—ideal for assistants that need memory without leaking the whole repo.

Namespaced chunks and metadata you can reason about
Server-side retrieval with your Neon + pgvector stack
Dashboard for caches, chat, and automation in one place

context · retrieve · respond

Chunks indexed12,482

Illustrative metrics — your data stays in your database.

Memory in the tools you already use

Connect a cache through MCP and your APIs so the same context shows up whether you are in Claude Desktop, Claude Code, Visual Studio Code, Cursor, ChatGPT, or another assistant that supports custom tools.