Persistent memory for AI
Memory for AI agents
Store and retrieve embedding-backed context so your agents remember what matters—without stuffing the whole world into every request.
Structure
Retrieval
→ embed query
→ topK: 8 · cache: primary
→ merged context: 4.2k tokens
→ ready for modelBuilt for agents that remember
Everything you need to keep prompts sharp and retrieval predictable.
Structured caches
Organize chunks and links so context stays scoped to the task.
Vector search
Embedding-backed retrieval tuned for your workflows.
Hooks & jobs
Automate ingest and updates without manual copy-paste.
Model-ready
Ship context to chat and tools in a consistent shape.
Privacy-friendly, predictable context
You control what lands in a cache. Search narrows to what matters, so each request stays small and auditable—ideal for assistants that need memory without leaking the whole repo.
- Namespaced chunks and metadata you can reason about
- Server-side retrieval with your Neon + pgvector stack
- Dashboard for caches, chat, and automation in one place
Illustrative metrics — your data stays in your database.
Memory in the tools you already use
Connect a cache through MCP and your APIs so the same context shows up whether you are in Claude Desktop, Claude Code, Visual Studio Code, Cursor, ChatGPT, or another assistant that supports custom tools.
- Claude DesktopClaude Desktop
- Claude CodeClaude Code
- Visual Studio CodeVisual Studio Code
- CursorCursor
- ChatGPTChatGPT
Connect tools via Model Context Protocol. When enabled, Claude can call your ContextCache server during this conversation.
ContextCache
https://mcp.contextcache…/sse
Tools
- search · retrieve · list_caches
Ship context that survives the session.
Sign in, create a cache, and wire it into your assistant flow.