Persistent memory for AI agents
auditable by default.
Kumiho gives your AI assistants graph-native memory that survives context window resets, model swaps, and long-running projects.
Sound familiar?
The amnesia problem
Your agent forgets everything when the context window resets. Every session starts from scratch.
The hallucinated memory problem
You stuff old conversations into the prompt. The model confuses its own summaries with facts.
The black box problem
The agent "remembers" something, but you can't trace where that memory came from or verify it.
Kumiho solves all three. With a graph.
What Kumiho does
Persistent recall
Memories stored as immutable revisions in a graph. Survive context resets, model swaps, and provider changes.
Typed reasoning edges
DERIVED_FROM, DEPENDS_ON, REFERENCED — every memory knows what it came from and what depends on it.
Dream State consolidation
Async background process that enriches, links, deduplicates, and prunes memories. Like sleep for your agent.
LLM-decoupled
Memory lives in Neo4j + Redis, not inside any model. Switch providers without losing history.
A day in the life
Your AI assistant handles a design review. Kumiho stores the conversation summary, decisions, and references as a memory revision.
Three days later, the user asks "What did we decide about the hero layout?" The agent recalls the exact revision with full provenance.
Dream State runs overnight — links the design decision to the project brief, flags stale references, enriches metadata.
Two months later, a new team member asks the agent for project history. Every decision is traceable, every source is cited.
No prompt stuffing. No hallucination. Just recall.
Integration paths
Playground
Web chat powered by your own LLM API key. Talk to your agent and test memory store & recall in real time.
Install guide
- 1
Open Playground from the dashboard sidebar.
- 2
Add your OpenAI, Anthropic, or Gemini API key.
- 3
Start chatting and watch Kumiho store and recall memory automatically.
# Launch Kumiho Playground
kumiho playground
# Chat with your AI agent
# Bring your own LLM API key
# Test memory store & recall in real timeSetup note
Fastest way to test Kumiho memory before installing a plugin.
Claude Code Plugin
Install the Kumiho marketplace plugin, run /kumiho-auth, and Claude Code gets persistent memory across every coding session.
Install guide
- 1
Clone the plugins repo locally, add the claude directory as the marketplace source, and install the kumiho-memory plugin once.
- 2
Mint a Kumiho API token from the dashboard, then run /kumiho-auth inside Claude Code.
- 3
Start a new session and confirm Kumiho memory tools appear automatically.
# 1. Clone the plugins repo
git clone https://github.com/KumihoIO/kumiho-plugins.git
# 2. Add the local Claude marketplace
claude plugin marketplace add ./kumiho-plugins/claude
# 3. Install the plugin
claude plugin install kumiho-memory@kumiho-claude
# 4. Authenticate inside Claude
/kumiho-authSetup note
Claude Code clones GitHub repos at the repo root, so GitHub subfolder URLs do not work here. On first launch the plugin bootstraps its own isolated Python runtime and can auto-load nearby .claude/settings*.json KUMIHO_* values.
Claude Cowork Plugin
Use the same plugin with Cowork, authenticate once, and shared memory follows across desktop sessions with author attribution.
Install guide
- 1
Clone the plugins repo locally, add the claude directory as the marketplace source, and install the same Kumiho Claude plugin used by Claude Code.
- 2
Run /kumiho-auth once inside Cowork and paste your Kumiho dashboard token.
- 3
If Cowork does not reconnect immediately, relaunch the app so the local MCP server picks up the new token.
# 1. Clone the plugins repo
git clone https://github.com/KumihoIO/kumiho-plugins.git
# 2. Add the local Claude marketplace
claude plugin marketplace add ./kumiho-plugins/claude
# 3. Install the same Kumiho Claude plugin
claude plugin install kumiho-memory@kumiho-claude
# 4. Authenticate once inside Cowork
/kumiho-authSetup note
GitHub subfolder URLs are not supported for marketplace add, so use the local claude path after cloning the repo. Cowork can read auth from the local Kumiho credential cache or the plugin .env.local, and /kumiho-auth updates both when possible.
OpenClaw Plugin
Install the npm plugin, run kumiho-setup, and let the wizard wire Python, auth, and openclaw.json for you.
Install guide
- 1
Install the npm plugin, then run kumiho-setup to find Python 3.9+, create ~/.kumiho/venv, and install kumiho[mcp] + kumiho-memory[all].
- 2
Let the setup wizard authenticate with Kumiho, write ~/.kumiho/preferences.json, and merge the plugin block into ~/.openclaw/openclaw.json.
- 3
Restart OpenClaw if needed and verify with openclaw kumiho stats or /memory stats inside chat.
# 1. Install the Kumiho plugin for OpenClaw
openclaw plugins install @kumiho/openclaw-kumiho
# 2. Run the guided setup
npx --package=@kumiho/openclaw-kumiho kumiho-setup
# 3. Verify memory is connected
openclaw kumiho statsSetup note
Local mode is the recommended default. Cloud mode is also supported if you prefer direct HTTPS with an API key.
How it works
Memory flows through a lifecycle — each stage adds structure and durability.
Kumiho vs. DIY memory stacks
An honest comparison
| What you need | RAG + Vector DB | Kumiho AI Cognitive Memory |
|---|---|---|
| Store conversations | You build chunking + embedding | memory_store() — one call |
| Recall by meaning | Vector similarity (no structure) | Hybrid: fulltext + graph + vector |
| Trace provenance | Not available | Every memory has typed edges |
| Handle contradictions | Hope the model figures it out | Dream State detects + resolves |
| Survive model swaps | Re-embed everything | LLM-decoupled by design |
| Audit trail | Build your own | Immutable revisions + timestamps |