Persistent memory for AI agents
auditable by default.

Kumiho gives your AI assistants graph-native memory that survives context window resets, model swaps, and long-running projects.

Get Started Free Read AI Memory Docs

Sound familiar?

The amnesia problem

Your agent forgets everything when the context window resets. Every session starts from scratch.

The hallucinated memory problem

You stuff old conversations into the prompt. The model confuses its own summaries with facts.

The black box problem

The agent "remembers" something, but you can't trace where that memory came from or verify it.

Kumiho solves all three. With a graph.

What Kumiho does

Persistent recall

Memories stored as immutable revisions in a graph. Survive context resets, model swaps, and provider changes.

Typed reasoning edges

DERIVED_FROM, DEPENDS_ON, REFERENCED — every memory knows what it came from and what depends on it.

Dream State consolidation

Async background process that enriches, links, deduplicates, and prunes memories. Like sleep for your agent.

LLM-decoupled

Memory lives in Neo4j + Redis, not inside any model. Switch providers without losing history.

A day in the life

Your AI assistant handles a design review. Kumiho stores the conversation summary, decisions, and references as a memory revision.

Three days later, the user asks "What did we decide about the hero layout?" The agent recalls the exact revision with full provenance.

Dream State runs overnight — links the design decision to the project brief, flags stale references, enriches metadata.

Two months later, a new team member asks the agent for project history. Every decision is traceable, every source is cited.

No prompt stuffing. No hallucination. Just recall.

Integration paths

Playground

Web chat powered by your own LLM API key. Talk to your agent and test memory store & recall in real time.

Install guide

1
Open Playground from the dashboard sidebar.
2
Add your OpenAI, Anthropic, or Gemini API key.
3
Start chatting and watch Kumiho store and recall memory automatically.

# Launch Kumiho Playground
kumiho playground

# Chat with your AI agent
# Bring your own LLM API key
# Test memory store & recall in real time

Setup note

Fastest way to test Kumiho memory before installing a plugin.

Claude Code Plugin

Install the Kumiho marketplace plugin, run /kumiho-auth, and Claude Code gets persistent memory across every coding session.

Install guide

1
Clone the plugins repo locally, add the claude directory as the marketplace source, and install the kumiho-memory plugin once.
2
Mint a Kumiho API token from the dashboard, then run /kumiho-auth inside Claude Code.
3
Start a new session and confirm Kumiho memory tools appear automatically.

# 1. Clone the plugins repo
git clone https://github.com/KumihoIO/kumiho-plugins.git

# 2. Add the local Claude marketplace
claude plugin marketplace add ./kumiho-plugins/claude

# 3. Install the plugin
claude plugin install kumiho-memory@kumiho-claude

# 4. Authenticate inside Claude
/kumiho-auth

Setup note

Claude Code clones GitHub repos at the repo root, so GitHub subfolder URLs do not work here. On first launch the plugin bootstraps its own isolated Python runtime and can auto-load nearby .claude/settings*.json KUMIHO_* values.

Claude Cowork Plugin

Use the same plugin with Cowork, authenticate once, and shared memory follows across desktop sessions with author attribution.

Install guide

1
Clone the plugins repo locally, add the claude directory as the marketplace source, and install the same Kumiho Claude plugin used by Claude Code.
2
Run /kumiho-auth once inside Cowork and paste your Kumiho dashboard token.
3
If Cowork does not reconnect immediately, relaunch the app so the local MCP server picks up the new token.

# 1. Clone the plugins repo
git clone https://github.com/KumihoIO/kumiho-plugins.git

# 2. Add the local Claude marketplace
claude plugin marketplace add ./kumiho-plugins/claude

# 3. Install the same Kumiho Claude plugin
claude plugin install kumiho-memory@kumiho-claude

# 4. Authenticate once inside Cowork
/kumiho-auth

Setup note

GitHub subfolder URLs are not supported for marketplace add, so use the local claude path after cloning the repo. Cowork can read auth from the local Kumiho credential cache or the plugin .env.local, and /kumiho-auth updates both when possible.

OpenClaw Plugin

Install the npm plugin, run kumiho-setup, and let the wizard wire Python, auth, and openclaw.json for you.

Install guide

1
Install the npm plugin, then run kumiho-setup to find Python 3.9+, create ~/.kumiho/venv, and install kumiho[mcp] + kumiho-memory[all].
2
Let the setup wizard authenticate with Kumiho, write ~/.kumiho/preferences.json, and merge the plugin block into ~/.openclaw/openclaw.json.
3
Restart OpenClaw if needed and verify with openclaw kumiho stats or /memory stats inside chat.

# 1. Install the Kumiho plugin for OpenClaw
openclaw plugins install @kumiho/openclaw-kumiho

# 2. Run the guided setup
npx --package=@kumiho/openclaw-kumiho kumiho-setup

# 3. Verify memory is connected
openclaw kumiho stats

Setup note

Local mode is the recommended default. Cloud mode is also supported if you prefer direct HTTPS with an API key.

How it works

Memory flows through a lifecycle — each stage adds structure and durability.

Ingest

Store

Recall

Revise

Consolidate

Kumiho vs. DIY memory stacks

An honest comparison

What you need	RAG + Vector DB	Kumiho AI Cognitive Memory
Store conversations	You build chunking + embedding	memory_store() — one call
Recall by meaning	Vector similarity (no structure)	Hybrid: fulltext + graph + vector
Trace provenance	Not available	Every memory has typed edges
Handle contradictions	Hope the model figures it out	Dream State detects + resolves
Survive model swaps	Re-embed everything	LLM-decoupled by design
Audit trail	Build your own	Immutable revisions + timestamps

Common questions

What is Dream State?

An async consolidation process that reviews, enriches, links, and prunes memories. Think of it as offline maintenance for your agent's long-term memory.

Does this replace my vector database?

Yes. Kumiho includes its own vector infrastructure as part of the hybrid retrieval system (fulltext + graph + vector). You don't need to run a separate vector database — memory search, similarity matching, and semantic recall all run on Kumiho's built-in infrastructure.

Which AI platforms are supported?

Any platform that supports MCP tools, REST APIs, or Python. First-class plugins for Claude Code, Cowork, and OpenClaw. Plus Kumiho Playground for local testing.

How much memory can I store?

Depends on your tier. Memory is included at every tier. Free gets 1k nodes, Creator 30k, Studio 500k. Add Memory Standard ($29/mo) to boost Free limits.

Stop rebuilding memory stacks. Ship agent recall with provenance.

Get Started Free See Pricing

Persistent memory for AI agentsauditable by default.

Sound familiar?

The amnesia problem

The hallucinated memory problem

The black box problem

What Kumiho does

Persistent recall

Typed reasoning edges

Dream State consolidation

LLM-decoupled

A day in the life

Integration paths

Playground

Claude Code Plugin

Claude Cowork Plugin

OpenClaw Plugin

How it works

Kumiho vs. DIY memory stacks

Common questions

Stop rebuilding memory stacks. Ship agent recall with provenance.

Persistent memory for AI agents
auditable by default.