Skip to content

Memory System

N.E.K.O.'s memory system provides persistent context across sessions, enabling characters to remember past conversations, user preferences, and evolving relationships.

Storage layers

LayerStorageRetentionAccess pattern
Recent memoryJSON files (recent_*.json)Sliding windowDirect read, per-character
Time-indexed originalSQLite (time_indexed_original)PermanentTime range queries
Time-indexed compressedSQLite (time_indexed_compressed)PermanentTime range queries
Semantic memoryVector embeddings (text-embedding-v4)PermanentSimilarity search

How memory flows into conversations

  1. When a new session starts, the system loads recent memory (last N messages) as immediate context.
  2. A semantic search retrieves relevant past conversations based on the current topic.
  3. A time-indexed query provides chronological context for temporal references ("yesterday", "last week").
  4. All retrieved memory is injected into the LLM system prompt as context.

Compression pipeline

Old conversations are periodically compressed to save context window space:

Raw conversation ──> Summary model (qwen-plus) ──> Compressed summary

                                                   Stored in time_indexed_compressed

The ROUTER_MODEL (default: qwen-plus) decides which memories are relevant enough to retain in full versus compress.

Memory review

Users can browse and correct stored memories at http://localhost:48911/memory_browser. This helps address:

  • Model hallucinations stored as "memories"
  • Incorrect facts the character has internalized
  • Repetitive patterns in conversation summaries

API endpoints

See the Memory REST API for the full endpoint reference.

EndpointMethodPurpose
/api/memory/recent_filesGETList all memory files
/api/memory/recent_fileGETGet specific memory file content
/api/memory/recent_file/savePOSTSave updated memory
/api/memory/update_catgirl_namePOSTRename character across memories
/api/memory/review_configGET/POSTMemory review settings

Released under the MIT License.