Core Concepts
Memory Lifecycle
Neuroloom, the lifecycle memory engine for coding agents, tracks knowledge from its first capture through every session that uses it — extracting structured memories, evolving their confidence as the codebase changes, and surfacing the right context when it matters.
A memory moves from raw agent observation through extraction, embedding, relationship discovery, and importance scoring before it becomes searchable. After creation, every search retrieval and explicit rating adjusts the importance score. Memories whose importance falls below a threshold and have not been accessed in 90 days become pruning candidates.
Understanding the lifecycle tells you when to expect a memory to be searchable after a session ends, how importance scores accumulate, and what controls whether a memory survives long-term.
Full Lifecycle
Stage 1: Observation and Context Injection
When you run a coding agent with Neuroloom connected, the MCP server captures observations during the session — tool calls, code read/write events, commands executed, and agent reasoning about what it just did. These are raw events, not memories.
Observations accumulate during the session in the source_session_id context. They are not individually stored as memories — that happens at extraction time.
At the same time, before the agent reads a file, Neuroloom injects relevant memories from past sessions as context. Each injected memory is wrapped in a provenance marker:
[Neuroloom — retrieved for api/routers/memories.py]
...memory content...
[/Neuroloom]The marker format identifies which file triggered retrieval. This gives downstream LLMs a lineage signal — they can attribute the injected knowledge to a specific file context rather than treating it as undifferentiated background information.
If you are using the Claude Code plugin, observation capture and context injection are automatic. If you are connecting directly via MCP, your agent calls session_record_observation for each event it wants to surface for extraction.
Stage 2: Session End and Batch Extraction
When the session ends — the plugin closes it, or you call session_end via MCP — the extraction pipeline runs over the session's observations.
An LLM pass reads the full observation log and extracts meaningful units of knowledge: decisions made, patterns applied, bugs fixed, conventions followed. Each extracted unit becomes a candidate memory with:
- A draft
titleandnarrative - An inferred
memory_typefrom the 9-value StrEnum - Extracted
conceptsfrom the narrative source_filesderived from file-access events in the observation log- An initial
confidence_score
The extraction step is why Neuroloom memories are richer than the raw observations — the LLM synthesizes the "so what" from a sequence of tool calls into a statement of knowledge.
Stage 3: Embedding Generation
After a memory record is created, Neuroloom computes a 1024-dimension vector embedding from the concatenated title and narrative. This embedding is stored in the embedding field and drives all semantic search operations.
Embedding generation is synchronous with memory creation — by the time the write returns a memory_id, the embedding exists and the memory is semantically searchable.
Stage 4: Relationship Discovery
After embedding, the relationship discovery pipeline runs eight heuristics against the existing memory store to find candidate edges. This step is asynchronous and typically completes within seconds.
The eight heuristics — file overlap, symbol overlap, concept overlap, semantic similarity, temporal proximity, session context, LLM extraction, and manual — are described in full in Relationships and Graph. Each discovered edge is assigned a relationship_type and recorded with its discovery_method.
Stage 5: PageRank Scoring
PageRank runs as a daily cron job at 01:00 UTC across the entire workspace graph. It computes a structural centrality score for each memory based on how many other memories link to it and the weights of those links.
Exponential smoothing formula:
pagerank_score = (0.7 × new_score) + (0.3 × previous_score)The 70/30 split dampens volatility — a memory that gains links overnight moves toward the new score without a sudden jump. The smoothed pagerank_score is then fed into the importance scoring formula as the 7th weighted factor.
Importance Scoring Formula
The importance_score is a weighted combination of seven factors:
| Factor | Signal |
|---|---|
| 1. Recency | How recently the memory was created |
| 2. Access frequency | access_count across all retrievals |
| 3. Retrieval frequency | retrieval_count in search results |
| 4. Confidence score | Reliability of the memory content |
| 5. Last accessed | Recency of the most recent retrieval |
| 6. Explicit ratings | Thumbs up/down signals from the agent or developer |
| 7. PageRank score | Structural centrality in the relationship graph |
A memory with high PageRank is connected to many other memories — it represents knowledge that the rest of the workspace refers to. This structural signal lifts its importance score even if it has not been accessed recently.
PageRank updates happen once per day. A memory created today gets its initial PageRank score the following morning at 01:00 UTC. Until then, its pagerank_score is 0.0 and importance scoring uses the other six factors.
Stage 6: Searchable Memory
After embedding generation, the memory is immediately available in semantic search. After relationship discovery and PageRank scoring, it also appears in graph retrieval results with its full edge context.
The importance_score affects result ranking — higher-importance memories rank above lower-importance memories when embedding similarity scores are close.
Stage 7: Retrieval and Rating
Every time a memory is returned in a search result, its retrieval_count increments and last_accessed_at updates. Every time a memory is explicitly retrieved or marked helpful by the agent, access_count increments and contributes to the importance score.
Conversely, memories that are never retrieved accumulate no access signals. Combined with confidence decay over time, their importance score trends downward.
Stage 8: Pruning
Memories become pruning candidates when all three conditions are true:
importance_score < 0.3- Created before the workspace's retention cutoff
last_accessed_atis more than 90 days ago
Pruning candidates are surfaced in the dashboard and via the API — they are not automatically deleted. You review and confirm deletion, or override by explicitly accessing the memory (which resets last_accessed_at and lifts its importance score).
Pruning is workspace-plan-dependent. Free tier workspaces have a 30-day retention period. Pro workspaces have a 1-year retention period. Team workspaces have configurable retention. A memory can reach the pruning candidate state before 90 days of inactivity if it is also past the plan's retention cutoff.
Consolidation
When the discovery pipeline identifies two or more memories as near-duplicates — high semantic similarity, same memory type, overlapping source files — it surfaces them as consolidation candidates. You confirm the consolidation via the dashboard or API.
Consolidation creates a new memory that merges the narratives of the source memories, inherits all their relationships, and records the source IDs in consolidated_from. The source memories are archived (not deleted) so the provenance is preserved.
The consolidated memory starts with an importance score derived from the highest-scoring source and carries the union of all tags, concepts, and source files from the originals.
Lifecycle Summary
| Stage | Trigger | Synchronous? | Timing |
|---|---|---|---|
| Observation capture | Agent tool use | Yes | During session |
| Batch extraction | Session end | No | Seconds after session close |
| Embedding generation | Memory creation | Yes | Immediate |
| Relationship discovery | Embedding complete | No | Seconds after creation |
| PageRank scoring | Daily cron | No | 01:00 UTC |
| Community detection | Daily cron | No | 01:30 UTC |
| Pruning evaluation | Continuous | No | Background |
Related Pages
- What is a Memory — memory fields, types, and the data model
- Relationships and Graph — how relationship discovery and community detection work
- Sessions API Reference — full endpoint documentation for session lifecycle management