Cookbooks
Coding Agent Memory
Every session, you re-explain the same architectural decisions. The async session factory configuration. Why you're using selectinload instead of joinedload. The fact that pgvector's HNSW index is non-transactional. Store those decisions once — surface them automatically when they matter again.
Problem
Claude Code sessions are stateless. When a new session opens on your monorepo, it has no memory of the three-hour debugging session where you diagnosed the DetachedInstanceError pattern, no record of the team's decision to use StrEnum for choice fields, and no awareness of the fifteen architectural trade-offs you've made over the past month. You spend the first 20 minutes of every session re-establishing context that should already be there.
Persona
Maya works as a backend engineer on a FastAPI monorepo that's been in development for 18 months. She uses Claude Code daily for code review, debugging, and feature implementation. Her codebase has well-established patterns, but each new Claude Code session is a blank slate that doesn't know any of them.
Prerequisites
- Neuroloom plugin installed:
/plugin install neuroloom@endless-galaxy-studios - API key configured:
/plugins configure neuroloom - Or: API key and workspace ID set as environment variables for REST access
Step 1: Store your first architecture decision
The most valuable memories are the "why we did this" decisions — the ones that aren't obvious from reading the code.
Use memory_store to store a memory:
- title: "SQLAlchemy lazy loading strategy"
- memory_type: "decision"
- content: "All SQLAlchemy relationships use lazy='raise' as the default. This forces explicit loading via selectinload() or joinedload() and prevents N+1 query bugs. Any code that accesses a relationship without an explicit loader will raise an error immediately rather than silently issuing a query. See api/neuroloom_api/models/memory.py for the base model configuration."
- concepts: ["SQLAlchemy", "lazy loading", "N+1 queries", "performance"]
- files: ["api/neuroloom_api/models/memory.py", "api/neuroloom_api/database.py"]
- importance: 0.95curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"title": "SQLAlchemy lazy loading strategy",
"memory_type": "decision",
"narrative": "All SQLAlchemy relationships use lazy=''raise'' as the default. This forces explicit loading via selectinload() or joinedload() and prevents N+1 query bugs. Any code that accesses a relationship without an explicit loader will raise an error immediately rather than silently issuing a query. See api/neuroloom_api/models/memory.py for the base model configuration.",
"concepts": ["SQLAlchemy", "lazy loading", "N+1 queries", "performance"],
"source_files": [
"api/neuroloom_api/models/memory.py",
"api/neuroloom_api/database.py"
],
"importance_score": 0.95
}'Response:
{
"id": "mem-4a9f1c2e",
"title": "SQLAlchemy lazy loading strategy",
"memory_type": "decision",
"importance_score": 0.95,
"created_at": "2026-04-01T09:15:00Z"
}Tag source files. When Claude Code opens api/neuroloom_api/models/memory.py in a future session, the memory_by_file tool can surface this decision automatically.
Step 2: Search decisions by concept
In a new session — or when Claude Code hits a related code path — search for context:
Use memory_search with query "database relationship loading strategy" and memory_types ["decision", "pattern"]curl -X POST https://api.neuroloom.dev/api/v1/memories/search \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"query": "database relationship loading strategy",
"memory_types": ["decision", "pattern"],
"limit": 10
}'Response:
{
"results": [
{
"id": "mem-4a9f1c2e",
"title": "SQLAlchemy lazy loading strategy",
"memory_type": "decision",
"score": 0.91,
"summary": "lazy=raise default forces explicit selectinload/joinedload, preventing N+1 bugs"
},
{
"id": "mem-8b2d5e7f",
"title": "Workspace isolation in all DB queries",
"memory_type": "convention",
"score": 0.73,
"summary": "Every query must filter by workspace_id — no exceptions. pgvector indexes do not enforce tenant isolation."
}
]
}The second result surfaced a related convention even though the query didn't mention workspaces. Semantic search finds conceptually adjacent memories — not just exact matches.
Step 3: Use session context injection
Sessions give Claude Code automatic context at the start of each work period. When you start a session, Neuroloom injects the most relevant recent memories into context.
Each injected memory appears wrapped in a provenance marker — [Neuroloom — retrieved for {file_path}]...[/Neuroloom]. The marker tells the agent which file triggered the retrieval, so it can attribute the context to a specific code location rather than treating it as undifferentiated background.
Use session_startcurl -X POST https://api.neuroloom.dev/api/v1/sessions/start \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"project_path": "/Users/maya/Projects/neuroloom"
}'Response:
{
"session_id": "ses-3c8e1f4a",
"injected_memories": [
{
"id": "mem-4a9f1c2e",
"title": "SQLAlchemy lazy loading strategy",
"memory_type": "decision",
"importance_score": 0.95
},
{
"id": "mem-8b2d5e7f",
"title": "Workspace isolation in all DB queries",
"memory_type": "convention",
"importance_score": 0.92
}
],
"context_summary": "2 high-importance memories injected. Last session: 2026-03-31."
}The session ID is what you use to end the session and trigger memory extraction. Keep it for Step 6.
To refresh context mid-session when you switch tasks:
Use session_get_context with session_id "ses-3c8e1f4a"curl "https://api.neuroloom.dev/api/v1/sessions/ses-3c8e1f4a/context?max_memories=10" \
-H "Authorization: Token $MEMORIES_API_TOKEN"Step 4: Store a pattern
Patterns describe recurring approaches — reusable solutions you want surfaced whenever a related problem appears.
Use memory_store:
- title: "ARQ worker argument serialization"
- memory_type: "pattern"
- content: "ARQ job functions must receive only JSON-serializable arguments. Never pass ORM objects into jobs — pass IDs as strings instead. The worker runs in a separate process with its own connection pool and cannot deserialize SQLAlchemy model instances. Pattern: accept entity_id: str, then load from DB inside the job function."
- concepts: ["ARQ", "background jobs", "serialization", "worker patterns"]
- files: ["api/neuroloom_api/workers/"]
- importance: 0.88curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"title": "ARQ worker argument serialization",
"memory_type": "pattern",
"narrative": "ARQ job functions must receive only JSON-serializable arguments. Never pass ORM objects into jobs — pass IDs as strings instead. The worker runs in a separate process with its own connection pool and cannot deserialize SQLAlchemy model instances. Pattern: accept entity_id: str, then load from DB inside the job function.",
"concepts": ["ARQ", "background jobs", "serialization", "worker patterns"],
"source_files": ["api/neuroloom_api/workers/"],
"importance_score": 0.88
}'Step 5: Rate a memory
After using a retrieved memory, rate it. Ratings adjust importance scores over time — useful memories surface more often, stale ones fade.
Use memory_rate with memory_id "mem-4a9f1c2e" and useful true and context "Prevented N+1 bug when adding eager loading to the workspace query"curl -X POST https://api.neuroloom.dev/api/v1/memories/mem-4a9f1c2e/feedback \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"useful": true,
"context": "Prevented N+1 bug when adding eager loading to the workspace query"
}'Response:
{
"memory_id": "mem-4a9f1c2e",
"previous_importance": 0.95,
"updated_importance": 0.97,
"rating_recorded": true
}Rate memories when they save you time, when they're outdated, or when they point you in the wrong direction. The feedback loop makes your workspace smarter over sessions.
Step 6: End the session and extract memories
When you finish a work period, end the session. Neuroloom enqueues a batch extraction job that converts your session observations into structured memories.
Use session_end with session_id "ses-3c8e1f4a" and summary "Added async graph endpoint for topic exploration. Resolved DetachedInstanceError in session factory by verifying expire_on_commit=False. Discovered and documented pgvector ef_search tuning approach."curl -X POST https://api.neuroloom.dev/api/v1/sessions/ses-3c8e1f4a/end \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"summary": "Added async graph endpoint for topic exploration. Resolved DetachedInstanceError in session factory by verifying expire_on_commit=False. Discovered and documented pgvector ef_search tuning approach."
}'Response:
{
"session_id": "ses-3c8e1f4a",
"status": "ended",
"extraction_job_id": "job-9d2f4a1b",
"estimated_memories": 4
}The extraction job runs asynchronously. Within a few minutes, new memories appear in your workspace — derived from what happened during the session. You can search them immediately after they're created.
Production Patterns
Deduplicate before storing
Before storing a memory about a well-known topic, search first. Duplicate memories create noise and dilute search quality.
# Search before storing a new decision
curl -X POST https://api.neuroloom.dev/api/v1/memories/search \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"query": "SQLAlchemy session configuration",
"memory_types": ["decision"],
"limit": 3
}'If a match score above 0.85 comes back, update the existing memory rather than storing a new one.
Pin critical memories with high importance
Important conventions that must never be missed — workspace isolation, security invariants, breaking API contracts — should use importance_score: 1.0. These surface in every session context injection regardless of recency.
curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"title": "Workspace isolation invariant",
"memory_type": "convention",
"narrative": "Every database query must filter by workspace_id. This is a hard security invariant, not a performance optimization. pgvector similarity queries do not enforce tenant isolation — the WHERE clause is mandatory.",
"importance_score": 1.0
}'One workspace per active project
Use separate workspaces for separate projects. Mixing memories from unrelated codebases degrades search quality — a query about "async patterns" in a Node.js project shouldn't surface memories about Python asyncio.
Create additional workspaces at app.neuroloom.dev and switch MEMORIES_WORKSPACE_ID per project directory.
Before You Ship
- Search your workspace for the 10 most important architectural decisions — confirm they're captured
- Test session context injection: start a new session and verify the injected memories are the right ones
- Rate at least 5 memories after use to seed the feedback loop
- Confirm
source_filesare set on all file-specific decisions somemory_by_fileworks - Check that security and isolation invariants have
importance_score: 1.0 - Run
session_endwith a summary at the end of each work period to trigger extraction
Related
- Decision Tracking — formalize ADRs as typed memories with supersession handling
- CLAUDE.md Migration — migrate your existing static context into searchable memory
- MCP Tools Reference — full tool reference for session and memory tools
- Concepts: Memories — memory types, importance scoring, and lifecycle