Cookbooks

Coding Agent Memory

Every session, you re-explain the same architectural decisions. The async session factory configuration. Why you're using selectinload instead of joinedload. The fact that pgvector's HNSW index is non-transactional. Store those decisions once — surface them automatically when they matter again.

Problem

Claude Code sessions are stateless. When a new session opens on your monorepo, it has no memory of the three-hour debugging session where you diagnosed the DetachedInstanceError pattern, no record of the team's decision to use StrEnum for choice fields, and no awareness of the fifteen architectural trade-offs you've made over the past month. You spend the first 20 minutes of every session re-establishing context that should already be there.

Persona

Maya works as a backend engineer on a FastAPI monorepo that's been in development for 18 months. She uses Claude Code daily for code review, debugging, and feature implementation. Her codebase has well-established patterns, but each new Claude Code session is a blank slate that doesn't know any of them.

Prerequisites

Neuroloom plugin installed: /plugin install neuroloom@endless-galaxy-studios
API key configured: /plugins configure neuroloom
Or: API key and workspace ID set as environment variables for REST access

Step 1: Store your first architecture decision

The most valuable memories are the "why we did this" decisions — the ones that aren't obvious from reading the code.

Use memory_store to store a memory:
- title: "SQLAlchemy lazy loading strategy"
- memory_type: "decision"
- content: "All SQLAlchemy relationships use lazy='raise' as the default. This forces explicit loading via selectinload() or joinedload() and prevents N+1 query bugs. Any code that accesses a relationship without an explicit loader will raise an error immediately rather than silently issuing a query. See api/neuroloom_api/models/memory.py for the base model configuration."
- concepts: ["SQLAlchemy", "lazy loading", "N+1 queries", "performance"]
- files: ["api/neuroloom_api/models/memory.py", "api/neuroloom_api/database.py"]
- importance: 0.95

curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "title": "SQLAlchemy lazy loading strategy",
    "memory_type": "decision",
    "narrative": "All SQLAlchemy relationships use lazy=''raise'' as the default. This forces explicit loading via selectinload() or joinedload() and prevents N+1 query bugs. Any code that accesses a relationship without an explicit loader will raise an error immediately rather than silently issuing a query. See api/neuroloom_api/models/memory.py for the base model configuration.",
    "concepts": ["SQLAlchemy", "lazy loading", "N+1 queries", "performance"],
    "source_files": [
      "api/neuroloom_api/models/memory.py",
      "api/neuroloom_api/database.py"
    ],
    "importance_score": 0.95
  }'

Response:

{
  "id": "mem-4a9f1c2e",
  "title": "SQLAlchemy lazy loading strategy",
  "memory_type": "decision",
  "importance_score": 0.95,
  "created_at": "2026-04-01T09:15:00Z"
}

Tag source files. When Claude Code opens api/neuroloom_api/models/memory.py in a future session, the memory_by_file tool can surface this decision automatically.

Step 2: Search decisions by concept

In a new session — or when Claude Code hits a related code path — search for context:

Use memory_search with query "database relationship loading strategy" and memory_types ["decision", "pattern"]

curl -X POST https://api.neuroloom.dev/api/v1/memories/search \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "query": "database relationship loading strategy",
    "memory_types": ["decision", "pattern"],
    "limit": 10
  }'

Response:

{
  "results": [
    {
      "id": "mem-4a9f1c2e",
      "title": "SQLAlchemy lazy loading strategy",
      "memory_type": "decision",
      "score": 0.91,
      "summary": "lazy=raise default forces explicit selectinload/joinedload, preventing N+1 bugs"
    },
    {
      "id": "mem-8b2d5e7f",
      "title": "Workspace isolation in all DB queries",
      "memory_type": "convention",
      "score": 0.73,
      "summary": "Every query must filter by workspace_id — no exceptions. pgvector indexes do not enforce tenant isolation."
    }
  ]
}

The second result surfaced a related convention even though the query didn't mention workspaces. Semantic search finds conceptually adjacent memories — not just exact matches.

Step 3: Use session context injection

Sessions give Claude Code automatic context at the start of each work period. When you start a session, Neuroloom injects the most relevant recent memories into context.

Each injected memory appears wrapped in a provenance marker — [Neuroloom — retrieved for {file_path}]...[/Neuroloom]. The marker tells the agent which file triggered the retrieval, so it can attribute the context to a specific code location rather than treating it as undifferentiated background.

Use session_start

curl -X POST https://api.neuroloom.dev/api/v1/sessions/start \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "project_path": "/Users/maya/Projects/neuroloom"
  }'

Response:

{
  "session_id": "ses-3c8e1f4a",
  "injected_memories": [
    {
      "id": "mem-4a9f1c2e",
      "title": "SQLAlchemy lazy loading strategy",
      "memory_type": "decision",
      "importance_score": 0.95
    },
    {
      "id": "mem-8b2d5e7f",
      "title": "Workspace isolation in all DB queries",
      "memory_type": "convention",
      "importance_score": 0.92
    }
  ],
  "context_summary": "2 high-importance memories injected. Last session: 2026-03-31."
}

The session ID is what you use to end the session and trigger memory extraction. Keep it for Step 6.

To refresh context mid-session when you switch tasks:

Use session_get_context with session_id "ses-3c8e1f4a"

curl "https://api.neuroloom.dev/api/v1/sessions/ses-3c8e1f4a/context?max_memories=10" \
  -H "Authorization: Token $MEMORIES_API_TOKEN"

Step 4: Store a pattern

Patterns describe recurring approaches — reusable solutions you want surfaced whenever a related problem appears.

Use memory_store:
- title: "ARQ worker argument serialization"
- memory_type: "pattern"
- content: "ARQ job functions must receive only JSON-serializable arguments. Never pass ORM objects into jobs — pass IDs as strings instead. The worker runs in a separate process with its own connection pool and cannot deserialize SQLAlchemy model instances. Pattern: accept entity_id: str, then load from DB inside the job function."
- concepts: ["ARQ", "background jobs", "serialization", "worker patterns"]
- files: ["api/neuroloom_api/workers/"]
- importance: 0.88

curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "title": "ARQ worker argument serialization",
    "memory_type": "pattern",
    "narrative": "ARQ job functions must receive only JSON-serializable arguments. Never pass ORM objects into jobs — pass IDs as strings instead. The worker runs in a separate process with its own connection pool and cannot deserialize SQLAlchemy model instances. Pattern: accept entity_id: str, then load from DB inside the job function.",
    "concepts": ["ARQ", "background jobs", "serialization", "worker patterns"],
    "source_files": ["api/neuroloom_api/workers/"],
    "importance_score": 0.88
  }'

Step 5: Rate a memory

After using a retrieved memory, rate it. Ratings adjust importance scores over time — useful memories surface more often, stale ones fade.

Use memory_rate with memory_id "mem-4a9f1c2e" and useful true and context "Prevented N+1 bug when adding eager loading to the workspace query"

curl -X POST https://api.neuroloom.dev/api/v1/memories/mem-4a9f1c2e/feedback \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "useful": true,
    "context": "Prevented N+1 bug when adding eager loading to the workspace query"
  }'

Response:

{
  "memory_id": "mem-4a9f1c2e",
  "previous_importance": 0.95,
  "updated_importance": 0.97,
  "rating_recorded": true
}

Rate memories when they save you time, when they're outdated, or when they point you in the wrong direction. The feedback loop makes your workspace smarter over sessions.

Step 6: End the session and extract memories

When you finish a work period, end the session. Neuroloom enqueues a batch extraction job that converts your session observations into structured memories.

Use session_end with session_id "ses-3c8e1f4a" and summary "Added async graph endpoint for topic exploration. Resolved DetachedInstanceError in session factory by verifying expire_on_commit=False. Discovered and documented pgvector ef_search tuning approach."

curl -X POST https://api.neuroloom.dev/api/v1/sessions/ses-3c8e1f4a/end \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "summary": "Added async graph endpoint for topic exploration. Resolved DetachedInstanceError in session factory by verifying expire_on_commit=False. Discovered and documented pgvector ef_search tuning approach."
  }'

Response:

{
  "session_id": "ses-3c8e1f4a",
  "status": "ended",
  "extraction_job_id": "job-9d2f4a1b",
  "estimated_memories": 4
}

The extraction job runs asynchronously. Within a few minutes, new memories appear in your workspace — derived from what happened during the session. You can search them immediately after they're created.

Production Patterns

Deduplicate before storing

Before storing a memory about a well-known topic, search first. Duplicate memories create noise and dilute search quality.

# Search before storing a new decision
curl -X POST https://api.neuroloom.dev/api/v1/memories/search \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "query": "SQLAlchemy session configuration",
    "memory_types": ["decision"],
    "limit": 3
  }'

If a match score above 0.85 comes back, update the existing memory rather than storing a new one.

Pin critical memories with high importance

Important conventions that must never be missed — workspace isolation, security invariants, breaking API contracts — should use importance_score: 1.0. These surface in every session context injection regardless of recency.

curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "title": "Workspace isolation invariant",
    "memory_type": "convention",
    "narrative": "Every database query must filter by workspace_id. This is a hard security invariant, not a performance optimization. pgvector similarity queries do not enforce tenant isolation — the WHERE clause is mandatory.",
    "importance_score": 1.0
  }'

One workspace per active project

Use separate workspaces for separate projects. Mixing memories from unrelated codebases degrades search quality — a query about "async patterns" in a Node.js project shouldn't surface memories about Python asyncio.

Create additional workspaces at app.neuroloom.dev and switch MEMORIES_WORKSPACE_ID per project directory.

Before You Ship

Search your workspace for the 10 most important architectural decisions — confirm they're captured
Test session context injection: start a new session and verify the injected memories are the right ones
Rate at least 5 memories after use to seed the feedback loop
Confirm source_files are set on all file-specific decisions so memory_by_file works
Check that security and isolation invariants have importance_score: 1.0
Run session_end with a summary at the end of each work period to trigger extraction

Decision Tracking — formalize ADRs as typed memories with supersession handling
CLAUDE.md Migration — migrate your existing static context into searchable memory
MCP Tools Reference — full tool reference for session and memory tools
Concepts: Memories — memory types, importance scoring, and lifecycle

Ready to get started?

Start building with Neuroloom for free.

Start Building Free

Coding Agent Memory

Problem

Persona

Prerequisites

Step 1: Store your first architecture decision

Step 2: Search decisions by concept

Step 3: Use session context injection

Step 4: Store a pattern

Step 5: Rate a memory

Step 6: End the session and extract memories

Production Patterns

Deduplicate before storing

Pin critical memories with high importance

One workspace per active project

Before You Ship

Related