Cookbooks
Knowledge Base Ingestion
Your project's documentation — architecture guides, runbooks, onboarding docs, API specs — exists as files. Storing them in Neuroloom turns static documents into a searchable knowledge base that agents query by meaning, not filename.
Problem
New engineers spend their first week reading through scattered documentation to understand how the system works. Senior engineers re-read the same runbook every time they're on-call because there's no faster way to find "what do I do when the ARQ worker stops processing jobs." Documentation that lives in files gets searched by filename — or not at all.
Persona
Marcus maintains a FastAPI monorepo with a 200-line CLAUDE.md and a growing folder of architecture guides, runbooks, and ADRs. The files are accurate, but they're scattered — his coding agent re-reads the wrong runbook or misses the relevant ADR because it can't search by meaning. He wants to store these documents as searchable memories so the agent can ask "how do we handle job retries in ARQ?" and get the right answer, not "docs/runbooks/arq-worker-troubleshooting.md".
Prerequisites
- Neuroloom API key from app.neuroloom.dev/settings/api-keys
- Environment variables:
export MEMORIES_API_TOKEN="nl_your_api_key_here" export MEMORIES_WORKSPACE_ID="ws_your_workspace_id_here" - Python 3.11+ and
httpxinstalled:pip install httpx
Step 1: Batch-store documents
Store each document as a separate memory. Use memory_type to categorize by document type — this enables filtered search later ("show me only runbooks about authentication").
import os
import glob
import httpx
from pathlib import Path
token = os.environ["MEMORIES_API_TOKEN"]
workspace_id = os.environ["MEMORIES_WORKSPACE_ID"]
# Map file path prefixes to memory types
def memory_type_for_path(path: str) -> str:
if "adr" in path or "decisions" in path:
return "decision"
if "runbook" in path or "ops" in path:
return "wiki"
if "architecture" in path or "design" in path:
return "architecture"
return "wiki"
def ingest_document(filepath: str) -> dict:
content = Path(filepath).read_text()
# Use first heading as title, fall back to filename
lines = content.strip().split("\n")
title = lines[0].lstrip("# ").strip() if lines[0].startswith("#") else Path(filepath).stem
return httpx.post(
"https://api.neuroloom.dev/api/v1/memories/store",
headers={"Authorization": f"Token {token}"},
json={
"workspace_id": workspace_id,
"title": title,
"memory_type": memory_type_for_path(filepath),
"narrative": content,
"source_files": [filepath],
"tags": ["ingested", "docs"],
"importance_score": 0.75,
},
).json()
# Ingest all markdown files in docs/
docs = glob.glob("docs/**/*.md", recursive=True)
for doc in docs:
result = ingest_document(doc)
print(f"Stored: {result['id']} {result['title']}")# Single document example — use a script loop for batch ingestion
curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"title": "ARQ worker troubleshooting",
"memory_type": "wiki",
"narrative": "# ARQ Worker Troubleshooting\n\nIf the ARQ worker stops processing jobs, check these steps in order:\n\n1. Verify Redis connection: `redis-cli ping` from the worker host\n2. Check worker process status: `systemctl status neuroloom-worker`\n3. Review worker logs for serialization errors — common cause is ORM objects passed as job arguments\n4. Verify the job queue depth: `arq.info` shows pending job count\n5. Restart procedure: drain the queue first, then restart\n\nWorker settings are in api/neuroloom_api/workers/settings.py",
"source_files": ["docs/runbooks/arq-worker-troubleshooting.md"],
"tags": ["ingested", "docs", "runbook"],
"importance_score": 0.75
}'Expected output for the Python script:
Stored: mem-1a2b3c4d ARQ worker troubleshooting
Stored: mem-5e6f7g8h Database migration strategy
Stored: mem-9i0j1k2l Authentication service architecture
Stored: mem-3m4n5o6p StrEnum over Postgres ENUM
...
Stored 47 documents in 12.3 secondsEmbedding generation runs asynchronously after each store call. All 47 documents will have embeddings within 60–90 seconds of the batch completing. Keyword search works immediately; semantic search activates as each embedding completes.
Step 2: Tag by source type
Tags make filtered search possible. If you didn't add granular tags during ingestion, add them now with targeted searches and batch updates.
Identify all runbooks:
response = httpx.post(
"https://api.neuroloom.dev/api/v1/memories/search",
headers={"Authorization": f"Token {token}"},
json={
"workspace_id": workspace_id,
"query": "troubleshooting operational procedure",
"memory_types": ["wiki"],
"limit": 50,
},
)
runbook_ids = [r["id"] for r in response.json()["results"] if r["score"] > 0.7]
print(f"Found {len(runbook_ids)} runbook candidates")Then add a "runbook" tag to each. Currently this requires an update call per memory — use the ID from each search result:
for memory_id in runbook_ids:
httpx.patch(
f"https://api.neuroloom.dev/api/v1/memories/{memory_id}",
headers={"Authorization": f"Token {token}"},
json={
"workspace_id": workspace_id,
"tags": ["ingested", "docs", "runbook"],
},
)Step 3: Semantic search the knowledge base
Search by what you need to do, not by document name:
Use memory_search with query "what to do when the background job worker stops processing" and memory_types ["wiki"]curl -X POST https://api.neuroloom.dev/api/v1/memories/search \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"query": "what to do when the background job worker stops processing",
"memory_types": ["wiki"],
"tags": ["runbook"],
"limit": 5
}'response = httpx.post(
"https://api.neuroloom.dev/api/v1/memories/search",
headers={"Authorization": f"Token {token}"},
json={
"workspace_id": workspace_id,
"query": "what to do when the background job worker stops processing",
"memory_types": ["wiki"],
"tags": ["runbook"],
"limit": 5,
},
)
for r in response.json()["results"]:
print(f"{r['score']:.2f} {r['title']}")Response:
{
"results": [
{
"id": "mem-1a2b3c4d",
"title": "ARQ worker troubleshooting",
"memory_type": "wiki",
"score": 0.93,
"summary": "Troubleshooting steps for ARQ worker job processing failures — Redis, process status, serialization errors"
},
{
"id": "mem-7f8g9h0i",
"title": "ARQ worker configuration",
"memory_type": "architecture",
"score": 0.78,
"summary": "WorkerSettings configuration, queue depth limits, and connection pool separation"
}
]
}The query described the situation in plain language. The right runbook surfaced with a 0.93 score. The architecture doc surfaced as a secondary result because it's conceptually adjacent.
Step 4: File-based retrieval
When an engineer opens a specific file and asks "what should I know about this?", retrieve memories by file path:
Use memory_by_file with file_path "api/neuroloom_api/workers/settings.py"curl -X POST https://api.neuroloom.dev/api/v1/memories/by-file \
-H "Authorization: Token $MEMORIES_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
"file_path": "api/neuroloom_api/workers/settings.py",
"limit": 10
}'Response:
{
"results": [
{
"id": "mem-1a2b3c4d",
"title": "ARQ worker troubleshooting",
"memory_type": "wiki",
"matched_files": ["docs/runbooks/arq-worker-troubleshooting.md"],
"importance_score": 0.75
},
{
"id": "mem-2b3c4d5e",
"title": "ARQ job argument serialization pattern",
"memory_type": "pattern",
"matched_files": ["api/neuroloom_api/workers/"],
"importance_score": 0.88
}
]
}The runbook references api/neuroloom_api/workers/settings.py in its narrative — which was captured in source_files during ingestion. The pattern memory was stored independently and uses a directory-level path that matches.
Step 5: Build a search workflow
Combine search into a single callable function that agents can use for knowledge base queries:
import os
import httpx
from typing import Optional
token = os.environ["MEMORIES_API_TOKEN"]
workspace_id = os.environ["MEMORIES_WORKSPACE_ID"]
def knowledge_search(
query: str,
doc_type: Optional[str] = None,
file_context: Optional[str] = None,
limit: int = 5,
) -> list[dict]:
"""
Search the project knowledge base by natural language query.
Args:
query: Natural language question or task description
doc_type: Filter to 'runbook', 'adr', 'architecture', or None for all
file_context: File path to surface memories related to that file
limit: Maximum results to return
Returns:
List of memory summaries with title, type, score, and ID
"""
# File-based retrieval if a file context is provided
if file_context:
response = httpx.post(
"https://api.neuroloom.dev/api/v1/memories/by-file",
headers={"Authorization": f"Token {token}"},
json={"workspace_id": workspace_id, "file_path": file_context, "limit": limit},
)
return response.json().get("results", [])
# Semantic search
payload: dict = {
"workspace_id": workspace_id,
"query": query,
"limit": limit,
}
if doc_type:
payload["tags"] = [doc_type]
response = httpx.post(
"https://api.neuroloom.dev/api/v1/memories/search",
headers={"Authorization": f"Token {token}"},
json=payload,
)
return response.json().get("results", [])
# Usage
results = knowledge_search("how to handle Redis connection failures in production")
for r in results:
print(f"{r['score']:.2f} {r['title']}")Production Patterns
Chunk large documents
Documents longer than ~2,000 words should be chunked before ingestion. A 10,000-word architecture guide stored as a single memory produces embeddings that represent the whole document — not specific sections. A query about "connection pool sizing" won't rank well against a document that covers connection pools in one of 15 sections.
Chunking strategy:
- Split on major headings (
##) - Aim for 400–800 words per chunk
- Prefix each chunk's title with the parent document title: "Architecture Guide: Connection Pool Sizing"
- Store all chunks with the same
source_filesreference so they're retrievable by file
Track freshness with ingestion dates
Tag memories with the ingestion date and document version. When documents change, you can find stale memories:
from datetime import date
tags = ["ingested", f"ingested:{date.today().isoformat()}", "docs"]Run a periodic search for old ingestion tags and re-ingest changed documents.
Incremental ingestion
Don't re-ingest the entire knowledge base every time a document changes. Track ingested file hashes in a local manifest:
import hashlib
import json
from pathlib import Path
manifest_path = Path(".neuroloom-manifest.json")
manifest = json.loads(manifest_path.read_text()) if manifest_path.exists() else {}
for doc in docs:
content = Path(doc).read_text()
content_hash = hashlib.sha256(content.encode()).hexdigest()
if manifest.get(doc) == content_hash:
print(f"Skipped (unchanged): {doc}")
continue
result = ingest_document(doc)
manifest[doc] = content_hash
print(f"Ingested: {result['title']}")
manifest_path.write_text(json.dumps(manifest, indent=2))Before You Ship
- Confirm all documents are ingested with appropriate
memory_typevalues - Verify
source_filesis set on every ingested memory so file-based retrieval works - Run 5 natural-language queries representing common engineer questions — confirm the right documents surface
- Check that large documents were chunked — search for specific sub-topics within long docs and verify they rank well
- Add
ingested:{date}tags to support freshness tracking - Commit the ingestion manifest to version control so incremental re-ingestion is repeatable
Related
- Decision Tracking — formalize architectural decisions as typed memories
- CLAUDE.md Migration — migrate static project context into searchable memory
- Graph Exploration — discover how ingested documents connect to each other
- REST API Reference — full endpoint documentation for store, search, and by-file