Neuroloom

Cookbooks

Knowledge Base Ingestion

Your project's documentation — architecture guides, runbooks, onboarding docs, API specs — exists as files. Storing them in Neuroloom turns static documents into a searchable knowledge base that agents query by meaning, not filename.

Problem

New engineers spend their first week reading through scattered documentation to understand how the system works. Senior engineers re-read the same runbook every time they're on-call because there's no faster way to find "what do I do when the ARQ worker stops processing jobs." Documentation that lives in files gets searched by filename — or not at all.

Persona

Marcus maintains a FastAPI monorepo with a 200-line CLAUDE.md and a growing folder of architecture guides, runbooks, and ADRs. The files are accurate, but they're scattered — his coding agent re-reads the wrong runbook or misses the relevant ADR because it can't search by meaning. He wants to store these documents as searchable memories so the agent can ask "how do we handle job retries in ARQ?" and get the right answer, not "docs/runbooks/arq-worker-troubleshooting.md".

Prerequisites

  • Neuroloom API key from app.neuroloom.dev/settings/api-keys
  • Environment variables:
    export MEMORIES_API_TOKEN="nl_your_api_key_here"
    export MEMORIES_WORKSPACE_ID="ws_your_workspace_id_here"
  • Python 3.11+ and httpx installed: pip install httpx

Step 1: Batch-store documents

Store each document as a separate memory. Use memory_type to categorize by document type — this enables filtered search later ("show me only runbooks about authentication").

import os
import glob
import httpx
from pathlib import Path

token = os.environ["MEMORIES_API_TOKEN"]
workspace_id = os.environ["MEMORIES_WORKSPACE_ID"]

# Map file path prefixes to memory types
def memory_type_for_path(path: str) -> str:
    if "adr" in path or "decisions" in path:
        return "decision"
    if "runbook" in path or "ops" in path:
        return "wiki"
    if "architecture" in path or "design" in path:
        return "architecture"
    return "wiki"

def ingest_document(filepath: str) -> dict:
    content = Path(filepath).read_text()
    # Use first heading as title, fall back to filename
    lines = content.strip().split("\n")
    title = lines[0].lstrip("# ").strip() if lines[0].startswith("#") else Path(filepath).stem

    return httpx.post(
        "https://api.neuroloom.dev/api/v1/memories/store",
        headers={"Authorization": f"Token {token}"},
        json={
            "workspace_id": workspace_id,
            "title": title,
            "memory_type": memory_type_for_path(filepath),
            "narrative": content,
            "source_files": [filepath],
            "tags": ["ingested", "docs"],
            "importance_score": 0.75,
        },
    ).json()

# Ingest all markdown files in docs/
docs = glob.glob("docs/**/*.md", recursive=True)
for doc in docs:
    result = ingest_document(doc)
    print(f"Stored: {result['id']}  {result['title']}")
# Single document example — use a script loop for batch ingestion
curl -X POST https://api.neuroloom.dev/api/v1/memories/store \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "title": "ARQ worker troubleshooting",
    "memory_type": "wiki",
    "narrative": "# ARQ Worker Troubleshooting\n\nIf the ARQ worker stops processing jobs, check these steps in order:\n\n1. Verify Redis connection: `redis-cli ping` from the worker host\n2. Check worker process status: `systemctl status neuroloom-worker`\n3. Review worker logs for serialization errors — common cause is ORM objects passed as job arguments\n4. Verify the job queue depth: `arq.info` shows pending job count\n5. Restart procedure: drain the queue first, then restart\n\nWorker settings are in api/neuroloom_api/workers/settings.py",
    "source_files": ["docs/runbooks/arq-worker-troubleshooting.md"],
    "tags": ["ingested", "docs", "runbook"],
    "importance_score": 0.75
  }'

Expected output for the Python script:

Stored: mem-1a2b3c4d  ARQ worker troubleshooting
Stored: mem-5e6f7g8h  Database migration strategy
Stored: mem-9i0j1k2l  Authentication service architecture
Stored: mem-3m4n5o6p  StrEnum over Postgres ENUM
...
Stored 47 documents in 12.3 seconds
Note

Embedding generation runs asynchronously after each store call. All 47 documents will have embeddings within 60–90 seconds of the batch completing. Keyword search works immediately; semantic search activates as each embedding completes.


Step 2: Tag by source type

Tags make filtered search possible. If you didn't add granular tags during ingestion, add them now with targeted searches and batch updates.

Identify all runbooks:

response = httpx.post(
    "https://api.neuroloom.dev/api/v1/memories/search",
    headers={"Authorization": f"Token {token}"},
    json={
        "workspace_id": workspace_id,
        "query": "troubleshooting operational procedure",
        "memory_types": ["wiki"],
        "limit": 50,
    },
)

runbook_ids = [r["id"] for r in response.json()["results"] if r["score"] > 0.7]
print(f"Found {len(runbook_ids)} runbook candidates")

Then add a "runbook" tag to each. Currently this requires an update call per memory — use the ID from each search result:

for memory_id in runbook_ids:
    httpx.patch(
        f"https://api.neuroloom.dev/api/v1/memories/{memory_id}",
        headers={"Authorization": f"Token {token}"},
        json={
            "workspace_id": workspace_id,
            "tags": ["ingested", "docs", "runbook"],
        },
    )

Step 3: Semantic search the knowledge base

Search by what you need to do, not by document name:

Use memory_search with query "what to do when the background job worker stops processing" and memory_types ["wiki"]
curl -X POST https://api.neuroloom.dev/api/v1/memories/search \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "query": "what to do when the background job worker stops processing",
    "memory_types": ["wiki"],
    "tags": ["runbook"],
    "limit": 5
  }'
response = httpx.post(
    "https://api.neuroloom.dev/api/v1/memories/search",
    headers={"Authorization": f"Token {token}"},
    json={
        "workspace_id": workspace_id,
        "query": "what to do when the background job worker stops processing",
        "memory_types": ["wiki"],
        "tags": ["runbook"],
        "limit": 5,
    },
)
for r in response.json()["results"]:
    print(f"{r['score']:.2f}  {r['title']}")

Response:

{
  "results": [
    {
      "id": "mem-1a2b3c4d",
      "title": "ARQ worker troubleshooting",
      "memory_type": "wiki",
      "score": 0.93,
      "summary": "Troubleshooting steps for ARQ worker job processing failures — Redis, process status, serialization errors"
    },
    {
      "id": "mem-7f8g9h0i",
      "title": "ARQ worker configuration",
      "memory_type": "architecture",
      "score": 0.78,
      "summary": "WorkerSettings configuration, queue depth limits, and connection pool separation"
    }
  ]
}

The query described the situation in plain language. The right runbook surfaced with a 0.93 score. The architecture doc surfaced as a secondary result because it's conceptually adjacent.


Step 4: File-based retrieval

When an engineer opens a specific file and asks "what should I know about this?", retrieve memories by file path:

Use memory_by_file with file_path "api/neuroloom_api/workers/settings.py"
curl -X POST https://api.neuroloom.dev/api/v1/memories/by-file \
  -H "Authorization: Token $MEMORIES_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "'"$MEMORIES_WORKSPACE_ID"'",
    "file_path": "api/neuroloom_api/workers/settings.py",
    "limit": 10
  }'

Response:

{
  "results": [
    {
      "id": "mem-1a2b3c4d",
      "title": "ARQ worker troubleshooting",
      "memory_type": "wiki",
      "matched_files": ["docs/runbooks/arq-worker-troubleshooting.md"],
      "importance_score": 0.75
    },
    {
      "id": "mem-2b3c4d5e",
      "title": "ARQ job argument serialization pattern",
      "memory_type": "pattern",
      "matched_files": ["api/neuroloom_api/workers/"],
      "importance_score": 0.88
    }
  ]
}

The runbook references api/neuroloom_api/workers/settings.py in its narrative — which was captured in source_files during ingestion. The pattern memory was stored independently and uses a directory-level path that matches.


Step 5: Build a search workflow

Combine search into a single callable function that agents can use for knowledge base queries:

import os
import httpx
from typing import Optional

token = os.environ["MEMORIES_API_TOKEN"]
workspace_id = os.environ["MEMORIES_WORKSPACE_ID"]


def knowledge_search(
    query: str,
    doc_type: Optional[str] = None,
    file_context: Optional[str] = None,
    limit: int = 5,
) -> list[dict]:
    """
    Search the project knowledge base by natural language query.

    Args:
        query: Natural language question or task description
        doc_type: Filter to 'runbook', 'adr', 'architecture', or None for all
        file_context: File path to surface memories related to that file
        limit: Maximum results to return

    Returns:
        List of memory summaries with title, type, score, and ID
    """
    # File-based retrieval if a file context is provided
    if file_context:
        response = httpx.post(
            "https://api.neuroloom.dev/api/v1/memories/by-file",
            headers={"Authorization": f"Token {token}"},
            json={"workspace_id": workspace_id, "file_path": file_context, "limit": limit},
        )
        return response.json().get("results", [])

    # Semantic search
    payload: dict = {
        "workspace_id": workspace_id,
        "query": query,
        "limit": limit,
    }
    if doc_type:
        payload["tags"] = [doc_type]

    response = httpx.post(
        "https://api.neuroloom.dev/api/v1/memories/search",
        headers={"Authorization": f"Token {token}"},
        json=payload,
    )
    return response.json().get("results", [])


# Usage
results = knowledge_search("how to handle Redis connection failures in production")
for r in results:
    print(f"{r['score']:.2f}  {r['title']}")

Production Patterns

Chunk large documents

Documents longer than ~2,000 words should be chunked before ingestion. A 10,000-word architecture guide stored as a single memory produces embeddings that represent the whole document — not specific sections. A query about "connection pool sizing" won't rank well against a document that covers connection pools in one of 15 sections.

Chunking strategy:

  • Split on major headings (## )
  • Aim for 400–800 words per chunk
  • Prefix each chunk's title with the parent document title: "Architecture Guide: Connection Pool Sizing"
  • Store all chunks with the same source_files reference so they're retrievable by file

Track freshness with ingestion dates

Tag memories with the ingestion date and document version. When documents change, you can find stale memories:

from datetime import date

tags = ["ingested", f"ingested:{date.today().isoformat()}", "docs"]

Run a periodic search for old ingestion tags and re-ingest changed documents.

Incremental ingestion

Don't re-ingest the entire knowledge base every time a document changes. Track ingested file hashes in a local manifest:

import hashlib
import json
from pathlib import Path

manifest_path = Path(".neuroloom-manifest.json")
manifest = json.loads(manifest_path.read_text()) if manifest_path.exists() else {}

for doc in docs:
    content = Path(doc).read_text()
    content_hash = hashlib.sha256(content.encode()).hexdigest()
    if manifest.get(doc) == content_hash:
        print(f"Skipped (unchanged): {doc}")
        continue
    result = ingest_document(doc)
    manifest[doc] = content_hash
    print(f"Ingested: {result['title']}")

manifest_path.write_text(json.dumps(manifest, indent=2))

Before You Ship

  • Confirm all documents are ingested with appropriate memory_type values
  • Verify source_files is set on every ingested memory so file-based retrieval works
  • Run 5 natural-language queries representing common engineer questions — confirm the right documents surface
  • Check that large documents were chunked — search for specific sub-topics within long docs and verify they rank well
  • Add ingested:{date} tags to support freshness tracking
  • Commit the ingestion manifest to version control so incremental re-ingestion is repeatable

Ready to get started?

Start building with Neuroloom for free.

Start Building Free