Karpathy's LLM Wiki with Claude Code: Setup Tutorial

Apr 14, 2026

This is the complete Claude Code setup for Karpathy's LLM Wiki pattern — the version we use ourselves, end to end, written for developers who already live in Claude Code or Cursor and just want a working pipeline by tonight. By the end of this guide you will have a real CLAUDE.md, a real schema.md, a working ingest.py, and a wiki that compiles itself every time you drop a new source. If the LLM Wiki concept itself is still fuzzy, start with the plain-English primer; this guide assumes you know what you are building.

The whole setup takes about 30 minutes if you already have Claude Code installed, an hour if you are starting from scratch. We will work through five steps, then list the five mistakes that ruin most pipelines.

Why Claude Code is the right home for an LLM Wiki

Claude Code is the most ergonomic LLM Wiki host because of one feature: it reads CLAUDE.md automatically on every session. That means the schema, the rules, and the project conventions live in one file that is always loaded into context — no manual prompting, no copy-paste. Compare that to running the LLM Wiki pattern through the web UI, where you re-paste the schema every session.

Cursor works almost identically with .cursorrules. The whole guide below applies to Cursor with two-line changes (rename file, point to the right rules location).

You could run an LLM Wiki without Claude Code — many people do, with Gemini or the Claude web app. But if you are reading this, you are probably the kind of developer who already has Claude Code and an LLM Wiki ambition in your terminal, and that is the path we are going to optimize for.

Claude Code as the host for a Karpathy LLM Wiki pipeline

Step 1 — Create the project structure

Open a new directory and lay out the four folders an LLM Wiki needs:

mkdir my-wiki && cd my-wiki
mkdir -p raw wiki meta scripts
touch CLAUDE.md meta/schema.md scripts/ingest.py scripts/lint.py

That gives you:

  • raw/ — the input pile that feeds the LLM Wiki. PDFs, articles, transcripts, screenshots, anything you want the wiki to absorb. Never edited by hand.
  • wiki/ — the output. Plain markdown files, written by Claude, organized by entity type. Read it like a document.
  • meta/ — schema, project rules, anything Claude needs to know but should not appear in the wiki itself.
  • scripts/ — the small Python files that run the compile and lint loops.

Initialize git in this folder. Every compile is a commit, which is your safety net when an LLM run goes wrong:

git init && git add . && git commit -m "Initial LLM Wiki structure"

Step 2 — Write a tight CLAUDE.md

This is the file Claude Code reads first on every session, and the single most important file in your LLM Wiki project. A good CLAUDE.md is one page long. A bad one is five. Here is the shape we use:

# CLAUDE.md — LLM Wiki Project Rules

## What this project is
A personal knowledge base that compiles raw sources into structured wiki pages.
Read meta/schema.md first — it defines the shape of every wiki page.

## Folder rules
- raw/ holds inputs. Never edit files in raw/.
- wiki/ holds compiled output. Edit only via the compile script (scripts/ingest.py).
- meta/ holds schema, glossary, and project state.
- scripts/ holds the compile and lint loops.

## When I say "compile"
1. Read meta/schema.md
2. Read every new file in raw/ (anything modified after meta/last-compile.txt)
3. For each new source, decide which wiki/ pages to create or update following the schema
4. Output each page as a separate markdown block with full filename
5. Update meta/last-compile.txt to current timestamp
6. Never overwrite Open Questions sections silently; flag contradictions instead

## When I say "lint"
1. Run scripts/lint.py
2. Report any schema violations (missing frontmatter, broken wikilinks, pages over 500 words)
3. Suggest fixes; do not auto-apply

## Style
- Wiki pages are 200-500 words each, max
- Use [[wikilinks]] for any reference to another page
- Mark uncertain claims with "Hypothesis:" prefix
- Cite sources at the bottom of each page

The trick is to make CLAUDE.md a rules file, not a tutorial. Save explanations and examples for meta/schema.md. CLAUDE.md should be something Claude can scan in five seconds and apply consistently.

If you want to skip the LLM Wiki iteration phase, our LLM Wiki Starter Kit ships with five tuned CLAUDE.md files (general, research, engineering, product, SEO) — each one is the result of dozens of real compilations, ready to drop into your project.

Get the LLM Wiki Starter Kit at launch

5 tuned CLAUDE.md + schema.md templates, ingest pipeline, 15-min video. $19 launch price for waitlist subscribers.

Step 3 — Define your schema

Where CLAUDE.md is rules for the LLM Wiki, schema.md is the shape of knowledge. It defines what entity types exist and how they relate. Keep it concrete — abstract schemas produce abstract wikis.

Here is a minimal engineering-project schema:

# Schema: Engineering LLM Wiki

## Entity types
- module: a code subsystem with a clear boundary
- decision: an architectural or design decision (ADR-style)
- issue: a known bug, gotcha, or open problem
- dependency: an external library or service
- runbook: an operational procedure

## Page format
Every page starts with frontmatter:
  type: module | decision | issue | dependency | runbook
  status: active | deprecated | resolved
  owner: [name or team]
  updated: YYYY-MM-DD

Body for module pages:
1. Purpose (one sentence)
2. Public API (the surface other modules see)
3. Internal architecture (3-6 bullets)
4. Dependencies (wikilinks)
5. Decisions (wikilinks to ADRs)
6. Known issues (wikilinks)

Body for decision pages:
1. Context (what we needed to decide)
2. Options considered
3. Decision and rationale
4. Tradeoffs accepted
5. Date and status

## Rules
- Every module page must list its public API
- Every decision page must include a date
- Issues stay until status: resolved
- Deprecations are never deleted, only marked
- Keep each page under 800 words

This is one of the schemas in the free starter pack — grab the academic and general-purpose versions there too if you want to compare shapes. The thing to internalize: schemas are tunable. Your first three compiles will tell you what your schema is missing. Edit the schema, not the wiki output.

Step 4 — Write the ingest script

Here is the smallest LLM Wiki ingest script that actually works. It reads every new file in your LLM Wiki's raw/, asks Claude to compile it via claude CLI, and writes the output to wiki/.

#!/usr/bin/env python3
"""scripts/ingest.py — compile new sources in raw/ into wiki/ pages."""
import os, subprocess, datetime
from pathlib import Path

ROOT = Path(__file__).parent.parent
RAW = ROOT / "raw"
WIKI = ROOT / "wiki"
SCHEMA = (ROOT / "meta" / "schema.md").read_text()
LAST_COMPILE_FILE = ROOT / "meta" / "last-compile.txt"

def newer_than_last_compile(path):
    if not LAST_COMPILE_FILE.exists():
        return True
    last = float(LAST_COMPILE_FILE.read_text().strip())
    return path.stat().st_mtime > last

def compile_source(source_path):
    source = source_path.read_text(errors="ignore")
    prompt = f"""You are compiling raw input into an LLM Wiki.

SCHEMA:
{SCHEMA}

NEW SOURCE ({source_path.name}):
{source}

INSTRUCTIONS:
1. Decide which wiki/ pages should be created or updated from this source.
2. Output each page as a separate markdown code block.
3. Each block must start with a comment: # filename: wiki/<type>/<slug>.md
4. Follow the schema exactly — frontmatter, body sections, wikilinks.
5. Never overwrite existing Open Questions sections.
"""
    result = subprocess.run(
        ["claude", "-p", prompt],
        capture_output=True, text=True, check=True
    )
    return result.stdout

def main():
    if not RAW.exists():
        print("raw/ folder missing"); return
    for path in RAW.rglob("*"):
        if not path.is_file() or not newer_than_last_compile(path):
            continue
        print(f"Compiling {path.name}…")
        output = compile_source(path)
        (ROOT / "meta" / f"compile-{path.stem}.md").write_text(output)
    LAST_COMPILE_FILE.write_text(str(datetime.datetime.now().timestamp()))
    print("Done. Review meta/compile-*.md and copy approved blocks into wiki/.")

if __name__ == "__main__":
    main()

Notice it does not write to the LLM Wiki directly — it stages the LLM output in meta/compile-*.md files for human review first. This is the single most important pattern: never let an LLM commit straight to your LLM Wiki without a quick human pass. Once you trust the schema, you can graduate to direct writes, but start staged.

The full version of this script (with retries, parallel compile, schema-aware diffing, and incremental updates) ships in the Starter Kit.

Step 5 — Run the lint loop

The LLM Wiki lint script catches schema violations before they pile up:

#!/usr/bin/env python3
"""scripts/lint.py — check wiki/ for schema violations."""
import re
from pathlib import Path

WIKI = Path(__file__).parent.parent / "wiki"
errors = []

for page in WIKI.rglob("*.md"):
    text = page.read_text()
    # rule 1: must have frontmatter
    if not text.startswith("---"):
        errors.append(f"{page}: missing frontmatter")
    # rule 2: must be under 800 words
    word_count = len(re.findall(r"\w+", text))
    if word_count > 800:
        errors.append(f"{page}: {word_count} words (limit 800)")
    # rule 3: every wikilink target should exist
    for link in re.findall(r"\[\[([^\]]+)\]\]", text):
        target = WIKI / f"{link}.md"
        if not target.exists():
            errors.append(f"{page}: broken wikilink [[{link}]]")

if errors:
    print(f"❌ {len(errors)} issues found:")
    for e in errors: print(f"  {e}")
    exit(1)
print(f"✅ Wiki clean ({len(list(WIKI.rglob('*.md')))} pages)")

Run python scripts/lint.py after every LLM Wiki compile. Make it a git pre-commit hook if you want, so you cannot ship a wiki with broken wikilinks.

Running the LLM Wiki ingest and lint loop in Claude Code

The five mistakes that kill LLM Wiki pipelines

After helping a few dozen developers set this up, these are the patterns we see fail most often. Avoid these and you will land your first useful LLM Wiki in a weekend.

1. Schema bloat

Symptom: your LLM Wiki schema is over two pages long and the output feels mechanical and uninspired. Fix: throw half of the schema out. Simpler schemas produce richer wikis because they leave room for the LLM to make connections.

2. Letting the LLM write directly to wiki/

Symptom: you ran the compile script, came back, and discovered Claude rewrote three pages you had hand-edited. Fix: the staging pattern in step 4. Always review LLM output before it lands in your real wiki, until you trust the schema.

3. Compiling everything at once

Symptom: you dumped 200 PDFs into raw/, ran the script, and got 400 mediocre wiki pages. Fix: compile one source at a time during the first month. You are tuning the schema, not processing a backlog.

4. Skipping the lint loop

Symptom: six months in, you have 80 wiki pages, half of them have broken wikilinks, and you do not trust anything anymore. Fix: lint after every compile. Cheap insurance.

5. Not reading the wiki

Symptom: the LLM is writing pages and you never re-read them. The LLM Wiki rots quietly. Fix: read at least three random pages every week. The moment something feels wrong, edit the schema (not the page) and recompile.

Where to go next

If you want the polished version of everything in this guide — five tuned CLAUDE.md files, a full ingest pipeline with diffing, the lint script with auto-fix, and a 15-minute video walkthrough — that is exactly what the LLM Wiki Starter Kit is. $19 at launch for waitlist subscribers.

Either way, our occasional newsletter sends an email when we have something worth your time — new schema patterns, fresh implementations, the best community discussions. No fixed schedule, no upsell. Subscribe below.

Karpathy's LLM Wiki with Claude Code: Setup Tutorial