This is a small, populated example library created by Funes. Several PDFs of papers were added to starter-library/raw/. Claude Code was then used to "ingest" them. You can use any LLM for this task, such as Codex or Deepseek, but using the best model you have access to with the largest possible context window is recommended. Browse the folders in starter-library to see the type of output Funes generates.
When you add information to your raw directory, you are not limited to PDFs. You can add any type of information, such as text, pictures, screenshots, code, etc.
Funes is a Git-based framework for turning raw sources into durable, cited knowledge work with an AI Librarian.
You can find Funes here.
The Librarian ingests raw sources, preserves them as an immutable record, compiles them into an interlinked Markdown wiki, and uses that wiki to produce cited answers, reports, analyses, routines, and other reusable outputs. You provide the sources and questions; the Librarian handles the writing, linking, indexing, health checks, and upkeep.
Everything lives in plain Markdown inside a Git repo, so your knowledge base is versioned, diffable, portable, searchable, and usable from GitHub or any editor.
The workflow adapts Andrej Karpathy's "LLM Knowledge Bases" idea to a plain Git repo instead of Obsidian.
Funes is named after Borges's “Funes the Memorious,” a character who remembers everything but cannot abstract.
This project preserves the raw record, then abstracts it into concepts, topics, and usable outputs.
raw source ─ingest→ raw/ (verbatim, immutable)
─compile→ wiki/sources/ (one summary note per source)
→ wiki/concepts/ (atomic articles, one idea each)
→ wiki/topics/ (maps of related concepts)
question ─answer→ read wiki, cite articles
─output→ outputs/ (reports, analyses, routines, answers)
→ wiki/ (durable findings filed back in)
The wiki is not the end product. It is the working memory the Librarian uses to answer questions, generate reports, produce routines, notice gaps, and keep the knowledge base coherent over time.
You rarely edit the wiki by hand. You supply sources and questions; the Librarian maintains the structure, links, indexes, and outputs.
-
Use this repository as a template with GitHub's “Use this template” button, or clone it.
-
Open it with an agentic coding tool such as Claude Code, Codex, or any LLM agent that can read and edit files in a repo. The agent reads
AGENTS.mdto learn how to behave as the Librarian. -
Add sources. Drop PDFs, web clips, notes, or other materials into
starter-library/raw/and say:ingest the new sources in
raw/Or paste text or a link directly in chat and say:
ingest this
-
Ask questions. The Librarian answers with citations into the wiki, writes substantial outputs to
outputs/, and offers to file durable findings back into the knowledge base. -
Keep it healthy. Periodically ask for a “health check” to audit broken links, duplicate concepts, stale indexes, contradictions, gaps, and possible new articles.
Rename or copy starter-library/ to suit your topic, such as physics/, history/, research/, or personal-kb/. To run several separate knowledge bases in one repo, add more top-level library folders. See library.md.
starter-library/— a ready-to-use empty knowledge base with the standardraw / wiki / outputs / metascaffold and seed index files.AGENTS.md— the entry point for agents: what each folder is and how to work in the repo.protocol.md— the shared Librarian Protocol: the full ingest → compile → Q&A → output → health-check workflow, plus conventions and article templates.library.md— the recipe for creating additional libraries in the same repo.
You do not write these by hand. They show the shape of the compiled wiki. The full templates live in protocol.md.
A source note summarizes a raw source and links to the concepts it feeds:
---
title: Attention Is All You Need
type: source
tags: [transformers, attention]
created: 2026-01-10
updated: 2026-01-10
---
# Attention Is All You Need
- **Raw file:** [2026-01-10-attention-is-all-you-need.pdf](../../raw/2026-01-10-attention-is-all-you-need.pdf)
- **Original:** https://arxiv.org/abs/1706.03762
## Summary
Introduces the Transformer, a sequence model based entirely on attention, dropping recurrence and convolution.
## Key takeaways
- Self-attention relates all positions in a sequence in O(1) sequential steps.
- Multi-head attention lets the model attend to different subspaces at once.
## Concepts extracted
- [Self-attention](../concepts/self-attention.md)
- [Multi-head attention](../concepts/multi-head-attention.md)An atomic concept explains one idea and links it back to sources, related concepts, and topic maps:
---
title: Self-attention
type: concept
tags: [transformers]
created: 2026-01-10
updated: 2026-01-10
---
# Self-attention
A mechanism that computes a representation of a sequence by relating each position to every other position, weighting them by learned compatibility.
## Related
- [Multi-head attention](./multi-head-attention.md)
## Sources
- [Attention Is All You Need](../sources/attention-is-all-you-need.md)
## Topics
- [Transformer architecture](../topics/transformer-architecture.md)An output is a substantial answer, report, routine, or analysis generated from the wiki:
# Reading plan for understanding Transformers
This plan draws on the compiled notes for [Attention Is All You Need](../wiki/sources/attention-is-all-you-need.md), [Self-attention](../wiki/concepts/self-attention.md), and [Multi-head attention](../wiki/concepts/multi-head-attention.md).
## Goal
Understand why the Transformer replaced recurrence for many sequence-modeling tasks.
## Sequence
1. Read the source note for *Attention Is All You Need*.
2. Review the concept article on self-attention.
3. Review multi-head attention.
4. Compare the topic map on Transformer architecture against the original paper.
## Durable findings to file back
- Add a concept article on positional encoding.
- Add a topic map for sequence modeling.Pattern adapted from Andrej Karpathy's "LLM Knowledge Bases".