AIP-10: KNOWLEDGE.md — agentknowledge/v1 (LLM-maintained wiki)
A filesystem-first knowledge-base format where an LLM curates, links, and lints a markdown wiki on top of immutable raw sources, turning agent knowledge into a compounding artifact instead of a per-query retrieval miss.
| Field | Value |
|---|---|
| AIP | 10 |
| Title | KNOWLEDGE.md — agentknowledge/v1 (LLM-maintained wiki) |
| Status | Draft |
| Type | Schema |
| Domain | knowledge.sh |
| Doctypes | knowledge.entry/v1 (curated), knowledge.source/v1 (immutable), knowledge.workspace/v1 (manifest + view) |
| Requires | AIP-1, AIP-2 |
| Composes with | AIP-3 (skills), AIP-6 (companies), AIP-7 (governance), AIP-9 (operators) |
| Reference Impl | TBD |
Abstract
agentknowledge/v1 defines a markdown-based wiki format that an LLM owns
end-to-end. Three doctypes cooperate: raw sources stay immutable
(knowledge.source/v1), curated entries are rewritten by the agent on every
ingest (knowledge.entry/v1), and a workspace manifest declares the wiki's
shape — entity types, lint rules, retention, curation policy — in a
machine-parseable file (knowledge.workspace/v1, written as KNOWLEDGE.md).
The same workspace doctype, used recursively via extends:, also expresses
per-context views: an operator (AIP-9), a company
(AIP-6), or a skill (AIP-3) can ship its OWN
KNOWLEDGE.md that adapts the base workspace for its lens — different entity
focus, tone, conflict-resolution rules — without forking the wiki itself. A
sibling free-form prose file (AGENTS.md) is RECOMMENDED for human readers,
maintained alongside the canonical machine config. Together these turn agent
knowledge into a compounding, composable artifact instead of a stateless RAG
retrieval, and make the resulting knowledge base portable across runtimes.
Motivation
Most agent knowledge today lives in one of two places: an opaque vector store rebuilt at query time (RAG), or a vendor-specific "memory" object that doesn't survive across runtimes. Both treat knowledge as a retrieval problem. Neither produces an artifact a human or a different agent can read, audit, or fork.
Andrej Karpathy's "LLM Wiki" pattern (April 2026) reframes the problem: treat raw sources as source code, treat the LLM as a compiler, and let it produce a structured wiki — a compiled knowledge artifact that compounds across ingests. AIP-10 codifies that pattern as a portable file format so that:
- A wiki built in one runtime can be opened, queried, and extended in another.
- The workspace manifest (
KNOWLEDGE.md) becomes the unit of trade — domain experts ship a workspace shape, runtimes execute it; a sibling human-readableAGENTS.mddocuments intent for readers and reviewers. - Cross-references, contradictions, and provenance live in the files themselves, not in a query-time prompt.
- Different consumers can read the same wiki through different lenses.
An operator focused on research wants Concept entries surfaced first;
the same wiki seen by a sales operator wants Customer and Deal entries.
Rather than fork the wiki per consumer, AIP-10 lets each consumer ship
a small
KNOWLEDGE.mdthat extends the workspace and overrides the bits that matter for its context. The wiki is one; the views are many.
This last point is the structural reason KNOWLEDGE.md is one doctype used
in two modes: a workspace-root manifest at the wiki root, and a view
in any operator/company/skill folder that wants its own lens. Composition is
the same mechanism used by Tailwind presets, designkit overrides, and the
profile registry pattern that shows up across this AIP family — the wiki
ships a base shape, and consumers compose narrower shapes on top.
Prior art: Karpathy's
llm-wiki gist,
AGENTS.md, Anthropic's "Agent Skills" pattern
(AIP-3), the filesystem-first lineage of
AIP-6/AIP-7/AIP-8.
Specification
A conforming agentknowledge/v1 package is a directory tree of four layers:
the workspace manifest, the immutable sources, the curated entries, and the
optional human-readable schema file. Per-context views live wherever the
consumer that owns them lives.
my-wiki/
├── KNOWLEDGE.md # workspace manifest (REQUIRED, root, machine config)
├── AGENTS.md # human-readable schema (RECOMMENDED, prose companion)
├── _index.md # generated catalog (REQUIRED, root)
├── _log.md # append-only activity log (REQUIRED, root)
├── sources/ # raw sources (immutable; LLM reads, never writes)
│ ├── 2026-04-15-paper.pdf
│ └── 2026-04-20-meeting.md
├── entities/ # one page per real-world entity
│ └── andrej-karpathy.md
├── concepts/ # one page per abstract concept
│ └── compounding-knowledge.md
├── summaries/ # one page per ingested source
│ └── 2026-04-15-paper.md
└── timelines/ # optional ordered narratives
└── 2026-q2-research.mdPer-context views live alongside their consumer, not under the wiki root. Conventional locations:
operators/research-analyst/KNOWLEDGE.md # extends ../../my-wiki/KNOWLEDGE.md
companies/acme/KNOWLEDGE.md # extends ../../my-wiki/KNOWLEDGE.md
skills/sales-assist/KNOWLEDGE.md # extends ../../my-wiki/KNOWLEDGE.mdA view's extends: field points to a parent KNOWLEDGE.md (workspace root
OR another view), and appliesTo: binds the view to one or more
operator/company/skill workspace refs. The runtime resolves the chain on
load and exposes the merged effective config to the consumer.
Layer 1 — Raw sources (sources/)
The runtime MUST treat any file under sources/ as immutable. The LLM
reads these files but MUST NOT modify, rename, or delete them. New
sources are added by humans (or upstream automation) and trigger ingest.
Layer 2 — Wiki pages (everywhere except sources/, KNOWLEDGE.md, and AGENTS.md)
Every wiki page is markdown with YAML frontmatter:
---
schema: knowledge/v1
slug: <kebab-case-page-id>
kind: entity | concept | summary | comparison | timeline
title: <human-readable title>
sources: # provenance — refs into sources/
- sources/2026-04-15-paper.pdf
- sources/2026-04-20-meeting.md
confidence: 0 .. 1 # OPTIONAL, default 1.0
updated_at: <ISO 8601>
supersedes: [<slug>] # OPTIONAL — earlier pages this replaces
contradicts: [<slug>] # OPTIONAL — pages whose claims conflict
metadata: # OPTIONAL — vendor extensions
<vendor>:
<field>: <value>
---
# <title>
<body — prose, tables, code, headings>Cross-references between pages MUST use the wikilink syntax [[slug]] or
markdown links to relative .md paths. The runtime MUST be able to
resolve both forms.
Layer 3 — Human-readable schema (AGENTS.md)
A RECOMMENDED root file describing, in prose, how the LLM should curate
the wiki. It exists for human readers and review tooling — it is the
companion artifact to the canonical machine-readable
KNOWLEDGE.md (Layer 4).
Conforming wikis SHOULD ship both: KNOWLEDGE.md for runtimes and
linters to consume programmatically, AGENTS.md for humans to read
during onboarding, review, or governance approval.
Earlier drafts of this AIP made AGENTS.md REQUIRED and treated it as
the schema of record. That role now belongs to KNOWLEDGE.md; the prose
file is downgraded to RECOMMENDED so that automated pipelines can ship
without it, while community spec compatibility (notably
agents.md) is preserved for hosts that want to
co-publish.
When present, AGENTS.md SHOULD contain at least:
- Page conventions — required frontmatter, naming, allowed
kinds, body structure for each kind. (MirrorsentityTypesinKNOWLEDGE.md.) - Ingest workflow — what the LLM does when a new source appears in
sources/: which pages to read, which pages to update, when to create new pages, how to update_index.mdand append to_log.md. - Contradiction policy — how to resolve conflicts (recency,
source authority, observation count) and how to flag unresolved
conflicts via
contradicts. (Mirrorscuration.conflictResolutioninKNOWLEDGE.md.) - Lint rules — which orphans/stale-claims/missing-concepts the LLM
should surface during a maintenance pass. (Mirrors
lintsinKNOWLEDGE.md.)
A host MAY treat AGENTS.md as the source of truth for human-facing
display and KNOWLEDGE.md as the source of truth for programmatic
behaviour. When the two disagree, runtimes MUST prefer KNOWLEDGE.md;
linters SHOULD surface the divergence as a wiki_schema_drift finding so
human authors can re-sync the prose.
Layer 4 — Workspace manifest (KNOWLEDGE.md)
KNOWLEDGE.md is the canonical, machine-parseable workspace manifest.
It encodes everything AGENTS.md describes in prose — entity types,
lint rules, retention, curation policy — into a YAML frontmatter that
runtimes can validate, merge, and diff. The body of KNOWLEDGE.md
remains free-form markdown for any prose the manifest author wants to
ship inline.
The same doctype, knowledge.workspace/v1, is used in TWO modes:
- Workspace-root mode —
<wiki>/KNOWLEDGE.md, noextends. Declares the base shape: what entity types exist, what lints run, what tone the curator agent uses, what retention applies to sources. - View mode —
<consumer>/KNOWLEDGE.md,extends:set to a parentKNOWLEDGE.mdpath. Adapts the base for a specific operator (AIP-9), company (AIP-6), or skill (AIP-3). View mode is the mechanism that lets one wiki serve many lenses without forking.
Frontmatter shape
---
schema: knowledge.workspace/v1
name: <kebab-case-id> # required
title: <human-readable> # required
description: <one-paragraph purpose> # required
version: <semver> # required, the WORKSPACE version
# (bump on shape changes)
# Composition (view mode only)
extends: ../path/to/parent/KNOWLEDGE.md # OPTIONAL — relative path to
# parent; recursive merge
appliesTo: # OPTIONAL — bind this view to
# specific consumers
- ws://operators/<slug> # AIP-9 operator
- ws://companies/<slug> # AIP-6 company
- ws://skills/<slug> # AIP-3 skill
# Cross-AIP refs
curator: ws://operators/<slug> # OPTIONAL — AIP-9 operator that
# curates this workspace
governance: <path-or-ref> # OPTIONAL — AIP-7 policy or
# audit binding
# Entity model — what TYPES of entries this workspace recognizes
entityTypes: # array; merge-by-name vs parent
- name: <PascalCase>
fields: [<field>, ...] # canonical fields
icon: <emoji> # OPTIONAL display hint
description: <prose> # OPTIONAL
parent: <PascalCase> # OPTIONAL — extends another
# local type
# Lint rules — what the curator agent checks every maintenance pass
lints: # array; merge-by-id vs parent
- id: <kebab-id> # required, stable
kind: require-source | max-age | min-confidence | broken-ref
| orphan | custom
appliesTo: <EntityType> | "*"
severity: error | warn | info
params: # kind-specific
<key>: <value>
# Source registry behavior
sources:
retention: forever | days:<n>
signing: required | optional | none # composes with AIP-7 signing
hashAlgo: sha256 | sha512 | blake3
authorityDefault: primary | secondary | rumour
# Curation behavior
curation:
tone: <free-form> # e.g. "academic", "sales"
depth: shallow | medium | deep
autoLink: byName | manual | off
conflictResolution: defer | recency | authority
| observation-count | keep-both
newEntryThreshold: <prose> # when to promote a mention to
# a full entry
# Query hints — how consumers should retrieve from this view
queryHints:
preferRecent: true | false
preferAuthoritative: true | false
scopeTo: [<EntityType>, ...] # OPTIONAL — narrow query default
# Display / UX hints (agnostic to runtime)
display:
homePage: <slug> # OPTIONAL — landing entry
defaultGrouping: kind | tag | source
metadata: # vendor extensions, namespaced
<vendor>:
<field>: <value>
---
# <body — markdown prose>
Conventional sections in the body include:
- ## Purpose — what this workspace is for, who uses it
- ## Conventions — naming, style, what to avoid
- ## When to extend vs replace — composition guidance
- ## Examples — short snippets of typical entriesComposition semantics
When a runtime loads a KNOWLEDGE.md whose frontmatter declares
extends:, it MUST:
- Walk the parent chain. Recursively load the parent referenced by
extends:, then that parent's parent, until a manifest with noextendsis reached (the workspace root). Maximum chain depth is eight. Hosts MUST detect cycles by tracking visited absolute paths. - Treat both depth overflow and cycle detection as warnings, not
errors. A view whose chain is malformed MUST still load — the
runtime falls back to the local manifest only and surfaces a
knowledge_extends_cycle(orknowledge_extends_depth_exceeded) warning to the consumer's debug surface. - Tolerate a missing parent. If
extends:points to a path that does not exist, the runtime emitsknowledge_extends_missingas a warning and uses the local manifest only. View activation does not abort. - Merge bottom-up. Walk the chain from the workspace root toward the leaf view, merging each manifest into the accumulator using the strategy below.
Merge strategy (child wins on conflicts):
| Field | Strategy | Notes |
|---|---|---|
name, title, description, version | override | Child's identity wins; the runtime exposes both. |
extends | not inherited | Local-only field. |
appliesTo | not inherited | Local-only binding. |
curator, governance | override | Child can rebind. |
entityTypes | merge-by-name | A child entry with the same name replaces the parent's; new names are appended. Subtyping is explicit via parent:. |
entityTypes[].fields | union | Child fields are appended to parent's; duplicates collapsed. |
lints | merge-by-id | Child lint with same id replaces parent's; new ids are appended. |
sources.* | override per leaf field | retention, signing, hashAlgo, authorityDefault each independently override. |
curation.* | override per leaf field | Same shape as sources. |
queryHints.* | override per leaf field | scopeTo is replaced wholesale by child if present. |
display.* | override per leaf field | |
metadata | deep-merge | Recursive merge; vendor namespaces accumulate. |
The runtime MUST expose both the merged effective config AND the resolution chain (ordered list of absolute paths consumed during merge). Consumers use the merged config; tooling uses the chain to explain why a field has the value it does.
Cross-AIP refs
KNOWLEDGE.md is the binding surface where agentknowledge/v1 meets
the rest of the AIP family:
| Field | References | Purpose |
|---|---|---|
curator | AIP-9 operator | Names the operator the host should activate when running ingest, curation, or lint passes against this workspace. |
governance | AIP-7 policy / audit ref | Binds the workspace (or view) to a governance policy. Schema-poisoning mitigations and source-mutation audits flow through this ref. |
appliesTo | AIP-3 skill, AIP-6 company, AIP-9 operator | A view declares which consumers it adapts the workspace for. Hosts MUST refuse a view whose appliesTo references a non-existent consumer (knowledge_appliesto_unresolvable). |
extends | another KNOWLEDGE.md | Composition. |
appliesTo is not inherited. A view binds to its own consumers; a
parent's bindings do not leak into the child. This is the rule that
makes the registry-of-views pattern coherent — every view declares its
own scope.
Workspace mode vs view mode — composability table
| Aspect | Workspace-root mode | View mode |
|---|---|---|
| File path | <wiki>/KNOWLEDGE.md | <consumer>/KNOWLEDGE.md |
extends: | absent | required (otherwise it's a workspace, not a view) |
appliesTo: | absent (a workspace has no single consumer) | OPTIONAL but conventional |
| Effective shape | the manifest as written | merge of the chain, child wins |
| Mutability | edits gated by governance | local edits adapt the lens, do not affect the workspace |
| Use cases | wiki authors, schema designers | operator/company/skill teams who want their own lens |
| Validation | full schema check | schema check + chain validation |
| Lifecycle | versioned with the wiki | versioned with the consumer |
The same knowledge.workspace/v1 doctype, the same file name, the same
schema. Only the location and the presence of extends: distinguish.
Required generated files
_index.md — content catalog. Lists every page (excluding sources/)
grouped by kind, with the slug, title, and a one-line summary
extracted from the page body. The runtime MUST regenerate _index.md
on every ingest.
_log.md — append-only activity log. Every ingest, query that produced
a new page, and lint pass MUST append a line:
## [<ISO 8601>] <event-type> | <subject>
- <bullet 1>
- <bullet 2>event-type ∈ manual.
Ingest contract
When a runtime ingests source S:
- The runtime MUST read
Sand the currentAGENTS.md. - It MUST identify wiki pages affected by
S(entities mentioned, concepts touched, conflicting claims) by reading_index.mdplus the relevant page bodies. - It MUST produce page diffs — minimal markdown patches per affected page — and apply them atomically (all or nothing). Monolithic rewrites of unaffected pages are non-conforming.
- It MUST create new pages for entities/concepts not yet covered.
- It MUST update
_index.mdand append to_log.mdin the same transaction. - It SHOULD set
confidenceandsupersedesbased on the schema's contradiction policy. Unresolved conflicts MUST be flagged viacontradicts.
Query contract
Querying a wiki MUST be possible with file reads alone — no runtime DB or vector index is required to be conforming. A runtime MAY add auxiliary indices (BM25, embeddings, graph) on top, but the wiki files remain the source of truth.
Lint contract
A maintenance pass MUST detect:
- Orphans — pages with no inbound links from
_index.mdor other pages. - Stale claims — pages whose
sourcesare all older than a schema-defined threshold and have not been re-confirmed. - Unresolved contradictions — pages with
contradictspopulated. - Broken refs — wikilinks/markdown links pointing to missing pages.
Lint output is appended to _log.md as an event of type lint.
Vendor extensions
Implementations MAY add fields under metadata.<vendor> (e.g.
metadata.guilde, metadata.simone). Vendor fields MUST NOT change the
meaning of any field defined in this AIP, and a runtime MUST ignore
vendor extensions it does not understand.
Rationale
Why filesystem-first. Following AIP-6/7/8: the wiki is the folder. Adapters project it into databases or vector stores; the canonical representation stays portable.
Why mandatory _index.md and _log.md. The ingest contract requires
the LLM to update both atomically. Making them part of the spec — rather
than implementation detail — lets a third party verify after the fact
that the wiki was maintained according to its schema, and lets a runtime
detect a poorly-applied ingest by inspecting log + index alone.
Why a workspace manifest (KNOWLEDGE.md). Prose describes intent;
machines need contracts. Earlier drafts of this AIP made AGENTS.md the
schema of record, which works for a single team but breaks the moment
multiple consumers want to share a wiki. A YAML manifest is what lets a
runtime know, without re-reading prose, that an entry of type
Investor is required to have a lead_partner field, that lint rule
require-source runs at severity error on Concepts, that this wiki
retains sources forever. The manifest is also the surface that
composition operates on — merging YAML is mechanical, merging prose is
interpretive. Splitting the canonical config (KNOWLEDGE.md) from the
human-readable companion (AGENTS.md) lets each artifact do what it's
good at, and avoids forcing tooling to parse natural language.
Why one doctype for both workspace root and view. The alternative is
two doctypes — knowledge.workspace/v1 for the root and
knowledge.workspace.view/v1 for the per-consumer adaptation. Two
doctypes means two schemas to maintain, two validators to ship, and an
asymmetric merge surface. One doctype with extends: collapses the two
into a single mental model: a workspace IS its own view of itself, and
every view IS a workspace bound to a different consumer. The same
schema validates both, the same merge algorithm applies recursively,
and the same authoring skill (./resources/aip-10/draft/skills/author-knowledge/SKILL.md)
walks an agent through both flows.
Why composition over inheritance hierarchies. A wiki could ship a
single workspace and then run a query-time prompt that reshapes results
per consumer. That couples consumer behaviour to runtime prompts —
which means swapping runtimes loses the lens. Composition via on-disk
extends: chains keeps the lens portable: the same operator,
re-instantiated in a different runtime, still gets the same merged
config because the merge runs against files, not prompts. This is the
registry-of-views pattern: the wiki is the registry, each KNOWLEDGE.md
is a registered view, and consumers pick a view by location.
Why AGENTS.md is RECOMMENDED, not REQUIRED. Conforming wikis
SHOULD ship both KNOWLEDGE.md and AGENTS.md. Automated pipelines —
and there will be many in the lifetime of this AIP — only need
KNOWLEDGE.md to operate. Forcing them to also write prose creates
drift between the two files; downgrading AGENTS.md to a recommendation
acknowledges that prose is for humans and machines should be free to
operate on machine config alone. Linters SHOULD flag drift between the
two when both are present.
Why contradicts and supersedes are first-class. Knowledge is not
monotonic. A spec that pretends contradictions don't exist forces the
LLM to silently overwrite — losing provenance and audit ability. Making
both relations explicit lets humans and other agents reason about why
a page reads the way it does.
Why no vector store in the spec. Retrieval is a runtime concern. Mandating a specific index would couple the spec to today's tooling; forbidding indices would punish runtimes that already have them. The spec defines what's on disk; runtimes choose how to reach it.
Why depth-cap and cycle-detection are warnings, not errors. A
malformed extends: chain is a configuration bug — but a consumer
whose view fails to load loses ALL of its lens, including the parts
that don't depend on the broken parent. The runtime degrades gracefully
to the local manifest and surfaces the issue, instead of refusing to
activate the consumer. This matches the broader AIP family's preference
for partial-availability over hard-fail.
Reference Implementation
packages/agent-framework/src/wiki —
parser, ingest pipeline, lint pass, and BM25 retrieval. Used by Guilde
(per-guild knowledge base, Librarian operator) and Simone (per-user
personal codex, written by the Council standing pass).
The implementation defines the working schema during the Draft phase; this AIP will absorb the normative text in full as part of moving Draft → Review.
Backwards Compatibility
Not applicable — this AIP introduces a new spec.
Security Considerations
The wiki is write surface for an LLM. Threats:
- Prompt injection via sources — a malicious source instructs the LLM to make harmful edits across the wiki. Mitigation: ingest runs in a sandboxed prompt context; the LLM MUST NOT execute instructions embedded in source bodies, only summarize them.
- Schema poisoning — an attacker rewrites
KNOWLEDGE.md(orAGENTS.md) to relax contradiction policy, expand the LLM's authority, or silently raise lint severities toinfo. Mitigation: both files SHOULD be subject to the same governance gate (AIP-7) as any contract — changes go through approval. Thegovernance:field inKNOWLEDGE.mdmakes this binding explicit; views that override the parent's governance MUST be reviewable through the same gate (a child cannot escape a parent's policy by pointinggovernance:elsewhere unless the parent permits it). - View shadowing — a malicious view
extends:a benign workspace but overrideslintsorcuration.conflictResolutionto silently weaken the curator agent. Mitigation: hosts MUST expose the resolution chain alongside the merged config so reviewers can audit what came from where; governance policies SHOULD restrict which lints a view is allowed to soften. - Confidence laundering — an LLM marks low-quality syntheses as
confidence: 1.0. Mitigation: confidence values written by the LLM during ingest are advisory; downstream consumers MAY weight them against source recency and observation count. - Cross-tenant leakage — when multiple tenants share a runtime, a
wiki ingest pulls a source from the wrong tenant. Mitigation: runtime
MUST scope
sources/reads to the wiki's tenant; the spec is filesystem-first and inherits whatever isolation the host provides.
Resources
Supporting artifacts for AIP-10. Links open the file on GitHub — markdown and JSON render natively in GitHub's viewer. Browse the full resource tree →
AIP-9: OPERATOR.md — agentoperators/v1 (operator runtime protocol)
A single canonical operator shell — pluggable profile, skills, tools, memory, governance — that any agent runtime can implement and any conforming workflow can dispatch to.
AIP-11: LESSON.md — agentlearning/v1 (distilled lessons from experience)
A markdown format for storing the transferable lessons an agent extracts from successful and failed runs — title, trigger, evidence, outcome — and a contract for how runtimes distill them and inject them back into future turns.