Per-Wiki Hooks Architecture
Status: Accepted design, phased implementation. Phase 1 already ships (classifier + processor plugins); later phases generalize the same machinery into a uniform hook system.
Problem
Each wiki covers a different domain, and domains disagree about what a
“paper” looks like, which taxonomy tags are legal, how a transcript should be
split into pages, and what counts as a lint violation. Hard-coding these
policies in hermes_wiki forces every wiki to share one behavior. Wikis need
their own code — versioned inside the wiki, attributable, and safe to sync
across machines without silently executing whatever arrives.
Existing foundation (Phase 1, shipped)
The trust-before-execute plugin system already implements the core pattern for two hook points:
| Piece | Location |
|---|---|
| Plugin code | <wiki>/plugins/classifiers/<name>.py, <wiki>/plugins/processors/<name>.py |
| Canonical trust record | SCHEMA.md — trusted_plugin YAML blocks (name, kind, path, sha256, trusted_at, author) |
| Queryable projection | trusted_plugins table in wiki.db, rebuilt from SCHEMA.md |
| Trust CLI | hermes wiki plugins trust\|untrust\|list |
| Enforcement | path must resolve inside the wiki root; sha256 must match; mismatch silently disables until re-trusted |
| Invocation | classify_source() consults trusted classifiers after built-ins; _trusted_processor_for_label() swaps the processor per label |
Every property below is inherited from this foundation, not invented: code lives in the wiki (portable, git-versioned with content), trust is content-addressed (path + sha256 in authoritative Markdown), and execution is opt-in per machine state (a cloned wiki’s hooks are inert until the projection is rebuilt from the SCHEMA.md trust records the owner committed).
Design
Hook points
Generalize kind from {classifier, processor} to a hook-point registry.
Each hook point declares its contract (function name, signature, return type)
and its failure semantics:
| Hook point | Contract | Called | On error |
|---|---|---|---|
classifier (shipped) |
classify(source_path) -> ClassLabel \| str \| None |
After built-in classifiers miss | Skip hook, continue chain |
processor (shipped) |
process(request: ProcessRequest) -> list[GeneratedPage] |
Replaces DefaultProcessor for its label |
Fail the ingest (rollback) |
taxonomy |
validate_tags(page_meta, schema_taxonomy) -> list[TagViolation] and/or suggest_tags(page_meta, body) -> list[str] |
On page create/update, before propagation | Fail closed: reject the write with the violation |
lint |
lint(wiki_root, projection) -> list[Finding] |
Appended to built-in checks in lint_wiki |
Report as its own finding, never crash lint |
pre_ingest |
pre_ingest(snapshot_meta) -> IngestDecision (allow / skip / reroute label) |
After snapshot, before classification | Fail open: log and continue (snapshot is already durable) |
post_ingest |
post_ingest(result: IngestResult) -> None |
After commit, outside the rollback boundary | Log only |
Notes:
taxonomyis the highest-value new hook: SCHEMA.md already declares the taxonomy in YAML; this hook lets a wiki enforce or extend it (e.g. derive tags from frontmatter, forbid tag combinations).pre_ingest/post_ingestdeliberately bracket the existing transactional boundary (_remember/_restore): pre runs before any page mutation, post runs after the git commit, so neither can corrupt a rollback.- No hook ever runs on read paths (search, open, dashboard GETs). Hooks fire only on writes a grant already authorizes.
Layout
<wiki-root>/plugins/
classifiers/ <name>.py # shipped
processors/ <name>.py # shipped
hooks/
taxonomy/ <name>.py
lint/ <name>.py
pre_ingest/ <name>.py
post_ingest/ <name>.py
Trust records keep the same shape with new kind values, so SCHEMA.md
blocks, the trusted_plugins projection, `hermes wiki plugins trust