ADR-0003: RFC 9162 Merkle tree construction with domain separation¶
Status: Accepted
Date: 2026-05-10
Spec section: Section 4.1.1, Section 3.2.3 (tool catalog hash), Section 3.2.5 (RAG corpus)
Context¶
The tool manifest catalog hash and the RAG corpus hash both require a Merkle tree over a set of items (tool schemas and corpus documents respectively). The Merkle construction must be specified precisely to ensure cross-implementation reproducibility and to prevent second-preimage attacks.
Decision¶
Use the RFC 9162 (Certificate Transparency v2) Merkle tree construction with explicit domain separation:
- Leaf nodes:
SHA-256(0x00 || leaf_data) - Internal nodes:
SHA-256(0x01 || left_hash || right_hash)
Leaf data for tool entries: RFC 8785 canonical JSON of the tool descriptor (schema + description, sorted by tool name).
Leaf data for corpus documents: RFC 8785 canonical JSON of the document descriptor (hash + identifier + ingested_at).
Rationale¶
- RFC 9162 is a published IETF standard for Merkle tree construction, used in Certificate Transparency - a deployed, audited system
- The
0x00/0x01domain separation prefix prevents second-preimage attacks where an attacker constructs an internal node that collides with a leaf node - Without domain separation, a tree with N leaves has the same root as a tree with N/2 "leaves" that are actually internal node hashes - the domain prefix makes these structurally distinct
- RFC 9162 construction is deterministic given a fixed leaf ordering (lexicographic by tool name / document identifier)
Alternatives considered¶
Simple concatenation Merkle (no domain separation): Vulnerable to second-preimage attacks as described above. Rejected.
BLAKE3 Merkle: BLAKE3 has built-in domain separation for its tree construction. Rejected because BLAKE3 is not yet in the standard library of all target languages, and SHA-256 is sufficient for this use case.
Flat hash (hash of concatenated hashes): Not a Merkle tree - does not support efficient membership proofs. Rejected because the spec's design supports future membership proof extensions.
Consequences¶
- The
0x00prefix on leaf nodes and0x01prefix on internal nodes are mandatory. Implementations that omit them will fail conformance test AM-BIND-015. - Leaf ordering must be deterministic: tool entries sorted by
tool_id(lexicographic, Unicode code point order, same as RFC 8785 key ordering). Corpus documents sorted bydocument_id. - Empty trees (no tools, no corpus documents) are represented by the SHA-256 of the empty string:
sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.