Governance Explainability
Governance explainability is the ability to prove what an organization did about an AI interaction: which policy ran, which entities were redacted, which model was permitted. It addresses Layer 2, distinct from model explainability.
What governance explainability is
Governance explainability is the property of an AI system that, for any given interaction, the organization can produce a complete and verifiable account of what it did about that interaction. Not what the model thought. What the organization, through its policy and infrastructure, controlled and recorded.
The output of governance explainability is concrete: a signed record showing the policy version, the redactions applied, the connectors permitted, the model invoked, the user identity, and the outcome. The record is tamper evident, queryable, and presentable to a regulator.
The input does not require any insight into the model’s internal computation. That is the whole point.
The two layers
There are two distinct layers at which an AI system can be made explainable. The industry has spent the last decade conflating them.
Layer 1: Model explainability. This is the question of why the model produced a particular output. Methods include LIME, SHAP, integrated gradients, attention map visualization, and increasingly mechanistic interpretability research. For classical machine learning models with hundreds or thousands of parameters, Layer 1 is partially tractable. For modern large language models with hundreds of billions of parameters and stochastic decoding, Layer 1 is essentially open as a research problem. Post hoc rationalizations from an LLM about its own reasoning are not faithful descriptions of its computation.
Layer 2: Governance explainability. This is the question of what the organization did about the AI interaction. Which policy version was in effect. Which input entities were redacted before the prompt left the perimeter. Which model the request was routed to. Which connectors the model was allowed to invoke. Which output the user actually received. Which record was written and signed.
Layer 2 is fully solvable. Every action it describes is deterministic software behavior in code that the organization controls. There is no stochasticity to model and no unsolved interpretability problem to wait on.
When a regulator, auditor, customer, or internal incident responder asks the question that matters, they ask about Layer 2.
Why Layer 2 is sufficient for compliance
Compliance frameworks were not written assuming model explainability is solved. They were written assuming organizations are accountable for the AI systems they deploy. Accountability requires evidence of what was controlled, not evidence of what the model thought.
The relevant text is consistent across frameworks:
- NIST AI RMF treats explainability as one of seven trustworthiness properties, alongside accountability, fairness, security, privacy, reliability, and validity. The Govern function explicitly requires accountability mechanisms and documentation (GV-1.6, GV-4.2). Layer 2 satisfies both directly.
- EU AI Act Article 13 requires that high risk AI systems be designed and developed so deployers can interpret system output and use it appropriately. The article focuses on operational transparency: instructions for use, capabilities, limitations, and the data the system was trained and tested on. None of that requires Layer 1 explainability of a specific inference.
- ISO/IEC 42001 clause 9.2 internal audit is a Layer 2 audit of the AI Management System, not a Layer 1 audit of model internals.
Frameworks that do mention model explainability treat it as a property to evaluate where feasible, not a precondition for compliance. Layer 2 is the load bearing requirement.
What governance explainability looks like in production
A signed record per interaction. The minimum viable schema:
- Identity. User id, application id, tenant id.
- Input. Hash of the original prompt. Optional plaintext, encrypted at rest.
- Policy. Version of the policy that ran, with a content hash.
- Redactions. Each redacted entity by type (PII category, secret type, proprietary code marker) with offsets.
- Routing. The model the request was routed to, the route reason, and the connectors permitted.
- Output. Hash of the model response. Optional plaintext, encrypted at rest.
- Decision. Allow, redact, block, or rewrite, with the rule id that fired.
- Timestamp. RFC 3339 with sub second precision.
- Signature. Per record (RSA 4096 or equivalent) and chain hash to the previous record (SHA 256).
- Storage. Append only, WORM (write once read many) media, retention configured to the longest applicable regulatory minimum.
The record is the evidence. It is generated automatically on every interaction. Sampling is not required. Reconstruction is not required. The audit trail is the system of record.
Common misconceptions
- “Governance explainability is just logging.” Logs record what happened. Governance explainability records why (the policy that ran), proves it (cryptographic signature), and chains it (tamper evidence). A log can be edited or lost. A signed chained record cannot be altered without detection.
- “It is a substitute for model explainability.” It is not a substitute. It is a different layer. Model explainability remains a useful research and validation tool, particularly for safety teams. For compliance, governance explainability is the load bearing layer.
- “Only highly regulated industries need it.” Any organization that runs AI in production and has any audit obligation (SOC 2, ISO 27001, HITRUST, contractual) ends up needing it. The shift from “AI as experiment” to “AI as production system” makes governance explainability a baseline requirement, not a regulated industry specialty.
How to evaluate a governance explainability claim
A vendor or internal team claims governance explainability. Verify with these questions:
- Show the record. Ask to see the actual signed record for a specific interaction. If it is a logfile entry, that is logging, not governance explainability.
- Verify the signature. A regulator will. The signature should be independently verifiable using a published public key.
- Reconstruct the chain. Pull two consecutive records and check the chain hash. If the chain breaks silently, the system fails the tamper evidence requirement.
- Replay the policy. Ask which policy version was in effect at the timestamp on the record. The vendor should be able to fetch the policy content by its hash and prove it matches.
- Show coverage. What percentage of production AI traffic produces a record? Anything less than 100 percent is sampling, which is not evidence.
If all five answers are clean, the claim is credible. If any is hand waved, the system is logging, not proving.
Questions about governance explainability.
What is governance explainability? +
How is it different from model explainability? +
Why is model explainability unsolvable for LLMs? +
What evidence does governance explainability produce? +
Does governance explainability satisfy the EU AI Act transparency requirement? +
How does it relate to AI accountability? +
Is this the same as audit logging? +
Which standards reference governance explainability concepts? +
Raidu is the AI Accountability Layer. Intercept. Explain. Prove.
See the runtime, the cryptographic record, and what a regulator-ready trail looks like for your AI stack.