Foundations: Judgement & Coherence
On Judgement, Coherence, and Measurement
The underlying question
Any system that evaluates AI behaviour must resolve a prior question:
By what criteria is behaviour judged?
This question is not peripheral.
It defines the meaning of every metric, score, and conclusion produced downstream.
There is no neutral evaluation
All evaluation systems depend on selected criteria:
- what is measured
- how it is measured
- how outcomes are interpreted
These selections encode assumptions about:
- usefulness
- safety
- truthfulness
- acceptability
In most systems, these assumptions remain implicit.
Position
The RI Safety Layer does not attempt to eliminate judgement.
It establishes judgement as a first-class, explicit component of the system.
Structural approach
Judgement within the system is:
- declared — criteria are defined in advance
- structured — evaluation procedures are formally specified
- versioned — changes to criteria are tracked over time
- auditable — all decisions can be traced to their originating rules
This allows any result to be inspected in terms of:
- the behaviour observed
- the measurement process applied
- the criteria used to interpret that behaviour
Separation of measurement and judgement
A core architectural principle is strict separation between:
- measurement — what occurred
- judgement — how it is interpreted
Measurement produces:
- sealed, reproducible records of behaviour
Judgement produces:
- governed decisions about how those records are used
This separation ensures:
- measurement remains stable and replayable
- judgement can evolve without corrupting evidence
Definition of coherence
Within this system, coherence is defined operationally.
Coherence is the degree of alignment between observed behaviour and a declared set of evaluation criteria.
These criteria are:
- explicitly defined
- empirically testable
- context-dependent
The system does not assume a single universal definition of coherence.
Different domains may define coherence differently, provided those definitions are made explicit and applied consistently.
Foundational basis
The evaluation criteria used within the system are not arbitrary.
They are informed by an underlying research programme focused on:
- behavioural consistency
- response stability under variation
- alignment between stated intent and produced output
- structural properties of interaction patterns
This work combines:
- empirical evaluation design
- behavioural analysis across repeated trials
- internally developed theoretical models of coherence and system alignment
These models are not treated as fixed doctrine.
They are:
- continuously tested against observed behaviour
- refined through iterative evaluation
- subject to revision as evidence accumulates
Where appropriate, elements of this framework are formalised and published in technical form.
What the system guarantees
The system does not claim that:
- any given definition of coherence is universally correct
It guarantees that:
- the criteria used are explicit and inspectable
- the evaluation process is deterministic and reproducible
- all results are traceable to their originating assumptions
Implication
This shifts evaluation from assertion to inspection.
Instead of:
- “This system is safe”
- “This output is correct”
the system provides:
- the recorded behaviour
- the measurement process
- the criteria applied
- the resulting decision
Allowing independent observers to:
- reproduce the result
- examine the assumptions
- agree or disagree with the judgement
Result
Evaluation becomes:
- transparent rather than implicit
- reproducible rather than anecdotal
- contestable without ambiguity
Trust is not required in advance.
It emerges from the ability to inspect, reproduce, and challenge the system’s outputs.