Foundations: Judgement & Coherence

On Judgement, Coherence, and Measurement

The underlying question

Any system that evaluates AI behaviour must resolve a prior question:

By what criteria is behaviour judged?

This question is not peripheral.

It defines the meaning of every metric, score, and conclusion produced downstream.

There is no neutral evaluation

All evaluation systems depend on selected criteria:

  • what is measured
  • how it is measured
  • how outcomes are interpreted

These selections encode assumptions about:

  • usefulness
  • safety
  • truthfulness
  • acceptability

In most systems, these assumptions remain implicit.

Position

The RI Safety Layer does not attempt to eliminate judgement.

It establishes judgement as a first-class, explicit component of the system.

Structural approach

Judgement within the system is:

  • declared — criteria are defined in advance
  • structured — evaluation procedures are formally specified
  • versioned — changes to criteria are tracked over time
  • auditable — all decisions can be traced to their originating rules

This allows any result to be inspected in terms of:

  • the behaviour observed
  • the measurement process applied
  • the criteria used to interpret that behaviour

Separation of measurement and judgement

A core architectural principle is strict separation between:

  • measurement — what occurred
  • judgement — how it is interpreted

Measurement produces:

  • sealed, reproducible records of behaviour

Judgement produces:

  • governed decisions about how those records are used

This separation ensures:

  • measurement remains stable and replayable
  • judgement can evolve without corrupting evidence

Definition of coherence

Within this system, coherence is defined operationally.

Coherence is the degree of alignment between observed behaviour and a declared set of evaluation criteria.

These criteria are:

  • explicitly defined
  • empirically testable
  • context-dependent

The system does not assume a single universal definition of coherence.

Different domains may define coherence differently, provided those definitions are made explicit and applied consistently.

Foundational basis

The evaluation criteria used within the system are not arbitrary.

They are informed by an underlying research programme focused on:

  • behavioural consistency
  • response stability under variation
  • alignment between stated intent and produced output
  • structural properties of interaction patterns

This work combines:

  • empirical evaluation design
  • behavioural analysis across repeated trials
  • internally developed theoretical models of coherence and system alignment

These models are not treated as fixed doctrine.

They are:

  • continuously tested against observed behaviour
  • refined through iterative evaluation
  • subject to revision as evidence accumulates

Where appropriate, elements of this framework are formalised and published in technical form.

What the system guarantees

The system does not claim that:

  • any given definition of coherence is universally correct

It guarantees that:

  • the criteria used are explicit and inspectable
  • the evaluation process is deterministic and reproducible
  • all results are traceable to their originating assumptions

Implication

This shifts evaluation from assertion to inspection.

Instead of:

  • “This system is safe”
  • “This output is correct”

the system provides:

  • the recorded behaviour
  • the measurement process
  • the criteria applied
  • the resulting decision

Allowing independent observers to:

  • reproduce the result
  • examine the assumptions
  • agree or disagree with the judgement

Result

Evaluation becomes:

  • transparent rather than implicit
  • reproducible rather than anecdotal
  • contestable without ambiguity

Trust is not required in advance.

It emerges from the ability to inspect, reproduce, and challenge the system’s outputs.