Foundations: Judgement & Coherence

The underlying question

Any system that evaluates AI behaviour must resolve a prior question:

By what criteria is behaviour judged?

This question is not peripheral.

It defines the meaning of every metric, score, and conclusion produced downstream.

There is no neutral evaluation

All evaluation systems depend on selected criteria:

what is measured
how it is measured
how outcomes are interpreted

These selections encode assumptions about:

usefulness
safety
truthfulness
acceptability

In most systems, these assumptions remain implicit.

Position

The RI Safety Layer does not attempt to eliminate judgement.

It establishes judgement as a first-class, explicit component of the system.

Structural approach

Judgement within the system is:

declared — criteria are defined in advance
structured — evaluation procedures are formally specified
versioned — changes to criteria are tracked over time
auditable — all decisions can be traced to their originating rules

This allows any result to be inspected in terms of:

the behaviour observed
the measurement process applied
the criteria used to interpret that behaviour

Separation of measurement and judgement

A core architectural principle is strict separation between:

measurement — what occurred
judgement — how it is interpreted

Measurement produces:

sealed, reproducible records of behaviour

Judgement produces:

governed decisions about how those records are used

This separation ensures:

measurement remains stable and replayable
judgement can evolve without corrupting evidence

Definition of coherence

Within this system, coherence is defined operationally.

Coherence is the degree of alignment between observed behaviour and a declared set of evaluation criteria.

These criteria are:

explicitly defined
empirically testable
context-dependent

The system does not assume a single universal definition of coherence.

Different domains may define coherence differently, provided those definitions are made explicit and applied consistently.

Foundational basis

The evaluation criteria used within the system are not arbitrary.

They are informed by an underlying research programme focused on:

behavioural consistency
response stability under variation
alignment between stated intent and produced output
structural properties of interaction patterns

This work combines:

empirical evaluation design
behavioural analysis across repeated trials
internally developed theoretical models of coherence and system alignment

These models are not treated as fixed doctrine.

They are:

continuously tested against observed behaviour
refined through iterative evaluation
subject to revision as evidence accumulates

Where appropriate, elements of this framework are formalised and published in technical form.

What the system guarantees

The system does not claim that:

any given definition of coherence is universally correct

It guarantees that:

the criteria used are explicit and inspectable
the evaluation process is deterministic and reproducible
all results are traceable to their originating assumptions

Implication

This shifts evaluation from assertion to inspection.

Instead of:

“This system is safe”
“This output is correct”

the system provides:

the recorded behaviour
the measurement process
the criteria applied
the resulting decision

Allowing independent observers to:

reproduce the result
examine the assumptions
agree or disagree with the judgement

Result

Evaluation becomes:

transparent rather than implicit
reproducible rather than anecdotal
contestable without ambiguity

Trust is not required in advance.

It emerges from the ability to inspect, reproduce, and challenge the system’s outputs.