# Role definitions

> The canonical roles — ingest data plane (passive/active), inside historian, outside copy, query plane — the one-way seam between them, and how they distribute across single-box and two-box-plus-diode deployments.

*Canonical HTML: https://www.conversationalfactory.com/docs/role-definitions*
*Markdown source: https://www.conversationalfactory.com/docs/role-definitions.md*

---

This document defines the architectural roles in Conversational Factory, the one-way seam between them, and how they distribute across single-box and two-box-plus-diode deployments. It is the canonical reference for *what each part is responsible for* and *what it must not do*.

Roles are **conceptual**. Multiple roles can live in one process; one role can have multiple physical instances (the historian role does, in two-box deployments). The role boundary still matters even when the process boundary doesn't, because failure-domain reasoning, ops cadence, and the security argument all flow from it.

This document is the canonical artifact. Reference software exists for several of these roles, but it is **illustrative, not definitive** — the roles and the read-only/one-way constraints are what must hold; how a given implementation satisfies them is its own business. The **witness** is an example of a *passive-ingest data plane*, **discovery** of an *active-ingest* one, and **modelpond** of a *query plane* (deliberately source-agnostic — it does not know or care where the data sits; the outside copy is just one source it can be pointed at). None defines its role.

## The roles

1. **Ingest data plane** — produces the standardized record upstream of the inside historian. **Passive-ingest**: observe only, never transmit or probe (*example: the witness*). **Active-ingest**: poll/query/discover to read values devices don't volunteer — read-only, never write or control (*example: discovery*). A factory may run both.
2. **Inside historian** — the authoritative system of record on the trusted network. Self-sufficient. Exports a copy outward; nothing reaches it from outside.
3. **Outside historian** — the expendable standardized copy on the outside. Same data, designed to be lost. The inside historian's outward-facing peer.
4. **Query plane** (the MCP gateway) — the off-copy conversational front door. Source-agnostic; turns natural language into bounded read-only queries and composes grounded, audited answers. *Example: modelpond.*

The **one-way sync** between the inside historian and the outside historian is not a role — it is the structural constraint every role is shaped around. Its properties are specified in [Diode: what it constrains](#diode-what-it-constrains) below.

The **ingest data plane** is what feeds the inside historian — passive (witness) or active (discovery), and a factory may run both. It is a role, not an afterthought. What is site-dependent is the *specific* feed (which traffic, which collection or polling method); the role and its binding invariant — **read-only, no process side effects, lands the standardized schema** (passive adds "no emission at all"; active relaxes to "reads only, never writes") — are canonical.

PI, Aveva, Canary, Grafana, custom apps, and any other downstream integration are **consumers** of the outside historian's published surface. They are not roles. Serving them is part of the outside historian's job; their existence does not change the role count.

---

## Ingest data plane

**Role.** Produce the standardized record upstream of the inside historian, with zero process side effects. The raw material is the plant's existing **pools of data** — Modbus registers, controller context, alarm and event logs, batch and recipe history, OPC tags — most of which sit untapped until something collects them. Two forms; a factory may run either or both.

### Passive-ingest (example: the witness)

**Must do**

- Observe and ingest plant data into the standardized schema.
- Stay strictly passive — receive only.

**Must not**

- Transmit, probe, scan, or emit anything on the observed network.
- Write to, or control, any device.
- Interpret beyond what the standardized schema requires (downstream roles do that).

### Active-ingest (example: discovery)

**Must do**

- Poll, query, or enumerate to read values devices do not volunteer, into the standardized schema.
- Confine itself to read/query verbs of whatever protocol it speaks.

**Must not**

- Write, command, or change any device state — it may *ask*, never *act*.
- Persist beyond handing the record to the inside historian.

### Compromise notes

- The binding invariant is **read-only, no process side effects, lands the standardized schema.** Passive adds "no emission at all"; active relaxes only to "reads/queries, never writes."
- The *specific* feed (which traffic, which polling method, which protocols) is site-dependent. The role and its invariant are canonical; the implementation is reference, not definitive.
- The data plane is upstream of the inside historian and never reaches across the one-way seam. A compromised data plane can, at worst, feed bad records inward — it has no path outward and no control path to the plant.

---

## Inside historian

**Role.** Maintain the authoritative, time-aligned, queryable record on the trusted network, and export a copy outward over the one-way sync. Never be reachable from outside.

### Must do

- **Own the authoritative record** in a standardized schema — the operational history consumers ask questions against.
- **Be self-sufficient.** Continue to function with zero outbound connectivity and zero dependence on anything outside the boundary.
- **Export a copy over the one-way sync.** Datagrams out, signed, FEC, locally buffered. No acknowledgement expected or required.
- **Stay standardized.** The schema the copy carries is the same standard schema every downstream reader expects.

### Must not

- Expose any inbound listener the untrusted side can address. The only boundary crossing is the outward sync.
- Depend on the outside for correctness or availability. The outside falling over must not affect the inside.
- Speak HTTP or any bidirectional protocol across the boundary.
- Hold conversational / LLM logic.

### Compromise notes

- The inside historian is the source of truth. If the link or the outside peer is lost, it keeps ingesting and recording; the outside falls behind but the inside is unaffected.
- Inside-side timestamps are authoritative — the receiver has no back channel to dispute them.

---

## Outside historian

**Role.** Hold the standardized copy on the outside, expose it through a read-only surface to consumers and the gateway, and be safely losable.

The historian role has **peer instances** in two-box deployments — one inside, one outside. Both hold the same data and answer the same queries. The outside peer is **expendable**: it lives outside the trust boundary and can be lost completely without affecting the inside historian or the plant.

In single-box deployments there is one historian process; the "copy" is an internal handoff and the one-way constraint is enforced in software rather than by a diode.

### Must do

- **Receive the copy** from the inside historian over the one-way sync. Reassemble, verify provenance against pre-shared / out-of-band trust anchors.
- **Expose a standardized read API.** A small, opinionated read-only surface — list objects, current state, history, related, subscribe — that any consumer or the gateway can use.
- **Multi-protocol publish** to non-API consumers where useful: OPC UA pub/sub, Sparkplug B, Iceberg/Delta batch, custom integrations.
- **Optionally forward outward.** Push the copy to a cloud or off-site server over MQTT in realtime when the deployment opts in.

### Must not

- Reach back across the seam. There is no path inward; in two-box this is the diode's job, in single-box it is enforced by construction.
- Pretend to be authoritative. On any discrepancy, the inside historian wins.
- Mutate device state or serve as a control system. It is a record, not an actuator.
- Assume it is trusted. It is designed on the premise that an attacker may own it entirely.

### Compromise notes

- **This is the part you are allowed to lose.** The entire security argument is: own the outside historian completely, and you have a copy of historical data and no route to anything. No socket, no interface, no path home.
- **Transient divergence during outages.** If the link or the outside peer is down, the outside falls behind by the outage duration. Buffered FEC catches up to a finite depth; beyond it, the gap is permanent on the outside (the inside still has the data and can re-export by hand if absolutely necessary).
- **Consumer security is the consumer's problem.** The copy is published through documented profiles; what a downstream consumer does with it, and how secure that consumer is, is downstream.

---

## Query plane (the MCP gateway)

**Role.** Translate natural-language queries into bounded, read-only calls, compose grounded answers with citations, and serve them through MCP to AI clients. **Source-agnostic** — it does not know or care where the data sits; the outside copy is one source it can be pointed at, alongside MCP servers, time-series databases, SQL historians, or indexes. *Example: modelpond.*

**Lives on the outside, with the copy.** Never on the trusted side.

### Must do

- **Expose read-only verbs as MCP tools** to AI clients (Claude Desktop, browser-based clients, local LLMs, MCP-aware tooling).
- **Translate natural language to queries.** A query like *"why did line 3 lose throughput last shift?"* becomes a sequence: identify line 3 → fetch components → range-query history → fetch findings in window → compose.
- **Compose answers with citations** that point back into the audit chain. Every claim traceable to the read that produced it.
- **Maintain an operator-side audit chain**, correlated with the inside record by `request_id`.
- **Enforce read-only at the surface.** No MCP tool maps to a write — and the only thing reachable is a copy fed by a one-way transport.

### Must not

- Speak any OT protocol, or reach the trusted side at all. The only thing it talks to is the outside copy.
- Run on the trusted side. Everything LLM-adjacent lives on the outside.
- Mutate any state anywhere.
- Make outbound calls beyond the copy's read surface and the chosen model/provider.

### Compromise notes

- **Compromising the gateway compromises the copy's read surface — and nothing else.** It has no path inward by construction; owning it yields read access to expendable data.
- **Model non-determinism.** The same NL query can produce different call sequences across runs. The audit chain captures the actual calls made — the sequence is auditable even if not deterministic.
- **Air-gapped sites.** Where no outbound connectivity is allowed, inference runs on-prem (a specialized small model next to the copy). The architecture does not change; only the model placement does.

---

## Consumers (not roles)

PI, Aveva, Canary, OSIsoft, GE Proficy, Grafana, Power BI, Tableau, Streamlit dashboards, custom integrations — all of these are *consumers* of the outside historian's publishing surface. They are not roles.

- Adding a consumer is **publishing-surface work on the outside historian**, not a new role. The role count is stable.
- Consumers can be added or removed without changing the topology. The contract surface is stable.
- A consumer's threat model is the consumer's responsibility. The copy is published through documented profiles; downstream is downstream.

---

## Deployment topologies

### Single-box

One Conversational Factory appliance hosts the inside historian and the outside copy in one process; the copy is an internal handoff. The one-way constraint is enforced in software — the copy-facing surface has no path back to the authoritative store.

The MCP gateway still lives on the outside (a workstation or adjunct host), querying the copy's read surface. This is the standard deployment for sites without a regulatory or operational mandate for a hardware diode.

### Two-box + diode

Two appliances. The inside box hosts the inside historian. The outside box hosts the outside historian. A **hardware data diode** sits between them, carrying the one-way sync from inside to outside and nothing back.

The MCP gateway lives on the outside network with the copy. Used in deployments with regulatory diode mandates (utilities, defense, nuclear, pharma, water) or customer-driven air-gap requirements.

---

## Data flow

End-to-end: inside historian (authoritative record) → one-way sync (datagrams out, optionally across a diode) → outside historian (standardized copy) → read API / multi-protocol publish (optionally MQTT to cloud/off-site) → consumer or MCP gateway (which composes the conversational answer; other consumers ingest directly).

The copy is the unit that crosses the seam. Nothing crosses the other way.

---

## Diode: what it constrains

The diode is a property of the inside-to-outside link in two-box deployments. It is hardware-enforced one-way flow — software bugs cannot violate it. This shapes several decisions, and the same constraints shape the single-box design even though software enforces them there.

### Profile selection

Only profiles compatible with one-way transport work across the diode:

| Profile | One-way compatible | Why |
|---|---|---|
| Sparkplug B | No | MQTT requires bidirectional sub/ack |
| OPC UA pub/sub | Yes | UDP multicast / one-way friendly |
| mTLS structured query | No | TLS handshake is bidirectional |
| Iceberg/Delta batch | Yes (with a diode that supports file xfer) | File-shaped, no return channel |
| Custom UDP + FEC | Yes | Designed for unidirectional gateways |

Across the diode, the practical sync profiles are custom UDP+FEC, OPC UA pub/sub, or batch file transfer. The outside copy then republishes to *its* consumers using whatever profile suits each (the outside network is bidirectional internally; the seam is not).

### No acknowledgment, no retry on receiver request

The inside historian cannot know whether the outside received a given batch. Implications:

- Use FEC or redundant transmission for resilience to packet loss.
- The local buffer is for *operational continuity* (riding out short outages of the outside), not guaranteed delivery.
- If the outside is down longer than the buffer depth, the gap is permanent on the outside. The inside retains the data; manual export is possible but not automatic.

### No clock sync, no schema feedback

The outside cannot tell the inside "your timestamps look wrong" or "I don't recognize this schema version." So:

- Inside-side timestamps are authoritative.
- Schema/dictionary version travels with each batch; outside consumers handle drift.
- Provenance signing uses pre-shared keys or out-of-band trust anchors.

### Single-box compatibility

The historian is built to work the same in both topologies. The cross-seam step degenerates to an in-process handoff in single-box. There is no separate "diode build" — the constraints shape the design in both cases, and the hardware enforces them in two-box.

---

## Cross-cutting tradeoffs

| Tradeoff | Choice | Cost | Why we accept it |
|---|---|---|---|
| Return channel | None, ever | No receiver acks, no retry-on-request, no clock/schema feedback | The absence of the channel *is* the product; FEC + buffering covers the operational cost |
| Inside/outside split | Always split (peer roles) | Extra transport, slightly more ops surface | The security argument requires it; failure domains stay clean |
| Expendable outside | Outside copy can be lost completely | Outside can fall behind or be lost | A lost copy is the designed worst case, not an incident |
| Standard schema at the seam | Standardized read API on the copy | Less freedom to special-case the boundary | Clients/models/clouds become interchangeable; no lock-in |
| Gateway on the outside | Always with the copy, never inside | Outside host needs reach to the copy | LLM-adjacent code can't sit in the trusted envelope; nothing inbound at the boundary |
| Consumers, not roles | Downstream is publishing-surface work | Each integration is a publish-profile addition | Keeps the role count small; integrations don't change the architecture |

---

## Where this fits in the docs

This document is the canonical reference for role boundaries. The [system architecture](/docs/architecture/) gives the inside / one-way / outside model and the seam; this document is more precise about what each role is responsible for and where its limits are. When the two disagree on role boundaries, this document is canonical.
