Type System

Radhflow has four data types. Every port on every node uses one of them. Schemas are validated at construction time — before any code runs.

Four primitives

Type	What it is	Example
Value	A single scalar: string, number, boolean, null.	An API key. A threshold. A file path.
Record	A single JSON object with named fields.	One user profile. One config block.
Table	An ordered collection of records (rows).	A CSV import. A query result. A report.
Stream	An unbounded sequence of records.	A webhook feed. A log tail. A message queue.

Value and Record are singular. Table and Stream are plural. Table is bounded (all rows in memory or on disk). Stream is unbounded (processed incrementally).

Tables are the most common type. Most pipelines read a Table, transform it through SQL or data ops, and write a Table.

Schemas

Every Table and Record port has a schema. Schemas define fields and their types. They live as .schema.json files alongside the data.

{
  "fields": {
    "name": { "type": "string" },
    "email": { "type": "string" },
    "score": { "type": "number" },
    "active": { "type": "boolean" }
  }
}

Supported field types: string, number, integer, boolean, null, array, object. Nested objects and arrays use standard JSON Schema structure.

Compatibility rules

When an edge connects an output port to an input port, the runtime checks schema compatibility. The rules are lenient by design — pipelines should not break because a source added a column.

Scenario	Result
Extra fields in source	Allowed. Downstream sees a superset.
Missing required field	Error at validation time.
`integer` output to `number` input	Allowed. Integer is a subset of number.
`number` output to `integer` input	Error. Potential data loss.
Enum subset (source has fewer values)	Allowed.
Enum superset (source has extra values)	Error. Downstream cannot handle unknown values.
Field type mismatch (`string` to `number`)	Error at validation time.

The principle: a producer can give more than the consumer expects, but never less.

Schema composition

Nodes declare schemas on their ports in node-spec.yaml:

id: score-leads
type: deterministic
inputs:
  leads:
    type: Table
    schema:
      name: { type: string }
      email: { type: string }
      clicks: { type: integer }
      opens: { type: integer }
outputs:
  scored:
    type: Table
    schema:
      name: { type: string }
      email: { type: string }
      score: { type: number }

When edges connect ports, the graph parser checks compatibility between the output schema of the source and the input schema of the target. This happens at construction time — before execution.

NDJSON interchange

Tables flow between nodes as NDJSON — one JSON object per line. Each Table has a companion .schema.json file.

nodes/read-leads/artifacts/
  leads.ndjson          # data
  leads.schema.json     # schema

The NDJSON file:

{"name":"Alice","email":"alice@example.com","clicks":42,"opens":18}
{"name":"Bob","email":"bob@example.com","clicks":7,"opens":3}
{"name":"Carol","email":"carol@example.com","clicks":91,"opens":55}

NDJSON is human-readable, diffable in Git, streamable, and supported by every language. No proprietary format. No binary serialization.

DuckDB for query execution

SQL nodes execute against DuckDB. The runtime loads input Tables as DuckDB tables, runs the SQL query, and produces an output Table.

score:
  type: data.sql
  query: |
    SELECT *,
      (clicks * 0.3 + opens * 0.5) AS score
    FROM input
    ORDER BY score DESC

DuckDB reads NDJSON natively. No import step. The query runs in-process with columnar execution — fast even on large Tables.

SQL nodes run inside the DuckDB sandbox. They have no filesystem access, no network access, no ability to execute external commands. The only thing a SQL node can do is query data.

Type checking flow

  node-spec.yaml          node-spec.yaml
  ┌──────────┐            ┌──────────┐
  │ output:  │            │ input:   │
  │  Table   │───edge────▶│  Table   │
  │  schema A│            │  schema B│
  └──────────┘            └──────────┘
        │                       │
        └───────┐   ┌───────────┘
                ▼   ▼
          compatibility check
          (A must satisfy B)

If schema A is not compatible with schema B, the graph parser rejects the pipeline before any node executes. You see the error immediately — not after processing half your data.