Schemas
Schemas define the shape of data at each port. Every port declares a schema. The runtime validates data against schemas at execution boundaries. Mismatches are caught before data flows.
Where schemas live
Section titled “Where schemas live”Custom nodes declare schemas in their node-spec.yaml:
nodes/ score-calculator/ node-spec.yaml # declares input/output schemas schemas/ input.schema.json # generated from spec output.schema.json # generated from spec main.js # implementationAt execution time, data files and their companion schemas are written to the node’s output directory:
nodes/ score-calculator/ output/ scored.ndjson # output data scored.schema.json # output schemaMinimal schema example
Section titled “Minimal schema example”A schema is a JSON object where keys are field names and values describe the field type and constraints:
{ "email": { "type": "string", "required": true }, "score": { "type": "number" }, "tier": { "type": "string", "enum": ["high", "medium", "low"] }}In node-spec.yaml, the same schema is declared in YAML:
inputs: records: type: table schema: email: type: string required: true score: type: number tier: type: string enum: [high, medium, low]Schema validation
Section titled “Schema validation”Validation happens at two points:
Before execution (static). The type checker compares schemas across edges. It verifies that source port schemas satisfy destination port requirements — required fields exist, types match, enum constraints hold. This happens during rf validate and at the start of rf run.
At edge boundaries (runtime). When a node finishes executing, the runtime validates its output data against the declared output schema. When a node starts, its input data is validated against the declared input schema.
Validation errors block execution. Warnings (like enum supersets or extra fields) are logged but do not prevent a run.
Schema inference
Section titled “Schema inference”Not all schemas need to be declared manually. Radhflow infers schemas in several cases:
file.source nodes. For CSV files, the schema is inferred from the header row and first N data rows. For NDJSON files, the companion .schema.json is used directly.
Data operations. Output schemas are computed from the operation config and the upstream input schema:
| Operation | Schema rule |
|---|---|
data.filter | Output = input (rows removed, fields unchanged) |
data.sort | Output = input (rows reordered) |
data.limit | Output = input (rows capped) |
data.dedup | Output = input (duplicates removed) |
data.map | Output = input fields + mapped fields (or mapped only) |
data.sql | Output inferred from SQL query columns |
data.join | Output = merged fields from left and right inputs |
data.partition | Both matching and not_matching = input schema |
data.group | Output = group-by fields + aggregation result columns |
Custom nodes. Schemas must be declared explicitly in node-spec.yaml. No inference happens.
Supported JSON Schema features
Section titled “Supported JSON Schema features”| Feature | Syntax | Description |
|---|---|---|
| Type | "type": "string" | Field type (see Data Types) |
| Required | "required": true | Field must be present (default: true) |
| Nullable | "nullable": true | Allows null values |
| Default | "default": 0 | Value used when field is absent |
| Enum | "enum": ["a", "b"] | Restricts to listed values |
| Description | "description": "User email" | Human-readable documentation |
| List items | "items": {"type": "string"} | Type of elements in a list field |
| Nested record | "schema": {"city": {...}} | Fields within a record field |
Common patterns
Section titled “Common patterns”Optional fields
Section titled “Optional fields”Mark fields as not required with required: false. They may be absent from records.
schema: email: type: string required: true phone: type: string required: false nickname: type: string required: false default: "Anonymous"When phone is absent, it is simply missing from the output. When nickname is absent, the default value "Anonymous" is used.
Arrays
Section titled “Arrays”Use the list type with an items declaration:
schema: tags: type: list items: type: string scores: type: list items: type: numberNested objects
Section titled “Nested objects”Use the record type with a nested schema:
schema: address: type: record schema: street: type: string city: type: string zip: type: string required: true country: type: string enum: [US, DE, GB, FR]Nullable fields
Section titled “Nullable fields”Allow null values with nullable: true:
schema: score: type: number nullable: true last_login: type: timestamp nullable: trueThis is distinct from required: false. A required-but-nullable field must be present, but its value can be null.
Edge validation
Section titled “Edge validation”When the type checker validates an edge, it compares the source port’s schema against the destination port’s expected schema.
Valid connection
Section titled “Valid connection”Source has all fields the destination requires:
# Source outputs: email, name, score, region# Destination expects: email (required), score
# Result: compatible. name and region are passed through but ignored.Missing required field
Section titled “Missing required field”# Source outputs: name# Destination expects: email (required), name
# Result: error MISSING_REQUIRED — source is missing required field email.Type mismatch
Section titled “Type mismatch”# Source outputs: score (string)# Destination expects: score (number)
# Result: error TYPE_MISMATCH — field score is string in source but number in destination.Error codes
Section titled “Error codes”| Code | Severity | Meaning |
|---|---|---|
MISSING_REQUIRED | error | Destination requires a field source lacks |
TYPE_MISMATCH | error | Port type or field type incompatible |
MISSING_PORT | error | Edge references a port that does not exist |
UNKNOWN_NODE | error | Edge references a node not in the graph |
ENUM_SUPERSET | warning | Source enum has values destination lacks |
EXTRA_FIELDS | warning | Source has fields destination ignores |
NULLABLE_MISMATCH | warning | Source nullable, destination not |
Errors block execution. Warnings are reported but do not prevent a run.