Key Concepts
Pipelines
Section titled “Pipelines”A pipeline is a directed acyclic graph (DAG) of nodes. Data flows from sources through transforms to outputs. The entire pipeline is defined in gain.yaml and versioned in Git.
# gain.yaml — a pipeline with three nodesnodes: fetch-data: type: source # ... transform: type: deterministic # ... export: type: deterministic # ...Pipelines execute deterministically. No LLM runs at execution time.
A node is a unit of work. It has typed input ports, typed output ports, and an operation that transforms inputs into outputs. Every node has a human-readable slug as its ID.
score-leads: type: deterministic op: sql.query params: query: "SELECT *, score * weight AS final FROM leads" inputs: leads: { type: Table, from: ref(read-csv.leads) } outputs: scored: { type: Table }Node types include source (data ingestion), deterministic (transforms), llm (AI-generated at creation time), conditional (branching), and service (external APIs).
Ports are the typed connection points on a node. Each input port declares the data type it accepts. Each output port declares the data type it produces. Types are checked before execution.
inputs: orders: { type: Table, from: ref(fetch-orders.orders) } threshold: { type: Value, from: ref(config.min-amount) }outputs: filtered: { type: Table }A port mismatch — connecting a Value output to a Table input — is caught before any code runs.
Edges are the connections between ports. They are declared implicitly through from references on input ports. An edge carries data from one node’s output to another node’s input.
# This input declaration creates an edge:# clean-data.customers → enrich.customersenrich: inputs: customers: { type: Table, from: ref(clean-data.customers) }Edges enforce type contracts. The output type of the source port must match the input type of the target port.
Data types
Section titled “Data types”Radhflow has four primitive data types. Every port uses one of them.
| Type | Description | Example |
|---|---|---|
| Value | A single scalar — string, number, boolean. | An API key, a threshold, a file path. |
| Record | A single JSON object with named fields. | One user profile, one config block. |
| Table | An ordered collection of records (rows). | A CSV import, a query result, a report. |
| Stream | An unbounded sequence of records. | A webhook feed, a log tail, a queue. |
Tables are the most common type. They flow between nodes as NDJSON.
Schemas
Section titled “Schemas”Every Table and Record port has a schema. Schemas define the fields and their types. They are stored as .schema.json files alongside NDJSON data.
nodes/read-leads/ schemas/ leads.schema.json # field definitions artifacts/ leads.ndjson # the dataA schema file:
{ "fields": { "name": { "type": "string" }, "email": { "type": "string" }, "score": { "type": "number" } }}NDJSON is one JSON object per line. Human-readable. Diffable in Git. Universal across languages.
{"name":"Alice","email":"alice@example.com","score":92}{"name":"Bob","email":"bob@example.com","score":45}ref() is how you wire nodes together. It references an output port on another node using the pattern ref(node-id.port-name).
transform: inputs: raw: { type: Table, from: ref(fetch-data.rows) } config: { type: Record, from: ref(load-config.settings) }ref(fetch-data.rows) means: take the rows output from the fetch-data node and feed it into this input. The graph executor resolves these references, validates types, and determines execution order automatically.