Skip to content

data.join

data.join combines two tables by matching rows on key fields. It takes two Table inputs (left and right ports) and produces one Table output containing columns from both sides.

Input: Table (left port), Table (right port) Output: Table

nodes:
orders-with-customers:
type: data.join
config:
join_type: left
left_key: customer_id
right_key: id
FieldTypeRequiredDescription
join_typestringnoJoin type: inner, left, right, full (default: inner)
left_keystringyesColumn name on the left table to join on
right_keystringyesColumn name on the right table to join on
right_prefixstringnoPrefix for right-side columns to avoid name collisions

The output contains all columns from the left table plus all columns from the right table (excluding the right join key, which would duplicate the left key). When both tables share a column name, use right_prefix to disambiguate.

nodes:
merged:
type: data.join
config:
join_type: inner
left_key: id
right_key: order_id
right_prefix: "order_"

With right_prefix: "order_", right-side columns status and total become order_status and order_total in the output.

Returns only rows with matching keys on both sides.

nodes:
matched-orders:
type: data.join
config:
join_type: inner
left_key: id
right_key: customer_id
edges:
- customers.output -> matched-orders.left
- orders.output -> matched-orders.right

Returns all rows from the left table. Right-side columns are null when no match exists.

nodes:
all-customers:
type: data.join
config:
join_type: left
left_key: id
right_key: customer_id

Returns all rows from the right table. Left-side columns are null when no match exists.

nodes:
all-orders:
type: data.join
config:
join_type: right
left_key: id
right_key: customer_id

Returns all rows from both tables. Columns are null on the side where no match exists.

nodes:
full-merge:
type: data.join
config:
join_type: full
left_key: email
right_key: email
nodes:
load-orders:
type: file.csv
config:
path: orders.csv
load-products:
type: file.csv
config:
path: products.csv
orders-enriched:
type: data.join
config:
join_type: left
left_key: product_id
right_key: id
right_prefix: "product_"
summary:
type: data.map
config:
include_original: true
fields:
line_total: "{{ quantity * product_price }}"
edges:
- load-orders.output -> orders-enriched.left
- load-products.output -> orders-enriched.right
- orders-enriched.output -> summary.input

This pipeline joins orders with product data, then computes a line total using the product price from the right table.