data.group
data.group partitions rows by one or more fields, then computes aggregation functions over each group. The output contains one row per unique combination of group-by values, plus the aggregation result columns.
Input: Table Output: Table (group columns + aggregation columns)
Basic config
Section titled “Basic config”nodes: orders-by-status: type: data.group config: by: [status] aggregations: order_count: op: count field: id total_revenue: op: sum field: amountConfig reference
Section titled “Config reference”| Field | Type | Required | Description |
|---|---|---|---|
by | array | yes | Fields to group on |
aggregations | map | yes | Output column name to aggregation definition |
Each aggregation definition has:
| Field | Type | Required | Description |
|---|---|---|---|
op | string | yes | Aggregation function |
field | string | varies | Input field to aggregate (required for all except count with *) |
limit | number | no | Max items for collect |
separator | string | no | Delimiter for join (default: ", ") |
Aggregation functions
Section titled “Aggregation functions”| Function | Description | Output type |
|---|---|---|
count | Count of non-null values (or all rows with *) | number |
sum | Sum of numeric values | number |
avg | Arithmetic mean | number |
min | Minimum value | same as input |
max | Maximum value | same as input |
first | First value in group | same as input |
last | Last value in group | same as input |
collect | Collect values into a list | list |
count_unique | Count of distinct values | number |
join | Concatenate string values with separator | string |
Examples
Section titled “Examples”Revenue by category
Section titled “Revenue by category”nodes: revenue-summary: type: data.group config: by: [category] aggregations: total_revenue: op: sum field: amount avg_order: op: avg field: amount order_count: op: count field: idCount by status
Section titled “Count by status”nodes: status-breakdown: type: data.group config: by: [status] aggregations: count: op: count field: "*"Multi-field grouping
Section titled “Multi-field grouping”nodes: region-product: type: data.group config: by: [region, product_type] aggregations: units_sold: op: sum field: quantity unique_customers: op: count_unique field: customer_idCollect and join
Section titled “Collect and join”nodes: tags-per-author: type: data.group config: by: [author] aggregations: all_tags: op: collect field: tag limit: 10 tag_list: op: join field: tag separator: " | "collect gathers values into a JSON array. limit caps the array size. join concatenates values into a single string with the specified separator.
Pipeline example
Section titled “Pipeline example”nodes: load-sales: type: file.csv config: path: sales.csv
monthly-summary: type: data.group config: by: [month, region] aggregations: revenue: op: sum field: amount deals: op: count field: id top_deal: op: max field: amount
ranked: type: data.sort config: by: - field: revenue direction: desc
edges: - load-sales.output -> monthly-summary.input - monthly-summary.output -> ranked.input