data.filter
Keep rows that match a condition. Drop the rest.
Input: Table | Output: Table (same schema, fewer rows)
Minimal example
Section titled “Minimal example”Filter customers by country:
nodes: german-customers: type: data.filter config: conditions: all: - field: country op: equals value: DEConfig reference
Section titled “Config reference”| Field | Type | Required | Description |
|---|---|---|---|
conditions | object | yes | Condition group (all or any) containing filter rules |
Condition groups
Section titled “Condition groups”Conditions are wrapped in all (AND) or any (OR). Groups nest arbitrarily.
# AND — all conditions must matchconditions: all: - field: age op: gte value: 18 - field: country op: equals value: DE# OR — at least one condition must matchconditions: any: - field: role op: equals value: admin - field: role op: equals value: editor# Nested — (status = active) AND (role = admin OR role = editor)conditions: all: - field: status op: equals value: active - any: - field: role op: equals value: admin - field: role op: equals value: editorOperators
Section titled “Operators”| Operator | Description | Value type |
|---|---|---|
equals | Exact match | any |
not_equals | Not equal | any |
gt | Greater than | number, string |
gte | Greater than or equal | number, string |
lt | Less than | number, string |
lte | Less than or equal | number, string |
contains | Substring match | string |
not_contains | No substring match | string |
starts_with | Prefix match | string |
ends_with | Suffix match | string |
in | Value in list | array |
not_in | Value not in list | array |
is_null | Field is null | (none) |
is_not_null | Field is not null | (none) |
matches | Regex match | string (regex pattern) |
Progressive examples
Section titled “Progressive examples”Simple equality
Section titled “Simple equality”conditions: all: - field: status op: equals value: activeRange comparison
Section titled “Range comparison”conditions: all: - field: score op: gte value: 50 - field: score op: lt value: 100Multiple conditions
Section titled “Multiple conditions”conditions: all: - field: email op: is_not_null - field: score op: gte value: 50 - field: source op: not_in value: [spam, test]NULL handling
Section titled “NULL handling”# Keep rows where phone is not nullconditions: all: - field: phone op: is_not_null
# Keep rows where phone IS nullconditions: all: - field: phone op: is_nullRegex match
Section titled “Regex match”conditions: all: - field: zip_code op: matches value: "^[0-9]{5}$"List membership
Section titled “List membership”conditions: all: - field: country op: in value: [DE, AT, CH]Substring search
Section titled “Substring search”conditions: all: - field: email op: contains value: "@company.com"Promoted ports
Section titled “Promoted ports”Filter values can come from upstream nodes instead of static config. This makes thresholds dynamic.
nodes: threshold: type: value.literal config: value: 0.8 type: number
high-scores: type: data.filter config: conditions: all: - field: score op: gte value: "{{threshold}}"
edges: - "threshold.value -> high-scores.threshold" - "scores-table.output -> high-scores.input"Change the upstream value and the filter adapts.
Edge cases
Section titled “Edge cases”Empty result. If no rows match, the output is an empty NDJSON file (zero lines) with the same schema as the input.
All rows filtered. Same as empty result — valid output, zero rows. Downstream nodes receive an empty table.
NULL comparisons. Comparisons against NULL follow SQL semantics: NULL = anything is false. Use is_null and is_not_null operators for null checks.
Pipeline example
Section titled “Pipeline example”name: qualified-leadsversion: 1
nodes: raw-leads: type: file.source path: leads.csv format: csv
qualified: type: data.filter config: conditions: all: - field: email op: is_not_null - field: score op: gte value: 50 - field: source op: not_in value: [spam, test]
edges: - "raw-leads.data -> qualified.input"