data.filter
data.filter keeps rows that match a set of conditions. Rows that fail the conditions are dropped. The output schema is identical to the input — same fields, fewer rows.
Input: Table Output: Table (same schema)
Basic config
Section titled “Basic config”nodes: active-users: type: data.filter config: conditions: all: - field: status op: equals value: activeCondition groups
Section titled “Condition groups”Conditions are wrapped in all (AND) or any (OR). Groups nest arbitrarily.
# AND — all conditions must matchconditions: all: - field: age op: gte value: 18 - field: country op: equals value: DE# OR — at least one condition must matchconditions: any: - field: role op: equals value: admin - field: role op: equals value: editor# Nested — (status = active) AND (role = admin OR role = editor)conditions: all: - field: status op: equals value: active - any: - field: role op: equals value: admin - field: role op: equals value: editorOperators
Section titled “Operators”| Operator | Description | Value type |
|---|---|---|
equals | Exact match | any |
not_equals | Not equal | any |
gt | Greater than | number, string |
gte | Greater than or equal | number, string |
lt | Less than | number, string |
lte | Less than or equal | number, string |
contains | Substring match | string |
not_contains | No substring match | string |
starts_with | Prefix match | string |
ends_with | Suffix match | string |
in | Value in list | array |
not_in | Value not in list | array |
is_null | Field is null | (none) |
is_not_null | Field is not null | (none) |
matches | Regex match | string (regex pattern) |
Operator examples
Section titled “Operator examples”# Numeric comparison- field: score op: gte value: 0.75
# List membership- field: country op: in value: [DE, AT, CH]
# Substring search- field: email op: contains value: "@company.com"
# Prefix match- field: sku op: starts_with value: "PROD-"
# Null check (no value needed)- field: phone op: is_not_null
# Regex match- field: zip_code op: matches value: "^[0-9]{5}$"Promoted ports
Section titled “Promoted ports”Filter values can come from upstream nodes instead of static config. When a field is marked as a promoted port, its value is read from an incoming edge at runtime.
nodes: threshold: type: value.literal config: value: 0.8 type: number
high-scores: type: data.filter config: conditions: all: - field: score op: gte value: "{{threshold}}"
edges: - threshold.value -> high-scores.threshold - scores-table.output -> high-scores.inputThis makes the filter threshold dynamic — change the upstream value and the filter adapts.
Pipeline example
Section titled “Pipeline example”nodes: raw-leads: type: file.csv config: path: leads.csv
qualified: type: data.filter config: conditions: all: - field: email op: is_not_null - field: score op: gte value: 50 - field: source op: not_in value: [spam, test]
edges: - raw-leads.output -> qualified.input