Skip to content

Security Model

Radhflow runs untrusted code in sandboxed containers. Here’s how.

SQL from users. CLI tools from package managers. Code generated by AI. The security model treats every node as potentially hostile and isolates it accordingly.

Radhflow protects against three classes of threats:

  • Malicious nodes. A node that tries to read files outside its workspace, access other nodes’ data, or execute arbitrary system commands.
  • Data exfiltration. A node that tries to send data to an external server, either through network requests or DNS tunneling.
  • Resource exhaustion. A node that consumes unbounded CPU, memory, or disk — intentionally or through a bug — and starves other nodes or the host.

Each node runs inside multiple isolation boundaries. The layers stack:

┌──────────────────────────────────────────────┐
│ Docker Container │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ DuckDB sandbox (Tier 1) │ │
│ │ SQL nodes — no FS, no net, no exec │ │
│ └─────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ bubblewrap + nix-shell (Tier 2) │ │
│ │ CLI nodes — isolated FS, no net (def.) │ │
│ └─────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────┘

Docker is the outer boundary. The entire Radhflow runtime runs inside a container. This provides:

  • Filesystem isolation from the host
  • Network namespace separation
  • Resource limits (CPU, memory) via cgroup constraints
  • A reproducible environment regardless of the host OS

Node dependencies are declared in Nix expressions. No global PATH. No ambient system packages. Each node gets exactly the tools it declares and nothing else.

{ pkgs }: pkgs.mkShell {
buildInputs = [ pkgs.jq pkgs.curl pkgs.imagemagick ];
}

This is auditable — you can read the Nix expression to see exactly what a node has access to. No hidden transitive dependencies, no version drift.

CLI nodes run inside bubblewrap, which provides Linux namespace isolation:

CapabilityDefault
FilesystemRead-only root, read-write workspace only
NetworkNone (blocked by default)
ProcessesIsolated PID namespace
UsersUnprivileged, mapped UID
IPCIsolated

To grant network access, the node spec must explicitly declare it:

sandbox:
network: true # default: false

Nodes have no network access unless explicitly granted. When network access is enabled, it applies only to that specific node. Other nodes in the same pipeline remain isolated.

SQL nodes run inside DuckDB’s in-process query engine. No server, no daemon, no open ports.

CapabilityAllowed
Read input tablesYes
Write output tablesYes
Filesystem accessNo
Network accessNo
Execute commandsNo
Access environment variablesNo

This is the tightest sandbox — and covers the majority of data transform operations.

Secrets are stored encrypted in the runtime database (state.db inside .rf/). Nodes access secrets through environment variables that the executor injects at runtime. Secrets never appear in:

  • flow.yaml or node.yaml files
  • Git history
  • Execution logs
  • NDJSON output files

Mark parameters as secrets in node.yaml:

params:
api_key:
type: string
required: true
secret: true # stored encrypted, injected as env var

Nodes see a restricted filesystem:

PathAccessContents
Node workspace (nodes/<slug>/)Read-writeImplementation files, artifacts
Pipeline rootRead-onlyflow.yaml, other node directories
.rf/Write (executor only)State database, run logs
System pathsNoneBlocked by bubblewrap

Nodes cannot read other nodes’ artifacts or write outside their workspace. The executor moves data between nodes — nodes never access each other directly.

Each node execution has hard limits:

ResourceDefaultConfigurable
CPU time30 secondsYes, via sandbox.timeout in node.yaml
Memory512 MBYes, via Docker container limits
Disk writesNode workspace onlyNo
Concurrent processes1 PID namespaceNo

If a node exceeds its timeout, the executor kills it and marks the node as failed. The pipeline halts at that node — downstream nodes do not execute.

Every execution is logged:

  • Execution logs. Each rf run records which nodes ran, how long they took, and whether they succeeded or failed. Logs are stored in .rf/runs/.
  • Git history. Every change to flow.yaml, node.yaml, and implementation files is tracked. You can diff what changed between pipeline versions.
  • Schema validation results. The executor logs which schemas were checked and whether they passed.