Sandbox: Subprocess-Isolated Code Execution

The agent’s run_python_analysis() tool runs LLM-generated Python in an OS-level isolated subprocess. This page documents what the sandbox protects against, what it deliberately does not protect against, and how to tune it.

Threat Model

The sandbox is built for one specific risk: a hijacked agent prompt emitting code that exfiltrates secrets or escapes the agent’s narrow output zone. Concretely the threat actor is the LLM itself (or a user prompt-injecting it), not a remote network attacker.

In-scope threats

  • Reading API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY etc.) from os.environ.

  • Reading raw study data (data/raw/) or staged data (tmp/) outside the PHI-scrubbed trio bundle.

  • Writing files anywhere outside output/{STUDY}/agent/analysis/.

  • Exhausting host resources via infinite loops, fork bombs, or huge memory allocations.

  • Opening network sockets to exfiltrate data to a remote host.

  • Bypassing the in-process AST guard via novel CPython gadgets.

Out-of-scope (explicit non-goals)

  • Defending against an attacker with shell access on the host.

  • Defending against malicious dependencies (pandas, numpy, etc.) — those are trusted and pinned in pyproject.toml.

  • Side-channel attacks on the same machine.

  • Protecting the user’s own DataFrame contents from the user — the sandbox is a guard between the LLM and the host, not between the user and their own data.

Architecture

The package scripts/ai_assistant/sandbox/ has three components:

__init__.py — the orchestrator

Public API run_in_subprocess(code, ...). Builds a clean child environment (no *_API_KEY, no PYTHONPATH from the parent), spawns the child via subprocess.run() with a wall-clock timeout, applies a preexec_fn from limits.py, parses the child’s JSON manifest, and validates that every figure / code path the manifest claims is actually inside output_dir (defense against a malicious child fabricating manifest paths).

runner.py — the child entry

Invoked as python -m scripts.ai_assistant.sandbox.runner <spec>. Carries the in-process AST/runtime guards (import allow-list, blocked-builtin call check, dunder filter on getattr / vars, zone-guarded open) — these run inside the subprocess, so even if process-level isolation has a flaw, the AST guards still bound what the code can do. Loads pre-approved trio DataFrames from explicit paths in the spec (the runner does not import the project config module — its read/write zones come solely from the spec).

limits.py — cross-platform rlimits

make_preexec_fn sets RLIMIT_CPU (CPU time), RLIMIT_NOFILE (file descriptors), RLIMIT_AS (address space, Linux), and RLIMIT_NPROC (process count, Linux) before the child program is exec’d.

Defense in Depth

Two layers, deliberately redundant:

  1. Subprocess isolation. Clean env (no API keys), clean cwd, clean PYTHONPATH except for what the runner needs to import, OS-enforced resource limits, separate process boundary.

  2. AST + runtime guards inside the child. Import allow-list, blocked-builtin AST check (eval, compile, __import__, etc.), dunder filter on getattr/vars, zone-guarded builtins.open confined to output_dir for writes and the pre-loaded JSONL set + output_dir for reads.

Either layer alone would block the headline threat. Both together make a CVE-class CPython escape needed to do real damage.

The macOS Asymmetry

This is the load-bearing caveat:

  • Linux is the production deployment target. RLIMIT_AS, RLIMIT_NPROC, RLIMIT_CPU, and RLIMIT_NOFILE all enforce reliably. A 2 GB memory allocation hits RLIMIT_AS and the child dies. A fork bomb hits RLIMIT_NPROC. A CPU spin loop hits RLIMIT_CPU and the child receives SIGXCPU.

  • macOS is the developer environment. RLIMIT_CPU and RLIMIT_NOFILE work. RLIMIT_DATA is set on best-effort but not strictly honored. RLIMIT_AS and RLIMIT_NPROC are effectively no-ops on Darwin and we do not pretend otherwise.

The CI pipeline runs on Ubuntu and exercises every test in tests/security/test_sandbox_isolation.py including the three @pytest.mark.skipif(sys.platform != "linux", ...) cases. On macOS the same tests skip with a clear marker. If you change the sandbox, run the suite locally then verify the Linux-only tests pass on CI before merging.

Configurable Knobs

Operational tunables live in config.py and are env-overridable. They are safe in the sense that lowering any of them only tightens the security envelope — none of these can weaken the trust boundary.

Setting

Default

Effect

ANALYSIS_TIMEOUT

300 s

Wall-clock kill at this many seconds.

ANALYSIS_MAX_OUTPUT

200_000

Cap on captured stdout (bytes).

ANALYSIS_MAX_FIGURES

20

Cap on collected figures per run.

SANDBOX_MAX_MEMORY_MB

512

RLIMIT_AS cap on Linux.

SANDBOX_MAX_PROCS

64

RLIMIT_NPROC cap on Linux.

SANDBOX_MAX_FILES

64

RLIMIT_NOFILE cap.

SANDBOX_PERSIST_CODE

true

Save executed code as .py.

What is not configurable from config.py (intentional):

  • The import allow-list (_ALLOWED_IMPORTS in runner.py).

  • The blocked builtins list (_BLOCKED_BUILTINS).

  • The dunder filter list (_BLOCKED_DUNDERS).

  • The env-var blocklist prefixes (_BLOCKED_PREFIXES in __init__.py).

  • The env-var allow-list (_SAFE_ENV_KEYS).

Adding to any of those is a security-relevant change and must be a code change reviewed in a PR — not a config flip.

Code Persistence and Replication

When SANDBOX_PERSIST_CODE is true (default), every successful sandbox run also saves the executed code as a .py file under output/{STUDY}/agent/analysis/code/run_<ISO_TIMESTAMP>_<UUID>.py. The file leads with a docstring header listing the pre-loaded DataFrames and pointing the user at the replication helper:

python -m scripts.ai_assistant.sandbox.replicate \
    output/{STUDY}/agent/analysis/code/run_2026-04-27T01-23-45Z_a1b2c3d4.py

The replication helper applies the same AST allow-list (defense in depth on locally re-run code) and then executes the code in the caller’s current Python process — so the user can see output unfiltered, write files to their working directory, and interact with figures normally.

The Streamlit UI surfaces saved code through a new <RPLN_CODE:...> marker rendered as a collapsible code block plus a download button — the user can copy the source from the rendered block or download the .py file directly.

Where the Code is Not Saved

The agent’s pre-execution rejections (AST guard, blocked import, syntax error) do not produce a saved file — there’s no useful code to replicate. Same for runtime errors: the file is only written after a successful run.

Tests

tests/security/test_sandbox_isolation.py covers the three contracts:

  • Confidentiality — env-var leak, blocked-prefix sweep, read-zone enforcement.

  • Integrity — write-zone strictness, manifest-path traversal rejection, AST guard preservation.

  • Availability — wall-clock timeout, RLIMIT_CPU / RLIMIT_AS / RLIMIT_NPROC (Linux), network-import blocked.

Plus legitimate-use tests proving the sandbox still does its day job (pandas group-by, plotly JSON write, matplotlib PNG save) and the new code-persistence tests (file written, marker emitted, header includes the DataFrame names, persistence togglable).

Run them with:

uv run pytest tests/security/test_sandbox_isolation.py -v

Total runtime is ~75 s on macOS (subprocess startup is the bottleneck).

Future Work

Out of scope for the current runtime; tracked for later:

  • Convert trio JSONL → parquet so the child loads DataFrames with mmap instead of re-parsing JSON on every call (~80 % of the per-call overhead).

  • Add seccomp-bpf syscall filtering on Linux for stronger network-egress denial than the import allow-list alone.

  • Add an opt-in nsjail/Docker profile for high-assurance deployments where even a CVE-class CPython escape is in scope.

  • Add code-retention auto-cleanup based on SANDBOX_CODE_RETENTION_DAYS (currently kept indefinitely).

When You Touch This Code

  • Adding a new allowed import is a security change. Open a PR, document why the new module is safe to expose to LLM-generated code, and add a regression test that exercises it through the sandbox.

  • Loosening any of the rlimits is a security change. Document the rationale in the PR description and the IRB conformance matrix if the change affects the agent boundary’s posture.

  • Changing the env allow-list is a security change. The default list (PATH, LANG, LC_ALL, TZ, PYTHONPATH) is the minimum the child needs to import its dependencies; adding anything risks leaking parent state into the child.

  • Tweaking timeouts or memory caps is operational, not security: config.py is the right place. New env-var knobs go through _get_env_int so the env layer behaves identically to the YAML overlay.