Sandbox: Subprocess-Isolated Code Execution

The agent’s run_python_analysis() tool runs LLM-generated Python in an OS-level isolated subprocess. This page documents what the sandbox protects against, what it deliberately does not protect against, and how to tune it.

Threat Model 

The sandbox is built for one specific risk: a hijacked agent prompt emitting code that exfiltrates secrets or escapes the agent’s narrow output zone. Concretely the threat actor is the LLM itself (or a user prompt-injecting it), not a remote network attacker.

In-scope threats

Reading API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY etc.) from os.environ.
Reading raw study data (data/raw/) or staged data (tmp/) outside the PHI-scrubbed trio bundle.
Writing files anywhere outside output/{STUDY}/agent/analysis/.
Exhausting host resources via infinite loops, fork bombs, or huge memory allocations.
Opening network sockets to exfiltrate data to a remote host.
Bypassing the in-process AST guard via novel CPython gadgets.

Out-of-scope (explicit non-goals)

Defending against an attacker with shell access on the host.
Defending against malicious dependencies (pandas, numpy, etc.) — those are trusted and pinned in pyproject.toml.
Side-channel attacks on the same machine.
Protecting the user’s own DataFrame contents from the user — the sandbox is a guard between the LLM and the host, not between the user and their own data.

Architecture 

The package scripts/ai_assistant/sandbox/ has three components:

__init__.py — the orchestrator: Public API run_in_subprocess(code, ...). Builds a clean child environment (no *_API_KEY, no PYTHONPATH from the parent), spawns the child via subprocess.run() with a wall-clock timeout, applies a preexec_fn from limits.py, parses the child’s JSON manifest, and validates that every figure / code path the manifest claims is actually inside output_dir (defense against a malicious child fabricating manifest paths).
runner.py — the child entry: Invoked as python -m scripts.ai_assistant.sandbox.runner <spec>. Carries the in-process AST/runtime guards (import allow-list, blocked-builtin call check, dunder filter on getattr / vars, zone-guarded open) — these run inside the subprocess, so even if process-level isolation has a flaw, the AST guards still bound what the code can do. Loads pre-approved trio DataFrames from explicit paths in the spec (the runner does not import the project config module — its read/write zones come solely from the spec).
limits.py — cross-platform rlimits: make_preexec_fn sets RLIMIT_CPU (CPU time), RLIMIT_NOFILE (file descriptors), RLIMIT_AS (address space, Linux), and RLIMIT_NPROC (process count, Linux) before the child program is exec’d.

Defense in Depth 

Two layers, deliberately redundant:

Subprocess isolation. Clean env (no API keys), clean cwd, clean PYTHONPATH except for what the runner needs to import, OS-enforced resource limits, separate process boundary.
AST + runtime guards inside the child. Import allow-list, blocked-builtin AST check (eval, compile, __import__, etc.), dunder filter on getattr/vars, zone-guarded builtins.open confined to output_dir for writes and the pre-loaded JSONL set + output_dir for reads.

Either layer alone would block the headline threat. Both together make a CVE-class CPython escape needed to do real damage.

The macOS Asymmetry 

This is the load-bearing caveat:

Linux is the production deployment target. RLIMIT_AS, RLIMIT_NPROC, RLIMIT_CPU, and RLIMIT_NOFILE all enforce reliably. A 2 GB memory allocation hits RLIMIT_AS and the child dies. A fork bomb hits RLIMIT_NPROC. A CPU spin loop hits RLIMIT_CPU and the child receives SIGXCPU.
macOS is the developer environment. RLIMIT_CPU and RLIMIT_NOFILE work. RLIMIT_DATA is set on best-effort but not strictly honored. RLIMIT_AS and RLIMIT_NPROC are effectively no-ops on Darwin and we do not pretend otherwise.

The CI pipeline runs on Ubuntu and exercises every test in tests/security/test_sandbox_isolation.py including the three @pytest.mark.skipif(sys.platform != "linux", ...) cases. On macOS the same tests skip with a clear marker. If you change the sandbox, run the suite locally then verify the Linux-only tests pass on CI before merging.

Configurable Knobs 

Operational tunables live in config.py and are env-overridable. They are safe in the sense that lowering any of them only tightens the security envelope — none of these can weaken the trust boundary.

Setting	Default	Effect
`ANALYSIS_TIMEOUT`	300 s	Wall-clock kill at this many seconds.
`ANALYSIS_MAX_OUTPUT`	200_000	Cap on captured stdout (bytes).
`ANALYSIS_MAX_FIGURES`	20	Cap on collected figures per run.
`SANDBOX_MAX_MEMORY_MB`	512	`RLIMIT_AS` cap on Linux.
`SANDBOX_MAX_PROCS`	64	`RLIMIT_NPROC` cap on Linux.
`SANDBOX_MAX_FILES`	64	`RLIMIT_NOFILE` cap.
`SANDBOX_PERSIST_CODE`	true	Save executed code as `.py`.

What is not configurable from config.py (intentional):

The import allow-list (_ALLOWED_IMPORTS in runner.py).
The blocked builtins list (_BLOCKED_BUILTINS).
The dunder filter list (_BLOCKED_DUNDERS).
The env-var blocklist prefixes (_BLOCKED_PREFIXES in __init__.py).
The env-var allow-list (_SAFE_ENV_KEYS).

Adding to any of those is a security-relevant change and must be a code change reviewed in a PR — not a config flip.

Code Persistence and Replication 

When SANDBOX_PERSIST_CODE is true (default), every successful sandbox run also saves the executed code as a .py file under output/{STUDY}/agent/analysis/code/run_<ISO_TIMESTAMP>_<UUID>.py. The file leads with a docstring header listing the pre-loaded DataFrames and pointing the user at the replication helper:

python -m scripts.ai_assistant.sandbox.replicate \
    output/{STUDY}/agent/analysis/code/run_2026-04-27T01-23-45Z_a1b2c3d4.py

The replication helper applies the same AST allow-list (defense in depth on locally re-run code) and then executes the code in the caller’s current Python process — so the user can see output unfiltered, write files to their working directory, and interact with figures normally.

The Streamlit UI surfaces saved code through a new <RPLN_CODE:...> marker rendered as a collapsible code block plus a download button — the user can copy the source from the rendered block or download the .py file directly.

Where the Code is Not Saved 

The agent’s pre-execution rejections (AST guard, blocked import, syntax error) do not produce a saved file — there’s no useful code to replicate. Same for runtime errors: the file is only written after a successful run.

Tests 

tests/security/test_sandbox_isolation.py covers the three contracts:

Confidentiality — env-var leak, blocked-prefix sweep, read-zone enforcement.
Integrity — write-zone strictness, manifest-path traversal rejection, AST guard preservation.
Availability — wall-clock timeout, RLIMIT_CPU / RLIMIT_AS / RLIMIT_NPROC (Linux), network-import blocked.

Plus legitimate-use tests proving the sandbox still does its day job (pandas group-by, plotly JSON write, matplotlib PNG save) and the new code-persistence tests (file written, marker emitted, header includes the DataFrame names, persistence togglable).

Run them with:

uv run pytest tests/security/test_sandbox_isolation.py -v

Total runtime is ~75 s on macOS (subprocess startup is the bottleneck).

Future Work 

Out of scope for the current runtime; tracked for later:

Convert trio JSONL → parquet so the child loads DataFrames with mmap instead of re-parsing JSON on every call (~80 % of the per-call overhead).
Add seccomp-bpf syscall filtering on Linux for stronger network-egress denial than the import allow-list alone.
Add an opt-in nsjail/Docker profile for high-assurance deployments where even a CVE-class CPython escape is in scope.
Add code-retention auto-cleanup based on SANDBOX_CODE_RETENTION_DAYS (currently kept indefinitely).

When You Touch This Code 

Adding a new allowed import is a security change. Open a PR, document why the new module is safe to expose to LLM-generated code, and add a regression test that exercises it through the sandbox.
Loosening any of the rlimits is a security change. Document the rationale in the PR description and the IRB conformance matrix if the change affects the agent boundary’s posture.
Changing the env allow-list is a security change. The default list (PATH, LANG, LC_ALL, TZ, PYTHONPATH) is the minimum the child needs to import its dependencies; adding anything risks leaking parent state into the child.
Tweaking timeouts or memory caps is operational, not security: config.py is the right place. New env-var knobs go through _get_env_int so the env layer behaves identically to the YAML overlay.