06 · OpenShell — The Sandbox Runtime (deep dive)

What lives in OpenShell/: NVIDIA's general‑purpose "safe, private runtime for autonomous AI agents." It knows nothing about OpenClaw. It isolates any agent binary (Claude Code, Codex, OpenCode, Copilot, Ollama, and — via NemoClaw — OpenClaw) behind a declarative YAML policy.

What it actually is

OpenShell is a CLI + control plane (a Rust/Python project). On first use it spins up a gateway (literally: a K3s Kubernetes cluster wrapped inside a single Docker container) and then lets you create sandboxes — isolated containers inside that cluster where an agent runs. Every byte of egress from a sandbox passes through a policy engine the gateway owns.

┌─────────────────────── host ───────────────────────┐
│                                                    │
│   openshell CLI  ────►  OpenShell gateway          │
│                         (K3s-in-Docker cluster)    │
│                         ├── policy engine          │
│                         ├── credential store       │
│                         ├── inference proxy        │
│                         │                          │
│                         └── sandbox: agent + L7    │
│                             proxy + namespace +    │
│                             Landlock + seccomp     │
│                                                    │
└────────────────────────────────────────────────────┘

The four building blocks

1. Gateway

The control plane. Provisions the cluster on first use, talks to sandboxes, stores credentials, proxies inference, enforces policy. Created once; shared by all sandboxes on that host. A remote gateway can run on a different host (--remote user@host).

2. Sandbox

An isolated container with its own process namespace, filesystem namespace, and network namespace. You create one per agent:

openshell sandbox create -- claude           # base image + Claude Code
openshell sandbox create --from openclaw     # community image + OpenClaw
openshell sandbox create --from ./my-dir     # local Dockerfile
openshell sandbox create --gpu --from ...    # GPU passthrough

Each sandbox is deny‑by‑default for egress. A fresh sandbox cannot reach api.github.com until a policy opens it.

3. Policy Engine — four layers, two lifecycles

Layer	Controls	Lifecycle
Filesystem	Reads/writes outside allowed paths	Locked at sandbox creation (Landlock)
Process	Privilege escalation, dangerous syscalls	Locked at sandbox creation (seccomp)
Network	Outbound connections, HTTP methods, paths	Hot‑reloadable via `openshell policy set`
Inference	Model API calls → controlled backends	Hot‑reloadable

Static layers (filesystem, process) need a new sandbox to change. Dynamic layers (network, inference) you can swap while the sandbox is running.

4. Provider

A named credential bundle. ANTHROPIC_API_KEY, NVIDIA_API_KEY, TELEGRAM_BOT_TOKEN — none of these ever land on the sandbox filesystem. You create a provider on the host:

openshell provider create --type anthropic --from-existing

…and OpenShell injects the credential into sandbox env vars at runtime only. For network egress that needs the real token in a header or URL path, the L7 proxy substitutes a placeholder with the real value as the request leaves the sandbox — the sandbox itself never sees the secret.

Inference routing — the most important trick

An agent inside a sandbox thinks it is calling a normal OpenAI/Anthropic/NVIDIA endpoint. Actually:

The agent hits https://inference.local (or an equivalent interposed DNS name).
The sandbox's L7 proxy catches it.
The OpenShell gateway strips the caller's "credential" (which is a placeholder), injects the real backend credential, and forwards upstream to the actual provider you configured.
The response streams back through the proxy.

This means: to switch the agent's backing model, you don't change the agent — you run openshell inference set --provider <p> --model <m>.

Network policy, concretely

A policy is a YAML file. Core shape:

network_policies:
  claude_code:                     # policy name
    name: claude_code
    endpoints:
      - host: api.anthropic.com
        port: 443
        protocol: rest
        enforcement: enforce
        tls: terminate              # OpenShell MITMs TLS to see methods/paths
        rules:
          - allow: { method: POST, path: "/v1/messages" }
          - allow: { method: GET,  path: "/v1/messages/batches/**" }
    binaries:
      - { path: /usr/local/bin/claude }   # only this binary may use it

Three things make this powerful:

tls: terminate — OpenShell terminates TLS inside the sandbox and re‑encrypts outbound so it can inspect HTTP method and path, not just host.
Binary pinning — a policy block only applies to listed binaries. A compromised tool that isn't on the list cannot ride a neighbor's allowlist.
Hot reload — openshell policy set <name> --policy file.yaml --wait swaps the running policy without restarting the sandbox.

Outcomes of a request reach one of three verdicts:

Allow — host+path+binary all match.
Route for inference — credential swap + forward.
Deny — blocked and logged (the operator can approve it in openshell term).

The commands that matter

openshell sandbox create --from openclaw         # create
openshell sandbox connect [name]                 # SSH into it
openshell sandbox list
openshell policy set <name> --policy file.yaml   # hot-reload network policy
openshell policy get <name>
openshell inference set --provider <p> --model <m>
openshell provider create --type <t> --from-existing
openshell logs [name] --tail
openshell term                                   # k9s-like TUI — the approval UI

openshell term is where you watch live traffic and approve / deny blocked egress in real time. Approvals persist for the session, not the baseline policy file.

What OpenShell does not do

It does not know which agent is "good" or "bad" — it just runs whichever binary you tell it to in a box.
It does not write your policy for you (though there is a generate-sandbox-policy agent skill).
It does not manage OpenClaw specifics: channel tokens, onboarding wizards, workspace migration, model pinning. That is NemoClaw's job.

Next: 07-openclaw.md — the assistant that actually runs inside the sandbox.