Skip to content

Performance Results

Reference results from local benchmark artifacts. Guest measurements come from capsem-bench 0.3.0; lifecycle, fork, host-native, Criterion, and VM-originated Security Engine measurements are host-side benchmark artifacts. The current Linux artifact set was refreshed on 2026-05-29 with just benchmark. Numbers vary with host load, network path, and cache state. Performance runs should be recorded with just benchmark so artifacts include architecture, host metadata, git commit, and an optional stable run id.

Total time from VM start to shell ready: ~580ms.

StageDurationDescription
squashfs10msMount compressed rootfs from virtio block device
virtiofs<1msMount VirtioFS shared directory
overlayfs80msCreate ext4 loopback overlay (format + mount)
workspace<1msBind-mount /root from VirtioFS
network210msConfigure dummy0 and iptables DNS/HTTPS redirect rules
dns_proxytracked separatelyStart UDP/TCP DNS bridge to host vsock:5007
net_proxy100msStart TCP-to-vsock HTTPS proxy
deploy10msCopy tools from initrd to rootfs
venv170msCreate Python virtualenv (via uv)
agent_start<1msLaunch PTY agent, connect vsock
Total~580ms

The diagnostic suite enforces boot time stays under 1 second. The two heaviest stages are network setup (iptables rule installation) and venv creation.

Scratch disk performance on the VirtioFS-backed workspace (/root). Test size: 256MB.

TestThroughputIOPSDuration
Sequential write (1MB blocks)156.9 MB/s-1,631.6ms
Sequential read (1MB blocks)352.8 MB/s-725.5ms
Random 4K write (fdatasync)10.8 MB/s2,7773,601.1ms
Random 4K read29.1 MB/s7,4401,344.2ms

Sequential I/O reflects the active host filesystem and hypervisor backend. Random write IOPS are limited by per-write fdatasync — this reflects the worst case for database-style workloads.

Read-only squashfs rootfs where binaries and libraries live.

TestDetailThroughputIOPSDuration
Sequential read (1MB)Claude binary (228.5MB)189.1 MB/s-1,208.6ms
Random 4K read2,612 files sampled6.3 MB/s1,6203,086.0ms
Large binary cold reads3 binaries, 668.8MB total188.1 MB/s-3,556.6ms
Small JS/package reads113 files sampled671.0 MB/s79,606 ops/s62.8ms
Metadata stat walk6,573 entries-42,384 stats/s155.1ms

Squashfs decompression adds overhead compared to the scratch disk. Random reads across many small files show the cost of decompression + inode lookup on a compressed filesystem.

Wall-clock time to run <cli> --version with page cache dropped (3 runs, best/mean/worst).

CLIMinMeanMax
python331.1ms36.6ms47.1ms
node295.7ms298.1ms299.6ms
claude1,287.4ms1,388.7ms1,439.6ms
gemini2,976.6ms3,092.2ms3,279.6ms
codex817.1ms835.6ms872.5ms

Python starts near-instantly. Node-based CLIs and native agent CLIs generally start in the low hundreds of milliseconds.

50 GET requests to https://www.google.com/ with concurrency 5, routed through the MITM proxy.

MetricValue
Requests50/50
Requests/sec61.4
Transfer3.8MB
Total duration814.2ms
Latency percentileValue
min47.4ms
p5054.3ms
p95281.5ms
p99287.0ms
max290.0ms

Latency includes the full path: guest -> net-proxy -> vsock -> host MITM proxy -> TLS termination -> internet -> re-encryption -> response. The tail mostly reflects upstream internet latency and TLS/session setup.

Reference file download through the MITM proxy.

MetricValue
Downloaded9.98MB
Duration0.532s
Throughput17.89 MB/s

This is the sustained bandwidth ceiling for the proxy pipeline (TLS termination + body inspection + re-encryption). Actual throughput varies with internet connection speed.

End-to-end latency for snapshot operations via the guest MCP endpoint at 3 workspace sizes. Each operation is a full round-trip: guest CLI -> framed vsock -> host endpoint -> host filesystem -> response.

OperationLatency
create2,945.6ms
list935.2ms
changes934.1ms
revert933.5ms
delete945.3ms
OperationLatency
create1,052.9ms
list946.4ms
changes946.7ms
revert943.5ms
delete974.2ms
OperationLatency
create1,030.6ms
list957.8ms
changes995.8ms
revert956.4ms
delete980.3ms

The 10-file create is slower than 100/500 because it includes the first MCP handshake (JSON-RPC initialize). Subsequent operations reuse the connection. List and changes scale modestly with file count. The host gateway-side latency is typically 3-20ms — the rest is vsock + MCP protocol overhead.

Host-side latency for individual VM operations. Measured over 3 provision/exec/delete cycles on the same service instance.

OperationMinMeanMaxDescription
provision2,238.2ms2,240.3ms2,243.4msCreate and boot a temporary VM
exec_ready23.3ms25.0ms28.3msFirst ready check after provisioning
exec23.0ms23.7ms24.2msSimple echo ok on running VM
delete166.8ms167.2ms167.5msVM teardown request
total2,454.2ms2,456.2ms2,457.3ms

Provision includes the boot path, so it carries the bulk of lifecycle latency. Exec and ready checks are low-latency once the VM is running.

Run: uv run pytest tests/capsem-serial/test_lifecycle_benchmark.py::test_lifecycle_benchmark -xvs

Host-side latency for fork (image creation) and boot-from-image. Measured over 3 cycles: create VM, install jq, write workspace files, fork, boot from image, verify data survived.

MetricMinMeanMaxGateDescription
fork114.6ms115.1ms115.4ms500msReflink/sparse-preserving copy of rootfs overlay + workspace
image_size91.8MB101.1MB105.8MB128MBActual disk (blocks), not logical sparse size
boot_provision1,485.6ms1,514.1ms1,529.4ms1,200msClone image into new session + boot
boot_ready26.1ms29.8ms35.3ms1,200msFirst ready check after provisioning

Fork is fast because the backend uses copy-on-write or sparse-preserving copy paths where available. Image size reports actual allocated blocks, not the logical sparse file size. Both rootfs overlay changes (installed packages) and workspace files (/root/) survive fork.

Regression gates: fork < 500ms, image < 16MB, packages + workspace must survive every run.

Run: uv run pytest tests/capsem-serial/test_lifecycle_benchmark.py::test_fork_benchmark -xvs

Security Engine CEL microbench (host-side)

Section titled “Security Engine CEL microbench (host-side)”

Current host-side microbenchmark artifact: benchmarks/security-engine/data_1.2.1779673506_x86_64_cel_microbench.json. Detection IR parse/lowering artifact: benchmarks/security-engine/data_1.2.1779673506_x86_64_security_packs_microbench.json.

These are Rust Criterion microbenchmarks for canonical policy-context CEL paths and Detection IR pack parsing/lowering. They are not VM-originated benchmarks and should not be used as end-to-end latency claims.

BenchmarkSlope
Compile http.request.host.contains("google")18.1us
Compile full HTTP policy109.0us
Evaluate http.request.host.contains("google")39.8us
Evaluate http.request.header("authorization").exists()46.8us
Evaluate full HTTP policy66.1us
Evaluate full HTTP policy as last match across 100 rules3.47ms
Detection finding for full HTTP policy66.5us
Detection finding as last match across 100 rules3.46ms
Dedupe 100 backtest rows / 100 unique signatures67.1us
Dedupe 1,000 backtest rows / 100 unique signatures584.4us
Runtime registry install/update of one rule202.6ns
Runtime registry projection of 100 enabled rules23.6us
Runtime projection and compile of 100 enforcement rules512.3us
Runtime projection and compile of 100 detection rules534.4us
Rebuild engine from 100 enforcement and 100 detection rules1.05ms
Update one existing rule and rebuild 100-rule plan688.8us
Project SecurityEvent to PolicyContext903.1ns
Project and serialize PolicyContext6.8us
Native Rust lookup for equivalent HTTP policy40.4ns
Parse and validate Detection IR Google-secret fixture409.9us
Lower Detection IR Google-secret fixture to CEL rules1.5us
Lower 100 Detection IR HTTP rules to CEL rules190.2us
Lower and compile 100 Detection IR HTTP rules7.2ms

Run:

Terminal window
just benchmark

Security Engine process enforcement (VM-originated)

Section titled “Security Engine process enforcement (VM-originated)”

Current VM-originated benchmark artifact: benchmarks/security-engine/data_1.2.1779673506_x86_64_process_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks shell process exec, sends eight blocked exec requests, and verifies the response, runtime match counters, canonical session.db security events, and logs exposure.

MetricValue
Runs8
Gate750ms mean
Min blocked exec latency13.758ms
Mean blocked exec latency14.308ms
Median blocked exec latency14.329ms
p95 blocked exec latency14.759ms
p99 blocked exec latency14.759ms
Max blocked exec latency14.759ms
Runtime matches8
Session DB security events8

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py -xvs

Security Engine HTTP request enforcement (VM-originated)

Section titled “Security Engine HTTP request enforcement (VM-originated)”

Current network-transport benchmark artifact: benchmarks/security-engine/data_1.2.1779673506_x86_64_http_request_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks a specific HTTPS request before upstream dispatch, warms the path once, then runs a guest curl loop and verifies the block responses, runtime match counters, canonical session.db security events, and logs exposure. It also runs a persistent TLS keep-alive client over the same connection to prove repeated block decisions stay logged and avoid per-request TLS setup in the hot path.

The wall-clock metric includes spawning curl in the guest. The time_starttransfer metric is curl’s first-byte timing for the blocked response and is the better proxy for transport plus Security Engine response latency. The phase deltas show most first-byte time is TLS/MITM appconnect; the post-pretransfer server-first-byte slice, which includes request dispatch, Security Engine evaluation, synthetic 403 generation, and first-byte delivery, is below 1ms on this run.

MetricValue
Runs8
Warmup runs1
Gate1,000ms mean
Mean wall-clock blocked request19.220ms
Median wall-clock blocked request18.751ms
p95 wall-clock blocked request22.104ms
Mean time_starttransfer9.523ms
Median time_starttransfer9.217ms
p95 time_starttransfer11.818ms
Mean DNS2.615ms
Mean TCP connect2.718ms
Mean TLS appconnect7.675ms
Runtime matches17
Session DB security events17

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py::test_http_request_enforcement_benchmark_records_vm_originated_path -xvs

Security Engine DNS request enforcement (VM-originated)

Section titled “Security Engine DNS request enforcement (VM-originated)”

Current DNS-transport benchmark artifact: benchmarks/security-engine/data_1.2.1779673506_x86_64_dns_request_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks one DNS qname, triggers repeated guest resolver lookups, and verifies NXDOMAIN-style failure, runtime match counters, canonical session.db security events, dns_events policy fields, and logs qname attribution.

MetricValue
Runs8
Gate1,000ms mean
Min blocked DNS lookup1.221ms
Mean blocked DNS lookup2.305ms
Median blocked DNS lookup1.566ms
p95 blocked DNS lookup7.655ms
p99 blocked DNS lookup7.655ms
Max blocked DNS lookup7.655ms
Runtime matches16
Session DB security events16
Session DB DNS events16

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py::test_dns_request_enforcement_benchmark_records_vm_originated_path -xvs

Security Engine MCP request enforcement (VM-originated)

Section titled “Security Engine MCP request enforcement (VM-originated)”

Current framed-MCP benchmark artifact: benchmarks/security-engine/data_1.2.1779673506_x86_64_mcp_request_enforcement.json.

This host-side serial benchmark runs a live service and VM, installs a runtime CEL rule that blocks the guest local__echo MCP tool, sends repeated tools/call requests through /run/capsem-mcp-server, and verifies JSON-RPC denial, runtime match counters, canonical session.db security events, mcp_calls policy fields, and logs server/tool attribution.

MetricValue
Runs8
Gate1,000ms mean
Min blocked MCP request0.846ms
Mean blocked MCP request1.173ms
Median blocked MCP request1.026ms
p95 blocked MCP request2.270ms
p99 blocked MCP request2.270ms
Max blocked MCP request2.270ms
Runtime matches8
Session DB security events8
Session DB MCP calls8

Run:

Terminal window
uv run pytest tests/capsem-serial/test_security_engine_benchmark.py::test_mcp_request_enforcement_benchmark_records_vm_originated_path -xvs
ComponentVersion
HostLinux x86_64, Intel Xeon @ 2.80GHz, 16 logical CPUs, 62.79GB RAM
Capsem1.2.1779673506 benchmark artifact
Guest kernelLinux 6.x (custom allnoconfig)
StorageKVM/VirtioFS workspace, ext4 host backing
Python3.x (rootfs)
Nodev22.x (rootfs)
Terminal window
just benchmark
# Optional named artifact run
CAPSEM_BENCHMARK_RUN_ID=rc1 just benchmark

Results are displayed as rich tables in the terminal. JSON output is saved to /tmp/capsem-benchmark.json inside the VM and archived under benchmarks/. Set CAPSEM_BENCHMARK_OUTPUT_DIR to write artifacts somewhere else during exploratory runs.