Turn any agent run into one file you can review.

filepacks seals an agent, eval, or CI run into a single .fpk. Verify it hasn't changed, diff it against the last run, hand it off. One command. No dashboard, no lock-in.

Try the CLI Get early access

Loading terminal…

Agent work is scattered.

Agent work is scattered across logs, screenshots, generated files, and trace dashboards. Reviewing it means reconstructing what happened from five places. filepacks puts the whole run in one file you can actually inspect.

How it works. Pack, verify, compare.

Three steps turn loose run output into one reviewable artifact.

agent-run-42/

report.md

results.json

trace.log

eval.csv

↓ pack

run-42.fpk4 files · sha256

Pack.

Point filepacks at a run directory. It seals the files into one .fpk with a canonical manifest and a SHA-256 for every file.

manifest.json

report.mda3f2…c1d8

results.json9b1e…44fa

trace.logd07c…8821

eval.csvfc40…2b6e

✓ verified

Verify.

Anyone can check the artifact is byte-for-byte what was produced. Same input, same bytes, every time.

baseline.fpk→latest.fpk

report.md

+ eval.csv

~ results.json

- trace.log

Compare.

Diff one run against the last to see exactly what changed when you tweaked a prompt, model, or tool.

The one command.

npx filepacks pack ./run --output run.fpk

pack

npx filepacks pack ./run --output run.fpk

That is the whole thing. inspect, verify, and compare do the rest. Open source, Apache-2.0.

Built something with an agent. Hand off the actual run.

Pack the run, send the .fpk. Whoever receives it can verify it themselves and review the actual output instead of trusting a summary.

baseline.fpk
  │
  ├─ prompt
  ├─ output
  └─ scores

latest.fpk
  │
  └─ Δ output

Built something with an agent and need to hand it off?

Pack the run, send the .fpk. Whoever receives it can verify it themselves and review the actual output instead of trusting a summary of what the agent claims it did.

Start with one run. Keep the evidence.

Try the CLI Get early access