Byte-for-byte validation of the run → log → replay pipeline
The full experiment suite was executed twice from a cold start (fresh process, no cached artifacts) using an identical experiment registry and identical seed, writing to two independent output directories:
run_A/ — first executionrun_B/ — second executionEach run produces the following per experiment:
frames.jsonl — tick-by-tick frame log (JSONL, one JSON object per line)summary.csv — tabular summary of key metricssummary.parquet — binary columnar trace (Apache Parquet)results.md — human-readable experiment report with embedded results table
Outputs were compared byte-for-byte using cryptographic hashing (SHA-256).
Determinism was validated across both engines: DSE stabilization experiments
and the thermal-token experiment (thermal_token_smoke).
No drift was observed from randomization, ordering, or nondeterministic I/O in any logged artifact — including Parquet binary traces.
You can verify determinism independently using only the published artifacts and standard command-line tools. No special software is required beyond Python 3.11+ or any SHA-256 hashing utility.
Option A: SHA-256 hash comparison (recommended)
# 1. Download and extract the artifacts archive unzip twin_artifacts.zip -d artifacts/ # 2. Hash every file in both run directories find artifacts/run_A -type f | sort | xargs sha256sum > hashes_A.txt find artifacts/run_B -type f | sort | xargs sha256sum > hashes_B.txt # 3. Compare (strip the directory prefix for fair comparison) sed 's|artifacts/run_A/|OUT/|' hashes_A.txt > norm_A.txt sed 's|artifacts/run_B/|OUT/|' hashes_B.txt > norm_B.txt diff norm_A.txt norm_B.txt # Expected output: (empty — no differences)
Option B: Direct byte comparison
# Compare every file pair directly
for f in $(find artifacts/run_A -type f -printf '%P\n' | sort); do
if ! cmp -s "artifacts/run_A/$f" "artifacts/run_B/$f"; then
echo "DIFFERS: $f"
fi
done
# Expected output: (nothing printed — all files match)
Option C: Python verification script
import hashlib, pathlib
def hash_tree(root):
hashes = {}
for p in sorted(pathlib.Path(root).rglob("*")):
if p.is_file():
h = hashlib.sha256(p.read_bytes()).hexdigest()
hashes[str(p.relative_to(root))] = h
return hashes
a, b = hash_tree("artifacts/run_A"), hash_tree("artifacts/run_B")
assert a == b, f"Mismatch: {set(a.items()) ^ set(b.items())}"
print(f"PASS: {len(a)} files compared, byte-identical.")
All artifacts are hosted alongside this viewer and can be accessed directly:
The experiments covered by this validation:
dse_single_payload_boundary_sweep — ring recruitment and boundary sweepdse_two_agent_zonepack_tatonnement — multi-agent tatonnement pricingdse_tow_mode_vector_authority — tow mode vector authority + pitch-moment guarddse_mpm_assist_envelope_expansion — MPM assist envelope expansiondse_human_embodiment_viscosity_trace — viscous feel jerk-limited tracethermal_token_smoke — thermal-token determinism smoke test
Human-facing report metadata is excluded from strict determinism checks by design.
Specifically, index.md (if present) may contain timestamps, environment
context, or run-time metadata that varies between executions. This is expected and
intentional — it does not affect determinism of the simulation itself.
index.md is treated as report metadata and excluded
from strict determinism checks because it may include timestamps or environment details;
the simulation outputs (frames.jsonl, summary.csv,
summary.parquet, results.md) remain fully deterministic.
This guarantees that any figure, metric, or trace produced from the twin is exactly reproducible. Logged outputs can be replayed and audited without ambiguity. The digital twin provides a stable foundation for debugging, validation, and future hardware binding — same inputs yield identical recorded state evolution.