← Back to Replay Viewer

Determinism Proof

Byte-for-byte validation of the run → log → replay pipeline

PASS — byte-for-byte identical (23 files compared)
Two independent cold-start runs · SHA-256 hash comparison · DSE + Thermal engines

1 What Was Compared

The full experiment suite was executed twice from a cold start (fresh process, no cached artifacts) using an identical experiment registry and identical seed, writing to two independent output directories:

  • run_A/ — first execution
  • run_B/ — second execution

Each run produces the following per experiment:

  • frames.jsonl — tick-by-tick frame log (JSONL, one JSON object per line)
  • summary.csv — tabular summary of key metrics
  • summary.parquet — binary columnar trace (Apache Parquet)
  • results.md — human-readable experiment report with embedded results table

Outputs were compared byte-for-byte using cryptographic hashing (SHA-256). Determinism was validated across both engines: DSE stabilization experiments and the thermal-token experiment (thermal_token_smoke).

No drift was observed from randomization, ordering, or nondeterministic I/O in any logged artifact — including Parquet binary traces.

2 How to Reproduce

You can verify determinism independently using only the published artifacts and standard command-line tools. No special software is required beyond Python 3.11+ or any SHA-256 hashing utility.

Option A: SHA-256 hash comparison (recommended)

# 1. Download and extract the artifacts archive
unzip twin_artifacts.zip -d artifacts/

# 2. Hash every file in both run directories
find artifacts/run_A -type f | sort | xargs sha256sum > hashes_A.txt
find artifacts/run_B -type f | sort | xargs sha256sum > hashes_B.txt

# 3. Compare (strip the directory prefix for fair comparison)
sed 's|artifacts/run_A/|OUT/|' hashes_A.txt > norm_A.txt
sed 's|artifacts/run_B/|OUT/|' hashes_B.txt > norm_B.txt
diff norm_A.txt norm_B.txt

# Expected output: (empty — no differences)

Option B: Direct byte comparison

# Compare every file pair directly
for f in $(find artifacts/run_A -type f -printf '%P\n' | sort); do
  if ! cmp -s "artifacts/run_A/$f" "artifacts/run_B/$f"; then
    echo "DIFFERS: $f"
  fi
done

# Expected output: (nothing printed — all files match)

Option C: Python verification script

import hashlib, pathlib

def hash_tree(root):
    hashes = {}
    for p in sorted(pathlib.Path(root).rglob("*")):
        if p.is_file():
            h = hashlib.sha256(p.read_bytes()).hexdigest()
            hashes[str(p.relative_to(root))] = h
    return hashes

a, b = hash_tree("artifacts/run_A"), hash_tree("artifacts/run_B")
assert a == b, f"Mismatch: {set(a.items()) ^ set(b.items())}"
print(f"PASS: {len(a)} files compared, byte-identical.")

3 Published Artifacts

All artifacts are hosted alongside this viewer and can be accessed directly:

The experiments covered by this validation:

  • dse_single_payload_boundary_sweep — ring recruitment and boundary sweep
  • dse_two_agent_zonepack_tatonnement — multi-agent tatonnement pricing
  • dse_tow_mode_vector_authority — tow mode vector authority + pitch-moment guard
  • dse_mpm_assist_envelope_expansion — MPM assist envelope expansion
  • dse_human_embodiment_viscosity_trace — viscous feel jerk-limited trace
  • thermal_token_smoke — thermal-token determinism smoke test

4 Exclusions & Scope

Human-facing report metadata is excluded from strict determinism checks by design. Specifically, index.md (if present) may contain timestamps, environment context, or run-time metadata that varies between executions. This is expected and intentional — it does not affect determinism of the simulation itself.

Repro note: index.md is treated as report metadata and excluded from strict determinism checks because it may include timestamps or environment details; the simulation outputs (frames.jsonl, summary.csv, summary.parquet, results.md) remain fully deterministic.

This guarantees that any figure, metric, or trace produced from the twin is exactly reproducible. Logged outputs can be replayed and audited without ambiguity. The digital twin provides a stable foundation for debugging, validation, and future hardware binding — same inputs yield identical recorded state evolution.