Blueberry Lab
IdleLast run M-008 (failed)4 done · 4 failed · 0 queued · 0 active · last activity just now

M-008

failed

M-007 eval recovery — measure ATE-RMSE on existing DPV-SLAM++ KITTI trajectories (no re-run, no Docker, evo + Leaflet only)

vision-robotics · Jaehyun Lee · from B-004.card-4

Completed in 1h 45m

Code plan6/6 steps (100%)
Iterations8/8 (100%)

Goal

Metric:ate_rmse_kitti_avg_m 27(baseline )

Eval fixture: M-007's pre-computed DPV-SLAM++ trajectories on KITTI Odometry seq 00 + 09 (the inputs are at algorithm/vision-robotics/missions/M-007/runs/MEASURED-001/{00,09}/trajectory.tum; GT poses at algorithm/vision-robotics/missions/M-007/runs/data/kitti/dataset/poses/{00,09}.txt). The trajectories were produced cleanly (NaN-free, on-time) — this mission ONLY measures them.

Baseline artifact:

Achieved: 40.3796 MEASURED-001

Approach

evo_ape on M-007's KITTI trajectories + Leaflet aerial overlay — no Docker, no SLAM re-run

Direct recovery of M-007's missing measurement. M-007 ran DPV-SLAM++ to completion (real NaN-free trajectories on disk) but ran out of budget one evo_ape call short. M-008 is the 8-iter pickup: pip install evo, compute Sim3-aligned ATE-RMSE on each sequence, render an HTML overlay with the polyline on Karlsruhe satellite imagery. Treats the budget-exhaustion class of failure (the M-007 + M-006 family) as a first-class lab problem: when the algorithm worked but the meta-pipeline didn't, ship a narrow follow-up that completes the measurement instead of re-running the whole stack. Honest acknowledgement: M-007 found seq 09 fired zero loop closures, so the visual oracle may show drift; that's the honest M-007/M-008 finding, not a bug to hide.

References (5)
  • lab/missions/M-007.json + lab/missions/M-007-postmortem.md — the source mission whose trajectories M-008 measures. M-007's CHANGELOG iters 19-22 record the actual SLAM run and confirm NaN-free outputs.
  • evo: https://github.com/MichaelGrupp/evo — the canonical KITTI/TUM trajectory evaluator (evo_ape, evo_traj). Documented to handle KITTI-format GT poses + TUM-format estimates with Sim3 alignment for monocular scale.
  • Lipson, Teed, Deng, 'Deep Patch Visual SLAM' (DPV-SLAM++), ECCV 2024 — Table 3 paper-published 25.76 m avg ATE on KITTI 00-10 (Sim3 aligned).
  • Leaflet 1.9: https://leafletjs.com — minimal JS map library, ~40KB; Esri WorldImagery tile URL https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x} is free for personal use and shows Karlsruhe road network clearly under the KITTI trajectories.
  • algorithm/infra/missions/M-003/proposals.md P7 — 'split setup/research budgets' proposal that would have surfaced M-007's failure mode upfront; M-008 demonstrates the manual workaround until P7 lands.

Code plan

  1. 01[DONE ✓] Step 0 [sandbox bootstrap + host pre-flight] — first transition (queued → planning). Per the M-003 P1 framework patch: run the spec's preflight_command first; if M-007's trajectory.tum files don't exist, transition straight to failed with 'preflight-failed' and exit. If they exist, materialise the sandbox: CLAUDE.md (charter + reuses-M-007 note + known caveats), CHANGELOG.md, requirements.txt (evo, numpy — that's it), src/ tests/ runs/ skeletons. Files: CLAUDE.md, CHANGELOG.md, requirements.txt, src/.gitkeep, tests/.gitkeep, runs/.gitkeep.
  2. 02[DONE ✓] Step 1 [author eval_kitti.py + pip install] — src/eval_kitti.py: small CLI script that takes --m007-sandbox (defaults to ../M-007) + --seqs (default 00,09) + --json + --save-overlay <dir>. Loads each sequence's KITTI-format GT pose file (3×4 transforms per line) and TUM-format estimated trajectory. Calls `evo_ape kitti --align --correct_scale` (Sim3 alignment for monocular scale). Parses ATE-RMSE + RPE translation + Sim3 scale factor. Also peeks at M-007's tracking.log for loop-closure events (heuristic: 'loop close' substring count). Emits Shape A oracle JSON. Run `pip install -r requirements.txt` in the host venv (no Docker per spec). Files: src/eval_kitti.py.
  3. 03[DONE ✓] Step 2 [measure] — run `python src/eval_kitti.py --m007-sandbox ../M-007 --seqs 00,09 --json --save-overlay runs/MEASURED-001 > runs/MEASURED-001.json`. Validate that summary.pass_overall makes sense (avg ATE within band OR honest 'exceeded due to zero-LC seq 09' note). Files: runs/MEASURED-001.json. RESULT: seq00 8.2438 m, seq09 72.5154 m, avg 40.3796 m, pass_overall=false — honest 'exceeded due to zero-LC seq 09' outcome (anticipated by hard_constraint #3).
  4. 04[DONE ✓] Step 3 [author render_overlay.py] — src/render_overlay.py: reads each estimated trajectory + KITTI GPS origin (algorithm/vision-robotics/missions/M-007/runs/data/kitti/dataset/sequences/<seq>/oxts/data/0000000000.txt if present, else hardcode Karlsruhe origin 49.0069°N, 8.4037°E with the published UTM zone 32U transform — M-007 download_kitti.py may not have fetched oxts; fall back gracefully). Converts TUM xyz → lat/lon. Writes a self-contained runs/MEASURED-001/overlay.html with Leaflet 1.9 inlined (JS + CSS as <script>/<style> blocks) + Esri WorldImagery tile layer + two polylines (seq 00 blue, seq 09 orange) + red start/end dots. Files: src/render_overlay.py.
  5. 05[DONE ✓] Step 4 [render overlay] — run `python src/render_overlay.py --m007-sandbox ../M-007 --seqs 00,09 --out runs/MEASURED-001/overlay.html`. Sanity-check the HTML opens in a browser (file size > 50KB, contains 'leaflet' string, polyline coords present). Files: runs/MEASURED-001/overlay.html. RESULT: overlay.html 1,197,669 bytes, self_contained=true, 36 Esri tiles inlined @ z17; gates pass (>50KB, 176 leaflet hits, seq00 2271+2271 / seq09 796+796 coords). Visual reading: aligned-est end-gaps seq00 103.4 m, seq09 243.6 m — even low-ATE seq00 loop not fully closed (numeric-vs-visual divergence for postmortem).
  6. 06[DONE ✓] Step 5 [wrap Shape B + postmortem] — wrap runs/MEASURED-001.json as Shape B (add run_id, mission, timestamp, iteration, container_invocation='host venv — no docker', primary_metric, secondary_metrics). Write POSTMORTEM.md (sandbox) + lab/missions/M-008-postmortem.md (lab mirror) covering: (a) the recovery pattern itself (when a precursor mission's algorithm worked but the meta-pipeline starved, a narrow follow-up beats a full re-run), (b) the actual ATE numbers vs paper, (c) honest handling of seq 09's zero-LC observation, (d) the visual artifact verdict. Files: runs/MEASURED-001.json (re-written as Shape B), POSTMORTEM.md, lab/missions/M-008-postmortem.md.

v3 metadata

Oracle

evo (evo_ape) 1.28+ with --align --correct_scale (Sim3 alignment for monocular SLAM scale recovery)deterministic

$ python src/eval_kitti.py --m007-sandbox ../M-007 --seqs 00,09 --json --save-overlay runs/MEASURED-001

Sandbox

algorithm/vision-robotics/missions/M-008

Memory files (living)

  • algorithm/vision-robotics/missions/M-008/CLAUDE.md
  • algorithm/vision-robotics/missions/M-008/CHANGELOG.md

Pass tolerance

absolute ≤ 1.35 · relative ≤ 5%

Hard constraints (6)

  • DO NOT re-run DPV-SLAM++. Inputs are M-007's existing trajectory.tum files; this mission ONLY measures them. If the input files are missing or corrupted, transition straight to failed with a 'preflight-failed' attempt (preflight_command checks this).
  • DO NOT build a Docker image. evo + numpy + a tiny HTML template are pure Python — system pip is sufficient. This is the v3 'use the simplest tool that satisfies the oracle' lesson distilled from M-006/M-007.
  • Average ATE-RMSE across KITTI seq 00 + 09 ≤ 27.0 m (DPV-SLAM++ paper Table 3 reports 25.76 m on 00-10 average; 5% reproduction tolerance). Note: M-007 found seq 09 fired zero loop closures, so this number MAY exceed the target — that's an honest outcome of THIS dataset/run, not a mission failure (postmortem should explain).
  • Trajectory polyline rendered as a self-contained HTML page (runs/MEASURED-001/overlay.html) using Leaflet + Esri WorldImagery tiles + KITTI's GPS origin georeferencing. No external CDN at view time — Leaflet JS/CSS bundled inline, satellite tiles either pre-cached or fetched once and inlined as base64.
  • Total wall-clock end-to-end ≤ 5 minutes (no docker build, no slam run — pure numpy/evo).
  • No simulated measurements: if any step blocks, fail honestly per M-006/M-007 precedent.

Execution

budget 0/8
File change matrix+7 ~12 · 11 files · 8 attempts
File12345678
algorithm/vision-robotics/missions/M-008/CHANGELOG.md~~~~~~~~
algorithm/vision-robotics/missions/M-008/requirements.txt~·······
algorithm/vision-robotics/missions/M-008/src/eval_kitti.py·~······
algorithm/vision-robotics/missions/M-008/runs/MEASURED-001.json··+~····
algorithm/vision-robotics/missions/M-008/runs/MEASURED-001/metrics.json··+·····
MEASURED-001···+····
algorithm/vision-robotics/missions/M-008/src/render_overlay.py·····~··
algorithm/vision-robotics/missions/M-008/runs/MEASURED-001/overlay.html······+·
algorithm/vision-robotics/missions/M-008/runs/MEASURED-001/tiles/······+·
algorithm/vision-robotics/missions/M-008/POSTMORTEM.md·······+
lab/missions/M-008-postmortem.md·······+

+ created ~ modified

  1. iter 8 · executingfailed · postmortem-and-closejust now

    Executed code_plan Step 5 (final iter, budget 1 -> 0): wrote lab/missions/M-008-postmortem.md (lab mirror) + algorithm/vision-robotics/missions/M-008/POSTMORTEM.md (sandbox copy, v3-F) and closed the mission to terminal 'failed'. Verdict: achieved ate_rmse_kitti_avg_m=40.3796 m vs target le 27.0 m -> target_met=false, measured_artifact=MEASURED-001. This is an HONEST failure, not a re-tryable miss: hard_constraint #1 forbids re-running DPV-SLAM++ so the ATE is frozen on disk; 40.38 > 27.0 is fixed. The recovery PATTERN succeeded (8 cheap host-only iters, no GPU/Docker, turned M-007's stranded NaN-free trajectories into a real Sim3-aligned ATE measurement + a 1.2 MB self-contained Leaflet overlay), but the METRIC does not pass because seq09 fired ZERO loop closures (M-007 Finding #12), drifting to a 243.68 m end-gap and pulling the average to 40.38 m. seq00, where closure fires, is sound at 8.24 m. Deliverable set is COMPLETE: numeric (runs/MEASURED-001.json Shape B + byte-stable metrics.json) + visual (runs/MEASURED-001/overlay.html) + both postmortems. Subtle carry-forward finding for the pillar: even seq00's excellent 8.24 m ATE hides a 103 m residual loop end-gap on the aerial — Sim3-aligned scalar ATE flatters an un-closed loop, so the visual oracle tells a story the numeric one doesn't. Recommend a dual-gate oracle convention (ATE <= target AND loop visibly closes).

    Modified: lab/missions/M-008-postmortem.md, algorithm/vision-robotics/missions/M-008/POSTMORTEM.md, algorithm/vision-robotics/missions/M-008/CHANGELOG.md

    Created: lab/missions/M-008-postmortem.md, algorithm/vision-robotics/missions/M-008/POSTMORTEM.md

  2. iter 7 · executingexecuting · execute-render-overlayjust now

    Executed code_plan Step 4: ran `python src/render_overlay.py --m007-sandbox ../M-007 --seqs 00,09 --out runs/MEASURED-001/overlay.html` (exit 0, <5 s, network up). Produced a FULLY self-contained runs/MEASURED-001/overlay.html (1,197,669 bytes) + cached runs/MEASURED-001/tiles/ (36 Esri WorldImagery tiles @ zoom 17, base64-inlined) — Trap #3 resolved in the good case (self_contained=true, no CDN at view time). All sanity gates PASS: 1.2 MB >> 50 KB; 176 'leaflet' hits; embedded SERIES JSON carries both sequences' coords (seq00 2271 est + 2271 GT, seq09 796 est + 796 GT); legend ATE matches the numeric oracle (seq00 8.24 m, seq09 72.52 m). HONEST visual-oracle reading carried to the postmortem: aligned-estimate start->end gaps are seq00 103.4 m and seq09 243.6 m. seq09's gap is the expected zero-loop-closure drift (M-007 Finding #12); the subtler finding is seq00 — its ATE-RMSE is excellent (8.24 m, Sim3 alignment spreads error over the whole path) yet the loop still does NOT visibly snap shut (103 m residual end-gap on a multi-km loop), so the visual oracle tells a slightly different story than the scalar ATE. Stayed in `executing` (not `measuring`) because Step 4 is a non-metric run step (metrics were built iter 2-4) and Step 5 (postmortem + evaluating-close) is still unfinished. Next/last iter (budget 1) writes POSTMORTEM.md + lab/missions/M-008-postmortem.md and takes the final close with target_met=false but a COMPLETE deliverable set (numeric + visual + postmortem).

    Modified: algorithm/vision-robotics/missions/M-008/CHANGELOG.md

    Created: algorithm/vision-robotics/missions/M-008/runs/MEASURED-001/overlay.html, algorithm/vision-robotics/missions/M-008/runs/MEASURED-001/tiles/

  3. iter 6 · planningexecuting · author-render-overlayjust now

    Executed code_plan Step 3: authored src/render_overlay.py (the visual oracle) — coding only, not run (planning -> executing is coding-only; the render is Step 4, next iter). 2 files touched (render_overlay.py + CHANGELOG), within the v3-D <=3 cap. Three design choices keep the visual oracle honest and consistent with the numeric one: (1) it reuses eval_kitti._kitti_poses_to_se3 via an absolute import (src/ is sys.path[0] when run as a script, satisfying the entrypoint absolute-imports rule) so both oracles read KITTI poses identically; (2) it applies the SAME align(correct_scale=True) Sim3 transform, so the DRAWN drift equals the MEASURED drift — the raw monocular estimate is ~1/11 metric scale (seq00 x11.04) and would render as an unreadable squiggle unaligned; (3) it draws GT (muted dashed) alongside each estimate (seq00 blue, seq09 orange) because the visual oracle's question — does the loop snap shut? — is only answerable against the GT loop, making seq09's zero-loop-closure drift legible. Trap #2 (confirmed: no oxts on disk) handled by anchoring both sequences at the hardcoded Karlsruhe origin (49.0069N, 8.4037E), heading assumed North, equirectangular projection; the HTML header + legend both state this is shape/loop-closure-meaningful, NOT survey-grade placement. Trap #3 handled by fetching Leaflet 1.9.4 + Esri WorldImagery tiles ONCE at render time and base64-inlining them (custom L.GridLayer -> data-URIs, adaptive zoom keeping tiles <=36, vector CircleMarker dots to avoid icon PNGs) for a no-CDN-at-view-time artifact, with a graceful live-layer fallback that stamps self_contained=false when the render host has no network. Verified py_compile OK. Next iter runs Step 4: python src/render_overlay.py --m007-sandbox ../M-007 --seqs 00,09 --out runs/MEASURED-001/overlay.html (gates: >50 KB, contains 'leaflet', coords present); then only Step 5 (postmortem + evaluating -> failed close) remains, fitting the budget exactly (this iter 3->2, Step 4 2->1, close 1->0).

    Modified: algorithm/vision-robotics/missions/M-008/src/render_overlay.py, algorithm/vision-robotics/missions/M-008/CHANGELOG.md

  4. iter 5 · evaluatingplanning · evaluating-retryjust now

    Computed the evaluating verdict: achieved ate_rmse_avg_m=40.3796 m vs target le 27.0 m -> target_met=FALSE. Took the evaluating -> planning retry edge, but NOT to chase a lower ATE: hard_constraint #1 forbids re-running DPV-SLAM++, so the trajectories (and therefore the ATE) are frozen on disk — a numeric retry would be both futile and dishonest. The honest 'adjustment Z' is deliverable completion: the required code_plan Steps 3-5 (src/render_overlay.py + overlay.html visual artifact + the honest postmortem) are still pending, and evaluating->planning is the only legal edge that reaches those coding steps. Anti-stall guard checked: this is the FIRST evaluating->planning retry (not the 3-identical-value stall that forces an immediate fail) and budget remains (4->3). No re-measurement will occur. seq09's 72.5 m miss stays framed by hard_constraint #3 as an honest zero-loop-closure dataset outcome (M-007 Finding #12), not a recovery-pattern failure. Next iter (planning -> executing) authors Step 3 src/render_overlay.py using the hardcoded Karlsruhe origin (Trap #2 confirmed: M-007 fetched no oxts GPS).

    Modified: algorithm/vision-robotics/missions/M-008/CHANGELOG.md

  5. iter 4 · measuringevaluating · build-measured-artifactjust now

    Built the MEASURED-001 Shape B artifact for the evaluating step. Re-wrote runs/MEASURED-001.json from raw Shape A into Shape B: added provenance (run_id=MEASURED-001, mission=M-008, iteration=4, container_invocation='host venv — no docker', oracle='eval_kitti.py', oracle_exit_code=0, oracle_tool='evo 1.36.5 Sim3 align(correct_scale=True)') plus flat primary_metric{} + secondary_metrics{} blocks that the frontend reader (frontend/lib/artifacts.ts L108) lifts into metric tables, and echoed the goal block. The byte-stable Shape A oracle output is preserved UNTOUCHED at runs/MEASURED-001/metrics.json and referenced via raw_oracle_stdout, so v3-C oracle determinism is intact (the wrapper-only timestamp never enters the oracle output). SCHEMA requires fixtures[]; this pillar's native key is sequences[], so the artifact carries BOTH (fixtures[] aliases sequences[]). Verdict locked in: pass_overall=false, primary ate_rmse_avg_m=40.3796 vs target le 27.0; secondaries seq00=8.2438, seq09=72.5154, rpe_seq00=0.1403, loop_closed_seq09=false, scale_drift_seq00=10.044. Carry-forward: the numeric target CANNOT improve (re-running DPV-SLAM++ is forbidden by hard_constraint #1, so ATE is fixed), therefore evaluating must NOT loop to planning to chase a lower ATE — it should spend the remaining 4 iters on the still-pending required deliverables (code_plan Step 3-4 overlay.html visual artifact + Step 5 honest postmortem), then close evaluating → failed with target_met=false but a complete deliverable set. hard_constraint #3 frames seq09's miss as an honest dataset outcome (zero loop closures, M-007 Finding #12), not a recovery-pattern failure.

    Modified: algorithm/vision-robotics/missions/M-008/runs/MEASURED-001.json, algorithm/vision-robotics/missions/M-008/CHANGELOG.md

    Created: MEASURED-001

  6. iter 3 · executingmeasuring · execute-measurejust now

    Executed code_plan Step 2: ran the oracle `python src/eval_kitti.py --m007-sandbox ../M-007 --seqs 00,09 --json --save-overlay runs/MEASURED-001`, producing runs/MEASURED-001.json + runs/MEASURED-001/metrics.json (byte-identical, deterministic per v3-C). evo emitted no warnings (stderr empty), wall-clock < 5 s. MEASURED RESULT: seq00 ATE-RMSE = 8.2438 m (excellent, RPE 0.1403 m/m), seq09 = 72.5154 m (heavy drift, RPE 0.6002 m/m), avg = 40.3796 m -> pass_overall=false (> 27.0 target). This is an HONEST outcome anticipated by hard_constraint #3, not a bug: seq09's GT is a loop (start/end gap 9.27 m) but the aligned estimate drifts to a 243.68 m gap and never closes — the direct consequence of M-007 Finding #12 (seq09 fired ZERO loop closures). seq00 (where closure fires) is sound at 8.24 m. Sim3 scale factors (seq00 x11.04, seq09 x5.67) confirm --correct_scale was mandatory: DPV-SLAM++ monocular output is ~1/11 metric scale, and an unaligned ATE would be meaningless. Next (measuring -> evaluating) builds the Shape B MEASURED-001 artifact; evaluating will compare 40.38 vs 27.0 (target NOT met) and ship an honest failure postmortem.

    Modified: algorithm/vision-robotics/missions/M-008/CHANGELOG.md

    Created: algorithm/vision-robotics/missions/M-008/runs/MEASURED-001.json, algorithm/vision-robotics/missions/M-008/runs/MEASURED-001/metrics.json

  7. iter 2 · planningexecuting · author-eval-scriptjust now

    Executed code_plan Step 1: authored src/eval_kitti.py (the ATE-RMSE oracle) and pip-installed evo 1.36.5 + numpy/scipy into the host venv (no Docker, per hard constraint). The script uses evo's Python API (same engine as the evo_ape CLI) for deterministic parseable output, with Sim3 align(correct_scale=True) for monocular scale recovery. Three findings hardened the design before any measurement: (1) trajectories are STRIDED (stride=2) so TUM timestamps are frame indices 0,2,4,… — confirmed seq00=2271 est vs 4541 GT lines, seq09 ts 0→1590 vs 1591 GT; the script assigns GT synthetic integer-frame timestamps and lets evo associate frame-2k↔frame-2k, avoiding silent mis-pairing. (2) M-007 fetched NO oxts GPS files (sequences/ dir absent), confirming Trap #2 — Step 3 render must use the hardcoded Karlsruhe origin. (3) tracking.log has no per-event loop-closure lines, only a SUMMARY JSON, so loop_closed is a documented geometric proxy with M-007 Finding #12 as the authoritative seq09 zero-LC source. Oracle JSON carries no timestamp/runtime and rounds floats to 4dp for byte-stable determinism (v3-C). Verified: evo API imports resolve and the script py_compiles. Next iter (executing → measuring) RUNS the oracle to produce runs/MEASURED-001.json.

    Modified: algorithm/vision-robotics/missions/M-008/src/eval_kitti.py, algorithm/vision-robotics/missions/M-008/CHANGELOG.md

  8. iter 1 · queuedplanning · elaborate-code-planjust now

    Host pre-flight passed: M-007's DPV-SLAM++ trajectory.tum files are present on disk (preflight_command returned 'M-007 trajectories present'), so no preflight-failed transition. Completed Step 0 sandbox bootstrap: the skeleton (CLAUDE.md, CHANGELOG.md, src/tests/runs .gitkeeps) already existed from mission promotion; added the one missing file, requirements.txt (evo>=1.28 + numpy, no Docker per spec). code_plan was already concrete enough (6 well-scoped steps, each <=3 files) so no re-elaboration needed. Next: planning -> executing authors src/eval_kitti.py and pip-installs evo into the host venv.

    Modified: algorithm/vision-robotics/missions/M-008/requirements.txt, algorithm/vision-robotics/missions/M-008/CHANGELOG.md

Switch to raw stream above to see the agent's tool calls and reasoning verbatim.

Results

MEASURED-001

measured
just now

Metrics

MetricValue
metricate_rmse_kitti_avg_m
ate_rmse_avg_m40.380
target_ople
target_value27
passno
paper_anchor_m25.760
tolerance_absolute1.350
tolerance_relative0.0500
ate_rmse_seq00_m8.244
ate_rmse_seq09_m72.515
rpe_translation_seq00_m_per_m0.1403
loop_closed_seq09no
scale_drift_seq0010.044
SLAM-estimated trajectory overlaid on satellite imagery. Self-contained interactive map — drag to pan, scroll to zoom.

Lessons recorded

Post-mortem notes carried forward for future missions.

  • Eval-only mission: it validates the meta-pipeline RECOVERY PATTERN (measure a precursor mission's stranded outputs), NOT DPV-SLAM++ algorithm quality — that verdict lives in M-007's postmortem.
  • seq09's 72.5154 m miss (which drives the 40.3796 m average over target) is the direct, spec-anticipated (hard_constraint #3) consequence of seq09 firing ZERO loop closures (M-007 Finding #12); it is an honest property of THIS 2-sequence subset/run, not a bug. The paper's 25.76 m is an average over KITTI 00-10, not this 00+09 pair.
  • Numeric-vs-visual oracle divergence: even seq00's excellent 8.2438 m ATE-RMSE does NOT visibly close its loop (103.4 m residual end-gap on a multi-km loop) — Sim3 align(correct_scale) spreads error over the whole path and flatters the scalar ATE relative to what the aerial overlay shows.
  • Sim3 --correct_scale was mandatory (monocular output is ~1/11 metric scale, seq00 scale factor x11.04) but silently absorbs scale drift; secondary metric scale_drift_seq00=10.044 surfaces this explicitly.
  • loop_closed is a deterministic geometric proxy (M-007's tracking.log emits only a SUMMARY JSON, no per-event LC lines); the authoritative seq09 zero-LC source is M-007 Finding #12.
  • No oxts GPS on disk (Trap #2 confirmed), so overlay.html anchors at the hardcoded Karlsruhe origin (49.0069N, 8.4037E, heading-North, equirectangular) — it is shape/loop-closure-meaningful, NOT survey-grade absolute placement.
  • No GPU/CUDA/WSL2 touched: this mission proves nothing about the host CUDA stack.