ActiveInferenceInstitute · docxology · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026
diff --git a/.agent_rules/README.md b/.agent_rules/README.md
@@ -90,4 +90,4 @@ uv pip install -e .                   # Install deps
 
 ---
 
-**Pipeline Version**: 1.9.0 | **Steps**: 25 | **Tests**: latest recorded full suite with Ollama integration excludes: 2,381 passed, 17 skipped, 1 xfailed; collect-only inventory is 2,399 tests | **MCP Tools**: verify with `src/tests/mcp/test_mcp_audit.py`
+**Pipeline Version**: 2.0.0 | **Steps**: 25 | **Tests**: latest recorded full suite with Ollama integration excludes: 2,393 passed, 17 skipped, 1 xfailed; collect-only inventory is 2,411 tests | **MCP Tools**: verify with `src/tests/mcp/test_mcp_audit.py`
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -3,7 +3,7 @@
 This guide details the architecture of the Generalized Notation Notation (GNN) system. It complements `DOCS.md` and `doc/pipeline/README.md` with an implementation-oriented perspective for developers.
 
 **Last Updated**: 2026-06-12
-**Version**: 1.9.0
+**Version**: 2.0.0
 **Status**: Maintained
 **Pipeline Steps**: 25 (0-24)
 
@@ -323,7 +323,7 @@ Each agent implements comprehensive performance monitoring:
 
 ---
 
-**Architecture Version**: 1.9.0
+**Architecture Version**: 2.0.0
 **Last Updated**: 2026-06-12
 **Status**: ✅ Production Ready
 **Compliance**: Thin orchestrator pattern

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -12,6 +12,24 @@ No unreleased changes yet.
 
 ---
 
+## [2.0.0] — 2026-06-12
+
+### Added
+- **Semantic fidelity release gate**: `scripts/run_semantic_fidelity_gate.py` writes `gnn_semantic_fidelity_ledger_v1` artifacts for maintained model families.
+- **Strict semantic contracts**: representative fixtures now preserve model identity, variables, edges, dimensions, parameter shapes, equations, time, and ontology mappings across JSON parse/serialize/parse checks.
+- **Cross-framework reliability release gate**: `scripts/run_cross_framework_reliability.py` writes `gnn_cross_framework_reliability_ledger_v1` artifacts with compatible, required, and unsupported backend statuses.
+- **GridWorld three-backend comparison**: GridWorld is profiled for PyMDP, RxInfer, and ActiveInference.jl, including seed, trace length, matrix-shape, and matrix-provenance parity.
+
+### Changed
+- GridWorld model-family acceptance now requests PyMDP, RxInfer, and ActiveInference.jl for the v2 comparison fixture instead of a PyMDP-only profile.
+- Roadmap next target moves to v3.0.0 for durable streams, long-running sessions, and auditable container plans.
+
+### Fixed
+- JSON serialization now emits equation objects instead of lossy stringified dataclasses, preventing silent semantic round-trip drift.
+- Cross-framework reliability no longer certifies aggregate Step 12 success without successful non-skipped execution-detail rows and current simulation payloads for required backends.
+
+---
+
 ## [1.9.0] — 2026-06-12
 
 ### Added
@@ -149,7 +167,8 @@ No unreleased changes yet.
 - pytest test suite with comprehensive coverage
 - MCP tool registration framework
 
-[Unreleased]: https://github.com/ActiveInferenceInstitute/GeneralizedNotationNotation/compare/v1.9.0...HEAD
+[Unreleased]: https://github.com/ActiveInferenceInstitute/GeneralizedNotationNotation/compare/v2.0.0...HEAD
+[2.0.0]: https://github.com/ActiveInferenceInstitute/GeneralizedNotationNotation/compare/v1.9.0...v2.0.0
 [1.9.0]: https://github.com/ActiveInferenceInstitute/GeneralizedNotationNotation/compare/v1.8.0...v1.9.0
 [1.8.0]: https://github.com/ActiveInferenceInstitute/GeneralizedNotationNotation/compare/v1.6.0...v1.8.0
 [1.6.0]: https://github.com/ActiveInferenceInstitute/GeneralizedNotationNotation/compare/v1.3.0...v1.6.0

diff --git a/CITATION.cff b/CITATION.cff
@@ -9,7 +9,7 @@ authors:
     # This entry acknowledges all contributors. Individual contributors can be listed above if desired.
 
 title: "GeneralizedNotationNotation (GNN)"
-version: 1.9.0 # Current stable release
+version: 2.0.0 # Current stable release
 date-released: 2026-06-12
 
 abstract: |

diff --git a/README.md b/README.md
@@ -49,11 +49,11 @@
 
 **Smékal, J., & Friedman, D. A. (2023)**. *Generalized Notation Notation for Active Inference Models*. Active Inference Journal.  
 **Last Updated**: 2026-06-12
-**Version**: 1.9.0
+**Version**: 2.0.0
 **Status**: ✅ Production Ready (Active Inference Institute)  
-**Test Suite Inventory (measured 2026-06-12)**: 184 `test_*.py` files under `src/tests/`; `uv run --extra dev python -m pytest --collect-only src/tests/ -q --tb=no --ignore=src/tests/llm/test_llm_ollama.py --ignore=src/tests/llm/test_llm_ollama_integration.py` collected 2,399 tests. Latest recorded full suite evidence with the same Ollama integration excludes is 2,381 passed, 17 skipped, 1 xfailed.
-**Features (v1.9.0)**: model-family acceptance and interpretability ledgers for basics, discrete, continuous, hierarchical, multi-agent, precision, structured, gridworld, and scaling-study fixtures; explicit profiled unsupported Step 11/12 skips for continuous/hierarchical families; maintained template CLI (`gnn templates list`, `gnn templates show`, `gnn pull`); packaged template assets with checksum/collision handling; authenticated local MCP HTTP orchestration; pre-commit/devcontainer tooling; structured PyMDP 1.0 POMDP execution; PyMDP Scaling Study; and MCP Full Module Exposure.
-**Next Target**: v2.0.0 semantic fidelity and cross-framework reliability hardening.
+**Test Suite Inventory (measured 2026-06-12)**: 186 `test_*.py` files under `src/tests/`; `uv run --extra dev python -m pytest --collect-only src/tests/ -q --tb=no --ignore=src/tests/llm/test_llm_ollama.py --ignore=src/tests/llm/test_llm_ollama_integration.py` collected 2,411 tests. Latest recorded full suite evidence with the same Ollama integration excludes is 2,393 passed, 17 skipped, 1 xfailed.
+**Features (v2.0.0)**: semantic fidelity ledgers across all maintained model families, strict JSON parse/serialize/parse preservation for variables, edges, dimensions, parameter shapes, equations, time, and ontology mappings; cross-framework reliability ledgers with explicit compatible/unsupported backend statuses; GridWorld comparison across PyMDP, RxInfer, and ActiveInference.jl; model-family acceptance and interpretability ledgers; maintained template CLI (`gnn templates list`, `gnn templates show`, `gnn pull`); authenticated local MCP HTTP orchestration; structured PyMDP 1.0 POMDP execution; PyMDP Scaling Study; and MCP Full Module Exposure.
+**Next Target**: v3.0.0 long-running orchestration, durable observation streams, and auditable container plans.
 📖 **DOI:** [10.5281/zenodo.7803328](https://doi.org/10.5281/zenodo.7803328)  
 📁 **Archive:** [zenodo.org/records/7803328](https://zenodo.org/records/7803328)
 

diff --git a/TO-DO.md b/TO-DO.md
@@ -1,20 +1,11 @@
 # TO-DO — GNN Pipeline Roadmap
 
 **Last Updated**: 2026-06-12
-**Current Version**: 1.9.0
-**Next Target**: v2.0.0 (semantic fidelity and cross-framework reliability)
-
-**Current Evidence (2026-06-12)**: v1.9.0 focused family/report suite
-`17 passed`; command-of-record collect-only inventory is `2399` collected tests
-across 184 `test_*.py` files with the documented Ollama integration ignores.
-Latest full local suite evidence with the same Ollama ignores is
-`2381 passed, 17 skipped, 1 xfailed`. The all-family strict acceptance passed
-for 9 families; continuous/hierarchical Step 11/12 recorded as profiled
-unsupported skips with `0` raw failed Step 11/12 counts. v1.8.0 focused
-release smokes passed for `gnn templates list`, `gnn templates show
-pomdp-gridworld-3x3`, dry-run `gnn pull` to `/tmp/gnn-pull`, and authenticated
-MCP HTTP tests (`12 passed`; combined CLI/MCP/capability suite `32 passed`);
-`just lint` passes.
+**Current Version**: 2.0.0
+**Next Target**: v3.0.0 (long-running orchestration, durable streams, and auditable container plans)
+
+**Current Evidence (2026-06-12)**: v2.0.0 semantic fidelity gate passed for 9 families (`gnn_semantic_fidelity_ledger_v1`); cross-framework reliability gate passed for 9 families (`gnn_cross_framework_reliability_ledger_v1`) with GridWorld compared PyMDP, RxInfer, and ActiveInference.jl and all other unprofiled backends recorded with explicit unsupported statuses. Command-of-record collect-only inventory is `2411` collected tests across 186 `test_*.py` files with the documented Ollama integration ignores. Latest full local suite evidence with the same Ollama ignores is `2393 passed, 17 skipped, 1 xfailed`. v1.9 all-family strict acceptance remains green for 9 families; continuous and hierarchical Step 11/12 remain profiled unsupported skips with `0` raw failed Step 11/12 counts. v1.8.0 focused release smokes passed for `gnn templates list`, `gnn templates show pomdp-gridworld-3x3`, dry-run `gnn pull` to `/tmp/gnn-pull`, and authenticated
+MCP HTTP tests (`12 passed`; combined CLI/MCP/capability suite `32 passed`); `just lint` passes.
 
 ---
 
@@ -111,12 +102,21 @@ uv run --extra dev python src/main.py --target-dir input/gnn_files/discrete --ou
 
 ---
 
-## 🧪 v2.0.0 — Semantic Fidelity & Cross-Framework Reliability
+## ✅ v2.0.0 — Semantic Fidelity & Cross-Framework Reliability (Released)
 
 > **Scope**: Upgrade GNN from broad fixture acceptance to stronger semantic preservation, cross-format round trips, and cross-framework equivalence checks.
+> **Released**: 2026-06-12 (tag: `v2.0.0`)
+
+- [x] **Semantic Round-Trip Gates** — Require representative model families to preserve variables, edges, dimensions, parameter shapes, equations, time, and ontology mappings across the maintained strict JSON interchange path. `scripts/run_semantic_fidelity_gate.py` passed for all 9 manifest families and wrote `gnn_semantic_fidelity_ledger_v1` artifacts.
+- [x] **Cross-Framework Result Comparisons** — Compare compatible backend outputs through `scripts/run_cross_framework_reliability.py`; required backends need Step 11/12 evidence, successful non-skipped Step 12 execution detail rows, current simulation payloads, matching seeds when present, trace lengths, and matrix-shape/provenance parity. The all-family gate passed for 9 families; GridWorld compared PyMDP, RxInfer, and ActiveInference.jl, while JAX, NumPyro, PyTorch, and DisCoPy remain explicit unsupported statuses unless profiled for a compatible family.
 
-- [ ] **Semantic Round-Trip Gates** — Require representative model families to preserve variables, edges, dimensions, and key matrix contracts across maintained formats.
-- [ ] **Cross-Framework Result Comparisons** — Compare compatible PyMDP, RxInfer, JAX, NumPyro, PyTorch, ActiveInference.jl, and DisCoPy outputs with explicit skipped/failed states for unavailable frameworks.
+### Acceptance
+```bash
+uv run --extra dev python -m pytest src/tests/pipeline/test_semantic_fidelity_gate.py src/tests/pipeline/test_cross_framework_reliability.py -q
+uv run --extra dev python scripts/run_semantic_fidelity_gate.py --manifest input/model_family_manifest.json --output-dir /tmp/gnn-semantic-fidelity --strict
+uv run --extra dev python scripts/run_cross_framework_reliability.py --manifest input/model_family_manifest.json --output-dir /tmp/gnn-cross-framework --strict
+uv run --extra dev python scripts/run_model_family_acceptance.py --manifest input/model_family_manifest.json --output-dir /tmp/gnn-family-acceptance-all --strict
+```
 
 ---
 

diff --git a/input/model_family_manifest.json b/input/model_family_manifest.json
@@ -75,7 +75,7 @@
       "name": "gridworld",
       "description": "Gridworld POMDP fixture used for cross-framework acceptance checks.",
       "target_dir": "input/gnn_files/pomdp_gridworld",
-      "frameworks": "pymdp",
+      "frameworks": "pymdp,rxinfer,activeinference_jl",
       "representative_files": ["pomdp_gridworld_3x3.md"]
     },
     {

diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "generalized-notation-notation"
-version = "1.9.0"
+version = "2.0.0"
 description = "A text-based language for standardizing Active Inference generative models"
 readme = "README.md"
 requires-python = ">=3.11,<3.14"

diff --git a/scripts/check_capability_contracts.py b/scripts/check_capability_contracts.py
@@ -114,6 +114,11 @@ def run_audit() -> List[str]:
         and "**Next Target**: v2.0.0" not in todo_text
     ):
         failures.append("TO-DO.md: v1.9.0 release must set v2.0.0 as next target")
+    if (
+        "**Current Version**: 2.0.0" in todo_text
+        and "**Next Target**: v3.0.0" not in todo_text
+    ):
+        failures.append("TO-DO.md: v2.0.0 release must set v3.0.0 as next target")
 
     readme_tests = _read("src/tests/README.md")
     maintained_dirs, direct_test_dirs = _maintained_test_directory_counts()
@@ -190,6 +195,19 @@ def run_audit() -> List[str]:
             "collect-only inventory",
             "full suite evidence",
         ),
+        "Semantic Round-Trip Gates": (
+            "semantic fidelity gate passed for 9 families",
+            "gnn_semantic_fidelity_ledger_v1",
+            "variables, edges, dimensions, parameter shapes, equations, time, and ontology mappings",
+            "scripts/run_semantic_fidelity_gate.py",
+        ),
+        "Cross-Framework Result Comparisons": (
+            "cross-framework reliability gate passed for 9 families",
+            "gnn_cross_framework_reliability_ledger_v1",
+            "GridWorld compared PyMDP, RxInfer, and ActiveInference.jl",
+            "explicit unsupported statuses",
+            "scripts/run_cross_framework_reliability.py",
+        ),
     }
     for item in guarded_pending_items:
         if f"- [x] **{item}**" in todo_text:
@@ -245,6 +263,26 @@ def run_audit() -> List[str]:
         if not _exists(required):
             failures.append(f"v1.9 model-family contract missing: {required}")
 
+    for required in (
+        "scripts/run_semantic_fidelity_gate.py",
+        "scripts/run_cross_framework_reliability.py",
+        "src/pipeline/semantic_fidelity.py",
+        "src/pipeline/cross_framework_reliability.py",
+        "src/report/semantic_fidelity.py",
+        "src/report/cross_framework_reliability.py",
+        "src/tests/pipeline/test_semantic_fidelity_gate.py",
+        "src/tests/pipeline/test_cross_framework_reliability.py",
+    ):
+        if not _exists(required):
+            failures.append(f"v2.0 reliability contract missing: {required}")
+
+    if "pymdp,rxinfer,activeinference_jl" not in _read(
+        "input/model_family_manifest.json"
+    ):
+        failures.append(
+            "input/model_family_manifest.json: GridWorld must profile a real multi-backend comparison"
+        )
+
     if "WebSocket" in todo_text:
         if not _contains(
             "src/gui/websocket_bridge.py",

diff --git a/scripts/run_cross_framework_reliability.py b/scripts/run_cross_framework_reliability.py
@@ -0,0 +1,66 @@
+#!/usr/bin/env python3
+"""Run profiled cross-framework reliability checks for maintained families."""
+
+from __future__ import annotations
+
+import argparse
+import sys
+from pathlib import Path
+
+REPO_ROOT = Path(__file__).resolve().parents[1]
+SRC_DIR = REPO_ROOT / "src"
+if str(SRC_DIR) not in sys.path:
+    sys.path.insert(0, str(SRC_DIR))
+
+from pipeline.cross_framework_reliability import (
+    MAINTAINED_FRAMEWORKS,
+    run_cross_framework_reliability,
+)
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "--manifest",
+        type=Path,
+        default=Path("input/model_family_manifest.json"),
+        help="Path to the model-family manifest",
+    )
+    parser.add_argument(
+        "--families",
+        default="",
+        help="Comma-separated family names to run; defaults to all families",
+    )
+    parser.add_argument(
+        "--frameworks",
+        default=",".join(MAINTAINED_FRAMEWORKS),
+        help="Comma-separated maintained frameworks to profile",
+    )
+    parser.add_argument(
+        "--output-dir",
+        type=Path,
+        required=True,
+        help="Directory for reliability artifacts",
+    )
+    parser.add_argument("--strict", action="store_true", help="Fail on mismatch")
+    args = parser.parse_args(argv)
+
+    families = [item.strip() for item in args.families.split(",") if item.strip()]
+    frameworks = [item.strip() for item in args.frameworks.split(",") if item.strip()]
+    try:
+        ledger = run_cross_framework_reliability(
+            args.manifest,
+            args.output_dir,
+            family_names=families,
+            frameworks=frameworks,
+            strict=args.strict,
+        )
+    except (FileNotFoundError, KeyError, RuntimeError, ValueError) as exc:
+        print(f"FAIL: {exc}", file=sys.stderr)
+        return 1
+    print(f"Cross-framework reliability {ledger['status']}: {args.output_dir}")
+    return 0 if ledger["status"] == "passed" else 1
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/scripts/run_semantic_fidelity_gate.py b/scripts/run_semantic_fidelity_gate.py
@@ -0,0 +1,63 @@
+#!/usr/bin/env python3
+"""Run strict semantic fidelity checks for maintained model families."""
+
+from __future__ import annotations
+
+import argparse
+import sys
+from pathlib import Path
+
+REPO_ROOT = Path(__file__).resolve().parents[1]
+SRC_DIR = REPO_ROOT / "src"
+if str(SRC_DIR) not in sys.path:
+    sys.path.insert(0, str(SRC_DIR))
+
+from pipeline.semantic_fidelity import run_semantic_fidelity_gate
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "--manifest",
+        type=Path,
+        default=Path("input/model_family_manifest.json"),
+        help="Path to the model-family manifest",
+    )
+    parser.add_argument(
+        "--families",
+        default="",
+        help="Comma-separated family names to run; defaults to all families",
+    )
+    parser.add_argument(
+        "--formats",
+        default="json",
+        help="Comma-separated serializer/parser formats to check",
+    )
+    parser.add_argument(
+        "--output-dir",
+        type=Path,
+        required=True,
+        help="Directory for semantic fidelity artifacts",
+    )
+    parser.add_argument("--strict", action="store_true", help="Fail on mismatch")
+    args = parser.parse_args(argv)
+
+    families = [item.strip() for item in args.families.split(",") if item.strip()]
+    formats = [item.strip() for item in args.formats.split(",") if item.strip()]
+    try:
+        ledger = run_semantic_fidelity_gate(
+            args.manifest,
+            args.output_dir,
+            family_names=families,
+            formats=formats,
+            strict=args.strict,
+        )
+    except (FileNotFoundError, KeyError, RuntimeError, ValueError) as exc:
+        print(f"FAIL: {exc}", file=sys.stderr)
+        return 1
+    print(f"Semantic fidelity {ledger['status']}: {args.output_dir}")
+    return 0 if ledger["status"] == "passed" else 1
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/src/AGENTS.md b/src/AGENTS.md
@@ -190,9 +190,9 @@ graph TD
   --ignore=src/tests/llm/test_llm_ollama_integration.py`. Re-include the two Ollama files
   when `ollama` is installed and reachable.
 - **Current test inventory (2026-06-12)**: 184 `test_*.py` files under `src/tests/`;
-  the command-of-record collect pass with Ollama integration tests ignored collected 2,399 tests.
+  the command-of-record collect pass with Ollama integration tests ignored collected 2,411 tests.
   Latest recorded full suite evidence with the same Ollama integration excludes is
-  2,381 passed, 17 skipped, 1 xfailed.
+  2,393 passed, 17 skipped, 1 xfailed.
 - All 25 orchestrator scripts comply with the <150 line thin orchestrator pattern.
 - Maintained source/test documentation coverage is enforced by `doc/development/docs_audit.py --strict`.
 
@@ -342,6 +342,6 @@ pytest --cov=src --cov-report=term-missing
 ---
 
 **Last Updated**: 2026-06-12
-**Pipeline Version**: 1.9.0
+**Pipeline Version**: 2.0.0
 **Total Steps**: 25 (0-24)
 **Status**: Maintained
diff --git a/src/gnn/parsers/json_serializer.py b/src/gnn/parsers/json_serializer.py
@@ -63,7 +63,12 @@ def serialize(self, model: GNNInternalRepresentation) -> str:
                 for param in model.parameters
             ],
             "equations": [
-                str(eq)
+                {
+                    "label": getattr(eq, "label", None),
+                    "content": getattr(eq, "content", ""),
+                    "format": getattr(eq, "format", "latex"),
+                    "description": getattr(eq, "description", ""),
+                }
                 for eq in (model.equations if hasattr(model, "equations") else [])
             ],
             "time_specification": self._serialize_time_spec(model.time_specification)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -90,4 +90,4 @@ uv pip install -e . # Install deps

		---

		Pipeline Version: 1.9.0 \| Steps: 25 \| Tests: latest recorded full suite with Ollama integration excludes: 2,381 passed, 17 skipped, 1 xfailed; collect-only inventory is 2,399 tests \| MCP Tools: verify with `src/tests/mcp/test_mcp_audit.py`
		Pipeline Version: 2.0.0 \| Steps: 25 \| Tests: latest recorded full suite with Ollama integration excludes: 2,393 passed, 17 skipped, 1 xfailed; collect-only inventory is 2,411 tests \| MCP Tools: verify with `src/tests/mcp/test_mcp_audit.py`