Skip to content

Provenance Signing

CalcBridge provides cryptographic provenance signing for XLSX exports, ensuring data integrity and enabling downstream verification.


Overview

Provenance signing creates a tamper-evident record of exported data by:

  1. Computing a canonical hash of every cell value
  2. Signing the hash with HMAC-SHA256
  3. Embedding a manifest in the XLSX file as a Custom XML Part
flowchart TD
    subgraph Hashing["Canonical Hashing"]
        CELLS["Cell Values"] --> SERIALIZE["Deterministic\nSerialization"]
        SERIALIZE --> HASH["SHA-256\nHash"]
    end

    subgraph Signing["HMAC Signing"]
        HASH --> HMAC["HMAC-SHA256"]
        SECRET["Server Secret"] --> HMAC
    end

    subgraph Embedding["Manifest Embedding"]
        HMAC --> MANIFEST["Provenance\nManifest"]
        MANIFEST --> XML["Custom XML\nPart"]
        XML --> XLSX["XLSX File"]
    end

    style HMAC fill:#DCFCE7,stroke:#22C55E
    style SECRET fill:#FEF3C7,stroke:#F59E0B

Canonical Hashing

Cell Serialization

Each cell is serialized deterministically to ensure consistent hashing:

{sheet_name}:{cell_reference}:{type}:{value}

Type Normalization

Cell Type Normalization
Boolean Lowercase: true / false
Integer String representation: 42
Float Fixed precision: 3.14159
Decimal String representation
DateTime ISO 8601: 2026-01-25T10:30:00
Date ISO 8601: 2026-01-25
String Trimmed, as-is
None/Empty Empty string

Hash Computation

  1. Serialize each cell using the deterministic format
  2. Compute SHA-256 hash per cell
  3. Combine all cell hashes into a canonical data hash

HMAC Signing

The canonical data hash is signed using HMAC-SHA256:

  • Algorithm: HMAC-SHA256
  • Key: Server-side secret (configurable, supports rotation)
  • Input: Canonical data hash
  • Output: HMAC signature

Key Rotation

HMAC verification supports key rotation:

  • Current secret is tried first
  • If verification fails, previous secrets are attempted
  • Constant-time comparison prevents timing attacks

Manifest Structure

The provenance manifest contains:

Field Description
export_id Unique export identifier
workbook_id Source workbook ID
export_format Always xlsx for provenance
data_hash SHA-256 canonical hash
hmac_signature HMAC-SHA256 signature
manifest_version Hash algorithm version (cb_hash_v1)
hash_scope Which sheets were hashed

XML Format

<?xml version="1.0" encoding="UTF-8"?>
<calcbridge-provenance xmlns="urn:calcbridge:provenance:v1">
  <export_id>exp_abc123</export_id>
  <workbook_id>550e8400-e29b-41d4-a716-446655440000</workbook_id>
  <export_format>xlsx</export_format>
  <data_hash>sha256:e3b0c44298fc1c149afbf4c8996fb924...</data_hash>
  <hmac_signature>hmac-sha256:b0344c61d8db38535ca8afceaf0bf12b...</hmac_signature>
  <manifest_version>cb_hash_v1</manifest_version>
  <hash_scope>all_sheets</hash_scope>
</calcbridge-provenance>

Verification

API Verification

Use the Exports API to verify an export:

GET /api/v1/exports/{export_id}/verify

Manual Verification Flow

flowchart LR
    XLSX["XLSX File"] --> EXTRACT["Extract\nManifest"]
    EXTRACT --> RECOMPUTE["Recompute\nCanonical Hash"]
    RECOMPUTE --> COMPARE{"Hashes\nMatch?"}
    COMPARE -->|Yes| VERIFY_HMAC["Verify\nHMAC"]
    COMPARE -->|No| TAMPERED["Data\nTampered"]
    VERIFY_HMAC --> VALID["Integrity\nVerified"]

    style VALID fill:#DCFCE7,stroke:#22C55E
    style TAMPERED fill:#FEE2E2,stroke:#EF4444

Cell-Level Diff

When verification fails, cell-level diff identifies exactly which cells changed:

Status Description
missing Cell in manifest but not in file
modified Cell value has changed
unexpected Cell in file but not in manifest

Diff results are truncated at 200 entries to prevent excessive output.