Methods

How we build forecasts

Every number on this platform comes from a defined pipeline: ingest authoritative data, run a pinned model, check governance gates, and publish with full provenance. Here is exactly how each step works.

The governing equation

P(event | data, model, gate=pass)   ±   confidence band

Every forecast is a conditional probability: the chance of a hazard event given the latest ingested data, a pinned model version, and confirmation that all governance gates passed. The confidence band quantifies how much the estimate could shift given data and model uncertainty. If any blocking gate fails, the forecast is not published.

Data sources

All input data comes from authoritative government and scientific agencies. We do not generate our own observations.

USGS ANSS Comprehensive Earthquake Catalog (ComCat)

Real-time and reviewed earthquake events. Updated continuously. Feed latency: typically 1–5 minutes after event detection. Used for: seismicity rates, b-value computation, event clustering, Coulomb stress updates.

NOAA GFS (Global Forecast System)

Synoptic-scale atmospheric analysis and forecasts. Updated every 6 hours (00Z, 06Z, 12Z, 18Z). Used for: deep-layer shear, SST, ocean heat content, upper-level divergence, moisture fields.

NHC Advisories and SHIPS Guidance

Official tropical cyclone intensity/track advisories and Statistical Hurricane Intensity Prediction Scheme outputs. Updated with each advisory cycle. Used for: current intensity, track forecast, environmental predictors.

SPC Mesoanalysis

Storm Prediction Center real-time mesoscale analysis fields. Updated hourly. Used for: CAPE, CIN, STP, SRH, composite parameters, LCL height. Freshness threshold: 45 minutes. Beyond this, gate G1 triggers degrade.

RAP / HRRR and MRMS

Rapid Refresh / High-Resolution Rapid Refresh model output and Multi-Radar Multi-Sensor precipitation/rotation products. Updated hourly (RAP) and every 15 min (HRRR/MRMS). Used for: convective initiation, mesocyclone detection, environmental verification.

Model architectures

Each hazard has its own model tailored to the physics and data characteristics of that domain.

Earthquake (hp-etas-v3.2.1)

ArchitectureGradient-boosted ensemble + Bayesian calibration
Features42
Key inputsSeismicity rates, b-value, Coulomb stress, geodetic strain, catalog statistics
Training cutoff2026-02-15
Calibration methodIsotonic regression on holdout

Hurricane RI (hp-hurricane-ri-v8.1)

Architecture50-bag logistic + histogram GBT (d3+d4) + Platt calibration
Features65
Key inputsShear, SST (real NOAA), OHC, outflow, SHIPS predictors, satellite estimates
Training cutoff2026-03-13
Calibration methodPlatt scaling on holdout
AUC0.938 (advisory-time eval, 5 holdouts)

Tornado formation (hp-tornado-meso-v1.4.2)

ArchitectureRandom forest + Platt-scaled calibration
Features34
Key inputsSTP, MLCAPE, MLCIN, deep shear, SRH, LCL height, composite parameters
Training cutoff2026-02-25
Calibration methodPlatt scaling on holdout
AUC0.644 (day-ahead formation)

Tornado storm-object (hp-tornado-coherence-v1)

ArchitectureCoherence Field Theory scoring on ProbSevere storm objects
Features41
Key inputsProbSevere attributes, CAPE, SRH, MaxLLAz, coherence field diagnostics
Training cutoff2026-03-15
Calibration methodPlatt scaling on 2024 strict temporal holdout
AUC0.894 (2024 test, strict temporal)
StatusResearch / Accumulating

Evaluation protocol

Every forecast is evaluated against observed outcomes after its valid window closes.

Resolution process

  • Each forecast has a defined valid window (e.g., 30 days for earthquake, 24h for RI/tornado).
  • After the window closes, we check authoritative sources for the observed outcome.
  • Earthquake: USGS reviewed catalog. Hurricane: NHC best track. Tornado: SPC storm reports.
  • The outcome is recorded immutably in the verification ledger.
  • No manual overrides or retroactive adjustments.

Scoring metrics

  • Brier score - Mean squared probability error. Lower is better. 0 = perfect.
  • Brier skill score - Improvement over climatological baseline.
  • AUC - Discrimination ability (ROC curve area).
  • Log score - Information-theoretic proper scoring rule.
  • Calibration - Agreement between predicted and observed rates by bin.
  • Sharpness - How decisive forecasts are (spread of probabilities).

Caveats and boundaries

  • These are not official forecasts. HazardPulse provides experimental research outputs. Always follow guidance from USGS, NHC, NWS, and SPC.
  • Models have known limitations. Earthquake probability in low-seismicity regions has high uncertainty. Hurricane RI skill degrades for weak/disorganized systems. Tornado forecasting at long lead times (>12h) has low sharpness.
  • Probabilities are not certainties. A 40% probability means 60% chance the event does not happen. Interpret accordingly.
  • Calibration is approximate. Reliability diagrams show systematic tendencies but are computed on finite samples. Small bins have high sampling noise.
  • Data freshness matters. If an input source is stale (exceeds threshold), gate G1 triggers degrade, widening uncertainty bands and displaying a warning.
  • Models are retrained periodically. Version changes can cause abrupt shifts in probability that reflect improved skill, not real changes in hazard.
  • Geographic scope is limited. Earthquake coverage focuses on well-instrumented seismic networks. Hurricane RI applies to Atlantic and East Pacific named storms. Tornado covers CONUS only.

Governance and gates

13 hard gates must pass before any forecast reaches you. This is not optional.

#GateWhat it checks
G0Schema ValidityAll fields present and correctly typed
G1Source FreshnessInput data within acceptable latency
G2Model Lineage PinnedModel version matches registry
G3Provenance CompleteAll hashes present and chain intact
G4Calibration FloorBrier score below threshold
G5Spatiotemporal SanityScope within valid bounds
G6Alert Harm GuardNo alarmist or misleading language
G7Explanation MinimumWhy-changed panel has sufficient factors
G8ReplayabilityReplay artifact emitted and hash-verified
G9Public Projection PolicyDisclaimer attached to output
G10Security PolicyHTTP headers and CSP valid
G11Performance BudgetsPayload within size limits
G12Trust Surface SyncUI chips match actual gate state

pass = fully publish. degrade = publish with warning banner and widened uncertainty. block = forecast is not published.

What this is

  • Experimental probabilistic research output
  • A tool for understanding risk, not a replacement for official warnings
  • Transparent about uncertainty and limitations
  • Fully auditable through public evidence chain

What this is not

  • Not an official forecast or warning
  • Not a substitute for USGS, NHC, NWS, or SPC guidance
  • Not guaranteed to be accurate for any individual event
  • Not advice for evacuation or emergency decisions