Pulse · Methodology
How we measure trend lead time
EarlyForge collects signals from 90+ sources across news, social, scientific, financial and institutional channels. Every 15 minutes we score each emerging cluster on a composite of velocity, diversity, recency, cohesion, and SEO opportunity. The global_score exposed in Pulse is the same one powering our internal alerting.
Scoring stack
- Velocity (35%) — adaptive EWMA + multi-window z-score (15 min / 1 h / 6 h) with optional global GDELT baseline calibration.
- Diversity (20%) — Shannon entropy across source platforms, weighted by topic-relative credibility (institutional sources start at base 65 vs 50 for social).
- Recency (15%) — exponential decay with platform-specific half-life.
- Cohesion (15%) — semantic cluster quality from BGE-M3 multilingual embeddings.
- SEO (15%) — keyword opportunity scoring.
Lead-time benchmark
We replay an annotated dataset of viral trends (Q1 2026 + April 2026) against our historical raw signal stream. For each trend we find the earliest 15-minute snapshot where the EarlyForge scoring crosses a threshold (default S2: pre_viral_score > 50) and compare it to the first appearance in Google Trends Daily Trends.
The reported lead time is T_google_trends − T_earlyforge. Positive numbers mean we detected the trend before Google Trends.
Reproducibility
The benchmark runner, dataset schema and validator live in our repository under services/trend-detector/evaluation. We publish the aggregated metrics (mean, median, p25, p75, F1) on this page once a week.