Historical Extreme Events & Clustering — Black Swans Rarely Come Alone

4-section structure: Concept / How We Compute / How to Read / Caveats.

1. Concept

VaR/CVaR gives you statistical tail risk; but users really want to know: "Which days exactly were the worst, and why?"

That's the goal of "historical extreme event flagging" — explicitly list the worst 10 single days in the past 252, with:

Same-day benchmark return — systemic crash or idiosyncratic event?
Excess return = stock − benchmark — quantify "how much worse than peers"
Clustering analysis — events uniformly distributed or concentrated?

Why Clustering Matters

Standard finance assumes i.i.d. (independent identically distributed) returns. Reality contradicts this:

Volatility Clustering: big moves tend to follow big moves.

2020/3 COVID: 5 × −5%+ days in one month
2022/10 rate panic: 3 crashes in two weeks
2008/9 Lehman: turbulent for weeks

Not coincidence — a well-documented statistical fact. This indicator quantifies the clustering.

2. How We Compute

2.1 Event Selection

1. Take past 252 days' simple daily returns
2. Compute 1st percentile threshold
3. If events under threshold < 10, take worst 10 anyway (guarantee sample)
4. Sort by date ascending

Default shows 10 events.

2.2 Benchmark Return

From _calc_risk_series df's market_ret:

TW stocks → ^TWII
US stocks → ^GSPC

2.3 Excess Return

excess = stock_return − benchmark_return

Excess < −2% → stock dropped much more than market → idiosyncratic event
Excess ≈ 0% → systemic event
Excess > 0% → outperformed on a bad day

2.4 Clustering Analysis

mean_gap_days = average spacing between consecutive events
hottest_cluster = max events in any 30-day rolling window

Under i.i.d.: 10 events evenly across 252 days ≈ 25-day average gap. Observed gap < 15 days → clear clustering.

3. How to Read

3.1 Event List Interpretation

Date          Stock%    Bench%    Excess%
2024-03-11   -6.20%   -1.10%   -5.10%   ← Idiosyncratic (large neg excess)
2024-03-13   -4.80%   -0.40%   -4.40%
2024-03-18   -3.90%   -0.20%   -3.70%
2024-10-09   -5.50%   -4.80%   -0.70%   ← Systemic (market fell similarly)

Insights:

3/11 – 3/18 all excess < −3% → stock-specific cascade (likely customer cut, inventory correction, earnings miss)
10/9 excess −0.7% → market-wide event, not stock-specific

Far more actionable than just "−5% day".

3.2 Clustering Reading

Mean Gap	Interpretation
> 50 days	🟢 Sparse, near-i.i.d.
20–50 days	🟡 Typical stock
< 20 days	🔴 Significant clustering, severe vol regime

Hottest cluster — highlighted at top. ≥ 4 events in 30 days indicates a structural crisis period, not isolated black swans.

3.3 Pairs Well With

VaR/CVaR: they give statistical tail; this gives concrete days
Jarque-Bera: tells you fat-tail severity; this shows what fat-tail looks like in practice
Max DD: MDD is the deepest; this is all the deep ones

4. Caveats

⚠️ Dynamic Threshold

Threshold = max(1% percentile, worst 10). During quiet periods (e.g., 2021 slow bull), worst 10 may not be that extreme. During crash periods, threshold genuinely severe.

UI shows threshold X% at the top for clarity.

⚠️ Daily Resolution Misses Intraday

Uses daily close. A stock that fell −8% intraday and recovered to −1% only logs as −1%.

Mitigation: cross-check with P1C.1 CVaR (captures borderline days) and news.

⚠️ Excess Return Not Beta-Adjusted

Direct subtraction stock − benchmark, no Beta adjustment.

Beta=1.5 stock should "normally" drop 4.5% when market drops 3%
Pure excess may overstate idiosyncratic risk

Academic: use CAPM residual. We chose intuitive subtraction; users can mentally adjust via the Beta card above.

⚠️ Clustering ≠ Full Time-Series Diagnostic

"4 events in 30 days" shows clustering, but:

Doesn't tell you why
Doesn't predict when next cluster (needs GARCH — Phase 4)
Only retrospective

⚠️ News Integration Pending

Original plan included same-day news. This version delivers core data; news integration in Phase 2 (requires news_d_tw_h integration + real-time event extraction).

⚠️ Fixed 252-day Window

Events outside the 252 window are forgotten. For COVID 2020/3 impact analysis:

252 days after 2022/3 already excluded COVID
Use long-horizon stress tests (P2.3 planned)

Try It

Stock Analysis → Risk: scroll to "Historical Extreme Events"
Watch excess-return column — red means stock-specific drops
Cross-reference "hottest cluster" with market events you remember
Switch stocks: steady large-caps vs speculative names show very different clustering
Click 📐 for event selection strategy and clustering formulas