Learn⚠️ RiskHistorical Extreme Events & Clustering — Black Swans Rarely Come Alone
⚠️ Risk6 min read

Historical Extreme Events & Clustering — Black Swans Rarely Come Alone

Lists 10 worst single-day events with benchmark return, excess return, and clustering analysis. Large negative excess = stock-specific risk; clustering = volatility clustering phenomenon.

Historical Extreme Events & Clustering — Black Swans Rarely Come Alone

4-section structure: Concept / How We Compute / How to Read / Caveats.

1. Concept

VaR/CVaR gives you statistical tail risk; but users really want to know: "Which days exactly were the worst, and why?"

That's the goal of "historical extreme event flagging" — explicitly list the worst 10 single days in the past 252, with:

  1. Same-day benchmark return — systemic crash or idiosyncratic event?
  2. Excess return = stock − benchmark — quantify "how much worse than peers"
  3. Clustering analysis — events uniformly distributed or concentrated?

Why Clustering Matters

Standard finance assumes i.i.d. (independent identically distributed) returns. Reality contradicts this:

Volatility Clustering: big moves tend to follow big moves.

  • 2020/3 COVID: 5 × −5%+ days in one month
  • 2022/10 rate panic: 3 crashes in two weeks
  • 2008/9 Lehman: turbulent for weeks

Not coincidence — a well-documented statistical fact. This indicator quantifies the clustering.


2. How We Compute

2.1 Event Selection

1. Take past 252 days' simple daily returns
2. Compute 1st percentile threshold
3. If events under threshold < 10, take worst 10 anyway (guarantee sample)
4. Sort by date ascending

Default shows 10 events.

2.2 Benchmark Return

From _calc_risk_series df's market_ret:

  • TW stocks → ^TWII
  • US stocks → ^GSPC

2.3 Excess Return

excess = stock_return − benchmark_return
  • Excess < −2% → stock dropped much more than market → idiosyncratic event
  • Excess ≈ 0% → systemic event
  • Excess > 0% → outperformed on a bad day

2.4 Clustering Analysis

mean_gap_days = average spacing between consecutive events
hottest_cluster = max events in any 30-day rolling window

Under i.i.d.: 10 events evenly across 252 days ≈ 25-day average gap. Observed gap < 15 days → clear clustering.


3. How to Read

3.1 Event List Interpretation

Date          Stock%    Bench%    Excess%
2024-03-11   -6.20%   -1.10%   -5.10%   ← Idiosyncratic (large neg excess)
2024-03-13   -4.80%   -0.40%   -4.40%
2024-03-18   -3.90%   -0.20%   -3.70%
2024-10-09   -5.50%   -4.80%   -0.70%   ← Systemic (market fell similarly)

Insights:

  • 3/11 – 3/18 all excess < −3% → stock-specific cascade (likely customer cut, inventory correction, earnings miss)
  • 10/9 excess −0.7% → market-wide event, not stock-specific

Far more actionable than just "−5% day".

3.2 Clustering Reading

Mean GapInterpretation
> 50 days🟢 Sparse, near-i.i.d.
20–50 days🟡 Typical stock
< 20 days🔴 Significant clustering, severe vol regime

Hottest cluster — highlighted at top. ≥ 4 events in 30 days indicates a structural crisis period, not isolated black swans.

3.3 Pairs Well With

  • VaR/CVaR: they give statistical tail; this gives concrete days
  • Jarque-Bera: tells you fat-tail severity; this shows what fat-tail looks like in practice
  • Max DD: MDD is the deepest; this is all the deep ones

4. Caveats

⚠️ Dynamic Threshold

Threshold = max(1% percentile, worst 10). During quiet periods (e.g., 2021 slow bull), worst 10 may not be that extreme. During crash periods, threshold genuinely severe.

UI shows threshold X% at the top for clarity.

⚠️ Daily Resolution Misses Intraday

Uses daily close. A stock that fell −8% intraday and recovered to −1% only logs as −1%.

Mitigation: cross-check with P1C.1 CVaR (captures borderline days) and news.

⚠️ Excess Return Not Beta-Adjusted

Direct subtraction stock − benchmark, no Beta adjustment.

  • Beta=1.5 stock should "normally" drop 4.5% when market drops 3%
  • Pure excess may overstate idiosyncratic risk

Academic: use CAPM residual. We chose intuitive subtraction; users can mentally adjust via the Beta card above.

⚠️ Clustering ≠ Full Time-Series Diagnostic

"4 events in 30 days" shows clustering, but:

  • Doesn't tell you why
  • Doesn't predict when next cluster (needs GARCH — Phase 4)
  • Only retrospective

⚠️ News Integration Pending

Original plan included same-day news. This version delivers core data; news integration in Phase 2 (requires news_d_tw_h integration + real-time event extraction).

⚠️ Fixed 252-day Window

Events outside the 252 window are forgotten. For COVID 2020/3 impact analysis:

  • 252 days after 2022/3 already excluded COVID
  • Use long-horizon stress tests (P2.3 planned)

Further Reading

  • VaR vs CVaR
  • Q-Q Plot & Jarque-Bera
  • Max Drawdown, Ulcer, Calmar
  • GARCH Volatility Model (Phase 4)

Try It

  • Stock Analysis → Risk: scroll to "Historical Extreme Events"
  • Watch excess-return column — red means stock-specific drops
  • Cross-reference "hottest cluster" with market events you remember
  • Switch stocks: steady large-caps vs speculative names show very different clustering
  • Click 📐 for event selection strategy and clustering formulas

Done reading? Try it hands-on

Practice with CTSstock tools to deepen your understanding

View TSMC's extreme events