Lab

Lab — regime segmentation

Hidden-Markov regime segmentation of crypto microstructure

A Gaussian HMM on (logret, vol, trend) features, with the number of states K selected by BIC — yielding persistent, interpretable regimes for risk and sizing.

The mathematics

A hidden Markov model factors a length-T observation sequence x1:T through a latent Markov chain z1:T with K states. The joint density is

For continuous features we let each emission be Gaussian: . The parameters θ = (π, A, {μk, Σk}) are estimated by Baum–Welch (EM applied to HMMs). The E-step is the forward-backward recursion

The smoothed posterior γt(k) ∝ αt(k) βt(k) is what we ship to downstream sizing rules. It is a probability — not a hard label — and its row-entropy is a natural regime-confidence score.

Choosing K with BIC

With features in d dimensions, a K-state Gaussian HMM with full covariances has p = K − 1 + K(K−1) + Kd + Kd(d+1)/2 free parameters. The Bayesian Information Criterion

trades off in-sample fit against complexity at the rate log n. We sweep K = 2, 3, 4,…, pick K* = arg minK BIC(K). On 30-minute LTC/USDT (n = 270,171 bars; d = 3 features) the sweep is unambiguous: BIC drops from 1.633M at K = 2 to 1.366M at K = 3 to 1.251M at K = 4. K = 4 wins on every asset we’ve tried. The same cross-asset BIC chart appears below.

Persistence and dwell time

The transition matrix A has self-loop probabilities pkk. The expected number of bars spent inside regime k before the chain transitions is the geometric mean

Fitted on LTC 30m we measure stay-probabilities (0.987, 0.970, 0.969, 0.981) — i.e. dwell times τ ≈ (76, 34, 32, 52) bars, which on a 30-minute clock means the regimes last on the order of 16–38 hours. They are not artefacts of a flickering segmentation; they are macroscopically stable.

Worked example

Per-bar feature vector xt = (logrett, volt, trendt), where logret is the bar log-return, vol is a rolling realised std, and trend is a smoothed slope. Fitting K = 4 on LTC 30m yields four regimes whose standardized means are crisply separated:

  • R1 — drift-down (32% of bars): negative vol z, near-zero trend. The default background regime.
  • R2 — trend-up (24%): mean trend ≈ +0.58 σ. Sustained positive drift, moderate vol.
  • R3 — trend-down (25%): mean trend ≈ −0.60 σ. Mirror of R2.
  • R4 — high-vol (20%): mean vol ≈ +1.48 σ, near-zero trend. The dispersion regime.

The interactive demo below fits a Gaussian-mixture model live for K ∈ {2, …, 6} on a synthetic price series and plots BIC(K) — the minimum is highlighted. The strip underneath colours the price path by inferred state at the chosen K.

Demo — Gaussian-mixture BIC sweep & regime strip

Synthetic log-returns with three planted regimes. EM-fit a K-component mixture for K ∈ {2,…,6}; pick K by BIC; colour the price by MAP regime.

K (states)3
seed=7
K* by BIC
2
BIC at K*
-3602
log L̂ at K
1820
BIC at K
-3590
n bars
500
planted K
3
BIC = −2·log L̂ + p(K)·log n, with p(K) = (K−1) + 2K. Lower is better. The argmin (highlighted amber) is the BIC-preferred mixture.
-3605-3587-3568-3550-3532K=2-3602K=3-3590K=4-3572K=5-3553K=6-3535BIC(K) — lower is better-0.070.370.80cumulative log-price coloured by MAP regime (K = 3)t = 0t = 500

BIC selects K* = 2 on this synthetic series (planted K = 3). The mixture conflates the persistence structure of an HMM into pure distributional separation, so on shorter samples it can prefer K = 2 or K = 4 — the production HMM uses the full transition matrix and is better calibrated. Reseed to explore that variance.

Figures

BIC vs K across the 10 highest-density partitions of the 30-asset corpus
Fig. 1BIC(K) across ten 30-minute crypto pairs. On every asset the curve is monotone-decreasing through K = 4 and the marginal gain shrinks sharply afterwards — BIC selects K = 4 universally.
LTC/USDT 30m HMM segmentation
Fig. 2LTC/USDT 30m price coloured by inferred regime under the K = 4 HMM. Long monochromatic stretches reflect the high stay-probabilities (0.97–0.99) and dwell times of 32–76 bars.
BTC/USDT 30m HMM segmentation
Fig. 3The same K = 4 fit on BTC/USDT 30m. The four-state structure transfers across assets — trend-up, trend-down, drift, and a distinct high-vol regime are recovered without any retuning.
LTC transition matrix
Fig. 4Estimated transition matrix A on LTC 30m. The diagonal dominance — every p_kk above 0.97 — is what makes the regimes useful: state estimates are stable enough to condition risk on.
Synthetic recovery
Fig. 5Sanity check on synthetic data with three known regimes. EM recovers the planted state sequence with ≥98% per-bar accuracy; BIC correctly identifies K* = 3 here.

Why this matters for systematic strategies

Most strategy evaluations average performance over a single mixed sample and report a single Sharpe. Conditioning on zt partitions that sample by regime and exposes the structure that the average hides: a strategy that is +1.5 Sharpe in trend-up and −0.8 Sharpe in high-vol is not the same object as a strategy that is +0.3 in every regime, even if both have the same blended Sharpe. The HMM gives us the partition we need to make those statements quantitatively, and — because we use the smoothed posterior — to weight bars by regime-membership probability rather than thresholding on a hard label.

The same posterior feeds the position sizer: scale exposure by P(zt ∈ favourable | x1:t) computed in the causal forward pass (filtering, not smoothing, for live use). When the chain is confident we lean in; when entropy spikes near a transition we de-risk. The persistence we measure (τ ≈ 16–38 h) is the timescale that makes this kind of conditional sizing economically viable on 30-minute data.

Reproducibility

DaruFinance / strategy-regime

Python — open source reference implementation

Minimal invocation

import numpy as np
from strategy_regime import fit_hmm, bic_sweep, posterior

# X: T x d feature matrix (logret, vol, trend) per bar
sweep = bic_sweep(X, K_grid=[2, 3, 4, 5, 6])
K_star = min(sweep, key=lambda r: r["bic"])["K"]

model = fit_hmm(X, n_states=K_star, n_iter=200, seed=0)
gamma = posterior(model, X)         # T x K smoothed P(z_t | x_{1:T})
states = gamma.argmax(axis=1)        # MAP regime per bar
dwell  = 1.0 / (1.0 - np.diag(model.transmat_))   # expected dwell per regime

References

  1. [1]Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2), 357–384.
  2. [2]Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286.
  3. [3]Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6(2), 461–464.