Lab

Lab · regime segmentation

Hidden-Markov regime segmentation of crypto microstructure

A Gaussian HMM on (logret, vol, trend) features, with the number of states K selected by BIC, yielding persistent, interpretable regimes for risk and sizing.

The mathematics

A hidden Markov model factors a length-T observation sequence x1:T through a latent Markov chain z1:T with K states. The joint density is

For continuous features we let each emission be Gaussian: . The parameters θ = (π, A, {μk, Σk}) are estimated by Baum–Welch (EM applied to HMMs). The E-step is the forward-backward recursion

The smoothed posterior γt(k) ∝ αt(k) βt(k) is what we ship to downstream sizing rules. It is a probability, not a hard label, and its row-entropy is a natural regime-confidence score.

Choosing K with BIC

With features in d dimensions, a K-state Gaussian HMM with full covariances has p = K − 1 + K(K−1) + Kd + Kd(d+1)/2 free parameters. The Bayesian Information Criterion

trades off in-sample fit against complexity at the rate log n. We sweep K = 2, 3, 4,…, pick K* = arg minK BIC(K). On 30-minute LTC/USDT (n = 270,171 bars; d = 3 features) the sweep is unambiguous: BIC drops from 1.633M at K = 2 to 1.366M at K = 3 to 1.251M at K = 4. K = 4 wins on every asset we’ve tried. The same cross-asset BIC chart appears below.

Persistence and dwell time

The transition matrix A has self-loop probabilities pkk. The expected number of bars spent inside regime k before the chain transitions is the geometric mean

Fitted on LTC 30m we measure stay-probabilities (0.987, 0.970, 0.969, 0.981), i.e. dwell times τ ≈ (76, 34, 32, 52) bars, which on a 30-minute clock means the regimes last on the order of 16–38 hours. They are not artefacts of a flickering segmentation; they are macroscopically stable.

Worked example

Per-bar feature vector xt = (logrett, volt, trendt), where logret is the bar log-return, vol is a rolling realised std, and trend is a smoothed slope. Fitting K = 4 on LTC 30m yields four regimes whose standardized means are crisply separated:

  • R1, drift-down (32% of bars): negative vol z, near-zero trend. The default background regime.
  • R2, trend-up (24%): mean trend ≈ +0.58 σ. Sustained positive drift, moderate vol.
  • R3, trend-down (25%): mean trend ≈ −0.60 σ. Mirror of R2.
  • R4, high-vol (20%): mean vol ≈ +1.48 σ, near-zero trend. The dispersion regime.

The interactive demo below plots the committed BIC(K) sweep for K ∈ {2, 3, 4} on the asset you select, the minimum is highlighted, and it lands on K = 4 on every one of the ten deepest-WFO assets. The panel underneath shows the fitted four-state structure: each regime’s standardized (logret, vol, trend) means, its share of bars, and its stay-probability.

Demo: Gaussian-mixture BIC sweep & regime strip

Synthetic log-returns with three planted regimes. EM-fit a K-component mixture for K ∈ {2,…,6}; pick K by BIC; colour the price by MAP regime.

K (states)3
seed=7
K* by BIC
2
BIC at K*
-3602
log L̂ at K
1820
BIC at K
-3590
n bars
500
planted K
3
BIC = −2·log L̂ + p(K)·log n, with p(K) = (K−1) + 2K. Lower is better. The argmin (highlighted amber) is the BIC-preferred mixture.
-3605-3587-3568-3550-3532K=2-3602K=3-3590K=4-3572K=5-3553K=6-3535BIC(K), lower is better-0.070.370.80cumulative log-price coloured by MAP regime (K = 3)t = 0t = 500

BIC selects K* = 2 on this synthetic series (planted K = 3). The mixture conflates the persistence structure of an HMM into pure distributional separation, so on shorter samples it can prefer K = 2 or K = 4, the production HMM uses the full transition matrix and is better calibrated. Reseed to explore that variance.

Figures

Fig. 1:BIC(K) across ten 30-minute crypto pairs. On every asset the curve is monotone-decreasing through K = 4 and the marginal gain shrinks sharply afterwards, BIC selects K = 4 universally.
Fig. 2:LTC/USDT 30m price coloured by inferred regime under the K = 4 HMM. Long monochromatic stretches reflect the high stay-probabilities (0.97–0.99) and dwell times of 32–76 bars.
Fig. 3:The same K = 4 fit on BTC/USDT 30m. The four-state structure transfers across assets, trend-up, trend-down, drift, and a distinct high-vol regime are recovered without any retuning.
Fig. 4:Estimated transition matrix A on LTC 30m. The diagonal dominance, every p_kk above 0.97, is what makes the regimes useful: state estimates are stable enough to condition risk on.
Fig. 5:Sanity check on synthetic data with three known regimes. EM recovers the planted state sequence with ≥98% per-bar accuracy; BIC correctly identifies K* = 3 here.

Why this matters for systematic strategies

Most strategy evaluations average performance over a single mixed sample and report a single Sharpe. Conditioning on zt partitions that sample by regime and exposes the structure that the average hides: a strategy that is +1.5 Sharpe in trend-up and −0.8 Sharpe in high-vol is not the same object as a strategy that is +0.3 in every regime, even if both have the same blended Sharpe. The HMM gives us the partition we need to make those statements quantitatively, and, because we use the smoothed posterior, to weight bars by regime-membership probability rather than thresholding on a hard label.

The same posterior feeds the position sizer: scale exposure by P(zt ∈ favourable | x1:t) computed in the causal forward pass (filtering, not smoothing, for live use). When the chain is confident we lean in; when entropy spikes near a transition we de-risk. The persistence we measure (τ ≈ 16–38 h) is the timescale that makes this kind of conditional sizing economically viable on 30-minute data.

Reproducibility

DaruFinance / strategy-regime

Python · open source reference implementation

Minimal invocation

import numpy as np
from strategy_regime import fit_hmm, bic_sweep, posterior

# X: T x d feature matrix (logret, vol, trend) per bar
sweep = bic_sweep(X, K_grid=[2, 3, 4, 5, 6])
K_star = min(sweep, key=lambda r: r["bic"])["K"]

model = fit_hmm(X, n_states=K_star, n_iter=200, seed=0)
gamma = posterior(model, X)         # T x K smoothed P(z_t | x_{1:T})
states = gamma.argmax(axis=1)        # MAP regime per bar
dwell  = 1.0 / (1.0 - np.diag(model.transmat_))   # expected dwell per regime

References

  1. [1]Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2), 357–384.
  2. [2]Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286.
  3. [3]Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6(2), 461–464.
Hidden-Markov regime segmentation of crypto microstructure | Daru Finance