Work in progress
This review is still in progress. Some claims have not yet been verified, and the results are not yet complete.
Research review · López de Prado
López de Prado, Reproduction & Review
A from-scratch reproduction and at-scale extension of Marcos López de Prado's methods across crypto, US equities and forex, every empirical claim gated by the Deflated Sharpe Ratio
This is a reproduction-and-extension program built from scratch around the methods in Marcos López de Prado's Advances in Financial Machine Learning and the surrounding papers, information-driven bars, fractional differentiation, financial labeling and cross-validation, ensembles and feature importance, portfolio construction, causal factors, predictive features, and bet sizing. Each study reproduces the original claim on a controlled benchmark, then re-runs it at scale on real, fully-costed, no-look-ahead data spanning crypto, US equities and forex, with every empirical headline gated by the Deflated Sharpe Ratio.
The honest through-line is simple: almost nothing clears deflated significance anywhere , which is precisely López de Prado's thesis, while several of his methodological claims reproduce cleanly: the activity clock really does thin the tails, fractional differencing really does keep the memory, bagging really does generalize where boosting does not, and a confounder really can manufacture a factor out of nothing. These are methodology results, not money results, and they are framed that way throughout.
Code & data
github.com/DaruFinance/lopez-de-prado-work-review
12 studies · 3 asset classes
Every claim deflated
real, fully-costed, no-look-ahead data
The studies
Meta-Strategy Organization
The assembly-line model, specialized, separable research roles plus mandatory disclosure of every trial, as the structural antidote to the lone-quant backtest search.
Backtest Overfitting & the Deflated Sharpe Ratio
Across ~92,500 real strategies in crypto, equities and forex, the best beats its multiple-testing null in none.
Information-Driven Bars
Dollar / volume / tick bars Gaussianize returns in all three markets, granularity-dependent, and equities need session handling.
Fractional Differentiation
Fixed-width fractional differencing keeps ~0.98 correlation with the price level, versus ~0.01 for plain returns.
Labeling & Cross-Validation
k-fold leakage scales with the ratio of label horizon to fold size; meta-labeling is a precision filter, not alpha.
Predictive Features
Structural-break / entropy / microstructure features carry weak signal that doesn't survive cost + deflation; a cheap proxy matches expensive order-flow data.
Ensembles & Feature Importance
Bagging generalizes ~4.8× better than boosting on 100% of 40 instruments; MDI is substitution-biased, MDA isn't.
Trading Rules & Bet Sizing
Bet sizing cuts turnover 80–87% but adds no deflated edge; Triple-Penance AR(1) drawdown control is the validated win.
Portfolio Construction: HRP, NCO & Denoising
HRP / NCO beat raw Markowitz on out-of-sample variance; denoising's value is a function of q = T/N.
Causal Factor Investing
A confounder makes a null factor look significant 100% of the time; backdoor adjustment fixes it, and few real factors survive.
Sample Uniqueness & Sequential Bootstrap
Overlapping triple-barrier labels make observations non-IID; uniqueness weighting and sequential bootstrap restore the effective sample size before training.
Cross-Sectional ML
Trees clear deflation in the cross-section where single-series ML fails: lgbm survives the Deflated Sharpe on 10 of 10 horizons, the program's first DSR-surviving ML edge, approaching but not beating the best static archetype.
The selection-discipline theme that runs through this program is the same one behind The edge is in the process and the broader body of work at Research.

