Article — 14 min read — 2026-05-04

Edge is in the Process

Why 400,000+ strategies find alpha that individuals cannot.

30
Asset/timeframe combinations in corpus
23M+
Strategy-windows evaluated (full corpus)
14.5%
Profitable OOS individually (PF > 1.2)
0.32
Median IS→OOS rank correlation
Strong
Portfolio profitability lifts under filter — strong regime
Proprietary
Filter construction reserved to Daru Finance

I tested millions of strategies. Almost none of them carry durable edge.

That sentence is precise. The full empirical apparatus underlying this article spans 30 asset/timeframe combinations across crypto majors (BTC, ETH, BNB, SOL, XRP, ADA, DOGE, LINK, LTC, AVAX, etc.), mid-caps (APE, ATOM, ICP, NEAR, ALGO, HBAR, ARB, SUI, APT, ZEC, others), and three forex pairs (AUDUSD, USDCAD, USDCHF), each evaluated under sliding walk-forward optimisation and indicator families spanning ATR, EMA, MACD, PPO, RSI, RSI-LEVEL, SMA, and STOCHK. Each parameter grid covers thousands of variants per asset.

Aggregated across the entire 30-asset substrate the corpus weighs in at roughly 23 million strategy-windows once both IS / OOS samples and Daru Finance's proprietary perturbation suite are unrolled. Per-window aggregates published in detail for four assets — BTC, DOGE, BNB, SOL — illustrate the broader pattern; the qualitative findings reproduce on every other asset partition the firm has run.

Across the full corpus, fewer than one in seven strategies show an out-of-sample profit factor above 1.2 — a deliberately modest threshold. With ~38,000 parameter variants per asset and 30 partitions, multiple-comparisons inflation predicts a meaningful fraction of false positives at any unadjusted threshold. The relevant question is not "did any individual strategy print a profitable backtest" — many did. It is "do those individuals carry that edge into the next window in any predictable way." The answer, which we develop below, is essentially no — until you change the unit of analysis from the strategy to the population.

If you came here expecting a strategy you can run, this is where most articles would lose you. Stay another 13 minutes.

LOADING STRATEGY POOL…

The setup

The empirical pipeline is deliberately exhaustive — chosen so that nothing is hiding in a clever parameter setting. Every asset is evaluated independently. Each strategy class is parameterised across a Cartesian product of indicator parameters, signal-line variants, stop-loss regimes, and risk-reward ratios. After deduplication, the result is roughly 38,000 parameterised strategies per asset.

Walk-forward optimisation slides a fixed-width in-sample window across the full price history. At each window, the best parameter set is selected by an in-sample objective (Sharpe, profit factor, or a robustness composite, depending on the experiment), and the chosen parameters are then applied untouched to the next out-of-sample window. The process repeats. A 12-week IS / 1-week OOS scheme over 7+ years of data yields hundreds of windows per strategy.

The point of this design is not to find the best strategy — it is to characterise the distribution of OOS outcomes given an honest IS selection process. That distribution is the protagonist.

Window-level scatter of robustness rank versus next-window out-of-sample performance
Fig. 1Window-by-window relationship between an in-sample robustness rank and next-window OOS outcome. The cloud is wide; the trend is shallow.Source: Daru Finance empirical corpus, illustrative four-asset panel.

The negative result

The cleanest way to see strategy-level futility is to plot in-sample profit factor against out-of-sample profit factor. If IS predicted OOS, the joint distribution would concentrate along the diagonal. It does not. Instead, density piles up on a horizontal stripe near OOS PF ≈ 1, regardless of where the strategy sat on the IS axis.

LOADING DECAY MAP…

The median rank correlation between IS and OOS profitability across all (strategy, window) pairs is approximately 0.32. That number is non-trivial — it reflects that some in-sample structure does carry over. But 0.32 is well below what you would need to make individual strategy selection a positive-expectation operation in the presence of transaction costs, regime variation, and the multiple-comparisons problem. Crucially, it does not concentrate predictably enough to lift any single strategy above the noise floor of an honest implementation budget.

Across millions of backtests, no single strategy carried statistically credible out-of-sample edge. Read that twice.

This is not a finding about these particular strategies. The same shape appears across crypto assets and across forex pairs (the latter held out as a robustness check). It appears across indicator families. It appears across timeframes. The conclusion forced by the data is structural: in this empirical setting, individual strategy selection is a coin flip with extra steps.

The pivot — what if the question is wrong?

The natural reaction to that conclusion is to look harder for better strategies. That impulse is wrong, and the data above is the receipt.

If you accept that the strategy-population is irreducibly noisy at the individual level, the right reformulation is: can we find a property of the population, evaluable from in-sample data, that predicts out-of-sample portfolio behaviour? This is a different question, and it has a different answer.

That portfolios of weak strategies can outperform their individual components, and that robustness-aware selection rules outperform naïve Sharpe ranking, are findings that have appeared repeatedly in the academic literature (López de Prado, Bailey, Harvey, Liu, and others). The companion paper to this article, available on SSRN, situates our results within that lineage.

What is novel in our work — and what is the firm's edge — is a specific construction of a robustness filter that operationalises this insight at scale across the 30-asset corpus. The internal mechanics of that filter are proprietary to Daru Finance's consulting work and are not disclosed here. What follows is the empirical demonstration that such a filter exists, that it works, and that it telegraphs the regimes where it does not.

The robustness filter — what it is, and what it isn't

For the purposes of this article, the filter is a black box. It ingests the in-sample performance of every strategy in the pool and emits, for each strategy, a robustness signal that captures how reliably its in-sample edge sits above its own noise floor. Strategies are ranked by that signal; portfolios are constructed from the top of the ranking. The construction details — the noise model, the aggregation rule, the threshold calibration — are detailed in our consulting work and remain confidential.

The interesting empirical claim is not "strategies that score well by the filter perform well." That would be a tautology — selection bias dressed in greek letters. The claim is stronger: the filter signal, computed entirely on in-sample windows, retains predictive power for the next-window OOS profitability of the resulting portfolio. That is testable. It is also what most published filter results fail to demonstrate.

LOADING ROBUSTNESS DISTRIBUTION…

The slider above shows what happens to a population's mean OOS profit factor as you raise the filter's strictness. The curve is monotonic across all four illustrative assets at the operationally useful range. Survival rate falls — that is the cost of the filter — but conditional OOS profitability lifts. The same qualitative shape obtains across the broader 30-asset corpus.

Robustness-signal distributions per asset
Fig. 2Robustness-signal distributions per asset for the illustrative four. The right-tail concentration is what the filter selects on; the construction of the signal itself is proprietary.

This is the empirical hinge. If the filter signal had no predictive power, the curve would be flat. It is not flat.

The portfolio result

Now the operational question: build portfolios from the filter-selected pool, and measure profitability at the portfolio level across sliding windows. The answer is striking — across the strong-regime sliding runs of one of Daru Finance's deep-WFO experiments, portfolio profitability lifts dramatically under the proprietary filter, while the same pool with no filter underperforms and the same pool with random selection sits at noise level.

LOADING PORTFOLIO MC…

Toggle between "None," "Random," and "Robust" in the simulator above. The portfolio outcome shifts not by a few percent but by orders of magnitude. The strategies are the same. The data is the same. What changed is the rule for selecting which strategies enter the portfolio.

The alpha is not in the strategy. It is in the question you ask of the population.

For an even stricter test, apply institutional hurdle-style pass/fail rules — fixed profit targets and drawdown caps, the kind a desk or evaluation programme would impose externally — to the same robust-filtered portfolios. The pass-rate framework, the per-asset thresholds, and the simulated outcomes are part of Daru Finance's proprietary evaluation protocol and are not disclosed here. The qualitative finding is what matters publicly: under the filter, simulated pass rates lift materially over unfiltered baselines, and the lift is asset- and regime-dependent in a way the filter's own internal signal predicts.

These are not curve-fit numbers. They are the result of running the same filter against the same strategy pool across hundreds of IS/OOS windows, then evaluating each resulting portfolio against fixed external rules.

Portfolio-level profit factor invariance under the robustness filter across sliding windows
Fig. 3Portfolio-level PF distributions before and after filtering, across sliding windows. The shift is consistent and large.

When the filter fails — and why that's the point

A filter that always works is suspicious. The interesting feature of this one is that it telegraphs its own failure when its internal robustness signal degrades.

Across one of the deep-WFO sliding-window experiments, the early strong regime yields high portfolio profitability under the filter. Then the market regime breaks. The filter's internal robustness signal drops sharply. Profitability collapses, and one of the runs deep in the adversarial period produces an outcome with a profitable-portfolio fraction near zero — a signal-confirmed failure mode rather than a surprise.

Sim F — Regime Forensics

When the filter's internal robustness signal degrades, portfolio profitability collapses. The filter telegraphs its own failure. Illustrative; specific experiment numbers withheld.

ILLUSTRATIVE
12 sliding runs (illustrative)
LOADING PLOT…

What's important is that the filter's own signal warned about this run. The pool-wide robustness reading dropped to a level where the filter operationally says "the recent past is not informative." When that signal fires, you stand down.

Per-window portfolio analysis showing regime breakdown
Fig. 4Per-window portfolio profitability across one of Daru Finance's deep-WFO sliding-window experiments. The regime-collapse window is foreshadowed by degradation in the filter's own internal signal several runs earlier.

A second deep-WFO experiment on a different asset shows the same shape: portfolio profitability holds at high levels through the early strong-regime runs, collapses sharply mid-stream as the pool's robustness signal degrades, and recovers as the signal recovers. Across runs, the filter's internal signal and the realised portfolio profitability move together — the predictive power of the filter, measured directly. The exact correlation value is part of Daru Finance's internal evaluation record and is not published here.

The filter does not predict the future. It predicts whether the recent past is still informative.

Edge is in the process

The synthesis is uncomfortable for retail-style strategy hunting and reassuring for systematic firms.

Across this empirical corpus, individual strategies do not carry persistent OOS edge. A robustness filter applied to the same pool transforms noise-level individual outcomes into portfolio-level outcomes that pass institutional hurdles in strong regimes — and signals its own failure in weak ones. The locus of detectable, exploitable edge in this setting is not any particular strategy. It is the evaluation process the strategies pass through.

This has practical implications:

  • Strategy hoarding without an evaluator is asset-light theatre. A library of 38,000 backtested strategies adds zero edge if the selection rule is "the one with the highest IS Sharpe."
  • The right unit of analysis is the population, not the strategy. Population-level statistics — robustness aggregates, eigenvalue concentration of the strategy correlation matrix, tail-coupling — carry predictive structure where individual metrics do not.
  • Filter robustness is a regime-detection problem, not a strategy-fit problem. When the filter's own internal signal degrades, the right response is to reduce exposure, not to rotate strategies.

The construction of the specific filter that produced these portfolio outcomes is proprietary; what is published here is the empirical demonstration that such a process exists and works under regimes that can be detected in advance. The companion paper on SSRN develops the higher-level methodology and situates the result in the published literature; the firm's specific operational construction is reserved for consulting engagements.

Strategy-level vs portfolio-level outcome comparison — closing visual
Fig. 5Strategy-level versus portfolio-level outcome distributions. The mass shift is not a curve fit; it is the filter doing the work.

If you take one thing from this piece, take this: when the empirical setting tells you no individual instrument has signal, the productive question is not which instrument did I miss. It is what process can I apply to the population such that the population behaves better than its individuals?

That is what the rest of Daru Finance is about.