Sharpe vs Sortino Ratio: Which One Should You Use?
Sharpe ratio, Sortino ratio, and Calmar ratio explained. When to use each, why upside volatility should not penalize you, and how risk-adjusted return metrics fit into a real investment workflow.
You cannot evaluate an investment strategy by return alone. A strategy that returns 20% per year sounds great until you learn it has 60% drawdowns and 80% annualized volatility. Risk-adjusted return metrics, Sharpe, Sortino, Calmar, exist to put return and risk on a single scale. The question is which one to use, and the answer depends on what you mean by "risk."
Sharpe ratio: the default
The Sharpe ratio is the workhorse of quantitative finance. Defined by William Sharpe in 1966, it measures excess return per unit of total volatility:
Sharpe = (R_p − R_f) / σ_pwhere R_p is the strategy return, R_f is the risk-free rate, and σ_p is the annualized standard deviation of strategy returns. A Sharpe of 1.0 means the strategy earns one unit of excess return per unit of volatility, historically a good benchmark for active strategies. Sharpes above 2.0 are excellent; above 3.0 should make you suspicious that something is wrong (look-ahead bias, survivorship bias, or simply not enough data to estimate honestly).
Sharpe is popular because it is simple, scale-invariant (you can compare it across strategies of different sizes), and well-understood. It is the metric most quant shops report by default.
It also has a problem.
The upside-volatility problem
Sharpe treats all volatility as bad. A strategy that has a single huge winning month penalizes its own Sharpe because that winning month contributes to σ. This is mathematically defensible, variance is variance, but it does not match how investors actually think about risk. Nobody thinks of "the day I gained 10%" as a risk event. Risk is loss, not surprise.
For strategies with symmetric return distributions (most diversified equity portfolios), this does not matter much, upside and downside vol are roughly equal, and Sharpe is approximately right. For asymmetric strategies, it can be very wrong. Trend-following strategies, for example, often have positive skew: lots of small losses punctuated by occasional huge wins. Sharpe penalizes the wins as much as the losses, which underrates the strategy.
Sortino ratio: only downside counts
The Sortino ratio, developed by Frank Sortino in the 1980s, fixes the upside-volatility problem by replacing total volatility with downside deviation:
Sortino = (R_p − R_f) / σ_dwhere σ_d is the standard deviation of returns below a target threshold (typically zero or the risk-free rate). Returns above the threshold do not enter the calculation. The denominator captures "how much do bad outcomes vary?" while ignoring the variation of good outcomes.
For strategies with positive skew, Sortino is materially higher than Sharpe. For strategies with negative skew (most carry trades, short-volatility strategies), Sortino is lower than Sharpe, the downside vol is bigger than the total vol because the downside is where the action is.
Sortino is the better default metric for evaluating strategies because it more closely matches the intuitive definition of risk. The only downside of Sortino is that it requires more data to estimate stably (you are computing statistics on a subset of the data, which has higher sample-size requirements).
Calmar ratio: focus on drawdowns
The Calmar ratio, named after the California Managed Account Reports newsletter that popularized it in the 1990s, focuses on drawdown rather than volatility:
Calmar = R_p / MaxDrawdownA strategy that returns 15% per year with a 30% max drawdown has a Calmar of 0.5. A strategy that returns 10% with a 10% max drawdown has a Calmar of 1.0. Calmar treats max drawdown as the relevant risk metric, which it is, psychologically, for many investors.
Calmar is harder to estimate stably than Sharpe or Sortino because max drawdown is a single point statistic that depends heavily on the time window. A 3-year backtest will have a smaller max drawdown than a 10-year backtest, all else equal. For honest reporting, Calmar should always be reported alongside the time window used to compute it, and should be computed over at least 5-10 years to be meaningful.
When to use each
- Use Sharpe when: the strategy has a roughly symmetric return distribution; you want a single, widely-understood, comparable metric; you have at least 36 months of monthly data or 252 days of daily data.
- Use Sortino when: the strategy has asymmetric returns (trend-following, options strategies, anything with skew); you want to penalize bad outcomes specifically; you have enough data to estimate downside deviation stably.
- Use Calmar when: you care about max drawdown as the primary risk concern; you are evaluating long-horizon strategies (3+ years of data); you are presenting to investors who think in drawdown terms.
- Report all three when: writing serious due diligence material. Each captures something different and they should reinforce or contradict each other in ways that are themselves informative.
Some real-world numbers
For calibration, here are typical ratios for various asset classes over the 2010-2024 period (approximate, illustrative):
- S&P 500: Sharpe ≈ 0.9, Sortino ≈ 1.3, Calmar ≈ 0.5 (max DD around -34% in 2020 and -25% in 2022)
- NASDAQ 100: Sharpe ≈ 1.0, Sortino ≈ 1.5, Calmar ≈ 0.5
- 60/40 portfolio: Sharpe ≈ 0.7, Sortino ≈ 1.0, Calmar ≈ 0.6
- Managed futures (BTOP50): Sharpe ≈ 0.4, Sortino ≈ 0.7, Calmar ≈ 0.5
- Long-volatility strategies: Sharpe ≈ -0.2, Sortino ≈ -0.4 (consistently bleeding except in crisis)
These numbers shift substantially over different windows, the 2010-2024 window is a particular regime (long bull market in equities, suppressed vol regime for most of it). Be skeptical of any ratio reported without a clear specification of the window used.
Common abuses of risk-adjusted metrics
- Cherry-picking the window. A strategy reported "Sharpe 2.5 since 2019" might be cherry-picking a particularly favorable period. Always ask for the longest available track record.
- Smoothing illiquid returns. Private investments (PE, real estate, illiquid hedge funds) often report monthly returns that are smoothed by appraisal-based marks rather than true market prices. This artificially lowers measured volatility and inflates Sharpe. Adjust for smoothing or refuse to compare smoothed to unsmoothed returns.
- Reporting Sharpe with hourly or minute data on a strategy that turns over slowly. Higher-frequency sampling generally raises measured Sharpe by capturing more periods of low return variance. Use monthly or daily.
- Ignoring transaction costs. Pre-cost Sharpe is meaningless. The relevant number is net-of-cost.
Conclusion
For most retail and pro investors, Sortino is the better default than Sharpe, with Calmar as a useful sanity check on drawdown tolerance. The Sharpe ratio is not wrong, it is just optimized for a world of symmetric returns, which is not the world most strategies live in. Report all three when in doubt.
See risk-adjusted metrics live on every backtest in ARIA, including walk-forward Sharpe, Sortino, Calmar, and the Deflated Sharpe Ratio that we cover in our walk-forward backtesting guide. Free tier includes basic metrics, Premium includes the full battery.
Frequently asked questions
Is a higher Sharpe ratio always better?
Higher is better all else equal, but Sharpe alone can mislead. A strategy with a very high Sharpe over a short window may have just been lucky. A strategy with a moderate Sharpe but a long track record across multiple regimes is more reliable. Always look at Sharpe in the context of sample size and regime coverage. Sharpe above 3 over a short window should make you suspicious of overfitting or look-ahead bias.
Why is Sortino usually higher than Sharpe?
For most strategies with roughly symmetric returns, Sortino is slightly higher than Sharpe because downside deviation excludes positive return periods from the denominator. The bigger the positive skew in returns (i.e., the more upside vol), the bigger the gap. For strategies with negative skew (carry trades, short-vol), Sortino can be lower than Sharpe because the downside is concentrated and large.
What is a good Calmar ratio?
For long-only equity strategies, a Calmar above 0.5 over a 10-year window is solid; above 1.0 is excellent. For trend-following CTAs, 0.3-0.5 is typical. Be especially skeptical of Calmar ratios computed over short windows, three years is not enough to see a real max drawdown for most strategies. Always ask for the time window.
Ready to put this into practice?
ARIA Analyst applies these methods on any stock, crypto, forex, commodity, or fund. Three free analyses per day on the free tier.