Cointegration and Statistical Arbitrage: A Pairs-Trading Primer
How cointegration powers pairs trading. The Engle-Granger and Johansen tests, the spread construction, half-life estimation, and the position-sizing math that makes the strategy survive transaction costs.
Pairs trading is the oldest statistical-arbitrage strategy still in active use. The intuition is simple: find two securities whose prices move together for fundamental reasons, build a spread between them, and trade the spread when it deviates from its long-run average. The implementation hangs on a single technical concept, cointegration, and the entire literature on statistical arbitrage is essentially a refinement of "how do you test for cointegration honestly, build the spread cleanly, and size positions sensibly."
This article walks through the cointegration machinery, the standard pairs-trading workflow, and the pitfalls that turn a profitable backtest into an unprofitable live strategy.
What cointegration is
Two time series are cointegrated if each is non-stationary (a unit root, i.e., a random walk with drift) but a linear combination of the two is stationary (mean-reverting). The classic example: KO and PEP, both individual prices wander, but the ratio (or log-ratio) mean-reverts because Coca-Cola and Pepsi compete in the same product market and any persistent divergence is arbitraged away by switching consumer behavior.
Mathematically, if X_t and Y_t are I(1) (integrated of order 1, meaning their first differences are stationary), they are cointegrated if there exists a coefficient β such that Y_t − β X_t is I(0) (stationary). The coefficient β is the "cointegrating vector"; the residual Y_t − β X_t is the "spread."
For trading, the cointegrating relationship is the source of the signal. The spread is bounded, it mean-reverts to zero, so when it deviates far from zero, you bet on reversion: go long the underpriced leg and short the overpriced leg, in proportion to the cointegrating coefficient β.
Testing for cointegration: Engle-Granger
The Engle-Granger two-step test is the simplest cointegration test. Step 1: regress Y_t on X_t (and a constant) using OLS to get the cointegrating coefficient β. Step 2: run an Augmented Dickey-Fuller test on the residuals from step 1 to test the null hypothesis that the residuals have a unit root. If you reject the unit-root null, you conclude cointegration.
The critical values for the ADF test in step 2 are different from the standard ADF test because the residuals are not from a known sequence, they are estimated. You have to use the MacKinnon critical values that account for this. Most stats packages do this automatically; if you are rolling your own, do not skip this step or you will get a flood of false positives.
A practical issue with Engle-Granger is that the choice of which variable to put on the left-hand side affects the estimated β. The standard fix is to run the regression both ways and check that both versions yield similar cointegrating coefficients. If they disagree substantially, the relationship is weak and probably not cointegrated.
Johansen test for multivariate cointegration
For more than two series, the Johansen procedure is the standard. It uses a vector error-correction model (VECM) to test for the number of cointegrating relationships in a system of N variables. The test produces two statistics, trace statistic and maximum eigenvalue statistic, that test the null of r cointegrating relationships against r+1 (or more).
The Johansen approach is more powerful than Engle-Granger and handles more than two assets cleanly. For sector baskets, ETF arbitrage, and multi-leg statistical arbitrage strategies, it is the standard tool. The cost is complexity: you have to specify the lag length, the deterministic terms (constant in cointegration vs. constant in VAR), and you have to be careful about how to interpret the cointegrating vectors when there are multiple of them.
Spread construction and half-life
Once you have a cointegrating pair, the spread is S_t = Y_t − β X_t. The trading signal is based on how far S_t is from its long-run mean. The standard implementation is a z-score: z_t = (S_t − μ_S) / σ_S, where μ_S and σ_S are the rolling mean and standard deviation. Enter long the spread when z_t < -2 (spread is unusually low), enter short when z_t > 2, and exit when z_t returns to within 0.5 of zero.
The crucial parameter is the half-life of mean reversion, the expected time for the spread to revert halfway to its mean from a deviation. The half-life is estimated from an OLS regression of the spread's first difference on its lagged level: ΔS_t = α + γ S_{t-1} + ε_t. The half-life is then h = -ln(2) / ln(1+γ). A half-life of 5 trading days is fast; 20 days is moderate; 100 days is slow.
Half-life matters for two reasons. First, it sets your holding period, you should hold the position for roughly the half-life, not longer. Second, it sets your transaction-cost budget, if the half-life is 100 days and you are paying 10 bp round-trip, you need the spread's standard deviation to be at least ~25 bp to cover costs, otherwise the strategy bleeds.
Position sizing in pairs trading
The natural position size is the cointegrating coefficient: long $1 of Y and short $β of X. This neutralizes the relationship and exposes you only to the spread. In practice, you scale this by the absolute value of the z-score (larger deviations get larger positions) and by your overall risk budget.
A subtle issue: the cointegrating coefficient β drifts. If you fix β at its training-period value and the true relationship changes, you accumulate directional exposure unintentionally. The standard fix is rolling re-estimation, refit β every N days using the trailing K days of data. Rolling Kalman filtering of β is a more elegant approach that handles slow drift smoothly.
Common pitfalls
- Spurious cointegration. Random walks can pass the Engle-Granger test by chance. Always require fundamental reason to believe the relationship is real before trusting the statistical test.
- Look-ahead in spread normalization. If you compute the z-score using the full-sample mean and standard deviation, you have leaked future information. Use rolling moments only.
- Ignoring transaction costs. Spreads with a 30-day half-life and 50 bp standard deviation are unprofitable after typical retail transaction costs. Be ruthless about the cost-adjusted Sharpe.
- Trading too many pairs. The probability of false-positive cointegration scales with the number of pairs you test. If you scan 10,000 pairs, you will find dozens that look cointegrated at the 5% level by chance. Use Bonferroni correction or false-discovery rate control.
- Position concentration. Pairs that have worked historically tend to cluster (same sector, same factor exposure). Holding 10 pairs from the same sector is one bet, not ten.
Where pairs trading still works
The classic pairs-trading edge in single-name equities is largely arbitraged away, the spreads are too tight and the transaction costs too high for retail-scale strategies. The edge still exists in (a) ETF vs. underlying basket (ETF arbitrage), (b) cross-listed shares (ADR arbitrage), (c) calendar spreads in futures (intra-curve mean reversion), and (d) crypto pairs across exchanges (with appropriate execution infrastructure). For each of these, the fundamental reason for cointegration is strong and the spreads are wide enough to support the strategy.
Conclusion
Cointegration is the mathematical foundation of pairs trading and statistical arbitrage. Engle-Granger and Johansen tests identify cointegrating relationships; rolling spread normalization produces trading signals; half-life estimation calibrates the holding period; and rolling β re-estimation handles slow drift. The strategy is no longer easy money in single-name equities, but the toolkit remains essential for ETF arbitrage, futures spreads, and crypto market-making.
ARIA Analyst uses cointegration-based screens to flag mean-reversion opportunities in its sector basket views. Create a free account to see live pair signals, or read how regime detection integrates with pairs trading (the strategy works in stable regimes and breaks in regime transitions). See also our walk-forward backtesting guide for honest validation methodology.
Frequently asked questions
What is the difference between correlation and cointegration?
Correlation measures the linear relationship between two return series, a moment-to-moment association. Cointegration is a long-run equilibrium relationship between two price series. Two assets can be highly correlated in returns without being cointegrated in prices (their long-run ratio can drift), and two assets can be cointegrated in prices without having particularly high return correlation. For pairs trading, you want cointegration, not correlation.
How long does a cointegrating relationship typically last?
For fundamental-driven pairs (e.g., a stock vs. its sector ETF, or two stocks in the same industry with similar business models), cointegration is stable for years. For relationships driven by transient factors (a temporary M&A premium, a calendar effect), it can break within months. Always re-test cointegration on a rolling window, a relationship that was cointegrated last year may not be this year, and a backtest that assumes a static cointegrating relationship will overestimate live performance.
Can I pairs-trade individual stocks profitably as a retail investor?
Probably not, at scale. Single-name equity pairs trading is heavily competed and the spreads are tight. The transaction cost wall, round-trip 10-20 bp plus borrow costs on the short leg, eats most of the edge. Better retail applications of cointegration logic are ETF-vs-basket arbitrage, sector-rotation strategies (cointegration between sector ETFs and benchmark), and crypto pairs across exchanges where infrastructure friction creates persistent spreads.
Ready to put this into practice?
ARIA Analyst applies these methods on any stock, crypto, forex, commodity, or fund. Three free analyses per day on the free tier.