## Abstract

In this article, the authors use machine learning tools to analyze industry return predictability based on the information in lagged industry returns. Controlling for post-selection inference and multiple testing, they find significant in-sample evidence of industry return predictability. Lagged returns for the financial sector and commodity- and material-producing industries exhibit widespread predictive ability, consistent with the gradual diffusion of information across economically linked industries. Out-of-sample industry return forecasts that incorporate the information in lagged industry returns are economically valuable: Controlling for systematic risk using leading multifactor models from the literature, an industry-rotation portfolio that goes long (short) industries with the highest (lowest) forecasted returns delivers an annualized alpha of over 8%. The industry-rotation portfolio also generates substantial gains during economic downturns, including the Great Recession.

**TOPICS:** Big data/machine learning, analysis of individual factors/risk premia, portfolio construction, performance measurement

A voluminous literature investigates aggregate stock market return predictability.^{1} In contrast, a relatively limited number of studies examine stock return predictability along industry lines, even though analyst reports and asset allocations are often industry based. Studies that analyze industry return predictability typically rely on popular predictor variables from the literature on aggregate market return predictability, such as the aggregate dividend yield, nominal yields, and yield spreads (e.g., Ferson and Harvey 1991, 1999; Ferson and Korajczyk 1995; Avramov 2004). In this article, we examine industry return predictability using a different information set—namely, lagged industry returns from across the entire economy.

Our study is the first to directly analyze industry return predictability based on lagged industry returns from across the economy. The lack of previous studies investigating this topic is perhaps due to the statistical challenges inherent in estimating regression models with a large number of predictors.^{2} Our use of lagged industry returns to forecast individual industry returns is motivated by the theoretical model of Hong, Torous, and Valkanov (2007), who introduce information frictions into an economy with multiple linked industries. Due to the industry links, cash flow shocks originating in one industry can affect expected cash flows in related industries. In a frictionless, rational-expectations equilibrium, investors readily recognize all of the interindustry implications of a cash flow shock in a particular industry. Consequently, equity prices across all relevant industries immediately adjust to fully impound the interindustry effects of the cash flow shock, and lagged industry returns do not have predictive power. However, incorporating insights from Merton (1987) and Hong and Stein (1999), Hong, Torous, and Valkanov (2007) posit that investors with limited information-processing capabilities specialize in specific market segments. In this environment, when a cash flow shock arises in a particular industry, information-processing limitations prevent investors specializing in related industries from quickly working out the full implications of the shock. Information thus diffuses gradually across industries, and the resulting delayed adjustment in equity prices gives rise to industry return predictability on the basis of lagged industry returns.

We analyze the predictive ability of lagged industry returns in a general predictive regression framework that allows each industry’s return to respond to the lagged returns for all industries, thereby accommodating a wide array of industry links, both direct and indirect. Given the plethora of potential predictors in the predictive regression models, conventional ordinary least squares (OLS) estimation has potential drawbacks. First, if all lagged returns are used, the abundance of predictors makes OLS estimation susceptible to overfitting. Second, if only a few lagged returns are chosen, it is difficult to know a priori which are the most important. Therefore, we use a machine learning approach to guard against overfitting the data in our high-dimensional setting and to select the most relevant predictors. Along this line, Rapach, Strauss, and Zhou (2013); Gu, Kelly, and Xiu (2018); Chinco, Clark-Joseph, and Ye (2019); Han et al. (2019); and Freyberger, Neuhierl, and Weber (forthcoming) recently use machine learning to predict aggregate and/or individual stock returns. Although we focus on industry return predictability in the present article, we share with these studies the use of machine learning to deal with the challenges posed by high-dimensional data in finance.

We use the least absolute shrinkage and selection operator (LASSO; Tibshirani 1996), a powerful and popular technique in machine learning. Similarly to ridge regression (Hoerl and Kennard 1970), the LASSO induces shrinkage in the estimated coefficients by including a convex penalty term in the objective function for fitting a model. In contrast to the ℓ_{2} penalty term in ridge regression, the LASSO relies on an ℓ_{1} penalty term, so that it permits shrinkage to exactly zero for some coefficients. It thus performs variable selection, which typically yields a sparse model. Sparsity has two key advantages. First, it helps to avoid overfitting the data by setting unimportant coefficients to zero. Second, it facilitates interpretation of the estimated model by identifying the most relevant predictor variables.

Although the LASSO’s ℓ_{1} penalty term mitigates overfitting the data via sparsity, it also tends to overshrink the coefficients for the selected variables. This tendency can lead to substantial downward biases (in magnitude) in the estimated coefficients (Fan and Li 2001). Recent studies propose OLS post-LASSO estimation to reduce the biases (e.g., Efron et al. 2004; Meinshausen 2007; Belloni and Chernozhukov 2011, 2013). The idea is to first use the LASSO to reduce the dimension of the model; to lessen the biases in the LASSO coefficient estimates, we then re-estimate the coefficients for the selected predictor variables using OLS. We use OLS post-LASSO to estimate predictive regression models for each industry, where the set of candidate predictors includes the lagged returns for all 30 industries that we consider. OLS post-LASSO estimation allows us to identify the most relevant set of lagged industry returns for predicting a given industry’s return, while generating more accurate estimates of the coefficients for the relevant lagged industry returns.

We use both in-sample and out-of-sample tests to analyze the ability of lagged industry returns to predict individual industry returns. With respect to the in-sample tests, we address the important issues of *post-selection inference* and *multiple testing*. In the context of OLS post-LASSO estimation, post-selection inference recognizes that the predictor variables that appear in the regression model are first selected from a large group of candidate predictors, so that conventional inferences for the selected variables’ coefficients are not necessarily reliable—we peeked at the data twice. We incorporate insights from Berk et al. (2013); Leeb, Pötscher, and Ewald (2015); and Zhao, Shojaie, and Witten (2017) to make accurate post-selection inferences concerning industry return predictability.

Multiple testing relates to the fact that we test a large number of individual null hypotheses. In this context, conventional *p*-values—which implicitly assume that a researcher tests a given null hypothesis in isolation—can present a misleading picture of statistical significance. To account for multiple testing, we use the Benjamini and Hochberg (2000) adaptive version of the Benjamini and Hochberg (1995) linear step-up procedure to control the false discovery rate (FDR) when testing hypotheses concerning individual coefficients for the entire set of LASSO-selected lagged industry returns.^{3}

For the in-sample analysis, we estimate predictive regression models via OLS post-LASSO using monthly return data spanning 1960 to 2016 for 30 industry portfolios from Kenneth French’s Data Library. The LASSO selects at least one lagged industry return as a predictor for 29 of the individual industries, while multiple lagged industry returns are selected for 22 of the individual industries. Furthermore, the OLS post-LASSO estimation results indicate that the lagged industry returns selected by the LASSO are often statistically significant predictors of industry returns, even after controlling for post-selection inference and multiple testing. Although monthly equity returns inherently contain a relatively small predictable component, the Campbell and Thompson (2008) metric indicates that the degree of return predictability is economically meaningful for nearly all industries.

Lagged returns for the financial sector and commodity- and material-producing industries are selected as return predictors for numerous individual industries. The prominent role of these sectors is economically intuitive from the perspective of the theoretical model in Hong, Torous, and Valkanov (2007). Firms in many industries rely on financial intermediaries for financing. When the financial sector experiences a positive return shock, financial firms have larger capital buffers and thus become more willing to provide credit on favorable terms to industries across the economy; borrowers benefit directly from the better terms, and their customers benefit indirectly. In the presence of information frictions, we thus expect lagged financial sector returns to positively affect future returns in many industries, which is what we find. Furthermore, commodity price shocks raise product prices and returns for sectors located in earlier production stages, whereas they squeeze profit margins and lower returns for sectors located in later production stages. With information frictions, we thus expect returns for lagged commodity- and material-producing industries to negatively affect future returns for industries located in later stages of the production chain, which is again what we find.

The out-of-sample analysis simulates the situation of an investor in real time and measures the economic value of industry return predictability. We first compute out-of-sample forecasts of monthly industry returns based on predictive regressions estimated via OLS post-LASSO. We then sort the 30 industries according to their forecasted returns over the next month and construct a zero-investment portfolio that goes long (short) the top (bottom) quintile of sorted industries. For the January 1970 to December 2016 out-of-sample period, the industry-rotation portfolio generates an average return of 7.33% per annum, annualized Sharpe (Sortino) ratio of 0.65 (1.16), and Goetzmann et al. (2007) manipulation-proof performance measure (MPPM) of 4.84% for a relative risk aversion coefficient of four. The industry-rotation portfolio performs especially well during business-cycle recessions, thereby providing a hedge against bad times for the macroeconomy. The portfolio also exhibits negative exposures to the broad equity market and Fama and French (1993) value factors, and it generates a substantial annualized alpha of more than 8% in the context of both the Carhart (1997) four-factor and Hou, Xue, and Zhang (2015) *q*-factor models.

## PREDICTIVE REGRESSION FRAMEWORK

Our basic framework is the following general predictive regression model specification:

1where **r**_{i} is the *T*-vector of monthly observations for the return on industry portfolio *i* in excess of the one-month Treasury bill return (with typical element *r*_{i}_{,}_{t}), *T* is the usable number of monthly observations, **X** is a *T*-by-*N* data matrix of lagged industry excess return observations, *N* is the number of industry portfolios, is an *N*-vector of regression coefficients, and ε_{i} is a *T*-vector of zero-mean disturbance terms. Equation 1 allows for lagged returns for all industries across the economy to affect a given industry’s excess return, thereby accommodating general industry links. Because of the high-dimensional nature (*N* = 30) of Equation 1, conventional OLS estimation runs the risk of overfitting. To deal with the challenges posed by the high dimensionality of the predictive regression models, we use the LASSO from machine learning.

Tibshirani (1996) introduces the LASSO as a regularization device for performing shrinkage in regressions with a large number of candidate predictor variables. For Equation 1, the LASSO objective function can be expressed as

2where λ_{i} ≥ 0 is a regularization parameter. The first component in parentheses in Equation 2 is the familiar sum of squared residuals; the objective function thus reduces to that for OLS when λ_{i} = 0. The second component in parentheses is an ℓ_{1} penalty term that shrinks the slope coefficient estimates to prevent overfitting. Unlike ridge regression—which relies on an ℓ_{2} penalty—the ℓ_{1} penalty in Equation 2 allows for shrinkage to zero (for sufficiently large λ_{i}), so that it performs variable selection. By reducing the dimension of the model, variable selection helps to guard against overfitting and facilitates interpretation of the estimated model. Powerful algorithms are available for computing the solution to Equation 2, such as cyclical coordinate descent (Friedman, Hastie, and Tibshirani 2010), making LASSO estimation feasible even when *N* is large.

The LASSO generally performs well in selecting the most relevant predictor variables (e.g., Zhang and Huang 2008; Bickel, Ritov, and Tsybakov 2009; Meinshausen and Yu 2009). However, LASSO estimates of the coefficients for the selected predictors suffer from downward biases in magnitude (Fan and Li 2001). Intuitively, the LASSO penalty term overshrinks the coefficients for the selected predictors. To alleviate the biases in the LASSO estimates, Efron et al. (2004), Meinshausen (2007), and Belloni and Chernozhukov (2011, 2013) propose re-estimating the coefficients for the LASSO-selected predictors via OLS (OLS post-LASSO estimation).

Because the predictor variables are first selected from a set of candidate predictors, statistical inference based on OLS post-LASSO estimation constitutes post-selection inference, an active area of recent research (e.g., Wasserman and Roeder 2009; Meinshausen, Meier, and Bühlmann 2009; Berk et al. 2013; Leeb, Pötscher, and Ewald 2015; Lee et al. 2016; Tibshirani et al. 2016; Zhao, Shojaie, and Witten 2017).^{4} When making post-selection inferences, we first need to identify the inferential target. The traditional target is the set of true regression coefficients for the full model that includes all of the potential predictors (Leeb, Pötscher, and Ewald 2015), which, in our context, corresponds to in Equation 1. Alternatively, Berk et al. (2013) advocate targeting the true regression coefficients for a submodel:

where is the index set of predictors for the submodel, and is composed of the columns of **X** indexed by . Recognizing that the actual data-generating process is essentially unknowable, a useful submodel provides a parsimonious representation of the data that focuses on the most relevant predictors.^{5} Because they measure different things, the true regression coefficients for the full model and submodel are generally different: measures the response to a given predictor, conditional on the remaining *N* − 1 predictors in the full model; measures the response to the predictor, conditional on the other predictors in the submodel, where is the cardinality of .

Let for any denote the conventional OLS confidence interval for in the model selected by the LASSO, where is the index set of predictors for the LASSO-selected model. Leeb, Pötscher, and Ewald (2015) use simulations to investigate the coverage properties of . For the case in which the true regression coefficients for a submodel are targeted, they find that has approximately correct coverage. This finding is surprising, since the naïve OLS post-LASSO confidence interval does not explicitly account for the fact that the data are used twice—first to select the predictor variables from the complete set of potential predictors via the LASSO and then to estimate the regression coefficients for the LASSO-selected predictors via OLS. Zhao, Shojaie, and Witten (2017) provide a theoretical explanation for this finding. Under reasonably weak assumptions, they show that the set of LASSO-selected predictors is deterministic with high probability, so that the conventional OLS post-LASSO confidence intervals and *t*-statistics are asymptotically valid—asymptotically, it is as if the data are only used once. We target the true regression coefficients for the LASSO-selected submodel. In line with the results of Leeb, Pötscher, and Ewald (2015) and Zhao, Shojaie, and Witten (2017), we assess statistical significance using conventional OLS post-LASSO *t*-statistics.

The results of Leeb, Pötscher, and Ewald (2015) and Zhao, Shojaie, and Witten (2017) demonstrate that the *p*-value for the conventional OLS post-LASSO *t*-statistic is valid for making post-selection inferences for an individual coefficient . However, when analyzing the significance of a large number of individual coefficient estimates, the issue of multiple testing remains. When testing many individual null hypotheses, conventional *p*-values can present a misleading picture of statistical significance.

Suppose that we test *m* individual null hypotheses (*m* = 167 in our application in the next section), each at significance level *q*. One approach for addressing multiple testing is to control the family-wise error rate (FWER): FWER = Pr(*V* ≥ 1) ≤ *q*, where *V* is the number of wrongly rejected null hypotheses. The familiar Bonferroni procedure controls the FWER by multiplying each of the unadjusted *p*-values by *m*. However, FWER control tends to be extremely conservative, with little power to detect false null hypotheses.

Instead of the FWER, Benjamini and Hochberg (1995) propose to control the FDR: FDR = *E*(*V*/*R*) ≤ *q*, where *R* is the number of rejections. They develop a popular linear step-up procedure based on adjusted *p*-values to control the FDR. The Benjamini and Hochberg (1995) procedure orders the unadjusted *p*-values and their corresponding null hypotheses (*p*_{(1)}, …, *p*_{(}_{m}_{)} and *H*_{(1)}, …, *H*_{(}_{m}_{)}, respectively) and rejects the first *k* null hypotheses (*H*_{(1)}, …, *H*_{(}_{k}_{)}), where

and

5is the adjusted *p*-value for *H*_{(}_{j}_{)}. Benjamini and Hochberg (1995) show that the linear step-up procedure controls the FDR when *p*_{(1)}, …, *p*_{(}_{m}_{)} are independent, while Benjamini and Yekutieli (2001) demonstrate that the procedure controls the FDR for a particular type of dependence (positive regression dependence) in the *p*-values.^{6}

When the number of true null hypotheses *m*_{0} is less than *m*, the Benjamini and Hochberg (1995) procedure controls the FDR at too low a level: FDR ≤ *π*_{0}*q*, where π_{0} = *m*_{0}/*m*. To improve its performance, a number of studies (e.g., Benjamini and Hochberg 2000; Storey, Taylor, and Siegmund 2004; Benjamini, Krieger, and Yekutieli 2006; Blanchard and Roquain 2009) adapt the Benjamini and Hochberg (1995) procedure to the data by replacing in Equation 5 with

where is a conservative estimate of *π*_{0}. We control for multiple testing via the FDR using the Benjamini and Hochberg (2000) adaptive procedure, which employs the Hochberg and Benjamini (1990) estimator for π_{0}. For values of *m* relevant to our application, simulations by Benjamini, Krieger, and Yekutieli (2006) indicate that the Benjamini and Hochberg (2000) adaptive procedure delivers good overall performance in terms of FDR control and power, including for dependent *p*-values.^{7}

Finally, to implement OLS post-LASSO estimation, we need to choose the regularization parameter λ_{i} in Equation 2. We choose λ_{i} using the Hurvich and Tsai (1989) corrected version of the Akaike information criterion (AIC; Akaike 1973). In the realistic case in which the true model is not included among the candidate models, Flynn, Hurvich, and Simonoff (2013) show that the corrected AIC (AIC_{c}) is asymptotically efficient for choosing λ_{i} for OLS post-LASSO estimation. This means that the AIC_{c} asymptotically selects the best-performing model from among the candidate models in terms of the ℓ_{2} loss criterion for predictive performance. In addition, Flynn, Hurvich, and Simonoff (2013) find that the AIC_{c} performs well for selecting the best-performing model in finite-sample simulations.

A popular alternative strategy for selecting λ_{i} in Equation 2 is *K*-fold cross-validation. We use the AIC_{c} for simplicity, because *K*-fold cross-validation involves the choice of the number of folds *K*, as well as how the folds are defined. Nevertheless, the use of a typical five-fold cross-validation yields results similar to those based on the AIC_{c}.

## IN-SAMPLE RESULTS

We estimate predictive regressions via OLS post-LASSO using monthly excess return data for 30 value-weighted industry portfolios from Kenneth French’s Data Library,^{8} in which the industries are defined based on the Standard Industrial Classification system. Exhibit 1 reports summary statistics for the industry portfolio excess returns for December 1959 to December 2016. We refer to the industries by their Data Library abbreviations, which are given in the notes to Exhibit 1. Along with the fact that the industry portfolios are value weighted, starting the sample in December 1959 mitigates illiquidity and thin-trading concerns. We start the sample in December 1959 to account for the lagged predictors when estimating the predictive regressions, so that the available estimation sample covers January 1960 to December 2016 (684 observations). Smoke (Tobacco Products) displays the highest annualized average excess return and Sharpe ratio at 11.79% and 0.56, respectively, in Exhibit 1, whereas Steel (Steel Works, Etc.) has the lowest annualized average excess return and Sharpe ratio at 3.49% and 0.14, respectively.

Exhibit 2 reports the OLS post-LASSO coefficient estimates for each industry.^{9} As discussed previously, we target the true regression coefficients for the LASSO-selected submodel. To conserve space, we use a bold (italicized bold) entry to indicate that a coefficient estimate is significant at the 10% (5%) level based on the conventional OLS post-LASSO *t*-statistic.

Overall, the OLS post-LASSO estimates in Exhibit 2 highlight the importance of lagged industry returns for predicting individual industry returns. The LASSO selects 167 lagged industry returns (out of a possible 900) as individual industry return predictors. At least one lagged industry return is selected as a return predictor by the LASSO for 29 of the 30 individual industries, and multiple lagged industry returns are selected for 22 individual industries. According to the conventional OLS post-LASSO *t*-statistics, 82 (53) of the 167 LASSO-selected lagged industry returns are significant predictors at the 10% (5%) level.^{10} In general, autocorrelation plays a limited role in Exhibit 2: An industry’s own lagged return is only selected by the LASSO for seven industries.

Exhibit 3 controls for multiple testing across all 167 of the OLS post-LASSO coefficient estimates in Exhibit 2. The exhibit depicts the first 85 elements of the sequence of sorted unadjusted *p*-values for the conventional OLS post-LASSO *t*-statistics, along with corresponding adjusted *p*-values based on the Benjamini and Hochberg (2000) adaptive procedure. In line with the bold (italicized bold) entries in Exhibit 2, there are 82 (53) rejections of at the 10% (5%) significance level based on the unadjusted *p*-values. According to the Benjamini and Hochberg (2000) adjusted *p*-values, 72 (21) of these rejections remain when we control for multiple testing via the FDR. In conjunction with the out-of-sample results in the next section, the numerous significant coefficient estimates in Exhibit 2 that survive multiple-testing control suggest that our evidence of industry return predictability is not simply an artifact of data mining.

Many of the coefficient estimates in Exhibit 2 appear economically plausible from the perspective of gradual information diffusion across economically related industries (Hong, Torous, and Valkanov 2007). For example, lagged Fin (Banking, Insurance, Real Estate, and Trading) returns are selected by the LASSO for 19 of the 30 individual industries, and 11 (7) of the coefficient estimates are significant at the 10% (5%) level according to the conventional OLS post-LASSO *t*-statistics. Furthermore, all of the coefficient estimates for lagged Fin returns are positive. This is economically reasonable, as firms in many industries rely extensively on financial intermediaries for financing. A positive return shock in the financial sector increases financial firms’ capital buffers, so that financial firms become more willing to make credit available to firms throughout the economy. In contrast, adverse shocks to the financial sector curtail intermediaries’ capacity to lend, thereby driving up borrowing costs and driving down returns for many sectors. Financial sector shocks have direct effects for firms that borrow from financial intermediaries, as well as indirect effects for the customers of the borrowing firms. The coefficient estimates for lagged Fin returns are typically sizable in Exhibit 2, reaching as high as 0.18 (Txtls, Textiles).

Another noteworthy pattern in Exhibit 2 involves industries located in different stages of the production process. Lagged returns for commodity- and material-producing industries located in earlier stages of the production chain—such as Coal (Coal) and Oil (Petroleum and Natural Gas)—are often negatively related to returns for industries located in later stages of the production chain—such as Smoke, Books (Printing and Publishing), Txtls, Paper (Business Supplies and Shipping Containers), Whlsl (Wholesale), and Meals (Restaurants, Hotels, and Motels). Lagged Coal and Oil returns are selected by the LASSO for 16 and 13 of the individual industries, respectively, in Exhibit 2. Eleven and 13 (7 and 9) of the coefficient estimates for lagged Coal and Oil returns, respectively, are significant at the 10% (5%) level based on the conventional OLS post-LASSO *t*-statistics. The estimated coefficients for lagged Coal and Oil returns are all negative in Exhibit 2, with the exception of the autoregression coefficient for Coal. These negative relationships presumably stem from supply shocks that raise product prices and returns for sectors located in earlier production stages but squeeze profit margins and lower returns for sectors located in later production stages. The significant coefficient estimates are again sizable in magnitude.

Although additional predictive relationships in Exhibit 2 readily accord with gradual information diffusion across related industries, there are other relationships that are more challenging to explain; for example, it is not obvious what economic channel links lagged Beer (Beer and Liquor) to future Coal returns. It is well known that machine learning is an effective means for discovering new relationships in the data, and we uncover some unusual relationships in Exhibit 2. Future research is needed to better understand their economic underpinnings.

For the industries with at least one lagged return selected by the LASSO, the *R*^{2} statistics at the bottom of Exhibit 2 range from 0.78% (Chems, Chemicals) to 7.93% (Clths, Apparel). Because monthly stock returns inherently contain a sizable unpredictable component, the degree of monthly stock return predictability will necessarily be limited. To assess the economic importance of the *R*^{2} statistics, we use the convenient metric suggested by Campbell and Thompson (2008):

where is the *R*^{2} statistic for the predictive regression when *r _{i,t}* is the regressand, and

*S*is the unconditional Sharpe ratio for

_{i}*r*. Equation 7 measures the proportional increase in average excess return for a mean–variance investor who allocates between the industry

_{i,t}*i*equity portfolio and risk-free bills when the investor uses return predictability relative to when the investor ignores return predictability.

The parentheses below the *R*^{2} statistics in Exhibit 2 report the CT_{i metric for each industry based on its monthly Sharpe ratio in Exhibit 1 and R2 statistic in Exhibit 2.11 According to the CTi metrics at the bottom of Exhibit 2, industry return predictability based on lagged industry returns increases the average excess return for a mean-variance investor by proportional factors ranging from 0.64 (ElcEq, Electrical Equipment) to 14.25 (Autos, Automobiles and Trucks). The vast majority of the CTi} measures are above one, so that the average excess return for a mean-variance investor more than doubles. The substantial proportional increases in average excess return indicate that the *R*^{2} statistics represent an economically meaningful degree of return predictability for nearly all of the industries.

Compared to OLS estimation of the full model, the LASSO sets numerous unimportant coefficients to zero, thereby reducing the noise in the data to better identify the predictive signal in lagged industry returns. As previously discussed, however, the LASSO tends to overshrink the full-model OLS coefficient estimates for the predictors selected by the LASSO. Indeed, on average, the LASSO coefficient estimates are less than half as large (in magnitude) as the corresponding full-model OLS estimates for the lagged industry returns selected by the LASSO.^{12} The degree of shrinkage induced by OLS post-LASSO estimation is more limited: On average, the OLS post-LASSO estimates are 15% smaller (in magnitude) than the corresponding full-model OLS estimates for the LASSO-selected predictors. Of course, the *R*^{2} statistics for the full models estimated via OLS are larger than the corresponding statistics at the bottom of Exhibit 2, since OLS estimation of the full model maximizes the *R*^{2} statistic by construction. However, the *R*^{2} statistics for the full models likely overstate the actual degree of return predictability, as OLS estimation fails to adequately filter the noise in the data. In summary, OLS post-LASSO estimation more accurately measures the predictive signal in lagged industry returns by avoiding both the overfitting associated with full-model OLS estimation and overshrinking of coefficient estimates engendered by the LASSO for the LASSO-selected predictors.

We check the robustness of the results along a number of dimensions. First, to examine the robustness of the post-selection inferences, we compute confidence intervals using the approach of Lee et al. (2016). Demonstrating that the LASSO selection event can be characterized as a union of polyhedra, they construct valid confidence intervals conditional on the event . Many of the coefficient estimates that are significant in Exhibit 2 are also significant based on Lee et al. (2016) confidence intervals.^{13}

Next, we check whether the return predictability in Exhibit 2 reflects time variation in risk premiums by augmenting Equation 1 with four lagged predictor variables similar to those used by Ferson and Harvey (1991, 1999), Ferson and Korajczyk (1995), and Avramov (2004): the S&P 500 dividend yield, three-month Treasury bill yield, difference between yields on a 10-year Treasury bond and a three-month Treasury bill (term spread), and difference between yields on BAA- and AAA-rated corporate bonds (credit spread).^{14} These variables represent popular return predictors from the literature, and they are often viewed as capturing time-varying risk premiums. When the lagged economic variables are included in Equation 1, the lagged industry returns selected by the LASSO and corresponding OLS post-LASSO coefficient estimates are typically quite similar to those reported in Exhibit 2.^{15} Popular measures of time-varying risk premiums thus do not readily account for the predictive power of lagged industry returns.

Furthermore, because the LASSO tends to select one predictor variable from among a group of correlated predictors while dropping the other predictors, we select the lagged industry returns in Equation 1 using the elastic net (ENet; Zou and Hastie 2005). The penalty term for the ENet is a convex combination of ℓ_{1} (LASSO) and ℓ_{2} (ridge) components. The ENet has a stronger tendency than the LASSO to select correlated predictor variables as a group. The ENet and LASSO typically select similar sets of predictor variables for the individual industries. Indeed, the two procedures select the same set of lagged industry returns as predictors for many individual industries.^{16} Practically speaking, the LASSO’s simpler penalty term appears sufficient for identifying the relevant lagged industry returns for predicting individual industry returns.

## OUT-OF-SAMPLE RESULTS

This section reports out-of-sample measures of the economic value of industry return predictability in the context of a monthly long-short industry-rotation portfolio. We construct a long-short industry-rotation portfolio for January 1970 to December 2016 using out-of-sample forecasts of monthly industry excess returns based on OLS post-LASSO estimation of predictive regressions for each industry.

We construct the long-short industry-rotation portfolio as follows. We first use data from the beginning of the sample through December 1969 to estimate Equation 1 for each industry via OLS post-LASSO and generate a set of 30 out-of-sample industry excess return forecasts for January 1970. We sort the industries in ascending order according to the excess return forecasts and form equal-weighted quintile portfolios; we then create a zero-investment portfolio that goes long (short) the top (bottom) quintile portfolio. Next, we use data through January 1970 to compute an updated set of out-of-sample industry excess return forecasts for February 1970 based on OLS post-LASSO estimation of Equation 1, sort the industries according to the forecasts, and form equal-weighted quintile portfolios; the zero-investment portfolio again goes long (short) the top (bottom) quintile portfolio. Continuing in this fashion, we construct a monthly long-short industry-rotation portfolio based on the OLS post-LASSO industry excess return forecasts for the January 1970 to December 2016 out-of-sample period (564 months). In generating the out-of-sample excess return forecasts, observe that we only use data available at the time of forecast formation when selecting the predictor variables via the LASSO and estimating the predictive regressions.

For the purpose of comparison, we construct a pair of benchmark long-short industry-rotation portfolios. The first benchmark is constructed in the same manner as the portfolio described previously, except that we use out-of-sample industry excess return forecasts based on the prevailing mean. The prevailing mean forecast corresponds to the constant expected excess return model—Equation 1 with for *j* = 1, …, *N*—and assumes that individual industry excess returns do not depend on lagged industry returns. The forecast is simply the mean industry excess return based on data from the beginning of the sample through the month of forecast formation. The prevailing mean forecast is a popular benchmark in out-of-sample tests of return predictability (e.g., Goyal and Welch 2008). We use this benchmark to assess portfolio performance under the assumption of no industry excess return predictability (apart from the mean excess return).

The second benchmark portfolio is again constructed in the same manner, but it relies on out-of-sample industry excess return forecasts based on OLS estimation of Equation 1. Comparing the performance of the benchmark portfolio based on the OLS forecasts to that of the portfolio based on the OLS post-LASSO forecasts provides a sense of the ability of machine learning to increase the economic value of out-of-sample industry return forecasts.

Exhibit 4 reports performance measures for the industry-rotation portfolio based on the OLS post-LASSO forecasts and the two benchmark portfolios. In addition to the annualized mean, volatility, and Sharpe ratio, the exhibit reports the annualized downside risk and Sortino ratio, maximum drawdown, and annualized Goetzmann et al. (2007) MPPM for a relative risk aversion coefficient of four. MPPM measures the continuously compounded certainty equivalent return in excess of the risk-free return for a power-utility investor.

The benchmark portfolio based on the prevailing mean forecasts generates an annualized average return of −2.22%, so that simply going long (short) industries with the highest (lowest) historical returns is not a profitable strategy over the out-of-sample period. Because of the negative average return, the Sharpe and Sortino ratios are both negative. The prevailing mean benchmark portfolio also has a sizable annualized downside risk (8.34%), large maximum drawdown (73.97%), and negative annualized MPPM (−4.73%).

In contrast to the prevailing mean benchmark, the benchmark portfolio based on the OLS forecasts produces a positive annualized average return of 5.52%. Although it has somewhat higher volatility than the prevailing mean benchmark, the OLS benchmark evinces a smaller downside risk (7.17%), much lower maximum drawdown (29.39%), positive Sharpe and Sortino ratios (0.47 and 0.77, respectively), and positive MPPM (2.84%). Overall, incorporating the information in lagged industry returns, as captured by the OLS forecasts, improves portfolio performance relative to ignoring such information.

Although the OLS benchmark portfolio generally outperforms the prevailing mean benchmark in Exhibit 4, OLS estimation of the predictive regressions is vulnerable to overfitting, which potentially detracts from the usefulness of the OLS forecasts as inputs for constructing the industry-rotation portfolio. The performance measures for the industry-rotation portfolio based on the OLS post-LASSO forecasts indicate that our machine learning approach increases the economic value of industry return forecasts, as the OLS post-LASSO forecasts improve portfolio performance vis-à-vis the OLS forecasts across all measures in Exhibit 4. The annualized average return and volatility of 7.33% and 11.29%, respectively, for the OLS post-LASSO portfolio translate into a substantive Sharpe ratio of 0.65. The downside risk and maximum drawdown (6.31% and 25.65%, respectively) are smaller for the OLS post-LASSO portfolio vis-à-vis the OLS portfolio, whereas the Sortino ratio (1.16) is considerably larger. Furthermore, the annualized MPPM increases by 200 bps (to 4.84%) when we allocate across industries using the OLS post-LASSO forecasts in lieu of the OLS forecasts.

Exhibit 5 depicts log cumulative returns for the long-short industry-rotation portfolios. In line with its negative average return in Exhibit 4, the prevailing mean benchmark portfolio (dashed line) experiences prolonged periods of poor performance in Exhibit 5. The OLS benchmark (dotted line) produces gains on a relatively consistent basis, and it typically performs well during business-cycle recessions as dated by the National Bureau of Economic Research (NBER). The industry-rotation portfolio based on the OLS post-LASSO forecasts (solid line) also delivers sizable gains on a consistent basis and appears to perform even better than the OLS benchmark during recessions, especially the recent Great Recession.

Exhibit 6 examines portfolio performance over the business cycle in greater detail. The exhibit reports annualized average portfolio returns during good and bad macroeconomic states, measured as months when the economy is in an NBER-dated expansion or recession, respectively. As a robustness check, we also measure the state of the economy using the Chicago Fed national activity index (CFNAI), in which months in the bottom quintile (top 80%) of CFNAI observations are deemed a bad (good) state.^{17}

The average returns for the benchmark portfolio based on the prevailing mean forecasts are negative during both expansions and recessions (although neither is significant), and the difference in average returns during recessions vis-à-vis expansions is insignificant. Confirming the visual impression in Exhibit 5, the benchmark portfolio based on the OLS forecasts generates annualized average returns of 4.20% and 12.54% during expansions and recessions, respectively, both of which are significant. The difference in average returns is a sizable 8.34%, which is significant at the 10% level. The portfolio based on the OLS post-LASSO forecasts produces a significant annualized average return of 4.62% during expansions and a striking (and significant) average return of 21.75% during recessions. The difference (17.13%) is significant at the 1% level. When we measure the macroeconomic state using the CFNAI, the OLS post-LASSO portfolio provides significant average returns of 5.07% and 16.14% during good and bad states, respectively, and the difference of 11.08% is again significant at the 1% level. In contrast, the average returns across good and bad states (5.56% and 5.34%, respectively) become much closer for the OLS benchmark portfolio, and the difference (−0.22%) is no longer significant.^{18}

The performance of the industry-rotation portfolio based on the OLS post-LASSO forecasts during cyclical downturns is difficult to explain with a rational macroeconomic risk-based story. By paying out handsomely in bad states when marginal utility is likely to be high, the long-short portfolio provides a hedge against bad times for the macroeconomy. From a rational asset pricing perspective, such a portfolio should have a high price (i.e., low expected return); however, the portfolio generates a sizable annualized average return (7.33%) in Exhibit 4.

Next, we test whether exposures to equity risk factors in leading multifactor models from the literature can account for the behavior of the long-short industry-rotation portfolios. Specifically, we estimate a portfolio’s alpha in the context of the popular Carhart (1997) four-factor and Hou, Xue, and Zhang (2015) *q*-factor models. The former includes the three Fama and French (1993) factors—market, size, and value—and a momentum factor (Jegadeesh and Titman 1993). Motivated by the *q* theory of investment, the Hou, Xue, and Zhang (2015) model contains market, size, investment, and profitability factors.^{19}

The benchmark portfolio based on the prevailing mean forecasts evinces significant negative (positive) exposures to the size and value (momentum) factors in the Carhart model. The annualized alpha is −1.84%, which is insignificant at conventional levels. The portfolio’s annualized alpha is also negative (−2.91%, significant at the 10% level) in the *q*-factor model, and it exhibits significant negative (positive) exposure to the investment (profitability) factor. The factors collectively explain 27.80% and 18.19% of the variation in portfolio return in the Carhart and *q*-factor models, respectively.

Turning to the benchmark portfolio based on the OLS forecasts, the portfolio displays significant negative exposures to the size and value factors (the latter at the 10% level) in the Carhart model. The portfolio also generates an economically sizable annualized alpha of 6.64% (significant at the 1% level). For the *q*-factor model, the portfolio again exhibits significant negative exposure to the size factor and produces a sizable annualized alpha (5.57%, significant at the 1% level). The *R*^{2} statistics are below 2.50% for both multifactor models. Comparing the multifactor model estimation results for the prevailing mean and OLS benchmark portfolios in Exhibit 7, as in Exhibit 4, we see that incorporating the information in lagged industry returns improves portfolio performance.

For the Carhart model, the industry-rotation portfolio based on the OLS post-LASSO forecasts delivers an annualized alpha of 8.78% (significant at the 1% level); the portfolio thus generates a substantial risk-adjusted average return. The risk-adjusted average return is nearly 150 bps higher than its unadjusted counterpart in Exhibit 4, due to the portfolio’s significant negative exposures to the market and value factors (−0.13 and −0.12, respectively) and insignificant exposures to the size and momentum factors in Exhibit 7.^{20} The portfolio’s annualized alpha is again above 8% (8.04%, significant at the 1% level) in the *q*-factor model, with significant negative exposure to the market factor and insignificant exposures to the remaining factors.

The annualized alphas for the OLS post-LASSO portfolio are over 200 bps higher than those for the OLS benchmark in Exhibit 7, providing further evidence that the OLS post-LASSO forecasts add value by reducing the noise in the OLS forecasts. The *R*^{2} statistics for the Carhart and *q*-factor models are relatively small (4.06% and 3.28%, respectively), indicating that the equity risk factors have limited explanatory power for the OLS post-LASSO portfolio return.^{21} Results from Frazzini, Israel, and Moskowitz (2015) suggest that, at least for a large institutional investor, transaction costs will have relatively little impact on the substantial average return and alpha generated by the industry-rotation portfolio based on the OLS post-LASSO forecasts.

## CONCLUSION

We use machine learning tools to analyze industry return predictability based on the information in lagged industry returns from across the economy. We begin by specifying a general predictive regression model for an individual industry’s return that includes lagged returns for all 30 of the industries that we consider as regressors. Because conventional OLS estimation of predictive regressions with such a plethora of predictor variables runs the risk of overfitting, we use the LASSO from machine learning to fit sparse models. To circumvent the downward biases (in magnitude) in the LASSO coefficient estimates themselves, we re-estimate the coefficients for the LASSO-selected predictor variables via OLS (OLS post-LASSO estimation). Controlling for post-selection inference and multiple testing, in-sample results provide extensive evidence of industry return predictability, pointing to the existence of industry-related information frictions in the equity market.

We also compute out-of-sample forecasts of industry returns based on OLS post-LASSO estimation of the predictive regressions to construct a zero-investment industry-rotation portfolio that goes long (short) industries with the highest (lowest) forecasted returns. The long-short industry-rotation portfolio earns a significant average return, performs well during cyclical downturns, and delivers a substantial annualized alpha of over 8% in the context of leading multifactor models. The information in lagged industry returns thus appears quite valuable for generating risk-adjusted average returns.

Our article shows that the historical information in industry returns contains statistically and economically significant predictive power, thereby providing machine learning-based evidence against the weak form of market efficiency. With additional information, such as industry characteristics, return predictability may be even stronger. Along this line, the encompassing LASSO approach developed by Han et al. (2019), which currently provides the most accurate out-of-sample forecasts of cross-sectional expected stock returns according to the Lewellen (2015) test, holds promise for further improving predictive performance. In summary, we expect machine learning techniques to continue to enhance our understanding of industry return predictability in future research.

## ACKNOWLEDGMENTS

The authors are grateful to conference and seminar participants at the 2015 Australasian Finance and Banking Conference, 2015 International Symposium on Forecasting, 2015 Midwest Econometrics Group Meeting, 2016 Midwest Finance Association Meeting, 2016 SMU Summer Camp, 2017 Conference on Financial Predictability and Big Data, 2018 Alliance Bernstein Boston Quantitative Finance Conference, Saint Louis University, West Virginia University, and especially Robert Connolly (MFA discussant), Rossen Valkanov (SMU Summer Camp discussant), and Dacheng Xiu for insightful comments that substantively improved the article. The authors are also grateful to Frank Fabozzi (the editor) for very helpful comments.

## ADDITIONAL READING

**A Backtesting Protocol in the Era of Machine Learning**

Rob Arnott, Campbell R. Harvey, and Harry Markowitz

*The Journal of Financial Data Science*

**https://jfds.pm-research.com/content/1/1/64**

**ABSTRACT:** *Machine learning offers a set of powerful tools that holds considerable promise for investment management. As with most quantitative applications in finance, the danger of misapplying these techniques can lead to disappointment. One crucial limitation involves data availability. Many of machine learning’s early successes originated in the physical and biological sciences, in which truly vast amounts of data are available. Machine learning applications often require far more data than are available in finance, which is of particular concern in longer-horizon investing. Hence, choosing the right applications before applying the tools is important. In addition, capital markets reflect the actions of people, who may be influenced by the actions of others and by the findings of past research. In many ways, the challenges that affect machine learning are merely a continuation of the long-standing issues researchers have always faced in quantitative finance. Although investors need to be cautious—indeed, more cautious than in past applications of quantitative methods—these new tools offer many potential applications in finance. In this article, the authors develop a research protocol that pertains both to the application of machine learning techniques and to quantitative finance in general.*

## ENDNOTES

↵

^{1}See Rapach and Zhou (2013) for a survey.↵

^{2}Moskowitz and Grinblatt (1999) analyze industry momentum, and Cohen and Frazzini (2008) and Menzly and Ozbas (2010) investigate customer-supplier links. In contrast, our study focuses on whether lagged industry returns have predictive power for future industry returns and on how returns across all industries are related to one another.↵

^{3}Harvey, Liu, and Zhu (2016); Green, Hand, and Zhang (2017); Linnainmaa and Roberts (2018); and Hou, Xue, and Zhang (forthcoming) highlight the relevance of multiple testing for identifying significant firm characteristics in the cross section of stock returns. Arnott, Harvey, and Markowitz (2019) stress the importance of accounting for multiple testing when using machine learning techniques.↵

^{4}See Dezeure et al. (2015) and Taylor and Tibshirani (2015) for recent surveys of the post-selection inference literature.↵

^{5}Targeting the submodel is in the spirit of George Box’s famous dictum, “All models are wrong, but some are useful” (Box 1979, p. 202).↵

^{6}Benjamini and Yekutieli (2001) modify the Benjamini and Hochberg (1995) procedure so that it controls the FDR under arbitrarily dependent*p*-values. However, the modified procedure is typically very conservative (at times more so than the Bonferroni procedure), so that it has limited ability to identify false null hypotheses.↵

^{7}The adaptive procedures can be shown to control the FDR when*p*_{(1)}, ...,*p*_{(}_{m}_{)}are independent. Theoretical results concerning FDR control for adaptive procedures under dependent*p*-values are more limited (e.g., Farcomeni 2007; Blanchard and Roquain 2009), and simulations are typically used to analyze FDR control and power.↵

^{8}Available at http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.↵

^{9}All computations are performed using R (R Core Team 2017). We use the glmnet package (Friedman et al. 2017) to compute the LASSO solution. Following convention, the predictor variables are standardized before solving Equation 2. The estimated coefficients in Exhibit 2 correspond to the scales of the original predictor variables.↵

^{10}As shown in Exhibit S1 of the online supplement, inferences are similar for*t*-statistics based on White (1980) heteroskedasticity-robust standard errors.↵

^{11}We divide the annualized Sharpe ratio in Exhibit 1 by to arrive at the monthly Sharpe ratio.↵

^{12}The LASSO and full-model OLS coefficient estimates are reported in Exhibits S2 and S3, respectively, of the online supplement.↵

^{13}The complete results are reported in Exhibit S4 of the online supplement. We compute Lee et al. (2016) confidence intervals using the selectiveInference package (Tibshirani et al. 2017).↵

^{14}Data for these variables are from Global Financial Data at https://www.globalfinancialdata.com.↵

^{15}Complete results for the augmented predictive regressions are reported in Exhibit S5 of the online supplement.↵

^{16}The complete ENet results are reported in Exhibit S6 of the online supplement.↵

^{17}Monthly indicator variables for NBER recessions and CFNAI observations are from Federal Reserve Economic Data at https://fred.stlouisfed.org.↵

^{18}The inferences in Exhibit 6 are similar for*t*-statistics based on White (1980) heteroskedasticity-robust standard errors; see Exhibit S7 of the online supplement.↵

^{19}Factor data for the Carhart model are from Kenneth French’s Data Library. We thank Kewei Hou for providing us with factor data for the*q*-factor model.↵

^{20}We also constructed a cross-sectional industry-momentum portfolio along the lines of Moskowitz and Grinblatt (1999). Specifically, each month we sort the 30 industries according to their cumulative excess returns over the previous 12 months and go long (short) the top (bottom) quintile of sorted industries. Consistent with Moskowitz and Grinblatt (1999), the cross-sectional industry-momentum portfolio evinces large exposure (0.87, significant at the 1% level) to the momentum factor in the Carhart model and generates insignificant alpha.↵

^{21}As shown in Exhibit S8 of the online supplement, the inferences in Exhibit 7 are similar for*t*-statistics based on White (1980) heteroskedasticity-robust standard errors. The portfolio based on the OLS post-LASSO forecasts also generates an annualized alpha of 8.12% (significant at the 1% level) in the Fama and French (2015) five-factor model and an annualized alpha of 8.17% (significant at the 1% level) in a six-factor model composed of the five Fama and French (2015) factors and the momentum factor from the Carhart model.

- © 2019 Pageant Media Ltd