TY - JOUR T1 - PCA for Implied Volatility Surfaces JF - The Journal of Financial Data Science SP - 85 LP - 109 DO - 10.3905/jfds.2020.1.032 VL - 2 IS - 2 AU - Marco Avellaneda AU - Brian Healy AU - Andrew Papanicolaou AU - George Papanicolaou Y1 - 2020/04/30 UR - https://pm-research.com/content/2/2/85.abstract N2 - Principal component analysis (PCA) is a useful tool when trying to construct factor models from historical asset returns. For the implied volatilities of US equities, there is a PCA-based model with a principal eigenportfolio whose return time series lies close to that of an overarching market factor. The authors show that this market factor is the index resulting from the daily compounding of a weighted average of implied-volatility returns, with weights based on the options’ open interest and Vega. The authors also analyze the singular vectors derived from the tensor structure of the implied volatilities of S&P 500 constituents and find evidence indicating that some type of open interest- and Vega-weighted index should be one of at least two significant factors in this market.TOPICS: Statistical methods, simulations, big data/machine learningKey Findings• Principal component analysis of a comprehensive dataset of implied volatility surfaces from options on US equities shows that their collective behavior is captured by just nine factors, whereas the effective spatial dimension of the residuals is closer to 500 than to the nominal dimension of 28,000, revealing the large redundancy in the data.• Portfolios of implied volatility surface returns, weighed suitably by open interest and Vega, track the principal eigenportfolio associated with a market portfolio of options, in analogy to equity portfolios.• Retention of the tensor structure in the eigenportfolio analysis improves the tracking between the open interest–Vega weighted (tensor) implied volatility surface returns portfolio and the (tensor) eigenportfolio, indicating that data structure matters. ER -