%0 Journal Article %A Jim Kyung-Soo Liew %A Tamas Budavari %A Zixiao Kang %A Fengxu Li %A Xuzhi Wang %A Shihao Ma %A Brandon Fremin %T Pairs Trading Strategy with Geolocation Data—The Battle between Under Armour and Nike %D 2019 %R 10.3905/jfds.2019.1.024 %J The Journal of Financial Data Science %P jfds.2019.1.024 %X In this article, the authors examine fundamental linkages between geolocation (human movement) data and financial market equity price behavior. The geolocation positions were recorded intraday and cover the period from January 1, 2018 to July 17, 2018. The authors concentrate their study on a popular hedge fund trading strategy known as pairs trading. First, the authors collect data regarding Under Armour (UA) and Nike’s stock market price and volume. Second, they investigate the volume of people visiting each physical store of UA and Nike, as proxied from anonymous cell phone geolocation traffic. Third, they monitor the relative activities for tweets related to UA and Nike. Fourth, after combining all the data, the authors glean the following fascinating results: (1) Geolocation information is proven to be an important factor in a pairs trading strategy between UA and Nike, according to the results of feature selection and prediction accuracy for price ratio change; (2) both ensemble methods of machine learning and rolling analysis could significantly raise the prediction accuracy; and (3) a pairs trading strategy incorporating geolocation information can have a cumulative return of 13.72% from January, 2018 to June 2018, with an annualized Sharpe ratio of 3.88.TOPICS: Statistical methods, simulations, big data/machine learningKey Findings• Geolocation data can encapsulate the relative influx and outflux of consumers from department stores. This is an extremely useful feature in training machine learning models that can be used in lucrative pairs-trading strategies.• In the era of big data, it is tempting to throw every feature into a model and assume that the model will always figure out which features are most important. However, the model will usually perform better when trained on a subset of distinct features.• Ensemble methods are an effective means of eliminating bias in machine learning models because they use the output from a series of independently trained submodels to generate one balanced result. %U https://jfds.pm-research.com/content/iijjfds/early/2019/12/20/jfds.2019.1.024.full.pdf