## Abstract

Generative adversarial networks (GANs) have been shown to be able to generate samples of complex financial time series, particularly by employing the concept of path signatures, a universal description of the geometric properties of a data stream whose expected value uniquely characterizes the time series. Specifically, the SigCWGAN model (Ni et al. 2020) can generate time series of arbitrary length; however, the parameters of the neural network employed grow exponentially with the dimension of the underlying time series, which makes the model intractable when seeking to generate large financial market scenarios. To overcome this problem of dimensionality, the authors propose an iterative generation procedure relying on the concept of hierarchies in financial markets. The authors construct an ensemble of GANs that they call the Hierarchical-SigCWGAN, which is based on hierarchical clustering that approximates signatures in the spirit of the original model. The Hierarchical-SigCWGAN can scale to higher dimensions and generate large-dimensional scenarios in which the joint behavior of all the assets in the market is replicated. The model is validated by comparing its performance on a series of similarity metrics with respect to the original SigCWGAN on a dataset in which it is still tractable and by showing its scalability on a larger dataset.

**Key Findings**

▪ The authors propose a framework to simulate high-dimensional financial time series based on generative adversarial networks (GANs).

▪ The resulting implementation solves the exponential scaling problem of standard GANs by approximating their correlation structure with a hierarchical clustering.

▪ Key metrics show the generated data to have more variation than benchmark approaches.

Modeling financial time series requires large volumes of data, and this demand has become even stronger with the advent of data-driven algorithms. From a modeling perspective, historical data are nothing more than a single realization of the unknown data-generation process (DGP) that drives financial markets, and limiting our understanding of financial systems to a single realization clashes with their not only stochastic but also nonstationary and irregular nature (López de Prado 2018, 2019, 2020). It is, thus, essential to develop data-generation procedures that can capture the complexities of entire financial markets. Potential applications of synthetic financial data include hedging (Wiese et al. 2019; Kim 2021), model calibration (Cuchiero, Khosrawi, and Teichmann 2020), and portfolio construction (Jaeger et al. 2021; Schwendner et al. 2021). López de Prado (2020) offered an overview of synthetic data generation structured into resampling and Monte Carlo methods. Although resampling methods such as bootstrapping draw from existing data with replacement, Monte Carlo methods can generate new innovations beyond the empirically observed.

Financial return time series form a class of data that presents particular challenges. They are nonstationary and exhibit a series of statistical properties usually referred to as stylized facts (such as volatility clustering). They tend to have fat tails (caused by market shock events) and, ultimately, their DGP is unknown. Because financial markets from time to time surprise market participants with previously unseen phenomena, resampling alone does not seem to be sufficient. Within the group of Monte Carlo methods, López de Prado (2020) discussed parametric and nonparametric methods. Parametric Monte Carlo methods have been used extensively in finance, especially for volatility modeling and for pricing derivatives. The advantage of these models is that they offer parametric sensitivities even for derivatives with complex, path-dependent payoffs. The disadvantage is the necessary decision for a specific parametric model and the associated model risk (Schoutens, Simons, and Tistaert 2004).

In recent years, nonparametric methods have gained increasing attention because of their promise to generate samples without a priori assumptions on the underlying DGP. Therefore, these models are well suited for scenario-based stress testing and challenging results gained from parametric approaches. Prominent examples are using restricted Boltzmann machines (RBMs) (Kondratyev and Schwarz 2019), self-organizing maps (SOMs) (Kohonen 2022; Huber 2019), variational autoencoders (VAEs) (Bühler et al. 2020; Jhirad 2021; Bergeron et al. 2022), and transformers (Dong and Scoullos 2022; Jhirad and Dong 2022). Papenbrock et al. (2021) introduced the matrix evolutions concept based on an evolutionary algorithm to simulate correlation matrices. The concept is set up as a multiobjective optimization problem with constraints and, therefore, is very flexible.

GANs are a novel neural network methodology for generating synthetic data. Through adversarial training of its two networks, the generator and the discriminator, via stochastic gradient descent, a GAN can learn a mapping between manifolds of noise seed inputs and target manifolds in which training data points are located (Goodfellow et al. 2014; Goodfellow 2016). Training GANs is a complex process that currently is an active area of research (Arjovsky and Bottou 2017; Roth et al. 2017). Originally, GANs applied to complex datasets exhibited training instabilities, experienced mode collapse, and/or lacked the capacity to generate the target dataset properly among other training issues.

Current research efforts focus on improving GAN training issues through various techniques. They include analyzing training dynamics directly (Li, Swersky, and Zemel 2015; Salimans et al. 2016; Kodali et al. 2017; Mescheder, Geiger, and Nowozin 2018; Nie and Patel 2018; Jolicoeur-Martineau 2018, 2019), focusing on the topological properties of the metrics used during training (Mroueh et al. 2017; Khayatkhoei, Elgammal, and Singh 2018), and proposing novel architectures for GAN neural networks (Radford, Metz, and Chintala 2015; Chen et al. 2016; Karras et al. 2017; Brock, Donahue, and Simonyan 2018; Karras, Laine, and Aila 2018; Karras et al. 2019), among others. GANs primarily have been applied to image datasets because qualitatively evaluating a GAN’s performance by visual inspection is relatively straightforward. Therefore, most GAN evaluation metrics are specific to image datasets. As a result, there is a comparatively smaller body of literature applying GANs to other data domains.

## GANS FOR FINANCIAL TIME-SERIES GENERATION

Several approaches to generating time series with GANs, such as adapting image-generation GANs (De Meer Pardo 2019; De Meer Pardo and López 2019), using recurrent networks (Esteban, Hyland, and Rätsch 2017), using temporal convolutional networks (Wiese et al. 2019; Yoon, Jarrett, and van der Schaar 2019), and approximating expected time-series signatures (Ni et al. 2020), have been explored.

A discrete time series is a sequence , *x _{t}* ∈ ℝ

^{d}governed by an unknown distribution ℙ

_{data}. The aim of generative modeling is to find a mapping from a noise distribution into a distribution ℙ

_{model}in the space of sequences of ℝ

^{d}such that ℙ

_{model}approximates ℙ

_{data}according to some metric. Generally, with other datasets, only static characteristics of the feature distribution must be accounted for, but time series add temporal dynamics to the generation problem.

The SigCWGAN model presented in Ni et al. (2020) is trained to perform a conditional generation of future values when presented with a window of past values *x*_{t−p+1:t} so that the (truncated) signature of the window of generated future values approximates that of the real future data window. Signatures are a universal feature-extraction technique originating from the theory of rough paths (Chevyrev and Kormilitzin 2016; Chevyrev and Oberhauser 2018) and, under certain assumptions fulfilled by discrete time series, uniquely determine paths up to time parametrization (Hambly and Lyons 2005). Approximating the signature is, therefore, equivalent to approximating the time series itself. The signature acts as an equivalent to the Taylor series expansion of a continuous function in the path space. An even stronger result states that the expected signature characterizes the law of a (time-augmented) path-valued random variable (Chevyrev and Lyons 2013); hence, in our setting, approximating an expected signature is enough. This can be accomplished via a Wasserstein-1 loss, which is well known for its desirable properties for GAN training (Arjovsky and Bottou 2017; Arjovsky, Chintala, and Bottou 2017; Gulrajani et al. 2017). The SigCWGAN is a flexible model for generating time series of arbitrary length while outperforming a series of benchmarks in a set of metrics. However, one shortcoming of employing signatures is their exponential scaling with regard to the dimensions of the underlying time series (Bonnier et al. 2019; Liao et al. 2019; Kidger and Lyons 2020). This makes the model intractable in practice when seeking to generate large-dimensional financial markets because the number of units the calculation of the signature adds to the GAN generator makes it too large to be back-propagated through.

## OUR CONTRIBUTIONS

We tackle this scaling problem and adapt the signature-based generation philosophy to a large-dimensional context. We develop an iterative generation procedure based on hierarchical structures in financial markets (Mantegna 1999; López de Prado 2016) to employ an ensemble of GANs. We call this type of GAN the Hierarchical-SigCWGAN. It can generate large-dimensional scenarios in which the joint behavior of all the assets of the market is approximated.

## OUTLINE

In the following section, we give the mathematical definition of signatures and outline the theoretical results presented in Ni et al. (2020) that allow for their use in the generation process. Later, we outline our proposed generation procedure and the setup of each of the GANs we employ. The subsequent section contains experiments on real data, comparing our model to the original SigCWGAN in a dataset in which the latter is still tractable and showcasing the Hierarchical-SigCWGAN’s ability to scale to higher dimensions. Finally, we conclude and present an outlook for future research.

## METHODOLOGY

In this section, we collect the concepts that lead to our generative models, the signature transform on paths, and the expected signature of a stochastic process. For simplicity, we restrict the discussion to the space of continuous paths mapping from a compact set *J* to ℝ^{d} with finite 1-variation and starting from the origin, denoted by . We define all integrals in the Riemann-Stieltjes sense. The signature of a path can be defined as follows (see Appendix A for the precise definitions of and iterated integrals).

### Definition: Signature of a Path

Let *J* := [*a, b*] be a compact interval and . The signature *S(X)* of *X* over the time interval *J* is defined as the ordered collection of all the iterated integrals of *X*, that is,

where the *zeroth* term by convention is equal to 1, the superscripts run along the set of all multi-indexes

and the terms of the signature are ordered according to the lexicographic ordering induced by the usual ordering of ℕ. The signature can alternatively be defined as

where, for each integer *k* ≥ 1, *S*^{k} is the collection of all the *k*-fold iterated integrals of *X*. The *S*^{k} are referred to as the levels of *S*(*X*)_{a,b}.

We will treat financial time series as stochastic processes; hence, we are interested in their expected signatures.

### Definition: Expected Signature of a Stochastic Process

Let *X* denote a stochastic process defined on a probability space (Ω, *P*, ). Suppose that for ω ∈ Ω the signature of *X*(ω), denoted by *S*(*X*(ω)), is well defined *a.s* and its expectation 𝔼[*S*(*X*(ω))] is finite *a.s* under the measure *P*. We call 𝔼[*S*(*X*(ω))] the expected signature of *X*.

The following result justifies the approximation of the expected signature as a GAN training objective. It applies to random variables in the space of time-augmented paths . We embed discrete time series in via their time-joined transformation; see Definition 4.3 in Levin, Lyons, and Ni (2013) and Appendix A for details.

### Theorem: Expected Signature of a Stochastic Process

Let *X* and *Y* be two random variables. If 𝔼(*S*(*X*)) = 𝔼(*S*(*Y*)) and 𝔼(*S*(*X*)) has infinite radius of convergence, then , they are equal in distribution. For the proof, see Chevyrev and Lyons (2013, Proposition 6.1).

Finally, because we want to approximate future expected signatures given past ones, the following result allows us to approximate this nonlinear supervised learning problem via a linear regression on the past signatures.

**Theorem: Signature approximation.** Consider a compact set . Let *f* : *K* → ℝ be a continuous function. Then, for any ε > 0, there exists *M* > 0 and a linear functional *L* acting on the truncated signature of degree *M* such that

For the proof, see Levin et al. (2013, Theorem 3.1).

## PROPOSED MODELS AND ALGORITHMS

### Hierarchical Generation

We propose to employ an ensemble of GANs to carry out an iterative generation of a given market without having to calculate the signature of the original large-dimensional time series. We obtain the structure of this ensemble using the hierarchical structure of the underlying market (Mantegna 1999).

Given a ℝ^{d}-valued time series , the main hypothesis of the SigCWGAN is that the process *X* has an autoregressive nature so that satisfies *X*_{t+1} = *f*(*X*_{t−p+1:t}) + ε_{t}, with . In our case, we maintain the same hypothesis for a set of dimensions of *X* that we will call the base of the market and hypothesize that the rest of the dimensions of *X* can be divided into a hierarchical clustering , so that for each cluster with dimensions , we have that , with .

### Hierarchical Clustering

Hierarchies have been studied in the context of financial markets (Mantegna 1999). These structures imply that the dynamics of financial markets can be described by the dynamics of a set of clusters determined in a hierarchical way, that is, clusters that can be divided into subclusters and so on down to individual assets.

We assume that it is possible to break down the generation of a financial market into two tasks, the generation of inter- and intra-hierarchical cluster dynamics.

**1.**Inter-cluster dynamics can be modeled by the joint generation of a set of assets, each taken as representative of its cluster with a SigCWGAN. We will call this set of assets the base of the market.**2.**Intra-cluster dynamics can be modeled by the conditional generation of the clusters’ assets conditioning on the dynamics of the base of the market, a conditional generation across dimensions.

In this article, we apply the single-linkage clustering algorithm that was used in the context of hierarchical risk parity (HRP) (López de Prado 2016; León et al. 2017; Jaeger et al. 2021). Following Mantegna (1999), Papenbrock and Schwendner (2015), and López de Prado (2016), we transform the empirical correlation matrix *C* into a distance matrix using the metric . Alternatives to single-linkage clustering are other static or adaptive tree-based clustering algorithms or more general seriation methods as discussed in Schwendner et al. (2021). For a review of correlations, hierarchies, networks, and clustering in financial markets, see Marti et al. (2021). We will choose the last dimension added to each cluster according to the single linkage as the representative of each cluster and, hence, part of the base (Cross Dimensional SigCWGAN).

### Cross-Dimensional SigCWGAN

The Expected Signature Theorem tells us that under the regularity condition, a stochastic process *X* on is characterized by its expected signature 𝔼[*S*(*X*)]. Intuitively, the signature can be considered an equivalent of a Taylor series for paths. Therefore, the expected signature of a random process can be viewed as an analog of the moment-generating function of a *d*-dimensional random variable.

Given a ℝ^{d}-valued time series , when presented with a window of size *p* of the generator of a SigCWGAN is trained to generate values whose signature approximates the future expected truncated signature of degree *m*,

under the assumption that the process *X* is stationary so that satisfies *X*_{t+1} = *f*(*X _{t}*

_{−p+1:t}) + ε

_{t}with ; hence,

*S*(

_{m}*X*

_{t}_{:t+q}) given

*X*

_{t}_{−p+1:t}=

*x*does not depend on

*t*. The Signature Approximation Theorem tells us that this nonlinear supervised learning problem can be reduced to a linear regression on the truncated signature of the past path

*X*

_{t}_{−p+1:t}. The linear functional can, thus, be estimated from true data and then the generator trained via the Wasserstein loss applied to the truncated signatures of generated data and the approximations obtained from , the Sig-W1 distance from Ni et al. (2020).

To avoid the exponential scaling problem arising from the calculation of the signature of the entire time series, we propose to modify the procedure above and break down the time-series generation process. We first propose to determine the base of the market through a hierarchical clustering procedure, intuitively a set of dimensions , with each dimension belonging to a different cluster *C*^{i} from the hierarchical clustering that are sufficiently representative of the market behavior, that is, that for all clusters *C*^{i} the conditional distribution does not depend on *t*, where *w* is a window length, is the distribution of the windows of size *w* of the base of the market, and are the windows of the dimensions of cluster *C*^{i} minus the cluster representative *r _{i}* included in the base so that , . We will call each dimension of the

*representative*

*r*

_{i}of its cluster.

Once the base has been determined, we train a SigCWGAN to learn to generate all the cluster representatives jointly and then train a new cross-dimensional SigCWGAN for each of the clusters of the hierarchical clustering. Given a cluster *C*^{i}, we want to have a generator *G _{i}* that when fed with values produced by the SigCWGAN trained on will produce values of the dimensions of cluster

*C*

^{i}(Exhibit 1) so that the window of generated values concatenated to the input has a truncated signature that resembles that of the real data window. The Signature Approximation Theorem tells us that this nonlinear supervised learning problem can be reduced to a linear regression on the truncated signature of the base path , and the Expected Signature Theorem tells us that the expected signature characterizes the law of the dimensions belonging to the cluster.

Similarly, as in the SigCWGAN, we calibrate the functional that estimates the expected signature by performing a linear regression on the signatures of the windows , . Then we train a generator *G _{i}* to generate when presented with the values generated by the base SigCWGAN so that

will approximate the signature obtained from applying to . The full pseudocode of the algorithm is detailed in Algorithm 1 and illustrated in Exhibit 2. We carry out the same path transformations as in the original SigCWGAN implementation to augment the paths in the hopes that the transformations will be informative of the temporal characteristics of the time-series data.

## EXPERIMENTS AND EVALUATION

It is highly challenging to determine the appropriate metrics to evaluate the quality of generated time-series data. Static aspects of the generated data distributions need to be considered as well as the temporal dynamics of each sample. We will calculate the following metrics:

▪ Metric on the

**marginal distribution**: Absolute difference between the empirical probability distribution functions, approximated via histograms, of real and generated returns, averaged over all dimensions.▪ Metric on the

**autocorrelation**: Absolute difference between the lag-1 autocorrelation coefficients of real and generated returns, averaged over all dimensions.▪ Metric on the

**correlation**: Absolute difference between the cross-correlation coefficients of real and generated data.▪

*R*^{2}comparison: A linear regression^{1}is performed on the real input-output pairs (*X*_{t−p+1:t},*X*_{t}_{+1}) used during training and the corresponding*R*^{2}(TRTR) is calculated. Then the same linear model is applied to where is simulated by the generator being fed (*X*_{t}_{−p+1:t}) as input, and the corresponding (TSTR) is calculated. The difference is reported.▪ Metric on the signature: Absolute difference between the (truncated) signatures of and , where is the linear functional obtained from calibrating a linear regression on

*S*(*X*_{t}_{−p+1:t}),*S*(*X*_{t}_{:t+q}).▪ Discriminative score: A two-layer long short-term memory (LSTM) network is optimized to distinguish segments of real and generated scenarios via minimizing the binary cross-entropy of its predicted labels in a supervised setting. The model is then applied to a held-out test set and the discriminative score is calculated as |0.5 −

*accuracy*|.

As benchmarks, we will train the Hierarchical-SigCWGAN on a dataset of daily returns of *d* = 10 futures markets because at this scale the SigCWGAN is still tractable. We also employ a simple block bootstrap benchmark (with replacement and block size 3, the same size as the input of the SigCWGANs). Then, we will enlarge the dataset with 25 more assets to 35 assets total and illustrate the Hierarchical-SigCWGAN’s ability to scale. The Hierarchical-SigCWGAN implementation can be found in https://github.com/FernandoDeMeer/Hierarchical-SigCWGAN. We use an NVIDIA DGX hardware with 256 GB RAM for the computations.

## FUTURES DATASET

The first dataset we employ consists of the daily closing prices from May 3, 2000, to May 7, 2021, of the 10 continuously rolled futures from Exhibit 3. We calculate discrete returns from the prices and train the SigCWGAN and each base and cross-dimensional GAN of the Hierarchical-SigCWGAN for 5,000 stochastic gradient descent steps. We calculate the mean ± standard deviation values of the metrics for 100 trials with different noise seeds. In each trial, a synthetic dataset of the same size as the empirical dataset is created. The results can be seen in Exhibit 4. We use discrete returns instead of logarithmic ones to account for potentially negative prices in empirical market data. Negative prices were observed on April 20, 2020, for West Texas Intermediate (WTI) oil futures (Corbet, Goodell, and Günay 2020), although this market is not included in this dataset.

We can observe how the Hierarchical-SigCWGAN achieves a fit of the marginal distribution that is very similar to that of the SigCWGAN and even achieves a closer autocorrelation fit. The results on the correlation metric and the discriminative score are to be expected: The Hierarchical-SigCWGAN generates different clusters independently, that is, a cross-dimensional generator does not receive any dimensions other than those of the base as input and, hence, cannot learn the dependence between clusters beyond the information carried by the base. This is the price to pay for scalability to larger datasets. A comparison of the three correlation matrices is presented in Exhibit 5. It is necessary to point out a shortcoming of the *R*^{2}(%), Sig-*W*_{1} metrics. Their implementation implies that only one-to-one comparisons are made between generated and real data, that is, the more a generator memorizes the finite set of pairs of real inputs and outputs, the better it will score in these metrics. It is impossible for the Hierarchical-SigCWGAN to memorize real data because the cross-dimensional SigCWGANs are trained on the outputs of the base generator, of which there are infinitely many, and the same real data windows are consistently fed to the generator in the SigCWGAN training, allowing it to memorize.

The strong similarity of correlation matrices in Exhibit 5 is a motivation to explore further the variations across generated scenarios and to compare these variations for the Hierarchical-SigCWGAN, the SigCWGAN, and the block bootstrap as a benchmark. Exhibit 6 shows six different panels: average return correlations between markets across 100 bootstrapped scenario trials in the upper left panel, standard deviations of return correlations between markets across these scenarios in the upper middle panel, the correlation between the starting times of the maximum drawdowns (MDDs) between markets across the same scenarios in the upper right panel, a boxplot of per annum (p.a.) returns across scenarios for each market in the lower left panel, a boxplot of p.a. volatilities across scenarios for each market in the lower middle panel, and a boxplot of MDDs across scenarios for each market in the lower right panel. Exhibit 7 shows the same evaluations for the SigCWGAN and Exhibit 8 for the hierarchical GAN. On the first glimpse, the three methods deliver very similar results for the average return correlations (upper left panel). Consistent with Exhibit 5, the hierarchical GAN yields a less pronounced correlation matrix than the SigCWGAN, especially for the block of negative (dark blue) correlations between the equity index futures and bond futures; this is a result of them being classified into different clusters (Exhibit 9). The standard deviations of return correlations across scenarios (upper middle panel) are of similar magnitude for all methods but are the most pronounced for the SigCWGAN. The boxplots for per annum returns (lower left panel) and volatilities (lower middle panel) look similar across methods. The largest differences between methods are visible in the outliers of the boxplots of the MDDs (lower right panel), and in the correlations between the start times of the MDDs (upper right panel): The hierarchical GAN exhibits more outliers in the MDD statistics than the other methods and a substantially weaker correlation block structure between the start times than the other methods. Depending on the application of the synthetic data, this can be an undesirable or a desirable feature.

## EXTENDED DATASET

To showcase the Hierarchical-SigCWGAN’s ability to generate large-dimensional datasets, we add 25 new markets as additional dimensions to the previous dataset (the full list can be seen in Appendix B). The SigCWGAN is no longer tractable with our hardware in this full dataset because the resulting generator is too large to be stored in memory. Because of this, we report only the mean ± standard deviation values of the metrics across 100 trials of the Hierarchical-SigCWGAN and the block bootstrap in Exhibit 10.

Exhibit 11 shows the cross correlations of real and generated returns. The extended dataset includes not only additional futures markets but also foreign exchange (FX) spot markets. Strong correlations between FX futures and spot FX markets are visible in the bottom right block of the correlation matrices. The preservation of the pronounced correlation block structure between the asset classes (Papenbrock and Schwendner 2015) is key for the resulting synthetic data.

The Hierarchical-SigCWGAN achieves a similar performance as the lower dimensional regime on the marginal distribution and autocorrelation metrics. If we normalize the correlation metric by the number of correlations being compared, we will see that the model performs similarly again, , . On the *R*^{2}(%) and Sig-*W*_{1} metrics, the Hierarchical-SigCWGAN’s inability to memorize real data is again evident; however, if we normalize the Sig-*W*_{1} metric by the number of dimensions of the signature, the average deviation per term, , , is again comparable to the lower dimensional case.

Exhibits 12 and 13 present the same evaluation for the extended dataset as Exhibits 6 and 8 present for the initial dataset. The average correlations across 100 synthetic scenarios look very similar between the bootstrapped and hierarchical GAN results but are a bit less pronounced for the hierarchical GAN, again because of the clustering structure of the dataset shown in Exhibit 14. The standard deviation of return correlations also looks less pronounced for the hierarchical GAN. The block structure of the average correlation of the MDD start times looks substantially weakened for the hierarchical GAN compared with the block bootstrap. We cannot generate SigCWGAN results because of the exponential computational effort as a function of the number of assets for this method.

Overall, the results show that evaluating the performance of a GAN via metrics is challenging. Ideally, a well-performing GAN model produces data that are realistic, novel, and diverse, but the latter two characteristics do not necessarily lead to better metrics. On the other extreme, a model that simply memorizes training data will yield a very good result in all metrics (if not perfect in the case of exact memorization) but the data it produces will not be of any use in downstream tasks. In addition, very simple methods such as a block bootstrap yield very good results in many metrics, as we have illustrated, but the scenarios it produces may be of limited utility depending on the task at hand. This is why we believe the quality of synthetic data should be evaluated according to its usefulness on a case-by-case basis. The models we present are likely to be more useful in tasks in which geometric/signature information of the generated scenarios is key.

## CONCLUSIONS AND OUTLOOK

We have presented the Hierarchical-SigCWGAN architecture, an iterative generation method that relies on the hierarchical structure of financial markets with an ensemble of GANs derived from the hierarchical clustering of a given market that approximates (truncated) expected signatures. The method performs similarly to the SigCWGAN in low-dimensional datasets. Moreover, it can scale to high-dimensional settings in which the SigCWGAN is no longer tractable.

In future research, we will consider alternatives for the different components of the hierarchical GAN and use its core idea as an element in a more comprehensive process chain. An example would be to first determine correlation regimes (Papenbrock and Schwendner 2015) and then enlarge the distribution of correlation matrices over different periods as in Marti (2019) and Papenbrock et al. (2021), with the regime correlation matrices as starting points. The resulting synthetic correlation matrices could then condition the market time series generated by the hierarchical GAN.

An alternative to the single-linkage clustering step used in this article would be clustering methods based on the signatures (Bilokon, Jacquier, and McIndoe 2021) of the time series or other clustering alternatives as discussed in (Schwendner et al. 2021). These approaches also could motivate portfolio construction schemes derived from signature distances (signature-HRP) rather than from correlation distances exclusively (López de Prado 2016).

Another interesting research direction would involve leveraging pretrained large-scale models, such as the different attention-based transformers (Vaswani et al. 2017), shown to be extremely proficient in natural language processing (NLP) and computer vision (CV) tasks (Bommasani et al. 2021), to replace the networks of the base and cross-dimensional SigCWGAN generators of the hierarchical GAN, given their ability to process time-series data.

Our research opens the doors to several applications. First, high-dimensional synthetic scenarios can be used to find optimal solutions with deep learning, such as dynamic delta hedging (Cuchiero, Khosrawi, and Teichmann 2020) or mean–variance portfolio optimization with neural networks. Furthermore, joint synthetic scenarios can be useful for stress testing as well as reverse stress testing. A recent preprint (Flaig and Junike 2022) highlighted the advantage of GAN-based economic scenario generators as an essentially assumption-free approach toward modeling dependencies between different risk factors.

## ACKNOWLEDGMENTS

This article is based on work from COST Action 19130, supported by COST (European Cooperation in Science and Technology; www.cost.eu). We also thank an anonymous reviewer for useful comments.

## APPENDIX A

### DEFINITIONS

This appendix contains the precise definitions omitted from the main text. We mostly follow the notation in Ni et al. (2020) and Chevyrev and Kormilitzin (2016). We begin with the definition of *p*-variation in the following.

#### Definition

Let *X* : *J* → ℝ^{d} be a *d*-dimensional path where *J* := [*a*, *b*] is a compact interval. We say that *X* is of finite *p*-variation for *p* ≥ 1 if

where the supremum is taken over all possible partitions *of* *J*.

Each of the values of a signature is an iterated integral of the path, which are defined in the following.

#### Definition

For a path *X* : *J* := [*a*, *b*] → ℝ^{d} we denote the coordinate paths of *X* by , where each *X*^{i} :→ ℝ^{d} is a real-valued path. For any single index *i* ∈ 1, …, *d*, we define the quantity

which is the increment of the *i*th coordinate of the path at time *t* ∈ [*a*, *b*]. Now for any pair *i*, *j* ∈ 1, …, *d*, let us define the double-iterated integral

We can continue recursively, and for any integer *k* ≥ 1 and collection of indexes *i*_{1}, …, *i _{k}* ∈ 1, …,

*d*, we define

The real number is called the *k*-fold iterated integral of *X* along the indexes *i*_{1}, …, *i*_{k}.

Only truncated signatures (i.e., up to a certain level) need to be calculated because the values of the iterated integrals decay factorially with the depth of the level:

**Lemma (Factorial Decay of the Signature).** Let . Then there exists a constant *C* > 0 such that for all *k* ≥ 0,

Finally, for any discrete time series we use the following definition to augment the time dimension and embed it into .

#### Definition (Time-Joined Transformation)

Let be a univariate time series. Let *R* : [2*m*, 2*n* + 1] → ℝ^{+} × ℝ be a two-dimensional time-joined path of *X*, which is defined as follows:

where *i* = *m*, *m* + 1, …, *n* − 1 and {*e _{i}*}

_{i=1,2}is an orthonormal basis of ℝ

^{2}.

The signature of is defined as the signature of its time-joined path *R*(*s*)_{s∈[2m,2n+1]}.

## APPENDIX B

## ENDNOTE

↵

^{1}In Ni et al. (2020) a linear signature model is referenced but only a linear regression is implemented. For reference, see the official Github: https://github.com/SigCGANs/Conditional-Sig-Wasserstein-GANs/blob/master/evaluate.py.

- © 2022 Pageant Media Ltd