Abstract
We model the evolution of the ex-ante weighted spread (EWS) embedded in an open Limit Order Book (LOB) and investigate the impact of observed market-related variables on the spread. Our modeling involves decomposing the joint distribution of the weighted spread into simple and interpretable distributions. Our main results have several implications: (i) EWS features high persistence in autocorrelation; (ii) lower-level LOB remains liquid even after a high trade imbalance; (iii) lower- and higher-level LOB react to temporal spread change and trade imbalance in different ways; and (iv) both trade durations and quote durations have seasonality effects. We also show, through a simple high frequency trading exercise, that the use of the model can be economically important. Further, our model provides an estimation of market resilience.
Disclosure statement
No potential conflict of interest was reported by the authors.
Supplemental data
Supplemental data for this article can be accessed at https://doi.org/10.1080/14697688.2019.1690160.
Notes
1 Because most exchanges allow also iceberg and hidden orders. In addition, due to the latency, the realized cost at execution could be different from the expected cost based on the information of the LOB before submitting market orders. We thank an anonymous referee for pointing out the importance of latency.
2 See Kyle (Citation1985), Glosten and Milgrom (Citation1985), and Glosten (Citation1994).
3 The factors used to model the dynamics of the EWT include trade duration, quote duration, activity, direction, and size factors.
4 Smart Order Router is a system designed to submit orders in the best available way by relying on the market condition and defined rules. Usually, SOR searches for the best execution price across fragmented markets. In this paper, we use SOR to refer to the execution practice that has the same objective as a general SOR but is applied to the temporal dimension (over the course of the trading day).
5 Short-run variables are variables at a given time point, whereas long-run variables summarize the information over an interval.
6 There are two types of trading mechanisms during normal trading hours: call auction and continuous auction. A call auction can be organized once or several times during the trading day in which the clearance price is determined by the state of the LOB and remains as the open price for the following continuous auction.
7 Fully hidden orders and the hidden part of an iceberg order are not observable in our dataset. However, as we observe the state of the LOB before and after the transaction, we can evaluate if a market order hits hidden orders or not. Our backtest results show that fewer than 3% of the market orders run into hidden orders, which represents about 6% of trade volumes. The presence of hidden orders makes our EWS slightly underestimating the actual liquidity. We thank one anonymous referee for pointing out the impact of hidden orders on the EWS.
8 To avoid the impact of stock price and outstanding shares across different stocks, for each stock, we choose q from its own trade volume distribution.
9 Given that q is ad-hoc, hereafter, we use the notations and
for low-level spread and high-level spread, respectively. In addition, we keep
as a general term for ex-ante weighted spread.
10 Lu and Abergel (Citation2018) show that trades are more likely to be the driving force of the trade-quote and LOB dynamics.
11 We also estimate the model by supposing that the quote updates initiate the dynamics. The results remain similar.
12 The augmented Dickey-Fuller test rejects the hypothesis of a unit root for for all stocks in our sample. For the sake of brevity, we do not present our test results here. They are available from authors.
13 The ACD (Autoregression Conditional Duration) model is proposed by Engle and Russell (Citation1998); it is widely used for duration modeling.
14 The augmented Dickey-Fuller test rejects the hypothesis of a unit root for the activity, direction, and size factors for all stocks in our sample. For the sake of brevity, we do not present our test results. But they are available from authors.
15 The McFadden's R squared measure is defined as where
denotes the likelihood value from the current fitted model and
denotes the corresponding value for the null model. ROC evaluates binary model accuracy at various threshold settings (Swets Citation1986, Citation1988) by considering Type I and Type II errors. Count accuracy measures the in-sample accuracy predicted by the model.
16 The general probability distribution function is
17 Results for the two other months are similar and available from the authors.
18 Because the size factor is not binary variable, we present the Ljung-Box statistics and adjusted to validate the model.
19 The variance of is equal to
. It is a function of T and
. In our simulations,
sometimes diverge gradually at large times even though
is stationary with zero mean and our model captures well the autocorrelation of empirical data. The variance of
may become very large when the summation containing
does not efficiently cancel the effect of T. In practice, to avoid this blow-up at very large times, the aggregation of simulated
should be done with a given number of ticks. This is how we proceed when showing the economic significance of our model in Section 5. We thank a referee for pointing out this issue.
20 In this paper, market risk is related to uncertainty about stock mid-quote price, whereas liquidity risk is uncertainty about the shape of the LOB when liquidating the position.
21 In this typical exercice, we use a moving window of 10 periods to calculate historical average.
22 Consistent with many model forecasting studies, the results confirm that a simple moving-average strategy could be hard to beat when using real data.
23 n is an arbitrary number of ticks and in our simulation, it takes the value of 1000.