Search in:

Quantitative Finance Volume 22, 2022 - Issue 3

Submit an article Journal homepage

Open access

3,485

Views

CrossRef citations to date

Altmetric

Research Papers

State-dependent Hawkes processes and their application to limit order book modelling

Maxime Morariu-Patrichi† Department of Mathematics, Imperial College London, South Kensington Campus, LondonSW7 2AZ, UK

Mikko S. Pakkanen† Department of Mathematics, Imperial College London, South Kensington Campus, LondonSW7 2AZ, UK;‡ CREATES, Aarhus University, Aarhus, DenmarkCorrespondence[email protected]

https://orcid.org/0000-0002-0696-4914

Pages 563-583 | Received 31 Aug 2019, Accepted 13 Sep 2021, Published online: 07 Dec 2021

Cite this article
https://doi.org/10.1080/14697688.2021.1983199
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Figure 1. Simulation of a state-dependent Hawkes process with $d_{e} = 1$ , $d_{x} = 2$ . The upper plot shows the evolution of the state process. The blue dots indicate the event times and the lower plot represents the intensity. The process is specified so that $ν_{1} = 1$ and $k_{11} (t, x) = \exp (- 4 t) 1_{{x = 2}}$ , that is, in state 2 the process exhibits exponential self-excitation whereas no self-excitation occurs in state 1.

Figure 2. Descriptive statistics of level-I order flow of INTC. Except for Figure (a), only the data between 12:00 and 14:30 are used. In Figure (a), the sample mean of the arrival rate of level-I orders is computed over 10-minute bins. The translucent area represents the range of the arrival rate across the 250 trading days, excluding the bottom and top 5% values. (a) Arrival rate of level-I orders. (b) Number of level-I orders. (c) Fraction of level-I orders with non-unique timestamp. (d) Distribution of level-I order types.

Figure 2. Descriptive statistics of level-I order flow of INTC. Except for Figure 2(a), only the data between 12:00 and 14:30 are used. In Figure 2(a), the sample mean of the arrival rate of level-I orders is computed over 10-minute bins. The translucent area represents the range of the arrival rate across the 250 trading days, excluding the bottom and top 5% values. (a) Arrival rate of level-I orders. (b) Number of level-I orders. (c) Fraction of level-I orders with non-unique timestamp. (d) Distribution of level-I order types.

Figure 3. Joint distribution of events and states for INTC, depicting the empirical distribution of the marks $(E_{n}, X_{n})$ for the two considered state processes. (a) ${Model}_{S}$ (state process: bid–ask spread). (b) ${Model}_{QI}$ (state process: queue imbalance).

Table 1. Summary of ${Model}_{S}$ and ${Model}_{QI}$ .

Display Table

Figure 4. Estimated transition distributions $\hat{ϕ}$ of ${Model}_{S}$ and ${Model}_{QI}$ . We report the average of ${\hat{ϕ}}^{(i)}$ across the 250 trading days. (Daily estimates vary little from these averaged values.). (a) Transition probabilities of the bid–ask spread ( ${Model}_{S}$ ). (b) Transition probabilities of the queue imbalance ( ${Model}_{QI}$ ).

Figure 4. Estimated transition distributions ϕˆ of ModelS and ModelQI. We report the average of ϕˆ(i) across the 250 trading days. (Daily estimates vary little from these averaged values.). (a) Transition probabilities of the bid–ask spread (ModelS). (b) Transition probabilities of the queue imbalance (ModelQI).

Figure 5. The estimated kernel $\hat{k}$ under ${Model}_{S}$ and ${Model}_{QI}$ . Each panel describes self- or cross-excitation as indicated by its title, whilst each colour corresponds to a different state. For example, in Figure (a), the red curves in the second panel represent the estimates ${\hat{k}}_{e^{'} e}^{(i)} (\cdot, x)$ where $e^{'} = { ask}$ , $e = { bid}$ and $x = { 1}$ . All daily estimates are superposed with one translucent curve for each day. An ‘aggregate’ kernel is represented by a solid line, computed using the median of ${\hat{α}}^{(i)}$ and ${\hat{β}}^{(i)}$ across the 250 trading days. (a) ${Model}_{S}$ (state variable: bid–ask spread). (b) ${Model}_{QI}$ (state variable: queue imbalance).

Figure 5. The estimated kernel kˆ under ModelS and ModelQI. Each panel describes self- or cross-excitation as indicated by its title, whilst each colour corresponds to a different state. For example, in Figure 5(a), the red curves in the second panel represent the estimates kˆe′e(i)(⋅,x) where e′={ ask}, e={ bid} and x={ 1}. All daily estimates are superposed with one translucent curve for each day. An ‘aggregate’ kernel is represented by a solid line, computed using the median of αˆ(i) and βˆ(i) across the 250 trading days. (a) ModelS (state variable: bid–ask spread). (b) ModelQI (state variable: queue imbalance).

Figure 6. Estimated base rate vector ${\hat{ν}}^{(i)}$ for ${Model}_{S}$ and ${Model}_{QI}$ over time (in number of events per second). (a) ${Model}_{S}$ . (b) ${Model}_{QI}$ .

Figure 6. Estimated base rate vector νˆ(i) for ModelS and ModelQI over time (in number of events per second). (a) ModelS. (b) ModelQI.

Figure 7. Estimated kernel norms $‖ \hat{k} ‖_{1, \infty}$ for ${Model}_{QI}$ over time. The dashed line marks 5 February 2018 (‘the return of volatility’), a day when the CBOE Volatility Index (VIX) jumped by 116% to 38 points, a level not seen since August 2015. This day seems to have introduced a systematic change in the magnitude of cross- and self-excitation. The spike in cross-excitation that occurs on 21 February 2018 is linked to a sudden change in market behaviour around 14:00 on that day, when Intel Corporation rolled out patches for its most recent generation of processors.

Figure 7. Estimated kernel norms ‖kˆ‖1,∞ for ModelQI over time. The dashed line marks 5 February 2018 (‘the return of volatility’), a day when the CBOE Volatility Index (VIX) jumped by 116% to 38 points, a level not seen since August 2015. This day seems to have introduced a systematic change in the magnitude of cross- and self-excitation. The spike in cross-excitation that occurs on 21 February 2018 is linked to a sudden change in market behaviour around 14:00 on that day, when Intel Corporation rolled out patches for its most recent generation of processors.

Figure 8. The upper panel depicts the evolution of the queue imbalance and level-I order flow of INTC on 13 February 2018 and the ask (red dots) and bid (blue dots) events. The lower panel displays the estimated intensities of ${Model}_{QI}$ . (The self- and cross-excitation kernel norms of ${Model}_{QI}$ on 13 February 2018 are visualised in Figure .)

Figure 9. In-sample (12:00–14:30) and out-of-sample (14:30–15:00) Q–Q plots of event residuals under ${Model}_{S}$ , ${Model}_{QI}$ (state-dependent) and an ordinary Hawkes process (simple). The residuals of the ith day are computed using the ML estimates $({\hat{ν}}^{(i)}, {\hat{α}}^{(i)}, {\hat{β}}^{(i)})$ obtained from the 12:00–14:30 period. The empirical quantiles are obtained by pooling the residuals of all 250 trading days. The two panels in each sub-figure correspond to the sequences of residuals $(r_{n}^{e})$ for $e \in E = {a s k, b i d}$ . (a) ${Model}_{S}$ : in sample. (b) ${Model}_{S}$ : out of sample. (c) ${Model}_{QI}$ : in sample. (d) ${Model}_{QI}$ : out of sample.

Figure 10. The estimated spectral radius $\hat{ρ} (x)$ as a function of $x \in X$ under ${Model}_{S}$ and ${Model}_{QI}$ . The daily profiles $x \mapsto \hat{ρ} (x)^{(i)}$ , $i = 1, \dots, 250$ , are represented by the red translucent curves. (a) ${Model}_{S}$ . (b) ${Model}_{QI}$ .

Figure 10. The estimated spectral radius ρˆ(x) as a function of x∈X under ModelS and ModelQI. The daily profiles x↦ρˆ(x)(i), i=1,…,250, are represented by the red translucent curves. (a) ModelS. (b) ModelQI.

Figure 11. In-sample (12:00–14:30) Q–Q plots of total residuals of ${Model}_{QI}$ (state-dependent) and the alternative model given by (Equation8(8) $\begin{aligned} {\tilde{λ}}_{e x} (t) & = ν_{e x} + \sum_{e^{'} \in E, x^{'} \in X} \int_{[0, t)} k_{e^{'} x^{'} e x} (t - s) d {\tilde{N}}_{e^{'} x^{'}} (s), t \geq 0, \\ e \in E, x \in X, \end{aligned}$ (8) ) (complex) on 11 May 2018. Each panel corresponds to a sequence of residuals $({\tilde{r}}_{n}^{e x})$ for all $e \in E$ and $x \in X$ .

Figure 11. In-sample (12:00–14:30) Q–Q plots of total residuals of ModelQI (state-dependent) and the alternative model given by (Equation8(8) λ~ex(t)=νex+∑e′∈E,x′∈X∫[0,t)ke′x′ex(t−s)dN~e′x′(s),t≥0,e∈E,x∈X,(8) ) (complex) on 11 May 2018. Each panel corresponds to a sequence of residuals (r~nex) for all e∈E and x∈X.

Table A1. Parameter values for Specification 1.

Display Table

Table A2. Parameter values for Specification 2.

Display Table

Figure A1. Violin plots of the worst estimation errors (EquationA2(A2) $ϵ_{r e l} := \frac{{\hat{θ}}_{i_{j^{⋆}}} - θ_{i_{j^{⋆}}}}{θ_{i_{j^{⋆}}}}, where j^{⋆} = {a r g m a x}_{j} \frac{| {\hat{θ}}_{i_{j}} - θ_{i_{j}} |}{θ_{i_{j}}} .$ (A2) ) under two different sets of parameter values (specifications 1 and 2). For every specification and sample size N, we simulate 100 paths with sample size N and perform ML estimation for each of them. The true parameters are used as the initial guess in the optimisation procedure to speed up estimation and reduce the computational cost. (a) Specification 1. (b) Specification 2.

Figure A1. Violin plots of the worst estimation errors (EquationA2(A2) ϵrel:=θˆij⋆−θij⋆θij⋆,wherej⋆=argmaxj|θˆij−θij|θij.(A2) ) under two different sets of parameter values (specifications 1 and 2). For every specification and sample size N, we simulate 100 paths with sample size N and perform ML estimation for each of them. The true parameters are used as the initial guess in the optimisation procedure to speed up estimation and reduce the computational cost. (a) Specification 1. (b) Specification 2.

Figure A2. Uncertainty quantification of the estimated excitation profiles for INTC on 13 February 2018 under ${Model}_{QI}$ . The parametric bootstrap procedure involves simulating 100 paths covering 2.5 hours of trading using the ML estimates of the parameters on the considered day and applying ML estimation again to each of the simulated paths. Three random sets of parameters are used as the initial guess in the optimisation procedure. We use the 100 estimates to compute a 99%-confidence interval for the truncated kernel norm (translucent area). The solid line corresponds to the ML estimates using the original INTC data.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

State-dependent Hawkes processes and their application to limit order book modelling

Table 1. Summary of ${Model}_{S}$ and ${Model}_{QI}$ .

Table A1. Parameter values for Specification 1.

Table A2. Parameter values for Specification 2.

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

State-dependent Hawkes processes and their application to limit order book modelling

Figures & data

Table 1. Summary of ModelS and ModelQI.

Table A1. Parameter values for Specification 1.

Table A2. Parameter values for Specification 2.

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1. Summary of ${Model}_{S}$ and ${Model}_{QI}$ .