1,324
Views
11
CrossRef citations to date
0
Altmetric
Articles

High order concentrated matrix-exponential distributions

, &
Pages 176-192 | Received 29 May 2019, Accepted 04 Dec 2019, Published online: 18 Dec 2019

Abstract

This paper presents matrix-exponential (ME) distributions, whose squared coefficient of variation (SCV) is very low. Currently, there is no symbolic construction available to obtain the most concentrated ME distributions, and the numerical optimization-based approaches to construct them have many pitfalls. We present a numerical optimization-based procedure which avoids numerical issues.

1. Introduction

Highly concentrated matrix-exponential functions play an important role in many research fields, for example, they turned out to be essential for numerical inverse Laplace transform methods as well.[Citation8]

The least varying phase type (PH) distribution of order N is known to be the Erlang distribution[Citation1] with SCV=1/N (defined as μ0μ2μ121, where μi,i{0,1,2}, are the moments of the distribution). Much less is known about the least varying ME distribution for order N. It is known that for order 2 the class of ME distributions is identical to the class of PH distributions, and it is also known that there exists an order 3 ME distribution with SCV = 0.200902<1/3, but it is still only a conjecture that this is the least varying order 3 ME distribution. Concentrated ME distributions are provided in Éltető et al.[Citation4] up to order 17 and in Horváth et al.[Citation7] up to order 47. These preliminary results indicate that, as N increases, the minimal SCV of order N ME distributions tends to be less than 2/N2 for odd N, and for even N, the minimal SCV is close to the minimal SCV of order N – 1 ME distributions. In this paper, we propose numerical procedures by which much higher odd order concentrated ME distributions can be computed, and based on that, we refine the dependence of the minimal SCV on the order.

The rest of the paper is organized as follows. In Section 2, we provide the basic definition of the considered set of functions and the SCV. Section 3 presents an algorithm for efficient SCV computation of exponential cosine-square functions, while Section 4 introduces the numerical optimization procedure to obtain ME functions with low SCV. A heuristic approach with only 3 parameters is proposed in Section 5 and Section 6 concludes the paper.

2. Concentrated ME distributions

In this paper, we focus on real functions and a distribution on R+.

Definition 2.1.

Order N ME functions (referred to as ME(N)) are given by (1) f(t)=α¯eAt(A)1,t0,(1) where α¯ is a real row vector of size N, A is a real matrix of size N × N and 1 is the column vector of ones of size N, α¯ is such that α¯1>0 and f(t)0 for t0.

According to Equation(1), vector α¯ and matrix A define the ME function. We refer to the pair (α¯,A) as a matrix representation in the sequel. The matrix representation is not unique. Applying a similarity transformation with a nonsingular matrix T, for which T1=1 holds, we obtain vector β¯=α¯T1 and matrix B=TAT1, that define the same ME function, since (2) β¯eBt(B)1=α¯T1eTAT1t(TAT1)1=α¯T1TeAtT1T(A)T11=α¯eAt(A)1=f(t).(2)

Definition 2.2.

The SCV of the real function f(t) is defined as (3) SCV(f(t))=μ0μ2μ121,(3) where μi=t=0tif(t)dt for i{0,1,2}.

According to Definition 2.2, the SCV is insensitive to multiplication and scaling, because t=0tic1f(c2t)dt=c1c2i+1μi and (4) SCV(c1f(c2t))=c1c2μ0c1c23μ2(c1c22μ1)21=μ0μ2μ121=SCV(f(t)).(4)

Definition 2.3.

[Asmussen and Bladt[Citation2] and Asmussen and O’Cinneide[Citation3]] The probability density function of an order N ME distribution is an ME(N) function with the following properties: f(t)0 for t0 and t=0f(t)dt=1.

If f(t) is a density function of an ME distribution, then μi denotes the ith moment and SCV(f(t)) the squared coefficient of variation (SCV) of the ME distribution. An ME distribution with density function f(t) is said to be concentrated if SCV(f(t)) is low.

Although ME functions have been used for many decades, there are still many questions open regarding their properties. Such an important question is how to decide efficiently if an ME function is non-negative for all t > 0. In general, f(t)0 does not hold for given (α¯,A) matrix representation for all t > 0, unless it has been constructed to be non-negative. In this paper, we are going to restrict our attention to such a special construction, the exponential-cosine square functions.

For the least varying ME(N) distributions only conjectures are available for N3.[Citation4] According to the current conjecture, for odd N, the density function of the most concentrated ME(N) distribution belongs to a special class characterized by f(t)=c1f+(c2t), where c1 and c2 are positive constants and f+(t) is an exponential cosine-square function defined below.

Definition 2.4.

The set of exponential cosine-square functions of order n has the form (5) f+(t)=eti=1ncos2(ωtϕi2),t0,(5) where ω0 and 0ϕi<π for i{1,,n}.

Since the SCV is insensitive to multiplication and scaling, in the rest of the paper we focus on SCV(f+(t)) with f+(t) as defined in Equation(5) (that is, without normalization and scaling).

An exponential cosine-square function is defined by n + 1 parameters: ω and ϕi for i{1,,n}. f+(t) is an ME(2n+1) function. Although the representation in Equation(5), which we refer to as the cosine-square representation, is not a matrix representation, [Horváth et al.,[Citation7] Appendix A] presents the associated matrix representation of size N=2n+1. Consequently, the set of exponential cosine-square functions of order n define a special subset of ME(2n+1) since f+(t), by construction, is non-negative. The SCV of f+(t) is a complicated function of the parameters, whose minimum does not exhibit a closed analytic form. That is why we have resorted to the following numerical problem. For a given odd order N=2n+1, we are looking for efficient numerical methods for finding the ω and ϕi (i{1,,n}) parameters which result in a low SCV(f+(t)). For efficient numerical minimization of the SCV for N > 47 (i.e., n > 23) we need

  1. an accurate computation of the SCV based on the parameters with low computational cost and

  2. an efficient optimization procedure with low computational cost.

In this paper we present a method that addresses i) in Section 3, and one that addresses ii) in Section 4.

3. Efficient computation of the squared coefficient of variation

To evaluate the objective function of the optimization problem, namely the SCV, we need efficient methods to compute μ0, μ1 and μ2. Deriving the μi parameters based on Equation(5) is difficult for large N. Hence we propose to compute them based on a different representation.

3.1. The hyper-trigonometric representation

The following theorem defines the hyper-trigonometric form of the exponential cosine-square functions and provides a recursive procedure to obtain its parameters from ω and ϕi,i{1,,n}.

Theorem 3.1.

An order N=2n+1 exponential cosine-square function can be transformed to a hyper-trigonometric representation of form (6) f+(t)=  c(n)·et+etk=1nak(n)cos(kωt)+etk=1nbk(n)sin(kωt),(6) where t,ω0,c(n)=12a0(n) and the coefficients ak(n),bk(n) are calculated recursively:

  • for n = 1: (7) a0(1)=1,  b0(1)=0,  a1(1)=12cosϕ1,  b1(1)=12sinϕ1,(7)

  • for k=0,n1: (8) a0(n)=12a0(n1)+12a1(n1)cosϕn+12b1(n1)sinϕn,(8) (9) b0(n)=0,(9)

  • for 1kn,n2 (10) ak(n)=12ak(n1)+12ak1(n1)+ak+1(n1)2cosϕn+12bk+1(n1)bk1(n1)2sinϕn,(10) (11) bk(n)=12bk(n1)+12bk1(n1)+bk+1(n1)2cosϕn+12ak1(n1)ak+1(n1)2sinϕn(11)

where ak(n)=bk(n)=0 for k > n.

Proof.

The theorem is proved by induction. For the order N=2n+1 cosine-square function we introduce the notation f(n)(t)=i=1ncos2(ωtϕi2). For n = 1, f(1)(t)=cos2(ωtϕ12)=12(1+cos(ωtϕ1))=12(1+cos(ωt)cos(ϕ1)+sin(ωt)sin(ϕ1)), and the theorem holds. Assuming that the theorem holds for n – 1, we are going to show that it holds for n as well. We have that f(n)(t)=f(n1)(t)cos2(ωtϕn2)=f(n1)(t)(12+12cos(ωtϕn))=f(n1)(t)/2+f(n1)(t)cos(ωt)cos(ϕn)/2+f(n1)(t)sin(ωt)sin(ϕn)/2=c(n1)/2+k=1n1ak(n1)cos(kωt)/2+k=1n1bk(n1)sin(kωt)/2+c(n1)cos(ωt)cos(ϕn)/2+c(n1)sin(ωt)sin(ϕn)/2+k=1n1ak(n1)cos(kωt)cos(ωt)cos(ϕn)/2+k=1n1ak(n1)cos(kωt)sin(ωt)sin(ϕn)/2+k=1n1bk(n1)sin(kωt)cos(ωt)cos(ϕn)/2+k=1n1bk(n1)sin(kωt)sin(ωt)sin(ϕn)/2.

Using the identities cos(kωt)cos(ωt)=cos((k+1)ωt)/2+cos((k1)ωt)/2,sin(kωt)sin(ωt)=cos((k+1)ωt)/2+cos((k1)ωt)/2,cos(kωt)sin(ωt)=sin((k+1)ωt)/2sin((k1)ωt)/2,sin(kωt)cos(ωt)=sin((k+1)ωt)/2+sin((k1)ωt)/2, and collecting the coefficients corresponding to cos(kωt) and sin(kωt) provides Equation(10) and Equation(11). Terms from cos((k1)ωt) at k = 1 contribute to c(n), leading to c(n)=12a0(n). The relations for the boundaries can be obtained in a similar manner. □

The hyper-trigonometric representation makes it possible to express the Laplace transform (LT) and the μi moments for i{0,1,2} in a simple and compact way.

Corollary 3.2.

The LT and the μi,i{0,1,2}, moments of the exponential cosine-square function, for sC, are given by (12) f*(s)=0estf+(t) dt=c(n)1+s+k=1nak(n)(1+s)+bk(n)kω(1+s)2+(kω)2,(12) and (13) μ0=c(n)+k=1nak(n)+bk(n)kω1+(kω)2,μ1=c(n)+k=1nak(n)+2bk(n)kωak(n)(kω)2(1+(kω)2)2,μ2=2c(n)+k=1n2ak(n)+6bk(n)kω6ak(n)(kω)22bk(n)(kω)3(1+(kω)2)3.(13)

Proof.

Since Equation(6) is linear, it can be Laplace transformed term-by-term using the relations LT(et)=1s+1,LT(etcos(ωt))=s+1(s+1)2+ω2,LT(etsin(ωt))=ω(s+1)2+ω2.

Based on f*(s), the μi moments can be computed using the LT moment relation μi=(1)ididsif*(s)|s=0.

In order to compute the matrix representation of the exponential cosine-square function, we introduce one more representation, and the associated transformation.

3.2. The spectral representation

The hyper-trigonometric representation makes it easy to obtain the spectral form of f+(t) as (14) f+(t)= c(n)et+12k=1n(ak(n)+ibk(n))e(1+ikω)t+(ak(n)ibk(n))e(1ikω)t,(14) where i denotes the imaginary unit. We refer to Equation(14) as the spectral representation because these exponential coefficients are the eigenvalues of matrix A in the matrix representation.

3.3. The matrix representation

Corollary 3.3.

The f+(t)=eti=1ncos2(ωtϕi2) function has a size N=2n+1 matrix representation f+(t)=α¯eAt(A)1, where the row vector α¯ is composed by the elements (15) α1=c(n),(15) (16) α2k=12ak(n)(1+kω)bk(n)(1kω)1+(kω)2,(16) (17) α2k+1=12ak(n)(1kω)+bk(n)(1+kω)1+(kω)2,(17) and the matrix A is given by (18) A=[1Δ1Δn],(18) where Δk=[1kωkω1], for k{1,,n}.

Proof.

According to Equation(18), (19) eAt=[eteΔ1teΔnt](19) where (20) eΔkt=e[1kωkω1]t=[etcos(kωt)etsin(kωt)etsin(kωt)etcos(kωt)].(20)

Based on Equation(19) and Equation(20), we can write (21) α¯eAt(A)1=c(n)et+k=1n[α2kα2k+1]eΔkt[1+kω1kω]=c(n)et+k=1n[α2kα2k+1] [etcos(kωt)etsin(kωt)etsin(kωt)etcos(kωt)]  [1+kω1kω]=c(n)et+k=1net(ak(n)cos(kωt)+bk(n)sin(kωt)),(21)

which is identical with Equation(6). □

3.4. Numerical computation of the moments

Theorem 3.1 together with Corollary 3.2 provides a very efficient explicit method to compute the SCV based on the parameters ω,ϕi,i{1,,n}.

There is one numerical issue that has to be taken care of when applying this numerical procedure with floating point arithmetic for large values of n. To evaluate the SCV, coefficients ak(n),bk(n),c(n) need to be obtained from the ω and ϕi,i{1,,n} parameters. The recursion defined in Theorem 3.1 involves multiplications between bounded numbers (sine and cosine always fall into [1,+1]), which is beneficial from the numerical stability point of view, but subtractions are unfortunately also present, leading to a loss of precision. To overcome this loss of precision, we introduce increased precision floating point arithmetic both in our Mathematica and C++ implementations1.[Citation9,Citation10] Mathematica can quantify the precision loss, enabling us to investigate this issue experimentally. According to , the number of accurate decimal digits lost when evaluating the SCV from the ω,ϕi parameters (computed by the Precision function of Mathematica), denoted by Ln, is nearly linear and can be approximated by (22) Ln1.487+0.647n.(22)

Figure 1. The precision loss while computing the SCV.

Figure 1. The precision loss while computing the SCV.

In the forthcoming numerical experiments we have set the floating point precision to Ln+16 decimal digits to obtain an accuracy of results up to 16 decimal digits, and this precision setting eliminated all numerical issues.

It is important to note that the high precision is needed only to calculate the ak(n),bk(n),c(n) coefficients and the SCV itself. Representing parameters ω,ϕi,i{1,,n} themselves does not need extra precision, and the resulting exponential cosine-square function f+(t) can be evaluated with machine precision as well (in the range of our interest, n1000).

A basic pseudo-code of the computation of the SCV with the indications where high precision is needed is provided by Algorithm 1.

Algorithm 1.

Pseudo-code for the computation of the SCV

  1. procedure ComputeSCV(n,ω,ϕi,i{1,,n})

  2. Compute the required precision, Ln, from Equation(22)

  3. Convert ω,ϕi,i{1,,n} to Ln+16 digits precision

  4. Calculate ak(n),bk(n),k{1,,n},c(n) by Theorem 3.1 (high precision)

  5. Calculate moments μ0,μ1,μ2 according to Equation(13) (machine precision)

  6. Calculate SCV=μ0μ2μ121 (machine precision)

  7. return SCV

  8. end procedure

4. Minimizing the squared coefficient of variation

Given the size of the representation N=2n+1, the f+(t) function providing the minimal SCV is obtained by minimizing Equation(3) subject to ω and ϕi,i{1,,n}. The form of the SCV does not allow a symbolic solution, and its numerical optimization is challenging, too. The surface to optimize has many local optima, hence simple gradient descent procedures failed to find the global optimum and are sensitive to the initial guess.

4.1. Optimizing the parameters

In the numerical optimization of the parameters, we had success with evolutionary optimization methods, in particular with evolution strategies. The results introduced in Horváth et al.[Citation7] were obtained by one of the simplest evolution strategies, the Rechenberg method.[Citation11]. In Horváth et al.,[Citation7] it was the high computational demand of the numerical integration needed to obtain the SCV and its reduced accuracy that prevented the optimization for N > 47 (n > 23).

However, computing the SCV based on the hyper-trigonometric representation using the results of Section 3.1 allows us to evaluate the moments orders of magnitudes faster and more accurately, enabling the optimization for higher n values. With the Rechenberg method (Rechenberg[Citation11], also referred to as (1 + 1)-ES in the literature) it is possible to obtain low SCV values relatively quickly for orders as high as n = 125, but these values are suboptimal in the majority of cases.

With more advanced evolution strategies the optimal SCV can be approached better. Our implementation[Citation9] supports the covariance matrix adoption evolution strategy (CMA-ES[Citation5]), and one of its variants, the BIPOP-CMA-ES with restarts.[Citation6]. Starting from a random initial guess, we got very low SCV values much quicker with the CMA-ES than with the (1 + 1)-ES with similar suboptimal minimum values (cf. ). The limit of applicability of CMA-ES is about n = 180. The best solution (lowest SCV for the given order), however, was always provided by the BIPOP-CMA-ES method, although it was by far the slowest among the three methods we studied. In fact, we believe that BIPOP-CMA-ES returned the global optimum for n{1,,74}, and we investigate the properties of those solutions in the next sections. For n > 74, we can still compute low SCV functions with the BIPOP-CMA-ES method, but its computation time gets to be prohibitive, and we are less confident about the global minimality of the results.

Figure 2. The minimal and the heuristic SCV as a function of order n in log-log scale.

Figure 2. The minimal and the heuristic SCV as a function of order n in log-log scale.

For our particular problem, the running time, T, and the quality of the minimum, Q (how low the SCV is), obtained by the different optimization methods can be summarized as follows TCMAES< T(1+1)ESTBIPOPCMAES,QCMAES Q(1+1)ES<QBIPOPCMAES.

4.2. Properties of the minimal SCV solutions

The minimal SCV values obtained by the BIPOP-CMA-ES optimization, which we conjecture to be optimal for n{1,,74}, are depicted in . Apart from the minimal SCV values of the exponential cosine-square functions, also plots 1/(2n+1) and 2/(2n+1)2, for comparison. The SCV=1/N=1/(2n+1) is known to be the minimal SCV value for PH distributions of order N,[Citation1] which form a subset in the set of order N ME distributions.[Citation2] The 2/N2=2/(2n+1)2 curve is reported to be the approximate decay rate in Horváth[Citation7], up to n = 23 (N = 47).

Figure 3. The minimal SCV of the exponential cosine-square functions as the function of n in log-log scale.

Figure 3. The minimal SCV of the exponential cosine-square functions as the function of n in log-log scale.

indicates that the SCV decreases much faster than 1/N and a bit faster than 2/N2. Indeed, 2/N2 is a good approximation up to n = 23, but the decay seems to decrease below 2/N2 for n > 23.

We suspect that the decrease is asymptotically polynomial (at n), that we checked by plotting the SCVn2.14 function in . While the exponent is determined empirically and might be slightly off, it is visible that the convergence is faster than 1/n2 for the examined range (n < 1000).

Figure 4. The function S(n)=SCVn·n2.14 with logarithmic y axis.

Figure 4. The function S(n)=SCVn·n2.14 with logarithmic y axis.

4.3. The parameters providing the minimal SCV

In this section, we investigate the parameters corresponding to the minimal SCV and provide an intuitive explanation for the observed behavior. depicts the optimal ω parameter as a function of n. It shows a slow decrease with some inhomogeneity around n{14,28,43,60}. suggests that ω tends to 0 as n0, and the inhomogeneity is related to the behavior of the ϕk parameters, as it is detailed below.

Figure 5. The ω parameter providing the minimal SCV in lin-lin and log-log scales.

Figure 5. The ω parameter providing the minimal SCV in lin-lin and log-log scales.

The visual appearance of the optimal ϕk parameters in reveals many more interesting properties. First, since the period of the cosine-square function is π, the ϕk parameters in Equation(5) are 2π periodic. That is, adding an integer times 2π to any of the ϕk parameters does not change the f+(t) function. In we transformed all ϕk parameters to the (π,π) range and depicted ϕk+π instead of ϕk because ϕk+π indicates the location where the cosine-square term with ϕk is zero in the (0,2π) cycle, which is the (0,2π/ω) interval of f+(t). The nth row in depicts n points, which are the optimal ϕk+π values for k=1,,n. In these n points of the (0,2π) cycle f+(t) equals zero. In between these zeros f+(t) has humps.

Figure 6. The location of the ϕk,k=1,,n parameters providing the minimal SCV.

Figure 6. The location of the ϕk,k=1,…,n parameters providing the minimal SCV.

and demonstrate the effect of ϕk on f+(t) for ω = 1, n = 3, ϕi+π{0.1,1,2} for i{1,2,3}. In the (0,2π) interval, both, f+(t) and i=1ncos2(tϕi2), have zeros at 0.1, 1 and 2. depicts f+(t) with the exponential attenuation, while depicts i=1ncos2(tϕi2) without the exponential attenuation. In the sizes of the humps depend on the distance of the neighboring zeros. The hump between 0.1 and 1 is smaller than the one between 1 and 2, which indicates that the closer the neighboring zeros are the smaller the humps are. In , the exponential attenuation also affects the sizes of the humps. With the exponential attenuation the hump between 0.1 and 1 is larger than the one between 1 and 2. Additionally, the SCV is more sensitive to the humps further from the main peak, which motivates the fact that the ϕk+π parameters are more concentrated around 0 to make the function as flat as possible near t = 0, where the exponential attenuation is rather weak. In , right of the main peak, f+(t) is suppressed by the exponential attenuation, while left of the peak the zeros of the cosine square terms keep the function low (c.f. ).

Figure 7. f+(t) with ω = 1, n = 3, ϕi+π{0.1,1,2} for i{1,2,3}.

Figure 7. f+(t) with ω = 1, n = 3, ϕi+π∈{0.1,1,2} for i∈{1,2,3}.

Figure 8. The initial part of i=1ncos2(ωtϕi2) (without exponential attenuation), with ω = 1, n = 3, ϕi+π{0.1,1,2} for i{1,2,3}.

Figure 8. The initial part of ∏i=1n cos 2(ωt−ϕi2) (without exponential attenuation), with ω = 1, n = 3, ϕi+π∈{0.1,1,2} for i∈{1,2,3}.

In , for n < 14, the zeros are located between 0 and 5, which means that in this range of n f+(t) is kept close to zero by the cosine-square functions in the interval (0, 5). It has a peak between 5 and 2π, and the next cycles are suppressed by the exponential attenuation.

For n14, there is a gap in the sequence of ϕk parameters, indicating the location of a spike of f+(t). It means that the exponential attenuation is not strong enough for suppressing f+(t) right after the spike, and the minimal SCV is obtained when some cosine-square terms are used to enforce the suppression beyond the spike.

The number of cosine-square terms used to suppress f+(t) beyond the spike changes at n{14,28,43,60}. These changes result in the small inhomogeneity in the ω values at n{14,28,43,60} in .

According to our intuitive understanding ω tends to 0 as n, because the cosine squared terms are more efficient in suppressing f+(t) than the exponential attenuation, and for large n, the cosine squared terms create a sharp spike inside the (0,2π) cycle (which is the (0,2π/ω) interval), such that the zeros are located on both sides of the spike. The cosine squared terms are 2π/ω periodic and consequently, there is a spike also in the (2π/ω,4π/ω) interval which is suppressed by exponential attenuation. The exponential attenuation in a 2π/ω long interval is e2π/ω. In order to efficiently suppress the spike in the (2π/ω,4π/ω) interval, ω has to be small.

5. Heuristic optimization with 3 parameters

According to the previously discussed approach the number of parameters to optimize increases with n. This drawback limits the applicability of the general optimization procedures to about n74 in the case of BIPOP-CMA-ES and about n180 in the case of the basic CMA-ES. Using these n values the optimization procedure takes several days to terminate on an average PC clocked at 3.4 GHz.

While the f+(t) function obtained this way for n = 180 has an extremely low SCV (105) already, some applications might benefit from ME distributions with even lower SCV. To overcome this limitation we developed a suboptimal heuristic procedure, that aims to obtain low SCV for a given large order n.

Our heuristic procedure has to optimize only three parameters, independent of the order n. The procedure is based on the assumption that the location of the spike in the (0,2π) cycle of the cosine-squared function plays the most important role in the SCV, and the exact values of the ϕk parameters are less important, the only important feature is that the cosine-squared terms characterized by the ϕk parameters should suppress f+(t) uniformly in the (0,2π) cycle – apart from the spike (cf. ). In case of ME distributions, the spike is the dominant mode of the density function.

Figure 9. The spike and the zeros of f+(t).

Figure 9. The spike and the zeros of f+(t).

Based on this assumption we set the ϕk parameters of the cosine-squared terms equidistantly. This way the position of the spike (denoted by p) and its width (w) inside the (0,2π) interval completely define the ϕk values for a given order n.

The distance of the ϕk parameters (d) and the number of ϕk parameters before the spike (i) can be computed from p and w by (23) d=2πwn,   i=pw/2d+12,(23) and for k{1,,n} the ϕk parameters are (24) ϕk={(k1/2)dπif ki,(k1/2)d+wπif k>i.(24)

The obtained heuristic procedure has only 3 parameters to optimize: ω, p and w (see Algorithm 2). The SCV values computed by this heuristic optimization procedure are depicted in for n{10,,74}. The figure indicates that for small n (n < 15) the procedure is inaccurate, but it is not a problem because the minimal SCV can be computed quickly in these cases. For larger n (n15) the SCV provided by the heuristic procedure is less than twice the minimal SCV in the given range. Assuming that this ratio to the optimal SCV remains valid also for n > 74 the heuristic procedure which is applicable up to n = 1000, is an efficient tool to compute highly concentrated ME distributions for large order n values.

Figure 10. The minimal and the heuristic SCV as a function of order n in log-log scale.

Figure 10. The minimal and the heuristic SCV as a function of order n in log-log scale.

Algorithm 2.

The objective function of the heuristic method

  1. procedure ComputeSCV-heuristic(ω,p,w)

  2. Obtain the distance of zeros, d, and threshold i by Equation(23)

  3. Obtain ϕi for i{1,,n} by Equation(24)

  4. Compute SCV by Algorithm 1

  5. return SCV

  6. end procedure

depicts the SCV obtained by the heuristic procedure for large n values, compared with the outputs of the highly accurate BIPOP-CMA-ES and the faster CMA-ES optimization procedures. suggests that the heuristic optimization remains very close to the minimum also for larger n values and the SCV obtained by the heuristic optimization maintains its polynomial decay between n2.1 and n2.2.

5.1. Implementation notes

The reader can validate the results of the paper based on the parameters we computed for a representative set of n values between 1 and 1000 and made available at https://github.com/ghorvath78/iltcme.[Citation10] The file itccme.json contains (among other parameters) the ak(n),bk(n),c(n) parameters for the computed set of n values from which f+(t) can be plotted using Equation(14) (to check the non-negativity of f+(t)), and the SCV can be computed using Equation(13) in any general mathematical programing environments. Some related functions processing the parameters in itccme.json are also available at https://github.com/ghorvath78/iltcme[Citation10] in various programing environments. Additionally, the implementation of the optimization procedure at https://github.com/ghorvath78/cmefinder[Citation9] allows the reader to reproduce the parameters of the CME distributions presented in itccme.json.

The parameter setting of the code in https://github.com/ghorvath78/cmefinder[Citation9] resulted from the trends of the parameters depicted in and , which can also be projected to unexplored orders with some confidence. Our numerical experiments suggest that there are several local optima with similar SCV in the range of initial guesses and outside that range the obtained SCV is several orders of magnitude larger. That is, the optimization is sensitive to the initial guess, but with a proper initial guess it is insensitive to implementation details like the stopping criteria.

6. Conclusion

The paper presents a method to generate ME distributions with very low SCV. The hyper-trigonometric representation of the subset of ME distributions with exponential cosine-squared density function enables the efficient, explicit computation of the squared coefficient of variation. By selecting the appropriate numerical precision and a suitable numerical optimization method, we managed to create ME functions up to order n = 1000 with very low SCV (<106 for n = 1000). Such non-negative, low-SCV matrix exponential functions are important ingredients in several numerical procedures, including the numerical inverse Laplace transform and representing deterministic delay in stochastic models.

Note

Additional information

Funding

This work is partially supported by the OTKA K-123914 and the TUDFO/51757/2019-ITM grants.

Notes

1 In C++ we used the mpfr library for multi-precision computations.

References

  • Aldous, D.; Shepp, L. The least variable phase type distribution is Erlang. Stoch. Models 1987, 3, 467–473.
  • Asmussen, S.; Bladt, M. Renewal theory and queueing algorithms for matrix-exponential distributions. In Matrix-Analytic Methods in Stochastic Models. Volume 183 of Lecture Notes in Pure and Applied Mathematics; Alfa, A. S., Chakravarthy, S. R., Eds.; New York: Marcel Dekker, 1996; pp 313–341.
  • Asmussen, S.; O’Cinneide, C. A. Matrix-exponential distributions – Distributions with a rational Laplace transform. In Encyclopedia of Statistical Sciences; Kotz, S.; Read, C., Eds.; John Wiley & Sons: New York, 1997; pp 435–440.
  • Éltető, T.; Rácz, S.; Telek, M. Minimal coefficient of variation of matrix exponential distributions. Presented at 2nd Madrid Conference on Queueing Theory, Madrid, Spain, Jul 2006.
  • Hansen, N. The CMA evolution strategy: A comparing review. In Towards a New Evolutionary Computation; Lozano, J. A., Larrañaga, P., Inza, I., Bengoetxea, E., Eds.; Springer: Berlin, Heidelberg, 2006; pp 75–102.
  • Hansen, N. Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed. Presented at Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, ACM, Montreal, Québec, Canada, July 8–12, 2009, pp 2389–2396.
  • Horváth, I.; Sáfár, O.; Telek, M.; Zámbó, B. Concentrated matrix exponential distributions. In European Workshop on Performance Engineering; Fiems, D., Paolieri, M., Platis, A. N., Eds.; Cham, Switzerland: Springer, 2016; pp 18–31.
  • Horváth, I.; Talyigás, Z.; Telek, M. An optimal inverse Laplace transform method without positive and negative overshoot – An integral based interpretation. Electron. Notes Theor. Comput. Sci. 2018, 337, 87–104. DOI: 10.1016/j.entcs.2018.03.035.
  • Optimization procedure for finding ME(n) with low SCV. https://github.com/ghorvath78/cmefinder [accessed Sep 22, 2019].
  • Parameters of ME(n) distributions with low SCV. https://github.com/ghorvath78/iltcme [accessed Sep 22, 2019].
  • Rechenberg, I. Evolutionsstrategien. In Simulationsmethoden in der Medizin und Biologie; Schneider, B., Ranft, U., Eds.; Springer-Verlag: Berlin, Heidelberg, 1978; pp 83–114.