The variances of non-parametric estimates of the cross-sectional distribution of durations: Econometric Reviews: Vol 41 , No 10

Abstract

This paper focuses on the link between non-parametric survival analysis and three distributions. The delta method is applied to derive the variances of the non-parametric estimators of three distributions: the distribution of durations (DD), the cross-sectional distribution of ages (CSA) and the cross-sectional distribution of (completed) durations (CSD). The non-parametric estimator of the the cross-sectional distribution of durations (CSD) has been defined and derived by Dixon (Citation2012) and used in the generalized Taylor price model (GTE) by Dixon and Le Bihan (Citation2012). The Monte Carlo method is applied to evaluate the variances of the estimators of DD and CSD and how their performance varies with sample size and the censoring of data. We apply those estimators to two data sets: the UK CPI micro-price data and waiting-time data from UK hospitals. Both the estimates of the distributions and their variances are calculated. Depending on the empirical results, the estimated variances indicate that the DD and CSD estimators are all significant.

Keywords:

JEL Codes:

Acknowledgments

We are grateful for very helpful comments from Patrick Minford, Kul Luintel, Walter Distaso, seminar participants at Cardiff University and also from participants at the 2018 China Meeting of the Econometric Society. We would also like to thank the editor and referee for their comments and advice.

Notes

1 We use the term distribution as short hand for discrete probability density function.

2 This is also known as the unconditional hazard function.

3 Durations are censored if we do not observe their beginning (left-censored) or their end (right-censored). It is common practice in survival analysis not to used left-censored data, which is why we focus on the right-censored data.

4 We summarise this derivation in the online appendix.

5 DD is NA(not available) when i = 0. The reason is that ${\hat{a}}_{0}^{d} = {\hat{S}}_{- 1} {\hat{h}}_{0} .$

6 The cross-section is length biased, so that the probability of observing a spell is proportional to length. The CSA has an interruption bias, since the spells are incomplete. With a constant hazard, the two biases exactly cancel out. This happens when DD follows a Bernoulli distribution with a hazard rate that is constant (in macroeconomics this is used in the discrete-time Calvo model of pricing).

7 Slutsky’s theorem states that if there exist two random variables or vectors X_i and Y_i, and those variables or vectors satisfy $X_{i} \overset{d .}{\to} X$ and $Y_{i} \overset{p .}{\to} c,$ then there exists the relationship:

$f (X_{i}, Y_{i}) \overset{d .}{\to} f (X, c)$

Where $X_{i} \overset{d .}{\to} X$ means that X_i converges to the fixed value X in distribution; $Y_{i} \overset{p .}{\to} c$ means that Y_i converges to the constant point c in probability.

8 The maximum likelihood estimator ${\hat{S}}_{i}$ is close to the mean value of S_i in large sample size. the S_i can be replaced by ${\hat{S}}_{i}$ in Greenwood formula. At this point, we replace x_i by ${\hat{x}}_{i}$ and y by $\hat{y}$

9 This method could also be extended to include left-censored data or other data imperfections.

10 The interval $(0, r_{1}]$ can be defined as the “first” period, and $(r_{i - 1}, r_{i}]$ is the “i”-th period. At this point, all the formulae are slightly different from previously result. For example, the estimator of CSD is $a_{i} = \frac{i S_{u_{i - 1}} h_{u_{i}}}{\sum_{k = 0}^{u_{F}} S_{k}}$

11 Since the parameter of the exponential distribution of censored time and observed time are 0.5 and 2, separately. The right-censored proportion of the total sample can be known as 0.8 = $\frac{0.5}{2 + 0.5} .$ The algebra is shown by Efron (Citation1981).

12 That is, we include the right censored data in N_i, but have only uncensored data in D_i

13 The benchmark value calculated from the Monte Carlo simulation. It is very close to the true value.

14 The CSA is the special case of the CSD, so we only provide the empirical results of CSD.

15 In the simulation results, the coefficient i of equation (24) is ignored in the simulation process. The reason is that i is a constant parameter for each a_i.

16 As Cox (Citation1990) and Franz (Citation2007) have shown, the delta method is a robust method for calculating the confidence interval for the ratio variable if the coefficient of variation, CV, of denominator of the ratio variable is a small value, where $C V = σ / μ .$ In the CPI micro-data, CV = 0.0061795. Since we use the delta method, we can interpret the ratio of the estimator to its standard deviation as a t-statistic, demonstrating that it is significantly different from zero. In Tian and Dixon (Citation2022), we evaluate the empirical size for the delta approximation of CSD estimator. The empirical results indicate that delta method is valid to do the null hypothesis test and construct the confidence interval for CSD estimator by using the critical value from student's t-distribution.

17 As with the CPI data, the CV = 0.01071 is small.

18 As shown in the online appendix, the KM estimator is a maximum likelihood estimator so that ${\hat{S}}_{i - 1}$ converges to the true value $S_{i - 1}$ and the marginal hazard function ${\hat{h}}_{i}$ converges to the true value h_i. At this point, we show that those result can be derived from the delta method. In the online appendix, we show how the KM estimator can be derived as an MLE.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 578.00 Add to cart

* Local tax will be added as applicable

The variances of non-parametric estimates of the cross-sectional distribution of durations

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

The variances of non-parametric estimates of the cross-sectional distribution of durations

Abstract

Acknowledgments

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature