Abstract
Many undergraduate students are introduced to frequentist or classical methods of parameter estimation such as maximum likelihood estimation, uniformly minimum variance unbiased estimation, and minimum mean square error estimation in a reliability, probability, or mathematical statistics course. CitationRossman, Short, and Parks (1998) present some thought provoking insights on the relationship between Bayesian and classical estimation using the continuous uniform distribution. Our aim is to explore these relationships using the exponential distribution. We show how the classical estimators can be obtained from various choices made within a Bayesian framework.
1. Introduction
The one-parameter exponential distribution is often used to illustrate concepts such as parameter estimation in undergraduate courses in mathematical statistics, probability, and reliability. The exponential density is easy to manipulate analytically and provides a good starting point for discussions of more general distributions. Furthermore, its analytical tractability allows exploration of the relationships between classical and Bayesian estimation.
Consider a random sample of independent observations X1, …, Xn from an exponential distribution with probability density function
Among the classical estimators of θ, it is easy to show that the maximum likelihood estimator (MLE) and the uniformly minimum variance unbiased estimator (UMVUE) are and , respectively. In the class of estimators of the form , the one that minimizes the mean squared error is .
In Section 2 we consider the problem of estimating the parameter θusing the Bayesian approach. Bayesian estimators derived from an improper prior distribution can be used to derive the classical estimators given above. The technique of deriving the classical estimators from the Bayesian estimator is not new. CitationRossman, Short, and Parks (1998) present a very helpful paper for teaching connections between Bayesian and classical estimators using the continuous uniform distribution.
The roots of Bayesian analysis lie in Bayes's Theorem:
where A is an event and the Bj's, j = 1, …, m, are mutually exclusive and collectively exhaustive events in a sample space with P (Bj) > 0 for all j. The same results can be translated to random variables, both discrete and continuous. Let U and Y be continuous random variables and let fU (u) be the prior density of U and g (y | u) be the conditional density of Y given U. Bayes' Theorem for continuous random variables then can be represented by
where g (u | y) is called the posterior density function of U. For more information see CitationRohatgi (1984) or CitationBerger (1988). In the next section we illustrate the continuous case using the exponential distribution with a single parameter θ. Additionally, interval estimators for θ are compared and an example is given.
2. Derivation of Point and Interval Estimators
Bayesian statistics have traditionally been dominated by the notion of conjugate priors. A class C of prior distributions is a conjugate family for F, where F denotes a class of density functions, if the posterior distribution is also in the class C for all density functions in F and all prior density functions in C. When dealing with conjugate priors, the posterior distribution can be easily calculated. In this section we derive the posterior distribution by using an improper prior distribution for the parameter θ.
Consider the improper prior distribution (i.e. ) for θ of the form
Notice that this prior distribution is the kernel of a gamma distribution when α ≥ 0. However, such a restriction on α is not necessary and decreases the flexibility of the resulting parameter estimator. Applying Bayes' Theorem
where f (X1, …, Xn) is the marginal distribution of X, it follows that the posterior distribution of θis
(1)The posterior distribution π(θ|X1,…,Xn) is proper when α + n > 0 and has a constant of proportionality given by . The estimator is derived by choosing that value of θwhich minimizes (assuming the squared error loss). The Bayes estimator of θis given by
The classical estimators derived in Section 1 can be obtained from the Bayes estimator by choosing different values of α and β. If α = 0 and β = 0, then the estimator corresponds to the MLE and the prior distribution is the Jeffreys' prior, π(θ) ∞ 1/θ, a standard noninformative prior as well as an improper prior. For more information on the Jeffreys' prior see CitationBerger (1988). Choosing α = −1 and β = 0 yields the UMVUE. The mean square estimator corresponds to setting α = −2 and β = 0. For α = 1 and β = 0 the prior density function is π(θ) = 1 (the flat improper prior). The resulting estimator in the case of the flat prior is .
A 100C% confidence interval for a parameter θis obtained by finding L and U such that P(L < Q < U) = C. When X1, …, Xn are independent and identically distributed exponential random variables, CitationKapur and Lamberson (1977) show that has a chi-square distribution with 2n degrees of freedom. Using this transformation, the interval estimate is developed by solving for θ in
and the resulting 100C% confidence interval for θis
The Bayesian analog to the confidence interval is called a credibility interval. In general, a 100C% credibility interval for a parameter θgiven a random sample X1, …, Xn is an interval (l (X1, …, Xn), u (X1, …, Xn) such that
CitationKapur and Lamberson (1977) show that has a chi-square distribution with 2(α + n) degrees of freedom. By using the posterior distribution in Equation(1)(1) a 100C% Bayesian credibility interval is easily developed beginning with
which gives the interval
So, when α = 0 and β = 0 the Bayesian and classical interval estimates are the same.
3. Example
Consider the following random sample of cycles to failure (in ten thousands) for 20 heater switches subject to an overload voltage:
0.0100, 0.0340, 0.1940, 0.5670, 0.6010, 0.7120, 1.2910, 1.3670
1.9490, 2.3700, 2.4110, 2.8750, 3.1620, 3.2800, 3.4910, 3.6860
3.8540, 4.2110, 4.3970, 6.4730
These data are from CitationKapur and Lamberson (1977, p. 240). summarizes the Bayesian point and interval estimates of θ. It also identifies the values of α and the Bayesian interpretation of the prior distribution as well as the corresponding classical counterpart to each point estimate of θ. Notice that negative values of α produce lower values for the posterior mean forβ = 0. Negative values of αand positive values of βput more prior weight on the small values of θ, resulting in lower estimates of the posterior mean.
4. Conclusion
We have shown the relationship of Bayesian estimators of the scale parameter of the one-parameter exponential distribution to three classical estimators, namely the MLE, UMVUE, and minimum MSE estimator. We considered both point and interval estimators. Our Bayesian estimators were derived from an improper prior distribution that is rather general. In practice, α and β are parameters whose values depend on the experimenter's a priori knowledge of the unknown parameter θand its distribution. An example was used to demonstrate the methods presented and to illustrate how Bayesian methods can yield classical estimators.
Acknowledgements
The authors are very grateful to the editor and to three anonymous referees, all of whom contributed toward the improvement of the manuscript.
References
- Berger, J. O. (1988), Statistical Decision Theory and Bayesian Analysis (2nd ed.), New York: Springer-Verlag.
- Kapur, K. C. and Lamberson, L. R. (1977), Reliability in Engineering Design, New York: John Wiley & Sons, Inc.
- Lindley, D. V. (1972), Bayesian Statistics, A Review, Philadelphia: SIAM.
- Raiffa, H. and Schlaifer, R. (1961), Applied Statistical Decision Theory, Harvard University, Boston: Graduate School of Business Administration.
- Rohatgi, V. K. (1984), Statistical Inference, New York: John Wiley & Sons, Inc.
- Rossman, A. J., Short, T. H. and Parks, M. T. (1998), “Bayes Estimators for the Continuous Uniform Distribution,” Journal of Statistics Education, [Online], 6(3). (http://ww2.amstat.org/publications/jse/v6n3/rossman.html)