Full article: Testing the mean of skewed distributions applying the maximum likelihood estimator

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

The sample moment can be used to estimate the population third central moment, $μ_{3}$ , in the Johnson’s modified t-statistic for skewed distributions. However, moment estimator is non-unique and insufficient for the parameter of population. In this paper, we display the maximum likelihood estimator (MLE) of $μ_{3}$ in modified t-statistic as parent distributions are asymmetrical. A Monte Carlo study shows that the MLE procedure is more powerful than Student’s t-test and ordinary Johnson’s modified t-test for a variety of positively skewed distributions with small sample sizes.

Keywords:

PUBLIC INTEREST STATEMENT

The effect of skewness of a random variable on test statistics has been a popular research topic in the statistics field. Student’s t-test is commonly adopted to test the null hypothesis. However, Student’s t-test may have power loss when the researches are focused on positively skewed data. This study proposed Johnson’s modified t-test with the maximum likelihood estimator (MLE) of the third central moment for positively skewed data. After controlling Type I error, half Johnson’s modified t-test could more significant than Student’s t-test for a demonstration of laboratory mice data. Johnson’s modified t-test with the MLE procedure is worth recommending for a variety of positively skewed distributions with small sample sizes.

1. Introduction

The central limit theorem is widely used when a random sample is drawn from a non-normal population with mean $μ$ and variance $σ^{2}$ . It assumes that the mean $μ$ of a population is to be estimated. In practice, a random sample of size $n$ would typically be taken from the population, and then the sample mean would be computed to estimate $μ$ . The sample mean can be defined as a random variable. Then, it varies from sample to sample and cannot be deterministically predicted. The notation $\overset{ˉ}{X}$ is used when the sample mean is defined as a random variable, and $X_{i}$ for the corresponding values where $i = 1, 2, . . ., n$ . The random variable $\overset{ˉ}{X}$ follows a sample distribution with mean $μ_{\overset{ˉ}{X}}$ and standard deviation $σ_{\overset{ˉ}{X}}$ . According to the central limit theorem, the sample mean $\overset{ˉ}{X}$ can be approximated by a normal distribution with mean $μ_{\overset{ˉ}{X}}$ = $μ$ and standard deviation $σ_{\overset{ˉ}{X}}$ = $σ / \sqrt{n}$ for a large sample size $n$ , where $σ$ is the standard deviation of the population. By this theorem, the test statistic $\sqrt{n} (\overset{ˉ}{X} - μ) / σ$ can be used to test the hypothesis that the mean of a non-normal population is $μ$ when it is known that standard deviation is $σ$ and the sample size is large.

The Student’s t-test was proposed to overcome the inefficiency of the z-test with small samples. The sample variance $S^{2}$ is used for the population variance if $σ^{2}$ is unknown. The Student’s t-test (i.e., $\sqrt{n} (\overset{ˉ}{X} - μ) / S$ ) can be used for hypotheses where the sample standard deviation $S$ is used to estimate $σ$ . It performs well when $σ$ is finite and the sample size is large. It is now assumed that the distribution of a random variable, such as the random variable $\overset{ˉ}{X}$ , should be studied. The first two moments (i.e., the mean and the variance) can be obtained as a step toward understanding the distribution, and the unbiased estimators for the mean and the variance can be obtained from a random sample. However, there are several situations that require higher-order moments. For a scenario where the sample size is small and the parent distribution is asymmetrical (e.g., Gamma distribution), Johnson (Citation1978) proposed a modified procedure for the Student’s t-test using the first few terms of the inverse Cornish–Fisher expansion, proposed by Cornish and Fisher (Citation1937), as follows:

t = [(\overset{ˉ}{X} - μ) + \frac{μ_{3}}{6 σ^{2} n} + \frac{μ_{3}}{3 σ^{4}} {(\overset{ˉ}{X} - μ)}^{2}] {(\frac{S^{2}}{n})}^{- 1 / 2},

where $μ_{3}$ is the population third central moment. It can be estimated by the sample third central moment, denoted by ${\hat{μ}}_{3}$ . When the hypothesis $H_{0} : μ_{x} = μ_{0}$ is stated, the ordinary Johnson’s modified t-statistic is

t_{1} = [(\overset{ˉ}{X} - μ_{0}) + \frac{{\hat{μ}}_{3}}{6 S^{2} n} + \frac{{\hat{μ}}_{3}}{3 S^{4}} {(\overset{ˉ}{X} - μ_{0})}^{2}] {(\frac{S^{2}}{n})}^{- 1 / 2},

where ${\hat{μ}}_{3} = \sum_{i = 1}^{n} (X_{i} - \overset{ˉ}{X})^{3} / n$ .

Under violations of both normality and variance homogeneity, Cressie and Whitford (Citation1986) examined the problem of using the conventional Student’s t-test with inappropriate standard deviation. The Welch's t-test is most frequently used to tackle the violations of classical assumptions. Alternatively, this situation can be improved by correcting the t variables using transformations, such as Johnson’s transformation and Hall’s transformation proposed by Hall (Citation1983).

For the asymmetric distribution of upper-tailed tests, Sutton (Citation1993) verified that Johnson’s $t_{1}$ -test could be used, as Student’s t-test lacks statistical power. Furthermore, it reduces the probability of Type I error. However, Johnson’s $t_{1}$ -test may yield incorrect results if skewness is inflated and the sample size is small.

To test the mean of a positively skewed distribution with the upper-tailed test, Chen (Citation1995) conducted a novel testing procedure using the Edgeworth expansion under several positively skewed distributions, such as Gamma, Weibull, exponential, and lognormal. According to the results of a simulation study, the new test statistic is more powerful than Student’s t-values and Johnson’s $t_{1}$ -values regardless of which positively skewed distribution and critical value were selected.

To estimate the mean of asymmetric distributions, Johnson (Citation1978) proposed some modified t-tests that can be widely applied to the original distributions, from normal distributions to asymmetric distributions, for example, to exponential distributions with sample size as small as 13. In several real situations, owing to the cost limitations of the sampling procedures, when the sample size is small, the deviation of the original distribution may be larger than that in Johnson’s study. In this case, Johnson’s test may lack accuracy. To resolve this, Sutton (Citation1993) proposed an improved comprehensive test method to improve Johnson’s tail t-test. Chen (Citation1995) proposed an upper-tailed test method for the average of a positively skewed distribution. According to a Monte Carlo study, Chen’s test proved to be more accurate than Johnson’s modified t-test and Sutton’s compound test for various positively skewed distributions and small samples. Above related studies used sophisticated mathematical expansion to improve the accuracy of Johnson’s test.

Diaconis and Efron (Citation1983) proposed the time-consuming computer intensive method carried out to evaluate the small-sample behavior of the modifications in terms of Type I error rate and statistical power. However, relatively few studies have considered the statistical properties of different estimators of $μ_{3}$ in the ordinary Johnson’s modified t-statistic. In this study, the maximum likelihood estimator (MLE) of $μ_{3}$ is proposed in such a modified t-statistic for asymmetrical parent distributions. A Monte Carlo simulation is performed to examine the statistical power of the MLE in the context of Johnson’s modified t-statistic for each scenario. It is demonstrated that this procedure is more powerful than both Student’s t-test and ordinary Johnson’s modified t-test for a variety of positively skewed distributions and small sample sizes.

2. MLE of μ₃ for the upper-tailed test

Skewness can be used to measure the level of asymmetry of a probability distribution. The skewness coefficient can be positive or negative and is denoted by $γ_{3}$ . It has a greater effect on a t-type variate compared with the kurtosis coefficient. Neyman and Pearson (Citation1928) and Pearson (Citation1928) demonstrated that the power of the short right tail in the sampling distribution of the Student’s t-test is small for upper-tailed tests of the population mean. Sutton (Citation1993) performed a Monte Carlo analysis to examine the statistical properties of Student’s t-test and Johnson’s modified t-test for skewed distributions. Sutton demonstrated that the power performance of Johnson’s modified t-test was better than that of the conventional t-test in several cases. When skewness was high, the Type I error was inaccurate for both tests, as the sample size was not sufficiently large. However, both procedures indicated a tendency for greater accuracy (in the Type I error) with an increase in sample size and a decrease in skewness.

In a field such as statistics, all inventions are necessarily conceptual. MLEs are arguably the most valuable invention in the history of statistics. Although MLEs are often mathematically non-trivial, and the likelihood equations are tractable only if they are specifically based on a given distribution, MLEs are still widely used in a large number of models. In general, maximum likelihood estimation can also be a different numerical application. This study begins with a familiar model, namely, the exponential family, as it is relatively simple from a computational perspective. The definition of the exponential family is as follows:

Definition 2.1. Let $f (x | θ) = exp \{Q (θ) T (x) + c (θ) + h (x)\},$ where $θ \in Ω .$ Suppose $f$ is a probability mass function (or probability density function) that belongs to the one-parameter exponential family with natural parameter space $Ω$ where $Q (θ)$ is called the natural parameter of $f$ , $T (x)$ is called the natural statistic, $c (θ)$ is the cumulant generating function, and $h (x)$ is the carrier density.

For simplicity, it is assumed that the shape parameters are known. Moreover, for completeness, the theorem on MLEs for $f$ belongs to the exponential family with parameter $θ$ is stated as follows:

Theorem 2.1. Let $X_{1}, X_{2}, . . ., {X_{n}}^{\underset{\sim}{i i d}} f (x | θ) = exp \{Q (θ) T (x) + c (θ) + h (x)\} .$ If $\hat{θ}$ is the MLE of the parameter $θ$ , then $Q^{'} (\hat{θ}) \sum_{i = 1}^{n} T (x_{i}) + n c^{'} (\hat{θ}) = 0.$

Here, three positively skewed distributions are considered: (i) a Weibull distribution, (ii) a Gamma distribution, and (iii) an exponential distribution. Of course, they belong to the one-parameter exponential family. The MLEs of the unknown parameters of these distributions are, according to Theorem 2.1, as follows.

Remark: (i) Weibull distribution ( $a, b$ )

The density of a Weibull distribution with variable $x_{i}$ , where $i = 1, 2, . . ., n$ , is given by
$f (x_{i} | a, b) = a b x_{i}^{b - 1} exp (- a x_{i}^{b}) = exp \{log a b + log x_{i}^{b - 1} - a x_{i}^{b}\},$

where $0 < x_{i} < \infty$ and $a, b > 0$ . Furthermore, $Q (a) = - a,$ $T (x_{i}) = x_{i}^{b},$ and $c (a) = log a b$ , where $b$ is known.

(2) The MLE of $a$ satisfies $Q^{'} (\hat{a}) \sum_{i = 1}^{n} T (x_{i}) + n c^{'} (\hat{a}) = - \sum_{i = 1}^{n} x_{i}^{b} + \frac{n}{\hat{a}} = 0$ . Then $\hat{a} = n / \sum_{i = 1}^{n} x_{i}^{b} .$

Remark: (ii) Gamma distribution ( $λ, r$ )

(1) The density of a Gamma distribution with variable $x_{i}$ , where $i = 1, 2, . . ., n$ , is given by $f (x_{i} | λ, r) = \frac{λ}{Γ (r)} (λ x_{i})^{r - 1} e^{- λ x_{i}} = exp \{log \frac{λ^{r}}{Γ (r)} + log x_{i}^{r - 1} - λ x_{i}\},$

where $0 < x_{i} < \infty$ and $r, λ > 0$ . Furthermore, $Q (λ) = - λ,$ $T (x_{i}) = x_{i},$ and $c (λ) = log \frac{λ^{r}}{Γ (r)}$ , where $r$ is known.

(2) The MLE of $λ$ satisfies $Q^{'} (\hat{λ}) \sum_{i = 1}^{n} T (x_{i}) + n c^{'} (\hat{λ}) = - \sum_{i = 1}^{n} x_{i} + \frac{n}{λ} = 0$ . Then $\hat{λ} = n / \sum_{i = 1}^{n} x_{i} .$

Remark: (iii) Exponential distribution ( $λ$ )

(1) The density of an exponential distribution with variable $x_{i}$ , where $i = 1, 2, . . ., n$ , is given by
$f (x_{i} | λ) = λ e^{- λ x_{i}} = exp \{log λ - λ x_{i}\},$

where $0 < x_{i} < \infty$ and $λ > 0$ . Furthermore, $Q (λ) = - λ,$ $T (x_{i}) = x_{i},$ and $c (λ) = log λ$ .

(2) The MLE of $λ$ satisfies $Q^{'} (\hat{λ}) \sum_{i = 1}^{n} T (x_{i}) + n c^{'} (\hat{λ}) = - \sum_{i = 1}^{n} x_{i} + \frac{n r}{\hat{λ}} = 0$ . Then $\hat{λ} = n r / \sum_{i = 1}^{n} x_{i} .$

According to the invariance property of MLE, it is convenient to derive the MLE of $μ_{3}$ , denoted as ${\hat{μ}}_{3}^{*}$ (see Appendix A) in each case. Then, the test statistic is

t_{2} = [(\overset{ˉ}{X} - μ) + \frac{{\hat{μ}}_{3}^{*}}{6 S^{2} n} + \frac{{\hat{μ}}_{3}^{*}}{3 S^{4}} {(\overset{ˉ}{X} - μ)}^{2}] {(\frac{S^{2}}{n})}^{- 1 / 2} .

The decision rule for testing $H_{0} : μ_{x} = μ_{0}$ versus $H_{1} : μ_{x} > μ_{0}$ is to reject $H_{0}$ when $t_{2} > t_{n - 1, α}$ under a significance level of $α$ . The theoretical derivation of $t_{2}$ is provided in Appendix B.

3. Monte Carlo simulation

Chen (Citation1995) proposed a new procedure for the upper-tailed test of the means of positively skewed distributions. Monte Carlo analysis can be used to investigate the new procedure’s statistical properties in each case. Here, random samples are generated from positively skewed distributions with a range of $γ_{3}$ values. These distributions are the Weibull ( $a = 1, b = 2$ ), Gamma ( $λ = 1, r = 5 .3$ ), Gamma ( $λ = 1, r = 4$ ), Gamma ( $λ = 1, r = 2 .3$ ), Gamma ( $λ = 1, r = 1.5$ ), Gamma ( $λ = 1, r = 1 .2$ ) and exponential ( $λ = 1$ ) corresponding to the $γ_{3}$ values are 0.63, 0.87, 1.00, 1.32, 1.63, 1.83, and 2.00, respectively.

It should be noted that studies on test procedures use Student’s t-test ( $t$ ) and Johnson’s modified t-test ( $t_{1}, t_{2}$ ). For all tests, the rejection regions are based on the $t$ -distribution. The notation of the parameters of the distribution is consistent with that in Mood, Graybill, and Boes (Citation1974). In this study, Monte Carlo samples of size 100,000 were generated for each simulation. The comparisons of the tests are based on the same conditions (i.e., sample size) to calculate the Type I error rate and the statistical power. For upper-tailed tests, let $μ_{0} = μ_{x} - k σ_{x} / \sqrt{n}$ , where $μ_{x}$ and $σ_{x}$ are the true mean and standard deviation, respectively, and $k$ = 0.5, 1.0, 1.5, 2.0, 2.5 for each scenario.

4. Simulation results

Tables and show the empirical results of the Type I error rates for Student’s t-test (the number at the top of each set) and Johnson’s modified t-test ( $t_{1}, t_{2}$ are the numbers in the middle and bottom of each set, respectively). The procedure indicates a tendency for greater accuracy of Type I error rates when the sample size increases and skewness decreases. It is evident that the Type I error rates of Student’s t-test may differ at significant levels of 0.01 and 0.05.

Table 1. Comparison of type I error rates for student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $H_{0} : μ_{x} = μ_{0}$ is true at $α = 0.01$

Display Table

Table 2. Comparison of type I error rates for Student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $H_{0} : μ_{x} = μ_{0}$ is true at $α = 0.05$

Display Table

It should be noted that when the skewness coefficient is less than 2.00 and $n = 20$ , the Type I error rates can be approximately doubled if $α = 0.01$ for testing $t$ and $t_{1}$ . Furthermore, they can be approximately 50% larger if $α = 0.05$ . However, the Type I error rate for $t_{2}$ indicates a slight inflation at the significant level of $0.01$ or $0.05$ when skewness is not severe and the sample size is as small as 20. The inflation of the Type I error rate increases as the sample size increases.

Tables and show the comparison of the power of Student’s $t$ -test, Johnson’s modified $t_{1}$ -test, and Johnson’s modified $t_{2}$ -test using the $t$ -critical point ( $t_{n - 1, α}$ ). In all the cases, as skewness and the value of $k$ vary, the statistical power of Johnson’s modified $t_{2}$ -test is higher than that of the Student’s $t$ -test and Johnson’s modified $t_{1}$ -test.

Table 3. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $n = 20$ and $H_{1} : μ_{x} = μ_{0} + k σ_{x} / \sqrt{n}$ is true at $α = 0.01$

Display Table

Table 4. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $n = 20$ and $H_{1} : μ_{x} = μ_{0} + k σ_{x} / \sqrt{n}$ is true at $α = 0.05$

Display Table

5. Demonstration using real data

The real data used here to illustrate the t-tests are from an experiment to determine the nitrogen binding capacity of laboratory mice (Dolkart, Halpern, & Perlman, Citation1971). The design was set by a control group of 20 normal mice and an experimental group of 19 diabetic mice. Both groups were treated with bovine serum albumin (BSA) for 28 days. The amount of BSA nitrogen bound was measured on the 29^th day with micrograms per milliliter of undiluted mouse serum. The two group data were used to test whether the average amount of BSA nitrogen bound in the normal control group is better than that in the experimental group (known average binding capacity is 112.72). Both tests $t_{1}$ and $t_{2}$ were used to test $H_{0} : μ_{n o r m a l} = 112 .72$ against $H_{1} : μ_{n o r m a l} > 112 .72$ . In a demonstration of laboratory mice data, we have $γ_{3}$ = 1.504 and kurtosis = 1.976 for the binding capacity of the experimental group. The goodness-of-fit test for the distribution fitting was used, and the result (p-value = 0.426) indicates that there is no significant evidence to reject the null hypothesis. This implies that the experimental group data are from the exponential distribution. The MLE of $μ_{3}$ for $t_{2}$ is considered under the exponential distribution assumption. Then, the data were tested by each Johnson’s $t$ -test, and $t_{1}$ =2.56 and $t_{2}$ =3.20 are obtained. The values of $t_{1}$ and $t_{2}$ should be compared with the critical value in Student’s t tables for 19 degrees of freedom at a significance level of 5% (i.e., $t_{19, 0 .05}$ ). It was found that the data supported $H_{1}$ rather than $H_{0}$ , and thus it is concluded that the normal mice have a significantly higher binding capacity than the diabetic mice at the critical point $t_{19, 0 .05}$ =1.729. The p-values of tests were also calculated: 0.006 and 0.001 corresponding respectively to $t_{1}$ =2.56 and $t_{2}$ =3.20. The p-value of $t_{2}$ represents a more significant impact on the dataset than that of $t_{1}$ .

6. Conclusion and future work

This study was concerned with the MLE of $μ_{3}$ in Johnson’s modified t-test and the $t_{2}$ -test of the means of positively skewed distributions. An empirical study indicated that the $t_{2}$ -test is accurate in terms of the Type I error rate when the sample size is small and skewness is not severe. Moreover, the $t_{2}$ -test is more powerful than the $t$ -test and $t_{1}$ -test given that the sampling distributions are known.

When skewed or known distributions are used, the parameters can be inferred by the MLE method more effectively than by moment estimators, as expected for known distributions. In this study, the distributions were selected with shape and scale parameters, and it was assumed that the shape parameters were known for simplicity in the setting of skewness in the simulations.

In practice, Johnson’s modified t-test is preferable when the distribution is unknown, except for its asymmetry. Therefore, the population third central moment (i.e., $μ_{3}$ ) is estimated by the sample third central moment in the calculation of Johnson’s modified t-test (i.e., frequentist) rather than by distribution-based estimators (such as MLEs). However, one may calculate the skewness coefficient of the empirical data and test them for distribution-based fit before applying the $t_{2}$ -test. It is suggested that both the $t_{1}$ and $t_{2}$ tests be performed and their results be compared for minimally skewed empirical data. Moreover, the $t_{2}$ -statistic greatly depends on the shape of the parent distribution through the goodness of fit test. Furthermore, it involves the scale for the MLE of the parent distribution. To derive a robust and powerful test, future studies should examine another estimator for $μ_{3}$ of Johnson’s modified t-test with fewer restrictions.

Additional information

Funding

The authors received no direct funding for this research.

Notes on contributors

I-Shiang Tzeng

I-Shiang Tzeng is a biostatistician at Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, New Taipei city, Taiwan. In the past years, he was a doctoral researcher in the National Translational Medicine and Clinical trial Resource Center (NTCRC) composed by Academia Sinica, National Taiwan University and National Yang-Ming University, Taiwan. He served as a bioinformatics and biostatistics consultant in NTCRC. He is also an adjunct assistant professor in the Department of Statistics, National Taipei University, Taiwan. His area of research includes biostatistics and epidemiologic method and further studies proposing the potential powerful method for age-period-cohort (APC) analysis. Futhermore, his research interests include the field of machine learning from biological issues to medical issues in all potential applications.

References

Student. (1908). The probable error of a mean. Biometrika, 6, 1–13.
Google Scholar
Chen, L. (1995). Testing the mean of skewed distributions. Journal of the American Statistical Association, 90, 767–772.
Web of Science ®Google Scholar
Cornish, E. A., & Fisher, R. A. (1937). Moments and cumulants in the specification of distributions. Review of the International Statistics Institute, 5, 307–327.
Google Scholar
Cressie, N. A. C., & Whitford, H. J. (1986). How to use the two sample t test? Biometrical Journal, 28, 131–148.
Web of Science ®Google Scholar
Diaconis, P., & Efron, B. (1983). Computer-intensive methods in statistics. Scientific American, 248, 116–130.
Web of Science ®Google Scholar
Dolkart, R. E., Halpern, B., & Perlman, J. (1971). Comparison of antibody responses in normal and alloxan diabetic mice. Diabetes, 20, 162–167.
Web of Science ®Google Scholar
Hall, P. (1983). Inverting an edgeworth expansion. The Annals of Statistics, 11, 569–576.
Web of Science ®Google Scholar
Johnson, N. J. (1978). Modified t tests and confidence intervals for asymmetrical populations. Journal of the American Statistical Association, 73, 536–544.
Web of Science ®Google Scholar
Mood, A. M., Graybill, F. A., & Boes, D. C. (1974). Introduction to the theory of statistics. New York, NY: John Wiley.
Google Scholar
Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference, part I. Biometrika, 20A, 175–240.
Google Scholar
Pearson, E. S. (1928). The distribution of frequency constants in small samples form symmetrical populations. Biometrika, 20A, 356–360.
Google Scholar
Sutton, C. D. (1993). Computer-intensive methods for tests about the mean of an asymmetrical distribution. Journal of the American Statistical Association, 88, 802–810.
Web of Science ®Google Scholar
Welch, B. L. (1947). The generalization of “Student’s” problem when several different population variances are involved. Biometrika, 34, 28–35.
PubMed Web of Science ®Google Scholar

Appendix A

(i) Weibull (

a, b

)

First, initial moments can be calculated using $E (X^{i}) = {(\frac{1}{a})}^{i / b} Γ (\frac{i}{b} + 1), i = 1, 2, 3.$

Second, $μ_{3} = E (X - μ)^{3} = E (X^{3}) - 3 E (X^{2}) E (X) + 2 E^{3} (X)$

= {(\frac{1}{a})}^{3 / b} [Γ (\frac{3}{b} + 1) - 3 Γ (\frac{2}{b} + 1) Γ (\frac{1}{b} + 1) + 2 Γ^{3} (\frac{1}{b} + 1)] .

Hence, MLE of $μ_{3}$ , $μ_{3}^{*}$ is ${(\frac{1}{\hat{a}})}^{3 / b} [Γ (\frac{3}{b} + 1) - 3 Γ (\frac{2}{b} + 1) Γ (\frac{1}{b} + 1) + 2 Γ^{3} (\frac{1}{b} + 1)],$

where $\hat{a}$ is MLE of $a$

(ii) Gamma ( $λ, r$ )

First, initial moments can be calculated using $E (X^{i}) = \frac{r (r + 1) \dots (r + i - 1)}{λ^{i}}, i = 1, 2, 3.$

Second, $μ_{3} = E (X - μ)^{3} = E (X^{3}) - 3 E (X^{2}) E (X) + 2 E^{3} (X) = \frac{2 r}{λ^{3}} .$

Hence, MLE of $μ_{3}$ , $μ_{3}^{*}$ is $\frac{2 r}{{\hat{λ}}^{3}},$ where $\hat{λ}$ is MLE of $λ$

(iii) Exponential ( $λ$ )

First, initial moments can be calculated using $E (X^{i}) = \frac{i!}{λ^{i}}, i = 1, 2, 3.$

Second, $μ_{3} = E (X - μ)^{3} = E (X^{3}) - 3 E (X^{2}) E (X) + 2 E^{3} (X) = \frac{2}{λ^{3}}$

Hence, MLE of $μ_{3}$ , $μ_{3}^{*}$ is $\frac{2}{{\hat{λ}}^{3}},$ where $\hat{λ}$ is MLE of $λ$

Appendix B

Derivation of $κ$ and $δ$ in $t_{2}$

Let $\overset{ˉ}{X}$ is defined as a random variable follows a sample distribution with mean $μ_{\overset{ˉ}{X}}$ = $μ$ and standard deviation $σ_{\overset{ˉ}{X}}$ = $σ / \sqrt{n}$ for a large sample size $n$ , where $σ$ is the standard deviation of population. First, we consider the Student’s t-test

t = \frac{\sqrt{n} (\overset{ˉ}{X} - μ)}{S},

where the sample standard deviation $S$ is used to estimate $σ$ . According to Cornish–Fisher expansion under the assumption of all moments of a population exists; then

C F (\overset{ˉ}{X}) = μ + σ_{\overset{ˉ}{X}}^{} ξ + \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{2}} (ξ^{2} - 1) + O (n^{- \frac{3}{2}}),

where $ξ$ is defined as a random variable follows a standard normal distribution. Let $μ_{3}$ is defined as the population third central moment and $μ_{3, \overset{ˉ}{X}}$ is the third central moment of $\overset{ˉ}{X}$ which equal to $μ_{3} / n^{2}$ ; then

C F (\overset{ˉ}{X}) = μ + \frac{σ}{\sqrt{n}} ξ + \frac{μ_{3}}{6 n σ^{2}} (ξ^{2} - 1) + O (n^{- \frac{3}{2}}),

C F (t) = ξ + (\frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ) ξ^{2} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}} - \frac{1}{2} [\sqrt{\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}}] ξ η .

The Cornish–Fisher expansion of $S^{2}$ which ignoring higher-order terms is

$C F (S^{2}) = σ_{\overset{ˉ}{X}}^{2} + (\sqrt{\frac{μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4}}{n}}) η = σ_{\overset{ˉ}{X}}^{2} + σ_{\overset{ˉ}{X}}^{2} (\sqrt{\frac{μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4}}{n σ_{\overset{ˉ}{X}}^{4}}}) η = σ_{\overset{ˉ}{X}}^{2} [1 + (\sqrt{\frac{μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4}}{n σ_{\overset{ˉ}{X}}^{4}}}) η] .$ Let $η = ρ ξ + ξ^{*}$ , $ξ^{*}$ be a normal variable independent of $ξ$ . Replacing the values of $\overset{ˉ}{X}$ and $S_{}^{2}$ by their respective expansions and rewriting $η = ρ_{} ξ + ξ^{*}$ , where $ρ = \frac{μ_{3, \overset{ˉ}{X}}}{\sqrt{σ_{\overset{ˉ}{X}}^{2} (μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}}$ is the correlation between $\overset{ˉ}{X}$ and $S^{2}$ , the Cornish–Fisher expansion of $t_{}$ is

C F (t) = ξ + (\frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ) ξ^{2} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}}

- \frac{1}{2} [\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}] (ξ^{2} ρ + ξ ξ^{*}),

where $ρ = \frac{μ_{3, \overset{ˉ}{X}}}{\sqrt{σ_{\overset{ˉ}{X}}^{2} (μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}}$ and $μ_{4, \overset{ˉ}{X}}$ is the fourth central moment of $\overset{ˉ}{X}$

Substitute $ρ = \frac{μ_{3, \overset{ˉ}{X}}}{\sqrt{σ_{\overset{ˉ}{X}}^{2} (μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}}$ to $C F (t)$ , then

C F (t) = ξ + (\frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ) ξ^{2} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}}

- \frac{1}{2} [\sqrt{\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}}] (ξ^{2} \frac{μ_{3, \overset{ˉ}{X}}}{\sqrt{σ_{\overset{ˉ}{X}}^{2} (μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}} + ξ ξ^{*})

= ξ + (\frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ) ξ^{2} + \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}}

- \frac{1}{2} [\sqrt{\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}}] ξ^{2} \frac{μ_{3, \overset{ˉ}{X}}}{\sqrt{σ_{\overset{ˉ}{X}}^{2} (μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}} - \frac{1}{2} [\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}] ξ ξ^{*}

= ξ + (\frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} + \frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ - \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}}) ξ^{2} + \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}}

- \frac{1}{2} [\sqrt{\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}}] ξ ξ^{*}

= ξ + (\frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ - \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}}) ξ^{2} + \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}}

- \frac{1}{2} [\sqrt{\frac{(μ_{4, \overset{ˉ}{X}} - σ_{\overset{ˉ}{X}}^{4})}{n σ_{\overset{ˉ}{X}}^{4}}}] ξ ξ^{*} .

Select $κ$ and $δ$ through constraints as follows

\frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} κ - \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} = 0 a n d \frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} - \frac{κ σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}} = 0

κ = \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} \cdot \frac{\sqrt{n}}{σ_{\overset{ˉ}{X}}} = \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} .

Substitute $κ = \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}}$ to constant term of $C F (t),$ then

\frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} - \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} \cdot \frac{σ_{\overset{ˉ}{X}}^{}}{\sqrt{n}} = 0

\frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} - \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} = 0

\frac{δ \sqrt{n}}{σ_{\overset{ˉ}{X}}} - \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} = 0

δ = \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{3} \sqrt{n}} \cdot \frac{σ_{\overset{ˉ}{X}}}{\sqrt{n}} = \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{2} n} .

Hence, according to Johnson’s method to modify t by $\overset{ˉ}{X}$ and $S_{}^{2}$ as

t = \frac{(\overset{ˉ}{X} - μ) + δ + κ \{{(\overset{ˉ}{X} - μ)}^{2} - \frac{σ_{\overset{ˉ}{X}}^{2}}{n}\}}{\frac{S}{\sqrt{n}}},

where $δ$ and $κ$ related with $μ_{3, \overset{ˉ}{X}}$ , $σ_{\overset{ˉ}{X}}^{2}$ , and $n$ .

And $δ = \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{2} n}$ , $κ = \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}}$ , then

t = \frac{(\overset{ˉ}{X} - μ) + \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{2} n} + \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} \{{(\overset{ˉ}{X} - μ)}^{2} - \frac{σ_{\overset{ˉ}{X}}^{2}}{n}\}}{\frac{S}{\sqrt{n}}} .

In our study, let the modified t variable of $t_{}$ as follows:

t = \frac{(\overset{ˉ}{X} - μ) + \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{2} n} + \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} {(\overset{ˉ}{X} - μ)}^{2} - \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} \cdot \frac{σ_{\overset{ˉ}{X}}^{2}}{n}}{\frac{S}{\sqrt{n}}}

= \frac{(\overset{ˉ}{X} - μ) + \frac{μ_{3, \overset{ˉ}{X}}}{2 σ_{\overset{ˉ}{X}}^{2} n} - \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{2} n} + \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} {(\overset{ˉ}{X} - μ)}^{2}}{\frac{S}{\sqrt{n}}}

= \frac{(\overset{ˉ}{X} - μ) + \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{2} n} + \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} {(\overset{ˉ}{X} - μ)}^{2}}{\frac{S}{\sqrt{n}}}

= [(\overset{ˉ}{X} - μ) + \frac{μ_{3, \overset{ˉ}{X}}}{6 σ_{\overset{ˉ}{X}}^{2} n} + \frac{μ_{3, \overset{ˉ}{X}}}{3 σ_{\overset{ˉ}{X}}^{4}} {(\overset{ˉ}{X} - μ)}^{2}] \cdot {(\frac{S^{2}}{n})}^{- \frac{1}{2}} .

Let $δ = \frac{μ_{3}^{*}}{2 σ_{}^{2} n}$ , $κ = \frac{μ_{3}^{*}}{3 σ_{}^{4}}$ , then we represent the above statistic as follows:

t_{2} = [(\overset{ˉ}{X} - μ) + \frac{μ_{3}^{*}}{6 σ^{2} n} + \frac{μ_{3}^{*}}{3 σ^{4}} {(\overset{ˉ}{X} - μ)}^{2}] \cdot {(\frac{S^{2}}{n})}^{- \frac{1}{2}} .

Use MLE of ${\hat{μ}}_{3}^{*}$ , ${\hat{σ}}_{}^{2}$ to estimate ${\hat{μ}}_{3}^{*}$ , $σ_{}^{2}$ , respectively. Then

t_{2} = [(\overset{ˉ}{X} - μ) + \frac{{\hat{μ}}_{3}^{*}}{6 {\hat{σ}}^{2} n} + \frac{{\hat{μ}}_{3}^{*}}{3 {\hat{σ}}^{4}} {(\overset{ˉ}{X} - μ)}^{2}] \cdot {(\frac{S^{2}}{n})}^{- \frac{1}{2}} .

We know MLE of ${\hat{σ}}_{}^{2}$ is equal to $\frac{\sum_{i = 1}^{n} {(x_{i} - \overset{ˉ}{x})}^{2}}{n} = S^{2}$ . And then

t_{2} = [(\overset{ˉ}{X} - μ) + \frac{{\hat{μ}}_{3}^{*}}{6 S^{2} n} + \frac{{\hat{μ}}_{3}^{*}}{3 S^{4}} {(\overset{ˉ}{X} - μ)}^{2}] \cdot {(\frac{S^{2}}{n})}^{- \frac{1}{2}} .

To demonstrate the use of the $t_{2}$ variable in testing of real data, we assume to test the hypothesis $H_{0} : μ_{} = μ_{0}$ against $H_{1} : μ > μ_{0}$ . The reject criteria could be

[(\overset{ˉ}{X} - μ) + \frac{{\hat{μ}}_{3}^{*}}{6 S^{2} n} + \frac{{\hat{μ}}_{3}^{*}}{3 S^{4}} {(\overset{ˉ}{X} - μ)}^{2}] \cdot {(\frac{S^{2}}{n})}^{- \frac{1}{2}} > t_{n - 1, α},

where the critical value, $t_{n - 1, α}$ is obtained from the Student’s $t$ -distribution.

Testing the mean of skewed distributions applying the maximum likelihood estimator

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. MLE of μ₃ for the upper-tailed test

3. Monte Carlo simulation

4. Simulation results

Table 1. Comparison of type I error rates for student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $H_{0} : μ_{x} = μ_{0}$ is true at $α = 0.01$

Table 2. Comparison of type I error rates for Student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $H_{0} : μ_{x} = μ_{0}$ is true at $α = 0.05$

Table 3. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $n = 20$ and $H_{1} : μ_{x} = μ_{0} + k σ_{x} / \sqrt{n}$ is true at $α = 0.01$

Table 4. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $n = 20$ and $H_{1} : μ_{x} = μ_{0} + k σ_{x} / \sqrt{n}$ is true at $α = 0.05$

5. Demonstration using real data

6. Conclusion and future work

Notes on contributors

I-Shiang Tzeng

Related Research Data

References

Appendix A

Appendix B

Information for

Open access

Opportunities

Help and information

Testing the mean of skewed distributions applying the maximum likelihood estimator

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. MLE of μ3 for the upper-tailed test

3. Monte Carlo simulation

4. Simulation results

Table 1. Comparison of type I error rates for student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when H0:μx=μ0 is true at α=0.01

Table 2. Comparison of type I error rates for Student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when H0:μx=μ0 is true at α=0.05

Table 3. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when n=20 and H1:μx=μ0+kσx/n is true at α=0.01

Table 4. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when n=20 and H1:μx=μ0+kσx/n is true at α=0.05

5. Demonstration using real data

6. Conclusion and future work

Additional information

Funding

Notes on contributors

I-Shiang Tzeng

Related Research Data

References

Appendix A

Appendix B

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

2. MLE of μ₃ for the upper-tailed test

Table 1. Comparison of type I error rates for student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $H_{0} : μ_{x} = μ_{0}$ is true at $α = 0.01$

Table 2. Comparison of type I error rates for Student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $H_{0} : μ_{x} = μ_{0}$ is true at $α = 0.05$

Table 3. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $n = 20$ and $H_{1} : μ_{x} = μ_{0} + k σ_{x} / \sqrt{n}$ is true at $α = 0.01$

Table 4. Power comparison of student’s t-test and Johnson’s modified t-tests for upper-tailed rejection areas when $n = 20$ and $H_{1} : μ_{x} = μ_{0} + k σ_{x} / \sqrt{n}$ is true at $α = 0.05$