Abstract
Traditional statistical modeling of continuous outcome variables relies heavily on the assumption of a normal distribution. However, in some applications, such as analysis of microRNA (miRNA) data, normality may not hold. Skewed distributions play an important role in such studies and might lead to robust results in the presence of extreme outliers. We apply a skew-normal (SN) distribution, which is indexed by three parameters (location, scale and shape), in the context of miRNA studies. We developed a test statistic for comparing means of two conditions replacing the normal assumption with SN distribution. We compared the performance of the statistic with other Wald-type statistics through simulations. Two real miRNA datasets are analyzed to illustrate the methods. Our simulation findings showed that the use of a SN distribution can result in improved identification of differentially expressed miRNAs, especially with markedly skewed data and when the two groups have different variances. It also appeared that the statistic with SN assumption performs comparably with other Wald-type statistics irrespective of the sample size or distribution. Moreover, the real dataset analyses suggest that the statistic with SN assumption can be used effectively for identification of important miRNAs. Overall, the statistic with SN distribution is useful when data are asymmetric and when the samples have different variances for the two groups.
AMS Classification::
Acknowledgements
A.H. acknowledges Canadian Institute of Health Research (CIHR) fellowship funding from the Drug Safety and Effectiveness Cross-Disciplinary Training (DSECT) Program. J.B. would like to acknowledge Discovery Grant funding from the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant number 293295-2009) and CIHR (grant number 84392). J.B. holds the John D. Cameron Endowed Chair in the Genetic Determinants of Chronic Diseases, Department of Clinical Epidemiology and Biostatistics, McMaster University. We would like to thank Professor Adelchi Azzalini for contributing the ideas of estimating parameters for high dimensional settings. We would also like to thank two anonymous reviewers and the editor for insightful comments that improved the presentation and clarity of our manuscript.