Search in:

Multivariate Behavioral Research Volume 54, 2019 - Issue 6

Submit an article Journal homepage

352

Views

CrossRef citations to date

Altmetric

Articles

What Causes the Mean Bias of the Likelihood Ratio Statistic with Many Variables?

Ke-Hai YuanUniversity of Notre Dame; ;Renmin University of ChinaCorrespondence[email protected]

Chao FanUniversity of Notre Dame;

Yanyun ZhaoRenmin University of China

Pages 840-855 | Published online: 08 Apr 2019

Cite this article
https://doi.org/10.1080/00273171.2019.1596060
CrossMark

Sample our Behavioral Sciences journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/00273171.2019.1596060?needAccess=true

Abstract

Survey data often contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. However, conventional SEM methods are not crafted to handle data with a large number of variables (p). A large p can cause T_ml, the most widely used likelihood ratio statistic, to depart drastically from the assumed chi-square distribution even with normally distributed data and a relatively large sample size N. A key element affecting this behavior of T_ml is its mean bias. The focus of this article is to determine the cause of the bias. To this end, empirical means of T_ml via Monte Carlo simulation are used to obtain the empirical bias. The most effective predictors of the mean bias are subsequently identified and their predictive utility examined. The results are further used to predict type I errors of T_ml. The article also illustrates how to use the obtained results to determine the required sample size for T_ml to behave reasonably well. A real data example is presented to show the effect of the mean bias on model inference as well as how to correct the bias in practice.

Keywords:

Chi-square distribution
mean bias
small sample size
large number of variables
best-subset regression

Article information

Conflict of Interest Disclosures: Each author signed a form for disclosure of potential conflicts of interest. No authors reported any financial or other conflicts of interest in relation to the work described.

Ethical Principles: The authors affirm having followed professional ethical guidelines in preparing this work. These guidelines include obtaining informed consent from human participants, maintaining ethical treatment and respect for the rights of human or animal participants, and ensuring the privacy of participants and their data, such as ensuring that individual participants cannot be identified in reported results or from publicly available original or archival data.

Funding: This work was supported by Grant SES-1461355 from the National Science Foundation.

Role of the Funders/Sponsors: None of the funders or sponsors of this research had any role in the design and conduct of the study; collection, management, analysis, and interpretation of data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

Acknowledgments: The authors would like to thank Stephen West, Brenna Gomer and two reviewers for their comments on prior versions of this manuscript. The ideas and opinions expressed herein are those of the authors alone, and endorsement by the authors' institutions or the National Science Foundation is not intended and should not be inferred.

Notes

1 The function $h_{1} (x) = log (x)$ is regarded as a transformation of x with power 0 in the development of Box and Cox (1964).

2 The selected number of best subsets might be arbitrary, but our experience indicates that the additional gain becomes minimal as we select more subsets. Also, best-subset regression becomes less effective with too many variables being included in the following step that involves product terms.

3 The option “model y = v1-v10/selection = maxR; weight w;” under Proc Reg allows us to select the best predictors from v1 to v10 according to weighted least squares.

4 The variables in these subsets are reported in .

5 Note that the dots corresponding to $r_{w 1} = - 61.69$ and $r_{w 1} = - 61.20$ are close to overlap in , and so are the two corresponding to $r_{w 1} = - 51.73$ and $r_{w 1} = - 50.91 .$

6 We use $λ_{j, k}$ to represent the factor loading of the jth variable on the kth factor. For example, $λ_{13, 3}$ is the loading of the 13th variable (Straight-Curved Capitals) on the 3rd factor (Speed), including $λ_{13, 1}$ makes variable 13 also load on the 1st factor (Spatial).

7 We used 5 decimals in order to see the change in p-values with different models and different methods of evaluation.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related Research Data

Fit indices in covariance structure modeling : Sensitivity to underparameterized model misspecification

Source: American Psychological Association (APA)

A NEW MEASURE OF MISFIT FOR COVARIANCE STRUCTURE MODELS

Source: Springer Science and Business Media LLC

Harman, Harry H.: Modern Factor Analysis. 2. durchges. Aufl. The Univ. of Chicago Press, Chicago und London 1967. XX, 474 S., Tafelanhang, Preis $12.50

Source: Wiley

The Effects of Sampling Error and Model Characteristics on Parameter Estimation for Maximum Likelihood Confirmatory Factor Analysis

Source: Informa UK Limited

An Introduction to the Bootstrap (Bradley Efron and Robert J. Tibshirani)

Source: Society for Industrial & Applied Mathematics (SIAM)

Performance of Modified Test Statistics in Covariance and Correlation Structure Analysis Under Conditions of Multivariate Nonnormality

Source: Informa UK Limited

Testing the equality of several covariance matrices with fewer observations than the dimension

Source: Elsevier Inc.

The Elements of Statistical Learning

Source: Springer New York

Fit indices in covariance structure modeling : Sensitivity to underparameterized model misspecification

Source: American Psychological Association (APA)

Evaluating Small Sample Approaches for Model Test Statistics in Structural Equation Modeling.

Source: Informa UK Limited

Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives

Source: Informa UK Limited

Book Reviews : D. N. Lawley and A. E. Maxwell. Factor Analysis as a Statistical Method (2nd ed.). New York: American Elsevier Publishing Company, Inc., 1971. pp. viii + 153. $10.50:

Source: SAGE Publications

Analysis of Covariance Structures under Elliptical Distributions

Source: Informa UK Limited

A general approach to confirmatory maximum likelihood factor analysis

Source: Springer Science and Business Media LLC

The Model Size Effect in SEM: Inflated Goodness-of-Fit Statistics Are due to the Size of the Covariance Matrix.

Source: Informa UK Limited

Empirical Correction to the Likelihood Ratio Statistic for Structural Equation Modeling with Many Variables

Source: Springer Science and Business Media LLC

A Penalized Likelihood Method for Structural Equation Modeling

Source: Springer Science and Business Media LLC

A test for the equality of covariance matrices when the dimension is large relative to the sample sizes

Source: Elsevier BV

Four improved statistics for contrasting means by correcting skewness and kurtosis.

Source: Wiley

Stabilizing bootstrap-t confidence intervals for small samples

Source: Wiley

Tests for High-Dimensional Covariance Matrices

Source: Informa UK Limited

A general approach to confirmatory maximum likelihood factor analysis

Source: Springer Science and Business Media LLC

Mean and Mean-and-Variance Corrections With Big Data

Source: Informa UK Limited

Asymptotic Chi-Square Tests for a Large Class of Factor Analysis Models

Source: Institute of Mathematical Statistics

A Study in Factor Analysis: The Stability of a Bi-Factor Solution. Karl J. Holzinger , Frances Swineford

Source: University of Chicago Press

Measures of multivariate skewness and kurtosis with applications

Source: Oxford University Press (OUP)

"How Big Is Big Enough?": Sample Size and Goodness of Fit in Structural Equation Models with Latent Variables.

Source: JSTOR

Practical Issues in Structural Modeling

Source: SAGE Publications

Sample Size and Number of Parameter Estimates in Maximum Likelihood Confirmatory Factor Analysis: A Monte Carlo Investigation.

Source: Informa UK Limited

Revisiting the Model Size Effect in Structural Equation Modeling

Source: Informa UK Limited

A GENERAL METHOD FOR APPROXIMATING TO THE DISTRIBUTION OF LIKELIHOOD RATIO CRITERIA

Source: Oxford University Press (OUP)

Robustness of normal theory statistics in structural equation models

Source: Wiley

The effect of sampling error on convergence, improper solutions, and goodness-of-fit indices for maximum likelihood confirmatory factor analysis

Source: Springer Science and Business Media LLC

Measures of multivariate skewness and kurtosis with applications

Source: Oxford University Press (OUP)

Tests for covariance matrices in high dimension with less sample size

Source: Elsevier BV

THE EFFECT OF STANDARDIZATION ON A χ2 APPROXIMATION IN FACTOR ANALYSIS

Source: Oxford University Press (OUP)

Revisiting Sample Size and Number of Parameter Estimates: Some Support for the N:q Hypothesis

Source: Informa UK Limited

Structural equation modeling with near singular covariance matrices

Source: Elsevier BV

The Univariate Case and Its Multivariate Implication

Source: SAGE Publications

Is More Ever Too Much? The Number of Indicators per Factor in Confirmatory Factor Analysis.

Source: Informa UK Limited

Type I Error Rates for Welch’s Test and James’s Second-Order Test Under Nonnormality and Inequality of Variance When There Are Two Groups:

Source: American Educational Research Association (AERA)

Linking provided by

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

What Causes the Mean Bias of the Likelihood Ratio Statistic with Many Variables?

Related Research Data

Information for

Open access

Opportunities

Help and information

What Causes the Mean Bias of the Likelihood Ratio Statistic with Many Variables?

Abstract

Article information

Notes

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature