Search in:

Econometric Reviews Volume 33, 2014 - Issue 1-4: Bayesian Inference and Information: In Memory of Arnold Zellner

Submit an article Journal homepage

634

Views

CrossRef citations to date

Altmetric

Original Articles

I Got More Data, My Model is More Refined, but My Estimator is Getting Worse! Am I Just Dumb?

Xiao-Li Meng Department of Statistics, Harvard University, Cambridge, Massachusetts, USACorrespondence[email protected]

Xianchao Xie Department of Statistics, Harvard University, Cambridge, Massachusetts, USA

Pages 218-250 | Published online: 25 Sep 2013

Cite this article
https://doi.org/10.1080/07474938.2013.808567
CrossMark

Sample our Economics, Finance,Business & Industry journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/07474938.2013.808567?needAccess=true

Abstract

Possibly, but more likely you are merely a victim of conventional wisdom. More data or better models by no means guarantee better estimators (e.g., with a smaller mean squared error), when you are not following probabilistically principled methods such as MLE (for large samples) or Bayesian approaches. Estimating equations are particularly vulnerable in this regard, almost a necessary price for their robustness. These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple models such as bivariate normal and ARCH(1). Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, Citation1994; Statistical Science) and adopting a guiding working model.

Using the example of estimating the autocorrelation ρ under a stationary AR(1) model, we also demonstrate the interaction between model assumptions and observation structures in seeking additional information, as the sampling interval s increases. Furthermore, for a given sample size, the optimal s for minimizing the asymptotic variance of is s = 1 if and only if ρ² ≤ 1/3; beyond that region the optimal s increases at the rate of log ⁻¹(ρ⁻²) as ρ approaches a unit root, as does the gain in efficiency relative to using s = 1. A practical implication of this result is that the so-called “non-informative” Jeffreys prior can be far from non-informative even for stationary time series models, because here it converges rapidly to a point mass at a unit root as s increases. Our overall emphasis is that intuition and conventional wisdom need to be examined via critical thinking and theoretical verification before they can be trusted fully.

Keywords:

AR(1) model
Estimating equation
Fraction of missing information
Fisher information
Generalized method of moments (GMM)
Jeffreys prior
Non-informative prior
Observation structures
Partial plug-in
Relative information
Self-efficiency
Unit root

JEL Classification:

C130
C140

ACKNOWLEDGMENTS

We thank Editor Ehsan Soofi for the invitation (and for his extraordinary patience) to contribute to this special volume in honor of Professor Arnold Zellner, who was a colleague and friend of one of us (Meng) during his Chicago years (1991–2001). We also thank many colleagues, especially Joseph Blitzstein, Ngai Hang Chan, and Ehsan Soofi for very helpful exchanges and conversations, Alex Blocker, Steven Finch, and Nathan Stein for proofreading and constructive comments, and the National Science Foundation for partial financial support.

Notes

See Prof. Arnold Zellner's CV at http://faculty.chicagobooth.edu/arnold.zellner/more/vita.pdf

The term “data augmentation” (Tanner and Wong, Citation1987, 32) is also well-known in the EM and MCMC literature, where it refers to creating artificial (missing) data for the purpose of constructing useful statistical algorithms. The connection with the discussion here is that the algorithmic efficiencies of these algorithms are (almost) exactly determined by the amount of augmented Fisher information; see van Dyk and Meng (Citation2001, Citation2010) for an overview and some detailed investigations.

Tanner , M. A. , Wong , W. H. ( 1987 ). The calculation of posterior distributions by data augmentation . Journal of the American Statistical Association 82 : 528 – 540 .

Web of Science ®Google Scholar

van Dyk , D. A. , Meng , X.-L. ( 2001 ). The art of data augmentation (with discussion) . Journal of Computational and Graphical Statistics 10 : 1 – 111 .

Web of Science ®Google Scholar

van Dyk , D. A. , Meng , X.-L. ( 2010 ). Cross-fertilizing strategies for better EM mountain climbing and DA field exploration: A graphical guide book . Statistical Science 25 : 429 – 449 .

Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related Research Data

Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information

Source: Oxford University Press (OUP)

Uniform moment bounds of Fisher’s information with applications to time series

Source: arXiv

Asymptotic properties of multivariate nonstationary processes with applications to autoregressions

Source: The Institute of Mathematical Statistics

Testing for a Unit Root in Time Series Regression

Source: Oxford University Press (OUP)

On Jeffreys Prior when Using the Exact Likelihood Function

Source: Cambridge University Press (CUP)

Multiple Imputation for Nonresponse in Surveys

Source: John Wiley & Sons, Inc.

A Course in Large Sample Theory

Source: Routledge

An Optimum Property of Regular Maximum Likelihood Estimation

Source: The Institute of Mathematical Statistics

Cross-Fertilizing Strategies for Better EM Mountain Climbing and DA Field Exploration: A Graphical Guide Book

Source: arXiv

The Parameter Inference for Nearly Nonstationary Time Series

Source: Informa UK Limited

Single observation unbiased priors

Source: Institute of Mathematical Statistics

What's the H in H‐likelihood: A Holy Grail or an Achilles' Heel?*

Source: Oxford University Press

The Art of Data Augmentation

Source: Informa UK Limited

To criticize the critics: An objective bayesian analysis of stochastic trends

Source: Wiley

Sampling-50 years after Shannon

Source: IEEE

A weakly informative default prior distribution for logistic and other regression models

Source: Institute of Mathematical Statistics

From Unit Root to Stein’s Estimator to Fisher’s k Statistics: If You Have a Moment, I Can Tell You More

Source: The Institute of Mathematical Statistics

Decoding the H-likelihood

Source: Institute of Mathematical Statistics

Reply to "A Paradoxical Result in Estimating Regression Coefficients"

Source: Informa UK Limited

Certain topics in telegraph transmission theory

Source: Institute of Electrical and Electronics Engineers (IEEE)

Multiple-Imputation Inferences with Uncongenial Sources of Input

Source: Institute of Mathematical Statistics

Data-dependent probability matching priors for empirical and related likelihoods

Source: arXiv

THE LIMITING DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT IN THE EXPLOSIVE CASE

Source: Institute of Mathematical Statistics

On the Sample Information About Parameter and Prediction

Source: arXiv

Inference from Simulations and Monitoring Convergence

Source: Chapman and Hall/CRC

Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation

Source: JSTOR

The calculation of posterior distributions by data augmentation

Source: Informa UK Limited

When are observed failures more informative than observed survivals

Source: Wiley

On Asymptotic Distributions of Estimates of Parameters of Stochastic Difference Equations

Source: The Institute of Mathematical Statistics

MSPE UNDER THE UNIT ROOT MODEL

Source: Wiley

Conditional likelihood and unconditional optimum estimating equations

Source: Oxford University Press (OUP)

Linking provided by

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

I Got More Data, My Model is More Refined, but My Estimator is Getting Worse! Am I Just Dumb?

Related Research Data

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

I Got More Data, My Model is More Refined, but My Estimator is Getting Worse! Am I Just Dumb?

Abstract

ACKNOWLEDGMENTS

Notes

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date