Publication Cover
Statistics
A Journal of Theoretical and Applied Statistics
Volume 49, 2015 - Issue 5
109
Views
7
CrossRef citations to date
0
Altmetric
Original Articles

Prior near ignorance for inferences in the k-parameter exponential family

&
Pages 1104-1140 | Received 09 Apr 2013, Accepted 11 Aug 2014, Published online: 02 Oct 2014
 

Abstract

This paper proposes a model of prior ignorance about a multivariate variable based on a set of distributions M. In particular, we discuss four minimal properties that a model of prior ignorance should satisfy: invariance, near ignorance, learning and convergence. Near ignorance and invariance ensure that our prior model behaves as a vacuous model with respect to some statistical inferences (e.g. mean, credible intervals, etc.) and some transformation of the parameter space. Learning and convergence ensure that our prior model can learn from data and, in particular, that the influence of M on the posterior inferences vanishes with increasing numbers of observations. We show that these four properties can all be satisfied by a set of conjugate priors in the multivariate exponential families if the set M includes finitely additive probabilities obtained as limits of truncated exponential functions. The obtained set M is a model of prior ignorance with respect to the functions (queries) that are commonly used for statistical inferences and, because of conjugacy, it is tractable and easy to elicit. Applications of the model to some practical statistical problems show the effectiveness of the approach.

AMS Subject Classification:

Acknowledgements

The authors would like to thank the anonymous referees for comments and constructive criticism that helped us to improve the presentation of the paper.

Notes

1. More precisely, F denotes a semigroup of transformations of W. That is, each fF maps W into itself, and the composition f1f2 defined by f1(f2(w)) is in W whenever f1,f2F. The semigroup F is Abelian if f1f2=f2f1 whenever f1,f2F.

2. In this paper we mainly focus on translation invariance. However, for multivariate models, we will impose other invariance properties: invariance to permutations and invariance to representation.

3. Note that I{A} is the indicator function of set A, that is, I{A}(x)=1 if xA and zero otherwise.

4. Equivalently, if −g and −g(f) belong to G1 then E_[g(f)]=E_[g], which implies that E¯[g(f)]=E¯[g] being E¯[g]=E_[g] for any g.

5. We point the reader to [Citation16, Chapter 20] for a general discussion about dominated priors. When the likelihood belongs to the exponential families (the focus of this paper), as dominated prior we may consider any proper conjugate prior, the improper uniform or other sufficiently regular priors. The posterior becomes asymptotically Normal in these cases.

6. Let the least term φ of a sequence be a term which is smaller than all but a finite number of the terms which are equal to φ. Then φ is called the lower limit of the sequence.

7. The differences are due to the Jacobians of the transformations.

8. With sufficiently smooth RVBFs, we mean integrable w.r.t. the kernel exp(n(yˆnTwb(w)))exp(w) for any ℓ∈[−c,c], nN and yˆnCl(Y0), with support in W and continuous on a neighbourhood of the point where the posterior relative to the improper uniform prior concentrates for n.

9. This holds for any ℓ∈(0,c]. All ℓ∈(0,c] are equivalent w.r.t. this property, since all exp(w) are increasing in W for ℓ>0.

10. Notice that this behaviour in general is not monotone and depends on how yˆn converges with the number of observations.

11. This also holds for the exponential distribution for n0<−1.

12. Since the priors in M are all countably additive, it also satisfies strong coherence as defined in [Citation6, Chapter 7].

13. In the formulation we need to impose the additional constraint that the argument of the square root is positive. This implies that the parameters ℓ·1 and ℓ·2 cannot vary independently in [−c·1,c·1] and, respectively, [−c·2,c·2]. However, for suitably large n and non-degenerate distributions, the argument of the square root is usually positive for any ℓ·1 and ℓ·2 in the intervals.

14. For Δ=0 the plot actually reports the Type I error.

15. We have chosen this interval in analogy with that of the IEM test. The distribution of the p-values for a different boundary of the ‘no decision zone’ can easily be deduced from Figure (c) and (d).

16. These lower and upper probabilities are obtained by densities in M approaching the extreme priors in Equations (42)–(43).

Additional information

Funding

This work was partly supported by the Swiss NSF [grant number 200021146606/1], [grant number 200020137680/1].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 844.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.