Abstract
We derive a stochastic word length distribution model based on the concept of compound distributions and show its relationships with and implications for Wimmer et al. ’s (Citation1994) synergetic word length distribution model.
Notes
1This might consequently be considered a Bayesian approach, cf. Lee (Citation2009).
2Approaches that have used these more general alignment models include Galescu & Allen (Citation2001), Jiampojamarn et al. (Citation2007), and Bisani & Ney (Citation2008).
3Or similar sizes of other segmental writing systems such as the Greek, Cyrillic, etc.
4Taken from the Pascal G2P challenge, http://www.pascal-network.org/Challenges/PRONASYL/
5If M also takes the value M = 0, then M might be perceived of as having a peak at M = 1.
6Note that compound distributions can be considered a special case of a mixed distribution.
7Here, by we denote polynomial coefficients (cf. Comtet, Citation1974; Caiado, 2007).
8The question is what the scope of such recursive schemes as (4) is. Certainly, to sketch an extreme, if they contained all possible probability distributions, they would be useless as theoretical models.
9Plus something termed “local modifications” in Wimmer & Altmann (1996) that mainly consists of marginal “corrections” to fit the data at hand, such as displacing probability distributions by 1, etc.