Abstract
Herdan-Heaps law and Lotka’s law are two important laws in linguistics and many other fields, which are often found to coexist in many languages. Herdan-Heaps law describes the type-token relation between number of distinct words and text length. Lotka’s law concerns the fraction of words with a given number of word occurrences. Utilising a variant of the Simon model, this work demonstrates that if the growth rate of different words follow Herdan-Heaps law, with an exponent in the interval (0,1), then the exponents of Lotka’s law and Herdan-Heaps law are identical. A biparameter power law distribution, i.e. the Waring distribution, is derived within the framework of Simon’s model. Estimators of the Waring distribution parameters are determined and numerical illustrations are provided.
Acknowledgements
The work was conducted by Shanghai University, China. This work was supported by the National Natural Science Foundation of China under grant number 61174160; the China Scholarship Council under grant number 201306890025; and the Shanghai Educational Commission Key Project under grant number 14ZS085. The authors are grateful to Professor Ronald Rousseau and Dr Alexander N W Taylor who kindly discussed some issues and polished the paper carefully. The authors are also grateful to the anonymous referees for their significant and constructive comments and suggestions.