181
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Why One Cannot Estimate the Entropy of English by Sampling

&

References

  • Algoet, P. H., & Cover, T. M. (1988). A sandwich proof of the Shannon-McMillan-Breiman theorem. The Annals of Probability, 16, 899–909.
  • Brown, P. F., Della Pietra, V. J., Mercer, R. L., Della Pietra, S. A., & Lai, J. C. (1992). An estimate of an upper bound for the entropy of English. Computational Linguistics, 18, 31–40. Retrieved from http://www.aclweb.org/anthology/J92-1002
  • Chomsky, N. (1957). Syntactic structures. The Hague/Paris: Mouton Publishers.
  • Cover, T. M., & King, R. C. (1978). A convergent gambling estimate of the entropy of English. IEEE Transactions on Information Theory, 24, 413–421. Retrieved from https://protect-us.mimecast.com/s/LL62BWsGEwgoc6?domain=www-isl.stanford.edu
  • Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). Hoboken, NJ: Wiley.
  • Davies, M. (2008--2012). The corpus of contemporary American English: 450 million words, 1990-present. Retrieved from http://corpus.byu.edu/coca/
  • Fousse, L., Hanrot, G., Lefévre, V., Pélissier, P., & Zimmermann, P. (2007). MPFR: A multiple-precision binary floating-point library with correct rounding. ACM Transactions on Mathematical Software, 33(2), 13:1–13:15. doi:10.1145/1236463.1236468
  • Goldreich, O., Sahai, A., & Vadhan, S. (1999). Can statistical zero knowledge be made noninteractive?, or On the relationship of SZK and NISZK. In M. Wiener (Ed.), Advances in cryptology: Proceedings of CRYPTO 1999 Santa Barbara, CA, Vol. 1666 (pp. 467–484). Berlin: Springer-Verlag. doi:10.1007/3-540-48405-130
  • Guibas, L. J., & Odlyzko, A. M. (1981). String overlaps, pattern matching, and nontransitive games. Journal of Combinatorial Theory, Series A, 30, 183–208. doi:10.1016/0097-3165(81)90005-4
  • Hilbert, M., & López, P. (2011). The world’s technological capacity to store, communicate, and compute information. Science, 332, doi:10.1126/science.1200970
  • Jacquet, P., Knessl, C., & Szpankowski, W. (2012). Counting Markov types, balanced matrices, and Eulerian graphs. IEEE Transactions on Information Theory, 58, 4261–4272. doi:10.1109/TIT.2012.2191476
  • Kasiski, F. W. (1863). Die Geheimschriften und die Dechiffrir-Kunst. Berlin: E. S. Mittler und Sohn.
  • Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.
  • Rosenfeld, R. (2000). Two decades of statistical language modeling: Where do we go from here? Proceedings of the IEEE, 88, 1270–1278. doi:10.1109/5.880083
  • Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423 and 623--656. ; Reprinted in Claude E. Shannon and Warren Weaver, The Mathematical Theory of Communication, University of Illinois Press, Urbana, IL, 1949.
  • Shannon, C. E. (1949). Communication theory of secrecy systems. Bell System Technical Journal, 28, 656–715.
  • Shannon, C. E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 30, 50–64. Retrieved from https://protect-us.mimecast.com/s/kJYMBRfW0J7bcl?domain=princeton.edu
  • von zur Gathen, J. (2015). Cryptoschool (p. 32). Heidelberg: Springer-Verlag.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.