476
Views
40
CrossRef citations to date
0
Altmetric
Regular articles

How useful are corpus-based methods for extrapolating psycholinguistic variables?

, &
Pages 1623-1642 | Received 10 May 2014, Accepted 24 Oct 2014, Published online: 19 Feb 2015

REFERENCES

  • Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., … Treiman, R. (2007). The English lexicon project. Behavior Research Methods, 39(3), 445–459. doi: 10.3758/BF03193014
  • Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Vol. 1). Retrieved from http://clic.cimec.unitn.it/marco/publications/acl2014/baroni-etal-countpredict-acl2014.pdf
  • Bestgen, Y. (2002). Détermination de la valence affective de termes dans de grands corpus de textes [Determination of the emotional valence of terms in large corpora]. In Y. Toussaint & C. Nedellec (Eds.), Actes du Colloque International sur la Fouille de Texte CIFT ‘02 (pp. 81–94). Nancy, France: INRIA.
  • Bestgen, Y., & Vincze, N. (2012). Checking and bootstrapping lexical norms by means of word similarity indexes. Behavior Research Methods, 44(4), 998–1006. doi:10.3758/s13428-012-0195-z
  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.
  • Bradley, M., & Lang, P. (1999). Affective norms for English words (ANEW): Stimuli, instruction manual, and affective ratings (Technical Report No. C-1). Gainesville, FL: University of Florida, NIMH Center for Research in Psychophysiology.
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi: 10.1023/A:1010933404324
  • Broder, A. Z. (1997). On the resemblance and containment of documents. Proceedings of Compression and Complexity of Sequences 1997 (pp. 21–29). IEEE. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=666900
  • Brysbaert, M., & Ghyselinck, M. (2006). The effect of age of acquisition: Partly frequency related, partly frequency independent. Visual Cognition, 13(7–8), 992–1011. doi:10.1080/13506280544000165
  • Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41(4), 977–990. doi:10.3758/BRM.41.4.977
  • Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904–911. doi:10.3758/s13428-013-0403-5
  • Coltheart, M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology Section A, 33(4), 497–505. doi:10.1080/14640748108400805
  • Feng, S., Cai, Z., Crossley, S., & McNamara, D. S. (2011). Simulating Human Ratings on Word Concreteness. Twenty-Fourth International FLAIRS Conference. Retrieved from http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS11/paper/viewPDFInterstitial/2644/3035
  • Fix, E., & Hodges, J. (1951). Discriminatory analysis, nonparametric discrimination: Consistency properties. US Air Force School of Aviation Medicine, Technical Report, 4(3).
  • Geman, S., & Geman, D. (1984). Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell., 6(6), 721–741. doi:10.1109/TPAMI.1984.4767596
  • Gilhooly, K. J., & Logie, R. H. (1980). Age-of-acquisition, imagery, concreteness, familiarity, and ambiguity measures for 1944 words. Behavior Research Methods & Instrumentation, 12(4), 395–427. doi:10.3758/BF03201693
  • Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37(2), 183–233. doi: 10.1023/A:1007665907178
  • Keuleers, E., Brysbaert, M., & New, B. (2010). SUBTLEX-NL: A new measure for Dutch word frequency based on film subtitles. Behavior Research Methods, 42(3), 643–650. doi:10.3758/BRM.42.3.643
  • Keuleers, E., Lacey, P., Rastle, K., & Brysbaert, M. (2012). The British Lexicon Project: Lexical decision data for 28,730 monosyllabic and disyllabic English words. Behavior Research Methods, 44(1), 287–304. doi:10.3758/s13428-011-0118-4
  • Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods, 44(4), 978–990. doi:10.3758/s13428-012-0210-4
  • Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–210. doi: 10.1037/0033-295X.104.2.211
  • Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208. doi: 10.3758/BF03204766
  • Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. New York: Cambridge University Press.
  • Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press.
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs]. Retrieved from http://arxiv.org/abs/1301.3781
  • Mikolov, T., Le, Q. V., & Sutskever, I. (2013). Exploiting Similarities among Languages for Machine Translation. arXiv:1309.4168 [cs]. Retrieved from http://arxiv.org/abs/1309.4168
  • Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. (1990). WordNet: An on-line lexical database. International Journal of Lexicography, 3, 235–244. doi: 10.1093/ijl/3.4.235
  • Recchia, G., & Jones, M. N. (2009). More data trumps smarter algorithms: Comparing pointwise mutual information with latent semantic analysis. Behavior Research Methods, 41(3), 647–656. doi:10.3758/BRM.41.3.647
  • Recchia, G., & Louwerse, M. M. (2014). Reproducing affective norms with lexical co-occurrence statistics: Predicting valence, arousal, and dominance. The Quarterly Journal of Experimental Psychology, 1–45. doi:10.1080/17470218.2014.941296
  • Rijsbergen, C. J. V. (1979). Information retrieval. London: Butterworths.
  • Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. doi:10.1038/323533a0
  • Sahlgren, M. (2006). The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. (Doctoral dissertation, Stockholm University). Retrieved from http://eprints.sics.se/437/1/TheWordSpaceModel.pdf.
  • Shaoul, C., & Westbury, C. (2006). Word frequency effects in high-dimensional co-occurrence models: A new approach. Behavior Research Methods, 38(2), 190–195. doi:10.3758/BF03192768
  • Shaoul, C., & Westbury, C. (2010). Exploring lexical co-occurrence space using HiDEx. Behavior Research Methods, 42(2), 393–413. doi:10.3758/BRM.42.2.393
  • Steyvers, M., & Griffiths, T. (2007). Probabilistic topic models. In T. Landauer, D. S. McNamara, S. Dennis & W. Kintsch (Eds.), Handbook of Latent Semantic Analysis (pp. 424–440). Hillsdale, NJ: Erlbaum.
  • Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 173–180). Association for Computational Linguistics.
  • Toutanova, K., & Manning, C. D. (2000). Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 13 (pp. 63–70). Association for Computational Linguistics.
  • Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior Research Methods, 45(4), 1191–1207. doi:10.3758/s13428-012-0314-x
  • Westbury, C. (2013). You Can't Drink a Word: Lexical and Individual Emotionality Affect Subjective Familiarity Judgments. Journal of Psycholinguistic Research, 43(5), 631–49. doi:10.1007/s10936-013-9266-2
  • Westbury, C. F., Shaoul, C., Hollis, G., Smithson, L., Briesemeister, B. B., Hofmann, M. J., & Jacobs, A. M. (2013). Now you see it, now you don't: on emotion, context, and the algorithmic prediction of human imageability judgments. Frontiers in Psychology, 4. doi:10.3389/fpsyg.2013.00991

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.