References
- Agerri, Rodrigo, Josu Bermudez, and German Rigau. 2014. “IXA Pipeline: Efficient and Ready to Use Multilingual NLP tools.” In Proceedings of the Ninth International Conference on Language Resources and Evaluation, Vol. 2014, 3823–3828.
- Agerri, Rodrigo, and German Rigau. 2016. “Robust Multilingual Named Entity Recognition with Shallow Semi-supervised Features.” Artificial Intelligence 238 (2): 63–82.
- Agerri, Rodrigo, and German Rigau. 2019. “Language Independent Sequence Labelling for Opinion Target Extraction.” Artificial Intelligence 268: 85–95.
- Agerri, Rodrigo, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, and Eneko Agirre. 2020. “Give Your Text Representation Models some Love: The Case for Basque.” In Proceedings of The 12th Language Resources and Evaluation Conference, 4781–4788.
- Alegria, Iñaki, Nora Aranberri, Pere R. Comas, Víctor Fresno, Pablo Gamallo, Lluis Padró, Iñaki San Vicente, Jordi Turmo, and Arkaitz Zubiaga. 2015. “TweetNorm: a Benchmark for Lexical Normalization of Spanish Tweets.” Language Resources and Evaluation 49 (4): 883–905.
- Al Zamal, Faiyaz, Wendy Liu, and Derek Ruths. 2012. “Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors.” In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 270, 2012.
- Baldwin, Timothy, Marie-Catherine de Marneffe, Bo Han, Young-Bum Kim, Alan Ritter, and Wei Xu. 2015. “Shared Tasks of the 2015 Workshop on Noisy User-Generated Text: Twitter Lexical Normalization and Named Entity Recognition.” In Proceedings of the Workshop on Noisy User-generated Text, 126–135.
- Basile, Valerio, Cristina Bosco, Elisabetta Fersini, Debora Nozza, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso, and Manuela Sanguinetti. 2019. “SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter.” In Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval 2019), 54–63.
- Bastian, Mathieu, Sebastien Heymann, and Mathieu Jacomy. 2009. “Gephi: An Open Source Software for Exploring and Manipulating Networks.” Proceedings of the International AAAI Conference on Web and Social Media 8 (2009): 361–362.
- Blondel, Vincent D., Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. “Fast Unfolding of Communities in Large Networks.” Journal of Statistical Mechanics: Theory and Experiment 2008 (10): P10008.
- Bojanowski, Piotr, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. “Enriching Word Vectors with Subword Information.” Transactions of the Association for Computational Linguistics 5: 135–146.
- Brown, Peter F., Peter V. Desouza, Robert L. Mercer, Vincent J. Della Pietra, and Jenifer C. Lai. 1992. “Class-based N-gram Models of Natural Language.” Computational Linguistics 18 (4): 467–479.
- Cesare, Nina, Christan Grant, and Elaine O. Nsoesie. 2017. “Detection of User Demographics on Social Media: A Review of Methods and Recommendations for Best Practices.” arXiv preprint arXiv:1702.01807.
- Clark, Alexander. 2003. “Combining Distributional and Morphological Information for Part of Speech Induction.” In 10th Conference of the European Chapter of the Association for Computational Linguistics.
- Conneau, Alexis, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. “Unsupervised Cross-Lingual Representation Learning at Scale.” arXiv:1911.02116.
- Conover, Michael D., Jacob Ratkiewicz, Matthew Francisco, Bruno Gonçalves, Filippo Menczer, and Alessandro Flammini. 2011. “Political Polarization on Twitter.” In Proceedings of the International AAAI Conference on Web and Social Media.
- Derczynski, Leon, Kalina Bontcheva, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, and Arkaitz Zubiaga. 2017. “SemEval-2017 Task 8: RumourEval: Determining Rumour Veracity and Support for Rumours.” In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 69–76.
- Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), 4171–4186.
- Eusko Jaurlaritza, E. J. G. V., and Nafarroako Gobernua, and Office Public de la Langue Basque. 2016. VI. Inkesta Soziolinguistikoa. irekia.euskadi.eus.
- Fernandez de Landa, Joseba, Rodrigo Agerri, and Iñaki Alegria. 2019. “Large Scale Linguistic Processing of Tweets to Understand Social Interactions Among Speakers of Less Resourced Languages: The Basque Case.” Information 10 (6): 212–00.
- Grover, Aditya, and Jure Leskovec. 2016. “Node2vec: Scalable Feature Learning for Networks.” In Association for Computing Machinery, 855–864.
- Jacomy, Mathieu, Tommaso Venturini, Sebastien Heymann, and Mathieu Bastian. 2014. “ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software.” PloS One 9 (6): e98679.
- Jones, Rhys James, Daniel Cunliffe, and Zoe R. Honeycutt. 2013. “Twitter and the Welsh Language.” Journal of Multilingual and Multicultural Development 34 (7): 653–671.
- Karthikeyan, K., Zihan Wang, Stephen Mayhew, and Dan Roth. 2020. “Cross-Lingual Ability of Multilingual BERT: An Empirical Study.” In International Conference on Learning Representations.
- Marquardt, James, Golnoosh Farnadi, Gayathri Vasudevan, Marie-Francine Moens, Sergio Davalos, Ankur Teredesai, and Martine De Cock. 2014. “Age and Gender Identification in Social Media.” In Proceedings of CLEF 2014 Evaluation Labs, 1129–1136.
- McMonagle, Sarah, Daniel Cunliffe, Lysbeth Jongbloed-Faber, and Paul Jarvis. 2019. “What Can Hashtags Tell Us About Minority Languages on Twitter? A Comparison of #cymraeg, #frysk, and #gaeilge.” Journal of Multilingual and Multicultural Development 40 (1): 32–49.
- Mhichíl, Mairéad Nic Giolla, Theo Lynn, and Pierangelo Rosati. 2018. “Twitter and the Irish Language, #Gaeilge – Agents and Activities: Exploring a Data Set with Micro-implementers in Social Media.” Journal of Multilingual and Multicultural Development 39 (10): 868–881.
- Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. “Distributed Representations of Words and Phrases and Their Compositionality.” In Advances in Neural Information Processing Systems, 3111–3119.
- Mohammad, Saif, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. 2016, June. “SemEval-2016 Task 6: Detecting Stance in Tweets.” In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 31–41, San Diego, CA: Association for Computational Linguistics.
- Morgan-Lopez, Antonio A., Annice E. Kim, Robert F. Chew, and Paul Ruddle. 2017. “Predicting Age Groups of Twitter Users Based on Language and Metadata Features.” PloS One 12 (8): e0183537.
- Nguyen, Dong, Rilana Gravel, Dolf Trieschnigg, and Theo Meder. 2013. “‘How Old Do You Think I Am?’ A Study of Language and Age in Twitter.” In Proceedings of the International AAAI Conference on Web and Social Media.
- Nguyen, Dong, A. Seza Doğruöz, Carolyn P. Rosé, and Franciska de Jong. 2016. “Computational Sociolinguistics: A Survey.” Computational Linguistics 42 (3): 537–593.
- Pennacchiotti, Marco, and Ana-Maria Popescu. 2011. “Democrats, Republicans and Starbucks Afficionados: User Classification in Twitter.” In Association for Computing Machinery, 430–438. ACM.
- Rao, Delip, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. “Classifying Latent User Attributes in Twitter.” In Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, 37–44. ACM.
- Ritter, A., S. Clark, and O. Etzioni. 2011. “Named Entity Recognition in Tweets: An Experimental Study.” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1524–1534.
- Rosenthal, Sara, Noura Farra, and Preslav Nakov. 2017. “SemEval-2017 Task 4: Sentiment Analysis in Twitter.” In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 502–518.
- Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention is all you Need.” In Advances in Neural Information Processing Systems, 5998–6008.
- Villena, Julio, Sara Lana, Eugenio Martínez, and José Carlos González. 2013. “TASS-Workshop on Sentiment Analysis at SEPLN.” Sociedad Española para el Procesamiento del Lenguaje Natural.
- Zaghouani, Wajdi, and Anis Charfi. 2018. “Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification.” In Proceedings of the Eleventh International Conference on Language Resources and Evaluation.
- Zotova, Elena, Rodrigo Agerri, and German Rigau. 2021. “Semi-Automatic Generation of Multilingual Datasets for Stance Detection in Twitter.” Expert Systems with Applications 170: 114547.
- Zubiaga, Arkaitz, Inaki San Vicente, Pablo Gamallo, José Ramom Pichel, Inaki Alegria, Nora Aranberri, Aitzol Ezeiza, and Víctor Fresno. 2016. “TweetLID: A Benchmark for Tweet Language Identification.” Language Resources and Evaluation 50 (4): 729–766.
- Zubiaga, Arkaitz, Bo Wang, Maria Liakata, and Rob Procter. 2017. “Stance Classification of Social Media Users in Independence Movements.” Catalonia 2 (8, 599): 10–960.