References
- Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D.G., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Sutskever, I., Talwar, K., Tucker, P.A., Vanhoucke, V., Vasudevan, V., Viégas, F.B., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous distributed systems.
- Alipanahi, B., Delong, A., Weirauch, M. T., & Frey, B. J. (2015). Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature Biotechnology, 33, 831–838. https://doi.org/https://doi.org/10.1038/nbt.3300
- Alom, M. Z., Taha, T., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M., Hasan, M., Essen, B., Awwal, A., & Asari, V. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, 8, 292. https://doi.org/https://doi.org/10.3390/electronics8030292
- Angoff, W. H. (1972). A technique for the investigation of cultural differences. Paper presented at the annual meeting of the American Psychological Association, Honolulu, May 1972. https://eric.ed.gov/?id=ED069686
- Anwar, S. M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., & Khan, M. K. (2018). Medical image analysis using convolutional neural networks: A review. Journal of Medical Systems, 42, 1–13. https://doi.org/https://doi.org/10.1007/s10916-018-1088-1
- Asparouhov, T., & Muthen, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21, 495–508. https://doi.org/https://doi.org/10.1080/10705511.2014.919210
- Baldi, P., Sadowski, P., & Whiteson, D. (2014). Searching for exotic particles in high-energy physics with deep learning. Nature Communications, 5, 4308. https://doi.org/https://doi.org/10.1038/ncomms5308
- Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling, 13, 186–203. https://doi.org/https://doi.org/10.1207/s15328007sem1302_2
- Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57, 289–300. http://www.jstor.org/stable/2346101
- Bhimji, W., Farrell, S. A., Kurth, T., Paganini, M., Prabhat, & Racah, E. (2018). Deep neural networks for physics analysis on low-level whole-detector data at the LHC. Journal of Physics. Conference Series, 1085, 042034. https://doi.org/https://doi.org/10.1088/1742-6596/1085/4/042034
- Breen, P. G., Foley, C. N., Boekholt, T., & Zwart, S. P. (2019). Newton vs the machine: Solving the chaotic three-body problem using deep neural networks. Monthly Notices of the Royal Astronomical Society. http://arxiv.org/abs/1910.07291 .
- Brown, T. A. (2015). Confirmatory factor analysis for applied research. Guilford publications .
- Browning, E., Bolton, M., Owen, E., Shoji, A., Guilford, T., & Freeman, R. (2018). Predicting animal behaviour using deep learning: GPS data alone accurately predict diving in seabirds. Methods in Ecology and Evolution, 9, 681–692. https://doi.org/https://doi.org/10.1111/2041-210X.12926
- Buchholz, J., & Hartig, J. (2020). Measurement invariance testing in questionnaires: A comparison of three Multigroup-CFA and IRT-based approaches. Psychological Test and Assessment Modeling, 62, 29–53. https://www.psychologie-aktuell.com/fileadmin/Redaktion/Journale/ptam-2020-1/03_Buchholz.pdf
- Byrne, B. M., Shavelson, R. J., & Bengt, M. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456. https://doi.org/https://doi.org/10.1037/0033-2909.105.3.456
- Dahl, G. E., Jaitly, N., & Salakhutdinov, R. (2014). Multi-task neural networks for QSAR predictions. ArXiv:1406.1231 [Cs, Stat]. http://arxiv.org/abs/1406.1231
- Darrell, B. R., & Zimowski, M. F. (1997). Multiple group IRT. In Handbook of modern item response theory (pp. 433–448). Springer.
- Davidov, E., Meuleman, B., Billiet, J., & Schmidt, P. (2008). Values and support for immigration: A cross-country comparison. European Sociological Review, 24, 583–599. https://doi.org/https://doi.org/10.1093/esr/jcn020
- Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Annual Review of Sociology, 40, 55–75. https://doi.org/https://doi.org/10.1146/annurev-soc-071913-043137
- Davidov, E., Schmidt, P., Billiet, J., & Meuleman, B. (Eds.). (2018). Cross-cultural analysis: Methods and applications. Routledge .
- De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533–559. https://doi.org/https://doi.org/10.1007/s11336-008-9092-x
- De Jong, M. G., Steenkamp, J. B. E. M., & Fox, J.-P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260–278. https://doi.org/https://doi.org/10.1086/518532
- ESS. (2020). European social survey cumulative file, ESS 1-9. Data file edition 1.0. NSD - Norwegian Centre for Research Data, Norway - Data Archive and distributor of ESS data for ESS ERIC. https://doi.org/https://doi.org/10.21338/NSD-ESS-CUMULATIVE .
- European Social Survey. (2014). ESS round 7: European social survey round 7 data. Data file edition 2.2. NSD - Norwegian Centre for Research Data, Norway – Data Archive and distributor of ESS data for ESS ERIC. https://doi.org/https://doi.org/10.21338/NSD-ESS7-2014 .
- European Social Survey. (2018). ESS round 9: European social survey round 9 data. Data file edition 2.0. NSD - Norwegian Centre for Research Data, Norway – Data Archive and distributor of ESS data for ESS ERIC. https://doi.org/https://doi.org/10.21338/NSD-ESS9-2018 .
- Ferrag, M. A., Maglaras, L., Moschoyiannis, S., & Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications, 50, 102419. https://doi.org/https://doi.org/10.1016/j.jisa.2019.102419
- Fidalgo, Á. M., & Scalon, J. D. (2010). Using generalized Mantel-Haenszel statistics to assess DIF among multiple-groups. Journal of Psychoeducational Assessment, 28, 60–69. https://doi.org/https://doi.org/10.1177/0734282909337302
- Finch, W. H. (2016). Detection of Differential Item Functioning for More than Two Groups: A Monte Carlo Comparison of Methods. Applied Measurement in Education, 29, 30–45. https://doi.org/https://doi.org/10.1080/08957347.2015.1102916
- Fitzgerald, R., & Jowell, R. (2010). Measurement equivalence in comparative surveys: The European Social Survey (ESS)—from design to implementation and beyond. John Wiley & Sons .
- Flake, J. K., & McCoach, D. B. (2018). An investigation of the alignment method with polytomous indicators under conditions of partial measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 25, 56–70. https://doi.org/https://doi.org/10.1080/10705511.2017.1374187
- Fox, J.-P. (2010). Bayesian item response modeling: Theory and applications. Springer .
- Géron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. Beijing-Cambridge-Farnham-Koln-Sebastopol-Tokyo: O'Reilly Media .
- Gomer, B., Jiang, G., & Yuan, K. H. (2019). New effect size measures for structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 26, 371–389. https://doi.org/https://doi.org/10.1080/10705511.2018.1545231
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. http://www.deeplearningbook.org
- Haerpfer, C., Inglehart, R., Moreno, A., Welzel, C., Kizilova, K., Diez-Medrano, J., Lagos, M., Norris, P., Ponarin, E., Puranen, B. (2020). World values survey: Round seven – country-pooled datafile. JD Systems Institute & WVSA Secretariat. https://doi.org/https://doi.org/10.14281/18241.1
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. ArXiv:1207.0580 [Cs].
- Holland, P. W., and Wainer, H. (Eds.). (1993). Differential item functioning: Theory and practice. Hillsdale: Lawrence Earlbaum .
- Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In PLMR (pp. 448–456).
- ISSP. (2021). History of ISSP. Website information: http://issp.org/about-issp/history/ [accessed 15 November 2021]
- Joreskog, K. G. (1970). A general method for estimating a linear structural equation system. In A. S. Goldberger, and O. D. Duncan (Eds.), ETS Research Bulletin Series, 1970. Seminar Press. i–41 https://doi.org/https://doi.org/10.1002/j.2333-8504.1970.tb00783.x
- Kaplan, D. (1989). Model modification in covariance structure analysis: Application of the expected parameter change statistic. Multivariate Behavioral Research, 24, 285–305. https://doi.org/https://doi.org/10.1207/s15327906mbr2403_2
- Kennamer, N., Kirkby, D., Ihler, A., & Sanchez-Lopez, F. J. (2018). ContextNet: Deep learning for star galaxy classification. International Conference on Machine Learning, 2582–2590. https://proceedings.mlr.press/v80/kennamer18a.html
- Kim, E. S., Cao, C., Wang, Y., & Nguyen, D. T. (2017). Measurement invariance testing with many groups: A comparison of five approaches. Structural Equation Modeling: A Multidisciplinary Journal, 24, 524–544. https://doi.org/https://doi.org/10.1080/10705511.2017.1304822
- Kim, E. S., Yoon, M., & Lee, T. (2012). Testing measurement invariance using MIMIC: Likelihood ratio test with a critical value adjustment. Educational and Psychological Measurement, 72, 469–492. https://doi.org/https://doi.org/10.1177/0013164411427395
- Kim, S.-H., Cohen, A. S., & Park, T.-H. (1995). Detection of differential item functioning in multiple-groups. Journal of Educational Measurement, 32, 261–276. https://doi.org/https://doi.org/10.1111/j.1745-3984.1995.tb00466.x
- Kingma, D. P., & Jimmy, B. (2017). Adam: A method for stochastic optimization. ArXiv:1412.6980 [Cs]. https://arxiv.org/abs/1412.6980
- Koc, P. (2021). Measuring non-electoral political participation: Bi-factor model as a tool to extract dimensions. Social Indicators Research, 156, 1–17. https://doi.org/https://doi.org/10.1007/s11205-021-02637-3. Online First .
- Kopf, J., Zeileis, A., & Strobl, C. (2015). Anchor selection strategies for DIF analysis: Review, assessment, and new approaches. Educational and Psychological Measurement, 75, 22–56. https://doi.org/https://doi.org/10.1177/0013164414529792
- Kostelka, F. (2014). The state of political participation in post-communist democracies: Low but surprisingly little biased citizen engagement. Europe-Asia Studies, 66, 945–968. https://doi.org/https://doi.org/10.1080/09668136.2014.905386
- Kuha, J., & Moustaki, I. (2015). Nonequivalence of measurement in latent variable modeling of multigroup data: A sensitivity analysis. Psychological Methods, 20, 523–536. https://doi.org/https://doi.org/10.1037/met0000031
- Lemos, C. M., Gore, R. J., Puga-Gonzalez, I., & Shults, F. L. (2019). Dimensionality and factorial invariance of religiosity among Christians and the religiously unaffiliated: A cross-cultural analysis based on the international social survey programme. PloS One, 14, e0216352. https://doi.org/https://doi.org/10.1371/journal.pone.0216352
- Lewkowycz, A., Bahri, Y., Dyer, E., Sohl-Dickstein, J., & Gur-Ari, G. (2020). The large learning rate phase of deep learning: The catapult mechanism. ArXiv:2003.02218 [Cs, Stat]. http://arxiv.org/abs/2003.02218
- Li, C. H. (2016) Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behavioral Research, 48, 936–949. doi:https://doi.org/10.3758/s13428-015-0619-7
- Li, X., Chen, S., Xiaolin, H., & Yang, J. (2018). Understanding the disharmony between dropout and batch normalization by variance shift. In ArXiv:1801.05134 [Cs, Stat]. https://doi.org/https://doi.org/10.1109/CVPR.2019.00279
- Lin, L. (2020). Evaluate measurement invariance across multiple-groups: a comparison between the alignment optimization and the random item effects model ( Doctoral dissertation). University of Pittsburgh .
- Liu, X. (2012). Classification accuracy and cut point selection. Statistics in Medicine, 31, 2676–2686. https://doi.org/https://doi.org/10.1002/sim.4509
- Lo Bosco, G., & Di Gangi, M. A. (2016). Deep learning architectures for DNA sequence classification. International Workshop on Fuzzy Logic and Applications, 162–171. https://doi.org/https://doi.org/10.1007/978-3-319-52962-2_14
- Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum .
- MacCallum, R. C., & Austin, J. T. (2000). Applications of structural equation modeling in psychological research. Annual Review of Psychology, 51, 201–226. https://doi.org/https://doi.org/10.1146/annurev.psych.51.1.201
- MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Psychological Bulletin, 100, 107–120. https://doi.org/https://doi.org/10.1037/0033-2909.100.1.107
- Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847–862. https://doi.org/https://doi.org/10.3758/BRM.42.3.847
- Magis, D., Raîche, G., Béland, S., & Paul, G. (2011). A generalized logistic regression procedure to detect differential item functioning among multiple-groups. International Journal of Testing, 11, 365–386. https://doi.org/https://doi.org/10.1080/15305058.2011.602810
- Marblestone, A. H., Wayne, G., & Kording, K. P. (2016). Toward an integration of deep learning and neuroscience. Frontiers in Computational Neuroscience, 10, 94. https://doi.org/https://doi.org/10.3389/fncom.2016.00094
- Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143. https://doi.org/https://doi.org/10.1016/0883-0355(89)90002-5
- Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543. https://doi.org/https://doi.org/10.1007/BF02294825
- Meuleman, B., & Billiet, J. (2009). A Monte Carlo sample size study: How many countries are needed for accurate multilevel SEM? Survey Research Methods, 3, 45–58. https://doi.org/https://doi.org/10.18148/srm/2009.v3i1.666
- Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge. https://doi.org/https://doi.org/10.4324/9780203821961
- Mitchell, M. (2019). Artificial intelligence: A guide for thinking humans. Penguin UK .
- Mozaffar, M., Bostanabad, R., Chen, W., Ehmann, K., Cao, J., & Bessa, M. A. (2019). Deep learning predicts path-dependent plasticity. Proceedings of the National Academy of Sciences, 116, 26414–26420. https://doi.org/https://doi.org/10.1073/pnas.1911815116
- Muthén, B., & Asparouhov, T. (2012). Bayesian structural equation modeling: A more flexible representation of substantive theory. Psychological Methods, 17, 313–335. https://doi.org/https://doi.org/10.1037/a0026802
- Muthén, B., & Asparouhov, T. (2014). IRT studies of many groups: The alignment method. Frontiers in Psychology, 5, 978. https://doi.org/https://doi.org/10.3389/fpsyg.2014.00978
- Muthén, L. K., & Muthén, B. O. (2017). Mplus user's guide. Eight edition. Los Angeles, CA: Muthén & Muthén .
- Nesterov, Y. (1983). A method for unconstrained convex minimization problem with the rate of convergence o(1/K^2). https://www.semanticscholar.org/paper/A-method-for-unconstrained-convex-minimization-with-Nesterov/ed910d96802212c9e45d956adaa27d915f57469
- Nielsen, M. A. (2015). Neural networks and deep learning. Determination Press. http://neuralnetworksanddeeplearning.com/
- Nye, C., & Drasgow, F. (2011). Effect size indices for analyses of measurement equivalence: Understanding the practical importance of differences between groups. The Journal of Applied Psychology, 96, 966–980. https://doi.org/https://doi.org/10.1037/a0022955
- OECD. 2010 PIAAC background questionnaire MS version 2.1 d.d. 15-12-2010. http://www.oecd.org/skills/piaac/piaacdesign/
- Oort, F. J. (1998). Simulation study of item bias detection with restricted factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 5, 107–124. https://doi.org/https://doi.org/10.1080/10705519809540095
- Penfield, R. D., & Lam, T. C. (2000). Assessing differential item functioning in performance assessment: Review and recommendations. Educational Measurement: Issues and Practice, 19, 5–15. https://doi.org/http://dx.doi.org/10.1002/j.2333-8504.1993.tb01525.x
- Penfield, R. D. (2001). Assessing differential item functioning among multiple-groups: A comparison of three mantel-haenszel procedures. Applied Measurement in Education, 14, 235–259. https://doi.org/https://doi.org/10.1207/S15324818AME1403_3
- Pokropek, A., Borgonovi, F., & Carina, M. (2017). On the cross-country comparability of indicators of socioeconomic resources in PISA. Applied Measurement in Education, 30, 243–258. https://doi.org/https://doi.org/10.1080/08957347.2017.1353985
- Pokropek, A., Lüdtke, O., & Robitzsch, A. (2020). An extension of the invariance alignment method for scale linking. Psychological Test and Assessment Modeling, 62, 305–334. https://www.psychologie-aktuell.com/fileadmin/Redaktion/Journale/ptam-2020-2/05_Pokropek.pdf
- Pokropek, A., Schmidt, P., & Davidov, E. (2020). Choosing priors in Bayesian measurement invariance modeling: A Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary Journal, 27, 750–764. https://doi.org/https://doi.org/10.1080/10705511.2019.1703708
- Polyak, B. T. (1964). Some Methods of Speeding up the Convergence of Iteration Methods. USSR Computational Mathematics and Mathematical Physics, 4, 1–17. https://doi.org/https://doi.org/10.1016/0041-5553(64)90137-5 .
- Rasp, S., Pritchard, M. S., & Gentine, P. (2018). Deep learning to represent subgrid processes in climate models. Proceedings of the National Academy of Sciences, 115, 9684–9689. https://doi.org/https://doi.org/10.1073/pnas.1810286115
- Ravanbakhsh, S., Lanusse, F., Mandelbaum, R., Schneider, J., & Poczos, B. (2017). Enabling dark energy science with deep generative models of galaxy images. Thirty-First AAAI Conference on Artificial Intelligence. https://doi.org/https://doi.org/10.5555/3298239.3298456
- Reed, R., & Marks II, R. J. (1999). Neural smithing: Supervised learning in feedforward artificial neural networks. Cambridge, MA: A Bradford Book The MIT Press. https://robertmarks.org/REPRINTS/NS/NS-html/NSindex.htm
- Robitzsch A. (2020). Robust Haebara linking for many groups: Performance in the case of uniform DIF. Psych, 2, 155–173. https://doi.org/10.3390/psych2030014
- Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536. https://doi.org/https://doi.org/10.1038/323533a0
- Saris, W. E., Satorra, A., & Sörbom, D. (1987). The detection and correction of specification errors in structural equation models. Sociological Methodology, 17, 105–129. doi:https://doi.org/10.2307/271030
- Saris, W. E., Satorra, A., & Van der Veld, W. M. (2009). Testing structural equation models or detection of misspecifications? Structural Equation Modeling, 16, 561–582. https://doi.org/https://doi.org/10.1080/10705510903203433
- Skjak, K. K. (2010). The international social survey programme: annual cross-national social surveys since 1985. In J. A. Harkness, M. Braun, B. Edwards, T. Johnson, L. Lyberg, P. P. Mohler, B.-E. Pennell, & T. W. Smith (Eds.), Survey methods in multinational, multiregional, and multicultural contexts (pp. 485–495). Wiley .
- Sorbom, D. (1989). Model modification. Psychometrika, 54, 371–384. https://doi.org/https://doi.org/10.1007/BF02294623
- Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958. https://doi.org/https://doi.org/10.5555/2627435.2670313
- Staar, B., Lütjen, M., & Freitag, M. (2019). Anomaly detection with convolutional neural networks for industrial surface inspection. Procedia CIRP, 79, 484–489. https://doi.org/https://doi.org/10.1016/j.procir.2019.02.123
- Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. The Journal of Applied Psychology, 91, 1292–1306. https://doi.org/https://doi.org/10.1037/0021-9010.91.6.1292
- Steinmetz, H. (2013). Analyzing observed composite differences across groups. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9, 1–12. https://doi.org/https://doi.org/10.1027/1614-2241/a000049
- Svetina, D., Rutkowski, L., & Rutkowski, D. (2020). Multiple-group invariance with categorical outcomes using updated guidelines: An illustration using M plus and the lavaan/semtools packages. Structural Equation Modeling: A Multidisciplinary Journal, 27, 111–130. https://doi.org/https://doi.org/10.1080/10705511.2019.1602776
- Swaminathan, H., & Jane Rogers, H. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370. http://www.jstor.org/stable/1434855
- Thissen, D., Steinberg, L., Wainer, H. et al (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland, and H. Wainer (Ed.), Differential Item functioning (pp. 67–115). Hillsdale: Lawrence Earlbaum .
- Tijmstra, J., Bolsinova, M., Liaw, Y. L., Rutkowski, L., & Rutkowski, D. (2020). Sensitivity of the RMSD for detecting item‐level misfit in low‐performing countries. Journal of Educational Measurement, 57, 566–583. https://doi.org/https://doi.org/10.1111/jedm.12263
- Van de Schoot, R., Kluytmans, A., Tummers, L., Lugtig, P., Hox, J., & Muthen, B. (2013). Facing off with Scylla and Charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance. Frontiers in Psychology, 4, 770. https://doi.org/https://doi.org/10.3389/fpsyg.2013.00770
- Van de Vijver, F. J., & Leung, K. (2011). Equivalence and bias: A review of concepts, models, and data analytic procedures. Cross-cultural Research Methods in Psychology, 17–45 Cambridge University Press; Cambridge Core .
- van der Linden, W. J., & Hambleton, R. K. (eds). (1997). Handbook of modern item response theory. Springer-Verlag.
- Wallach, I., Dzamba, M., & Heifets, A. (2015). AtomNet: A deep convolutional neural network for bioactivity prediction in structure-based drug discovery. ArXiv Preprint ArXiv:1510.02855.
- Wellcome Global Monitor. (2018). How does the world feel about science and health. London, UK: Wellcome Trust. https://wellcome.org/sites/default/files/wellcome-global-monitor-2018.pdf
- Whittaker, T. A. (2012). Using the modification index and standardized expected parameter change for model modification. The Journal of Experimental Education, 80, 26–44. https://doi.org/https://doi.org/10.1080/00220973.2010.531299
- Woods, C. M., Cai, L., & Wang, M. (2013). The Langer-improved Wald test for DIF testing with multiple groups: Evaluation and comparison to two-group IRT. Educational and Psychological Measurement, 73, 532–547. https://doi.org/https://doi.org/10.1177/0013164412464875
- Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research, 44, 1–27. https://doi.org/https://doi.org/10.1080/00273170802620121
- Zhang, R., Deng, W., & Zhu, M. Y. (2017). Using deep neural networks to automate large scale statistical analysis for big data applications. Proceedings of the Ninth Asian Conference on Machine Learning, https://proceedings.mlr.press/v77/zhang17d.html, 311–326 .