ABSTRACT
Using secondary data has many advantages, but there are also many limitations, including the lack of relevant information. This article draws on a previous study that used secondary data to investigate substance use in young, elite athletes. Three types of missing data appeared: missing data, lack of information about the data collection process, and unavailable data. Other concerns were also highlighted, such as coverage and sampling errors. The impacts of secondary data on scientific research results can be divided into unavoidable changes and researchers’ choices. The research question should guide the option to use secondary data, and it is essential to assess the level of constraint that will result from it early on. Additionally, along with the quality of information available, consistency in questionnaires is vital for broadening the scope and ensuring research progress.
Acknowledgments
The authors are grateful to the SNSF for its financial assistance. The authors wish to thank all the reviewers who gave their time to peer review this article.
Disclosure statement
No potential conflict of interest was reported by the authors.
Additional information
Funding
Notes on contributors
Ibrahima Dina Diatta
Ibrahima Dina Diatta is a doctoral student in statistics applied to humanities and social sciences at the University of Lausanne. His research focuses on the impact of missing data on statistical analysis and on the application of statistics in the medical field.
André Berchtold
André Berchtold is a professor of statistics at the Institute of Social Sciences of the University of Lausanne. He is a specialist in Markov models, categorical data and the treatment of missing data. His areas of application include adolescent health, substance use, and life courses.