Abstract
Sometimes the Joint Statistical Meetings (JSM) is frustrating to attend, because multiple sessions on the same topic are scheduled at the same time. This article uses seeded latent Dirichlet allocation and a scheduling optimization algorithm to very significantly reduce overlapping content in the original schedule for the 2020 JSM program. Specifically, a measure based on total variation distance that ranges from 0 (random scheduling) to 1 (no overlapping content) finds that the original schedule had a score of 0.058, whereas our proposed schedule achieved a score of 0.371. This is a huge improvement that would (i) increase participant satisfaction as measured by the post-JSM satisfaction survey, and (ii) save the American Statistical Association significant money by obviating the need for the traditional in-person meeting of the 47 program chairs and other organizers. The methodology developed in this work immediately applies to future JSMs and is easily modified to improve scheduling for any other scientific conference that has parallel sessions.
6 Supplementary Materials
The supplementary materials includes three datasets used in the analysis: “abstract_info” concerns the detailed information about the sessions of JSM 2020, “seeded_words” is the table of the seeded words we used in seeded LDA, and finally “trayvon_martin” is made up of all the words of Trayvon Martin Corpus we used in data cleaning. Furthermore, “allocation_matrix_list.RData, which is a list of 100 allocation matrices we generated through our algorithm, is included to give the opportunity to avoid running its final loop.