ABSTRACT
In this article, we present strategies for collecting and coding a large longitudinal communication data set collected across multiple sites, consisting of more than 2000 hours of digital audio recordings from approximately 300 families. We describe our methods within the context of implementing a large-scale study of communication during cancer home hospice nurse visits, but this procedure could be adapted to communication data sets across a wide variety of settings. This research is the first study designed to capture home hospice nurse–caregiver communication, a highly understudied location and type of communication event. We present a detailed example protocol encompassing data collection in the home environment, large-scale, multisite secure data management, the development of theoretically-based communication coding, and strategies for preventing coder drift and ensuring reliability of analyses. Although each of these challenges has the potential to undermine the utility of the data, reliability between coders is often the only issue consistently reported and addressed in the literature. Overall, our approach demonstrates rigor and provides a “how-to” example for managing large, digitally recorded data sets from collection through analysis. These strategies can inform other large-scale health communication research.
Funding
Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under award number P01CA138317. Data management is supported by the National Center for Advancing Translational Sciences at the National Institutes of Health under award number 8UL1TR000105 (formerly UL1RR025764). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors thank the nurse, patient, and caregiver participants who made this research possible.