ABSTRACT
Facial recognition technology (FRT) has been widely studied and criticized for its racialising impacts and its role in the overpolicing of minoritised communities. However, a key aspect of facial recognition technologies is the dataset of faces used for training and testing. In this article, we situate FRT as an infrastructural assemblage and focus on the history of four facial recognition datasets: the original dataset created by W.W. Bledsoe and his team at the Panoramic Research Institute in 1963; the FERET dataset collected by the Army Research Laboratory in 1995; MEDS-I (2009) and MEDS-II (2011), the datasets containing dead arrestees, curated by the MITRE Corporation; and the Diversity in Faces dataset, created in 2019 by IBM. Through these four exemplary datasets, we suggest that the politics of race in facial recognition are about far more than simply representation, raising questions about the potential side-effects and limitations of efforts to simply ‘de-bias’ data.
Acknowledgements
We are tremendously grateful to advisors, reviewers and friends, past and present, including Jacqueline Wernimont, David Ribes, Adam Hyland, Kate Crawford, Danya Glabau, Anna Lauren Hoffman—and each other. Our thanks further go to the editors for their precise and painstaking work in putting together this special edition.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Correction Statement
This article has been corrected with minor changes. These changes do not impact the academic content of the article.
Notes
1 Documents from our corpus will be referenced using the abbreviation ’DFR’ followed by their index number, rather than author name and year, which are not always easily available. Note to reviewers: the full corpus will be available online at the culmination of our research project.
2 For critical perspectives on AI more broadly, please see Alison Adams’ Gender and the Thinking Machine.
3 The subject’s metadata may come from a variety of sources that are not the subject including third-party assessments of the subject’s identity.
4 For more work on data as politically charged and/or negotiated, see (Ribes Citation2017, Shilton Citation2018, Maienschein et al. Citation2019, Williams Citation2018)
5 For more on the connection between imagery and disciplinary control see Sekula (Citation1986), Tagg (Citation1993).
6 Mugshots may have been the first dataset of faces. See Finn (Citation2009) for an analysis of the mugshot. As police departments discovered, mugshot books did not scale—they became less useful the more faces they contained.
7 The slow process of replacing Shirley cards did not start (until 1995). (Roth Citation2009)
8 The history of photography is also intertwined with anthropometry and phrenology, both of which were/are mobilized for social control (Cole Citation2009).
9 See (Turk and Pentland Citation1991) for a review of early automation attempts between Bledsoe and eigenfaces.
10 For information on the disproportionate harms maintained by the US carceral system, see Kristian Williams’ Our Enemies in Blue: Police and Power in America and Alex S. Vitale’s The End of Policing. For more information on technology in the carceral system see Captivating Technology: Race, Carceral Technoscience, and Liberatory Imagination in Everyday Life edited by Ruha Benjamin
11 A ‘false positive’ is when the system incorrectly identifies two photographs as containing the same subject
12 We are unwilling to reproduce the nonconsensual sharing of people’s images. Interested readers can learn more and download the datasets at https://www.nist.gov/itl/iad/image-group/special-database-32-multipleencounter-dataset-meds
13 The documentation for MEDS and other FRT datasets is not a neutral reporting of the contours of a technical object, but is itself a technical object produced through a variety of sociocultural interactions. We hope other researchers will analyze these documentation technologies.
14 IBM does not offer their definition of diversity. We assume, based on the imagery and language, that they are referencing racial, gender, and age diversity.
15 For information on the history of racialization as a concept, please see (Murji and Solomos Citation2005)
Additional information
Funding
Notes on contributors
Nikki Stevens
Nikki Stevens is a PhD candidate at Arizona State University and a research associate at Dartmouth College. Their background as a software engineer informs their research on proxy surveillance, corporate data collection and the affective power of data.
Os Keyes
Os Keyes is a researcher and writer at the University of Washington, where they study gender, technology and (counter)power. They are a frequently-published essayist on data, gender and infrastructures of control, and a winner of the inaugural Ada Lovelace Fellowship.