483
Views
68
CrossRef citations to date
0
Altmetric
Original Articles

Hearing facial identities

, &
Pages 1446-1456 | Received 22 Sep 2005, Accepted 21 Sep 2006, Published online: 12 Sep 2007
 

Abstract

While audiovisual integration is well known in speech perception, faces and speech are also informative with respect to speaker recognition. To date, audiovisual integration in the recognition of familiar people has never been demonstrated. Here we show systematic benefits and costs for the recognition of familiar voices when these are combined with time-synchronized articulating faces, of corresponding or noncorresponding speaker identity, respectively. While these effects were strong for familiar voices, they were smaller or nonsignificant for unfamiliar voices, suggesting that the effects depend on the previous creation of a multimodal representation of a person's identity. Moreover, the effects were reduced or eliminated when voices were combined with the same faces presented as static pictures, demonstrating that the effects do not simply reflect the use of facial identity as a “cue” for voice recognition. This is the first direct evidence for audiovisual integration in person recognition.

Acknowledgments

A huge thanks to B.T.J., M.L., D.D., P.O.D., and four additional Glasgow University staff members who agreed to temporarily donate their faces and voices in order to create the stimulus material for this study. This research was supported by a grant from the Deutsche Forschungsgemeinschaft (DFG) to S.R.S. The study further benefited from a summer studentship from the Nuffield foundation to D.R. J.M.K. has been supported by a British Academy Postdoctoral Fellowship.

Notes

1 An analysis of variance (ANOVA) with the factors familiarity (familiar vs. unfamiliar), correspondence (corresponding, noncorresponding within familiarity, and noncorresponding across familiarity), and parameter (initial, intermediate, and final positions) for absolute asynchronies only revealed a significant effect of parameter, F(2, 6) = 6.2, p < .05. Unsurprisingly perhaps, this indicated that synchronization was somewhat better for initial than for intermediate and final sentence positions (M = 41 ms, 104 ms, and 85 ms for initial, intermediate, and final positions, respectively). This analysis revealed no effects involving familiarity (M = 77 ms for both familiar and unfamiliar speakers) or correspondence (M = 65 ms, 82 ms, and 83 ms for corresponding, noncorresponding within familiarity, and noncorresponding across familiarity conditions, respectively).

Additional information

Notes on contributors

Stefan R. Schweinberger

Stefan R. Schweinberger is now at the Department of General Psychology, Friedrich-Schiller University of Jena, Jena, Germany.

David Robertson

David Robertson is now at the Department of General Psychology, Friedrich-Schiller University of Jena, Jena, Germany.

Jürgen M. Kaufmann

Jürgen M. Kaufmann is now at the Department of General Psychology, Friedrich-Schiller University of Jena, Jena, Germany.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.