Question to Siri: “What’s The Wizard of Oz about?” Siri: “It’s about some Dorothy, her intelligent assistants, and her little dog too. Some are not so intelligent, I guess.”
The development of speech recognition and speech synthesis on smartphones (eg Siri) brings to mind how important the voice is in helping people feel engaged, at least with the devices they carry. But it’s really more about fantasy and sci-fi than convenience … which is in turn about the city and film.
“All the world’s a stage,” wrote Shakespeare. Baroque architects arranged the urban environment as if a stage set. Now city inhabitants are more likely to view the world as if through a cinematic frame. Those miniature screens we carry around, plus the animated bilboards, urban screens and projections remind me that the world is mediated by and assessed as if a cinematic experience, at least for some of the time. As well as the visuals, we carry around sound tracks — and voices.
The disembodied voice
The sound theorist Michel Chion bases his studies of voice in cinema on the concept of the acousmêtre, the acousmatic being, the voice whose source is invisible and unknown — like Siri. In film this is the voice of the narrator, someone off-stage, or behind the curtain. For Chion, the acousmêtre conveys “ubiquity, panopticism, omniscience, and omnipotence” (24), in other words authority.
So the acousmêtre is all-seeing, knows everything, and is all powerful. Apparently the voice-over narration in documentaries and news broadcasts inherits its authority from this tendency. As a hapless acousmêtre, the fated Wizard of Oz concealed his modest human frame behind a screen and used his voice to convey the authority he lacked otherwise.
Chion refers to the “already visualised acousmêtre,” the voice whose face we know but currently don’t see, and that generally projects reassurance. The voice of a friend or known voice on the phone can have this function … or perhaps some of the pre-wired, indefatigable and soothing quips fron Siri.
The “commentator acousmêtre” is a category of disembodied voice, but a voice that has no personal stake in the action. Comparable to the narrator of a news item or nature programme, is the public address announcement, whether live or recorded.
The commentator acousmêtre is otherwise close to Chion’s “radio acousmêtre,” the voice over the radio that doesn’t have the option of showing its face.
The “complete acousmêtre” is the voice to whom we can’t yet attach a face, “but who remains liable to appear in the visual field at any moment” (21). The complete acousmêtre is a powerful element in cinema. Film directors play with the possibilities of revelation at any time, and the suspense this builds, the “epiphany of the acousmêtre,” as in Alfred Hitchcock’s Psycho: “the acousmêtre brings disequilibrium and tension” (24).
There’s an amusing scene from The Big Bang Theory (and on YouTube) where Raj actually gets to meet Siri (clearly an impossibility), though the tension comes from wondering whether Raj can speak back.
The grafted voice
You need to have an up-to-date phone and operating systems to interact with Siri, and to be online. It’s grafted on and sometimes drops out. But wer’e used to that with voices, which are frequently detached from the bodies they serve. Cinema rarely uses the voice as recorded by the actors at the same time they are filmed. Voices are added afterwards by the actors and engineers in a sound studio, and adjusted and placed with the image as part of the editing process.
This over-dubbing, or post-synchronisation, indicates the disjunction between voices and bodies, sounds and visual images. In some cases the connection can be disengaged deliberately, as in the case of Fellini’s use of voices that “hang on the bodies of actors in only the loosest and freest sense, in space as well as in time” (85).
Cinema draws on the false attribution of vocal agency, the simulation of vocal apparatus (dubbing, post-synchronization, Foley editing). You can change Siri’s gender and accent, and sometimes it goes mute. This is a cinematic treatment of the voice.
Algorithms simulate the voice, digital devices read documents out loud, advise on what movies are showing nearby, and the progress of software processes, and satellite navigation systems provide vocal directions.
The detached voice is also a bit menacing. In 2001, Stanley Kubrick presents the breakdown of the Hal 9000 computer as an acousmatic event. Throughout the film, Hal is an all-seeing and insistent voice. Hal’s dismantling is accompanied unnervingly by his reversion to a childish state, revealing his digital substrate as he recites a nursery rhyme. For Chion this is “a strange death, leaving no trace, no body,” (46) and no echo.
The vehicle for this speculation is the acousmêtre, a cinematic term that recognises the voice as already disembodied, floating, caught in cycles and cuts, at times mute, and often grafted.
Question to Siri: “What’s 2001: A Space Odyssey about?” Siri: It’s about an assistant named HAL who tries to make contact with a higher intelligence. These two guys get in the way and mess it all up.”
- This post has been published at the conclusion of an exciting symposium called “What is sound design?” in the University of Edinburgh.
- This post is adapted and updated from a book chapter Martin Parker and I wrote: Coyne, Richard, and Martin Parker. 2009. Voice and space: The agency of the acousmêtre in spatial design. In P. Turner, S. Turner, and E. Davenport (eds.), Exploration of Space, Technology and Spatiality: Interdisciplinary Perspectives: 102-112. Hershey PA: Information Science Reference. Siri wasn’t around then.
- Page reference in this post are to Chion, Michel. 1999. The Voice in Cinema. Trans. C. Gorbman. New York: Columbia University Press.
- Slavoj Zizek writes: “We have thus arrived at the formula of the relationship between voice and image: voice does not simply persist at a different level with regard to what we see, it rather points towards a gap in the field of the visible, toward the dimension of what eludes our gaze. In other words, their relationship is mediated by an impossibility: ultimately, we hear things because we cannot see everything” (Zizek, 1996, p. 93). The voice points towards a series of gaps.
- Also see The King’s speech impediment, Haunted by media and Where is that sound?.
- Chion, Michel. 1994. Audio-Vision: Sound on Screen. Trans. C. Gorbman. New York: Columbia University Press
- Chion, Michel. 1999. The Voice in Cinema. Trans. C. Gorbman. New York: Columbia University Press. First published in French in 1982.
- Coyne, Richard, and Martin Parker. 2005. Sounding Off: The Place of Voice in Ubiquitous Digital Media. In K. Nyíri (ed.), Proc. Seeing, Understanding, Learning in the Mobile Age, Budapest, April 28–30: 129-134.
- Coyne, Richard, and Martin Parker. 2006. Voices out of place: Voice, non-place and ubiquitous digital communications. In K. Nyíri (ed.), Mobile Understanding: The Epistemology of Ubiquitous Communication: 171-182. Vienna: Passagen Verlag.
- Coyne, Richard, and Martin Parker. 2009. Voice and space: The agency of the acousmêtre in spatial design. In P. Turner, S. Turner, and E. Davenport (eds.), Exploration of Space, Technology and Spatiality: Interdisciplinary Perspectives: 102-112. Hershey PA: Information Science Reference.
- Zizek, Slavoj. 1996. “I hear you with my eyes”; or, the invisible master. In R. Saleci, and S. Zizek (eds.), Gaze and Voice as Love Objects: 90-126. Durham, NC: Duke University Press.