My friend Carolyn passed this link my way yesterday. It raises quite a few interesting questions about the future of sound in cinema:
If you don’t have time to watch the video in full, it tells us about a new development in audio-visual technology known as 3D sound. According Dr. Edgar Choueiri, 3D sound is unique in that it actually spatially places every sound that comes from the speakers in a specific location. As near as I can tell, this means that if, for instance, a character on the far right of the screen speaks, 3D sound can make it seem like her voice is actually coming from her exact spot onscreen. In theory, 3D sound is able to simulate the actual auditory space of the image you see on-screen. In an age where 3D films are becoming more and more immersive, this certainly seems like the next step in cinematic world-building.
But while the idea is intriguing, it comes with major conceptual obstacles if we try to apply it to narrative cinema. When I first saw this video, my response was simply, “well that’s a great idea, but good luck getting the studios and movie theater chains to invest in it!” Upon further reflection though, the concept of 3D sound has even bigger conceptual challenges. The overriding hook of this new technology is that it gives sounds in film specific locations. The problem with this idea, however, is that it assumes that every sound in film actually has a concrete location. This is not the case, because A) not every sound has an actual visual referent on-screen, and B) a sound that does have a visual referent isn’t always meant to match that visual referent’s location in space.
That’s probably a mouthful, so let me break this down in slightly simpler terms. In film, there are roughly two major categories of sound – diegetic sound and non-diegetic sound. Diegetic sound is sound that exists within the film’s story world – sound that characters themselves can hear. Dialogue and sound effects tend to be diegetic sounds, for example. Non-diagetic sound is trickier – it’s sound that exists outside of a film’s story world. Omniscient narrators and instrumental film music (hey, that’s the topic of this blog!) are generally non-diegetic, because only the audience has access to those sounds. Why is this relevant? Because if we’re thinking about sound in terms of three-dimensional film space, non-diegetic sound is impossible localize. Elements like film music are not actually a part of the three-dimensional space that the audience can see – these audio components exist outside of that world. Where then, can non-diegetic sound fit in this new 3D sound technology? It might make sense to have a character’s voice come from that character’s spot onscreen, but what does one do with the massive unseen orchestra that isn’t even part of that onscreen world? Do we place the score behind the audience, to keep it firmly separated from the visuals? Do we try to integrate it seamlessly into the images in front of the audience? Do we arbitrarily assign spatial locations for every instrument of the orchestra? No matter where we place this music, we’re stuck trying to integrate a huge body of sound into a fictional space where that sound is not actually supposed to exist. And once we create an expectation that every sound does need to exist in a three-dimensional plane, every sound that doesn’t fit into that logic is going to seem distracting – we’ll be that much more conscious that these non-diegetic sounds don’t belong in this auditory space.
It actually gets even more complicated when we talk about diegetic sound. In theory, creating a three-dimensional sound space for the dialogue and sound effects should be as simple as attaching each sound to its object in the film’s visual space. But in practice, sound perspective is not always synchronized to visual perspective. Frequently, for example, we’ll encounter a long shot of two characters having a conversation that we can hear perfectly. Even though the camera is very far away from these characters, we can still hear them as though we’re standing right next to them. Film sacrifices audio clarity for spatial fidelity all the time – if it didn’t, we’d only be able to hear dialogue when the camera was literally right next to the characters. This means that if we want to develop three-dimensional sound-space, it’s not going to be as simple as placing each diegetic sound on corresponding location on-screen.
Here, however, I do think 3D sound actually does have potential if it’s handled carefully. In particular, it has the potential to take what theorist Michel Chion refers to as the “point of audition” to exciting new places. The “point of audition” is essentially the audio equivalent of point of view – our auditory perspective. Where our point of view is the position from which we see, our point of audition is the position from which we hear. While it seems like these two points would always be the same – that we would see and hear from the same position – in practice this isn’t the case. Frequently, film will actually place our points of view and audition into counterpoint with each other. In a horror film, for example, we might see a man running away from a monster. Our point of view is that of an outside observer – we don’t see things the way the character sees things, because we’re looking at the character himself. Our point of audition, however, may very well be designed to situate us inside the character’s head. While we see the character in front of us, the character’s breathing and the monster’s footsteps have been amplified so that we hear these sounds as the character hears them. 3D sound has the potential to make our point of audition even more vivid. If it works as well as Dr. Choueiri claims that it works, 3D sound could actually construct a detailed audio perspective in which all sounds come from specific locations around the listener. A monster’s footsteps wouldn’t simply be amplified – they’d actually sound like they were directly behind us. In this regard, 3D sound may move us one step closer past the line that separates our subjectivity from a character’s subjectivity.
But in order for that to work, filmmakers would have to think of sound as an entity that is entirely separate from the images onscreen. This is ultimately the only way I can see this three-dimensional sound concept working in film. In the video linked about, Choueiri speaks of audio depth as though it were an extension of visual depth. This logic might be conducive to filming a live event from a single camera, but it is not conducive to narrative cinema. If 3D sound is going to have a place in our movie-going future, then it needs to develop a logic that isn’t strictly image-bound. This logic will likely be extremely complex, and it will force filmmakers to constantly make decisions that weren’t even possible prior to this technology. It will mean constantly making very specific decisions about point of audition – filmmakers will need to account for the audience’s specific location in the film’s soundscape for literally every moment in the film. Non-diegetic sound is still going to be an obstacle, but it wouldn’t be such an insurmountable obstacle if we don’t think of sound as a literal extension of the visual space. In an ideal scenario, three-dimensional audio equipment would not simply attempt to make CGI-enhanced 3D fantasies seem more real. Rather, three-dimensional audio equipment would enhance the independent power that sound already carries in the cinema.