The ethics of a deepfake Anthony Bourdain voice

[The use of artificial intelligence technology in a new documentary to create the illusion of the late Anthony Bourdain’s voice has raised a controversy with important implications regarding the ethics of presence. This story from The New Yorker provides the facts and explores the reasons some are upset – including their previous presence experiences in the form of parasocial relationships with Bourdain – and the evolving standards and techniques for disclosing illusions to audiences. –Matthew]

[Image: The artificial voice may trouble people in large part because of the close connection they feel with Bourdain, shown here in 2011. Credit: Jose Sena Goulao / EPA-EFE / Shutterstock]

The Ethics of a Deepfake Anthony Bourdain Voice

The new documentary “Roadrunner” uses A.I.-generated audio without disclosing it to viewers. How should we feel about that?

By Helen Rosner
July 17, 2021

The documentary “Roadrunner: A Film About Anthony Bourdain,” which opened in theatres on Friday, is an angry, elegant, often overwhelmingly emotional chronicle of the late television star’s life and his impact on the people close to him. Directed by Morgan Neville, the film portrays Bourdain as intense, self-loathing, relentlessly driven, preternaturally charismatic, and—in his life and in his death, by suicide, in 2018—a man who both focussed and disturbed the lives of those around him. To craft the film’s narrative, Neville drew on tens of thousands of hours of video footage and audio archives—and, for three particular lines heard in the film, Neville commissioned a software company to make an A.I.-generated version of Bourdain’s voice. News of the synthetic audio, which Neville discussed this past week in interviews with me and with Brett Martin, at GQ, provoked a striking degree of anger and unease among Bourdain’s fans. “Well, this is ghoulish”; “This is awful”; “WTF?!” people said on Twitter, where the fake Bourdain voice became a trending topic. The critic Sean Burns, who had reviewed the documentary negatively, tweeted, “I feel like this tells you all you need to know about the ethics of the people behind this project.”

When I first spoke with Neville, I was surprised to learn about his use of synthetic audio and equally taken aback that he’d chosen not to disclose its presence in his film. He admitted to using the technology for a specific voice-over that I’d asked about—in which Bourdain improbably reads aloud a despairing e-mail that he sent to a friend, the artist David Choe—but did not reveal the documentary’s other two instances of technological wizardry. Creating a synthetic Bourdain voice-over seemed to me far less crass than, say, a C.G.I. Fred Astaire put to work selling vacuum cleaners in a Dirt Devil commercial, or a holographic Tupac Shakur performing alongside Snoop Dogg at Coachella, and far more trivial than the intentional blending of fiction and nonfiction in, for instance, Errol Morris’s “Thin Blue Line.” Neville used the A.I.-generated audio only to narrate text that Bourdain himself had written. Bourdain composed the words; he just—to the best of our knowledge—never uttered them aloud. Some of Neville’s critics contend that Bourdain should have the right to control the way his written words are delivered. But doesn’t a person relinquish that control anytime his writing goes out into the world? The act of reading—whether an e-mail or a novel, in our heads or out loud—always involves some degree of interpretation. I was more troubled by the fact that Neville said he hadn’t interviewed Bourdain’s former girlfriend Asia Argento, who is portrayed in the film as the agent of his unravelling.

Besides, documentary film, like nonfiction writing, is a broad and loose category, encompassing everything from unedited, unmanipulated vérité to highly constructed and reconstructed narratives. Winsor McCay’s short “The Sinking of the Lusitania,” a propaganda film, from 1918, that’s considered an early example of the animated-documentary form, was made entirely from reënacted and re-created footage. Ari Folman’s Oscar-nominated “Waltz with Bashir,” from 2008, is a cinematic memoir of war told through animation, with an unreliable narrator, and with the inclusion of characters who are entirely fictional. Vérité is “merely a superficial truth, the truth of accountants,” Werner Herzog wrote in his famous manifesto “Minnesota Declaration.” “There are deeper strata of truth in cinema, and there is such a thing as poetic, ecstatic truth. It is mysterious and elusive, and can be reached only through fabrication and imagination and stylization.” At the same time, “deepfakes” and other computer-generated synthetic media have certain troubling connotations—political machinations, fake news, lies wearing the HD-rendered face of truth—and it is natural for viewers, and filmmakers, to question the boundaries of its responsible use. Neville’s offhand comment, in his interview with me, that “we can have a documentary-ethics panel about it later,” did not help assure people that he took these matters seriously.

On Friday, to help me unknot the tangle of ethical and emotional questions raised by the three bits of “Roadrunner” audio (totalling a mere forty-five seconds), I spoke to two people who would be well-qualified for Neville’s hypothetical ethics panel. The first, Sam Gregory, is a former filmmaker and the program director of Witness, a human-rights nonprofit that focusses on ethical applications of video and technology. “In some senses, this is quite a minor use of a synthetic-media technology,” he told me. “It’s a few lines in a genre where you do sometimes construct things, where there aren’t fixed norms about what’s acceptable.” But, he explained, Neville’s re-creation, and the way he used it, raise fundamental questions about how we define ethical use of synthetic media.

The first has to do with consent, and what Gregory described as our “queasiness” around manipulating the image or voice of a deceased person. In Neville’s interview with GQ, he said that he had pursued the A.I. idea with the support of Bourdain’s inner circle—“I checked, you know, with his widow and his literary executor, just to make sure people were cool with that,” he said. But early on Friday morning, as the news of his use of A.I. ricocheted, his ex-wife Ottavia Busia tweeted, “I certainly was NOT the one who said Tony would have been cool with that.” On Saturday afternoon, Neville wrote to me that the A.I. idea “was part of my initial pitch of having Tony narrate the film posthumously á la Sunset Boulevard—one of Tony’s favorite films and one he had even reenacted himself on Cook’s Tour,” adding, “I didn’t mean to imply that Ottavia thought Tony would’ve liked it. All I know is that nobody ever expressed any reservations to me.” (Busia told me, in an e-mail, that she recalled the idea of A.I. coming up in an initial conversation with Neville and others, but that she didn’t realize that it had actually been used until the social-media flurry began. “I do believe Morgan thought he had everyone’s blessing to go ahead,” she wrote. “I took the decision to remove myself from the process early on because it was just too painful for me.”)

A second core principle is disclosure—how the use of synthetic media is or is not made clear to an audience. Gregory brought up the example of “Welcome to Chechnya,” the film, from 2020, about underground Chechen activists who work to free survivors of the country’s violent anti-gay purges. The film’s director, David France, relied on deepfake technology to protect the identities of the film’s subjects by swapping their faces for others, but he left a slight shimmer around the heads of the activists to alert his viewers to the manipulation —what Gregory described as an example of “creative signalling.” “It’s not like you need to literally label something—it’s not like you need to write something across the bottom of the screen every time you use a synthetic tool—but it’s responsible to just remind the audience that this is a representation,” he said. “If you look at a Ken Burns documentary, it doesn’t say ‘reconstruction’ at the bottom of every photo he’s animated. But there’s norms and context—trying to think, within the nature of the genre, how we might show manipulation in a way that’s responsible to the audience and doesn’t deceive them.”

Gregory suggested that much of the discomfort people are feeling about “Roadrunner” might stem from the novelty of the technology. “I’m not sure that it’s even all that much about what the director did in this film—it’s because it’s triggering us to think how this will play out, in terms of our norms of what’s acceptable, our expectations of media,” he said. “It may well be that in a couple of years we are comfortable with this, in the same way we’re comfortable with a narrator reading a poem, or a letter from the Civil War.”

“There are really awesome creative uses for these tools,” my second interviewee, Karen Hao, an editor at the MIT Technology Review who focusses on artificial intelligence, told me. “But we have to be really cautious of how we use them early on.” She brought up two recent deployments of deepfake technology that she considers successful. The first, a 2020 collaboration between artists and A.I. companies, is an audio-video synthetic representation of Richard Nixon reading his infamous “In Event of Moon Disaster” speech, which he would have delivered had the Apollo 11 mission failed and Neil Armstrong and Buzz Aldrin perished. (“The first time I watched it, I got chills,” Hao said.) The second, an episode of “The Simpsons,” from March, in which the character Mrs. Krabappel, voiced by the late actress Marcia Wallace, was resurrected by splicing together phonemes from earlier recordings, passed her ethical litmus test because, in a fictional show like “The Simpsons,” “you know that the person’s voice is not representing them, so there’s less attachment to the fact that the voice might be fake,” Hao said. But, in the context of a documentary, “you’re not expecting to suddenly be viewing fake footage, or hearing fake audio.”

A particularly unsettling aspect of the Bourdain voice clone, Hao speculated, may be its hybridization of reality and unreality: “It’s not clearly faked, nor is it clearly real, and the fact that it was his actual words just muddles that even more.” In the world of broadcast media, deepfake and synthetic technologies are logical successors to ubiquitous—and more discernible—analog and digital manipulation techniques. Already, face renders and voice clones are an up-and-coming technology in scripted media, especially in high-budget productions, where they promise to provide an alternative to laborious and expensive practical effects. But the potential of these technologies is undermined “if we introduce the public to them in jarring ways,” Hao said, adding, “It could prime the public to have a more negative perception of this technology than perhaps is deserved.” The fact that the synthetic Bourdain voice was undetected until Neville pointed it out is part of what makes it so unnerving. “I’m sure people are asking themselves, How many other things have I heard where I thought this is definitely real, because this is something X person would say, and it was actually fabricated?” Hao said. Still, she added, “I would urge people to give the guy”—Neville—“some slack. This is such fresh territory. . . . It’s completely new ground. I would personally be inclined to forgive him for crossing a boundary that didn’t previously exist.”

Setting aside questions of technological ethics, the artificial voice may trouble people in large part because of the close connection they feel with Bourdain—what psychologists call a parasocial relationship. “There’s this visceral reaction of, Hey, whoa, you potentially manipulated our understanding of Anthony Bourdain—what he would have said, how he would have portrayed himself—without his consent and without our knowing,” Hao said. Parasocial intimacy can be profound and deeply precious; the depth and duration of grief over Bourdain’s death—ironically, the same grief that Neville navigates and dissects in “Roadrunner”—is proof of that. The synthetic line readings trigger our revulsion at the uncanny valley of artificial intelligence, which in turn threatens to corrode the sanctity of our private relationships with Bourdain and his memory.

A common refrain on Twitter in recent days has been that Bourdain would surely have hated the use of his A.I.-generated voice. He is often considered a champion of “authenticity,” and some fans have criticized the technology as antithetical to that standard. But authenticity is a slippery concept, especially in the world of food and travel—something that Bourdain himself knew. “The word authentic has become a completely ridiculous, snobbish term,” he once told Time. We can’t know how he would have felt about the vocal clone, of course. When it comes to matters of journalistic ethics, the opinion of the subject isn’t really relevant, and maybe it’s irresponsible of us to speculate. But, given Bourdain’s formalist, cinematic approach to his TV storytelling—not to mention his documented love of Werner Herzog—it’s not unreasonable to imagine that he would have subscribed to Herzog’s notion of ecstatic truth. He almost certainly would have relished the passionate dialogue about cinematic ethics that “Roadrunner” has kicked off. “Our process is much more manipulative than the fly-on-the-wall approach,” Bourdain said, of his CNN travel show, “Parts Unknown,” in an interview from 2013. “We try to get the audience to a particular place. Our shows are very subjective, with no attempt of being fair. . . . If I feel oppressed or a particular mood, I want the audience to feel that way, too, by all means possible.”

International Society
for Presence Research

The ethics of a deepfake Anthony Bourdain voice

Comments

Leave a Reply Cancel reply

ISPR Presence News

Search ISPR Presence News:

Categories

Archives

Recent Posts

Recent Comments