Technology Revives Past Voices for Audiobooks

From a Wall Street Journal story by Jeffrey A.Trachtenberg headlined “Technology Revives Past Voices for Audiobooks”:

Edward Herrmann, a prolific actor who narrated dozens of audiobooks, has been dead for almost a decade. But that hasn’t prevented him from being the voice of several recent audiobooks.

Mr. Herrmann’s latest work is generated by DeepZen Ltd., a London-based artificial-intelligence startup that was given access to the actor’s past recordings with his family’s permission. From that trove, DeepZen said it is able to generate any sound and intonation that Mr. Herrmann would have used if he were narrating these new books himself.

“We felt it was an amazing way to carry on his legacy,” said Rory Herrmann, a Los Angeles restaurateur and Mr. Herrmann’s son. He said he was astonished when he first listened to an audiobook featuring his father’s synthetic voice.

“It’s a wow moment,” he said.

Generative AI technology, a type of artificial intelligence that can create various types of content including text, images and audio, has become a buzzword since OpenAI’s ChatGPT was launched late last year. The chatbot—which can eloquently answer seemingly any question, but is sometimes spectacularly wrong—became an overnight global phenomenon, fueling speculation that AI could fundamentally reshape many professions.

AI’s reach into audiobook narration isn’t merely theoretical. Thousands of AI-narrated audiobooks are available on popular marketplaces including Alphabet Inc.’s Google Play Books and Apple Inc.’s Apple Books, whose Audible unit is the largest U.S. audiobook service, doesn’t offer any for now, but says it is evaluating its position.

The technology hasn’t been widely embraced by the largest U.S. book publishers, which mostly use it for marketing efforts and some foreign-language titles. But it is a boon for smaller outfits and little-known authors, whose books might not have the sales potential to warrant the cost—traditionally at least $5,000—of recording an audio version.

Apple and Google said they allow users to create audiobooks free of charge that use digitally replicated human voices. The voices featured in audiobooks generated by Apple and Google come from real people, whose voices helped train their automated-narration engines.

Charles Watkinson, director of the University of Michigan Press, said the publisher has made about 100 audiobooks using Google’s free auto-narrated audiobook platform since early last year. The new technology made those titles possible because it eliminated the costs associated with using a production studio, support staff and human narrators.

“From what I can see, human narrators are freaking out,” said Dima Abramov, chief executive of Speechki, an Austin, Texas-based audiobook producer that uses synthetically narrated voices.

Scott Brick, who has narrated more than 1,000 audiobooks by such authors as Tom Clancy and Nelson DeMille, said AI auto-narration is best suited for nonfiction titles, where narrators and readers aren’t as emotionally invested as with works of fiction.

“There’s realism there, but no soul,” Mr. Brick said.

DeepZen has worked with more than 30 professional actors to help its AI engine capture all the ranges of human emotion, said Taylan Kamis, its CEO and co-founder.

Melissa Papel, a Paris-born actress who records from her home studio in Los Angeles, said she recorded eight hours of content for DeepZen, reading in French from different books. “One called for me to read in an angry way, another in a disgusted way, a humorous way, a dramatic way,” she said.

Ms. Papel said there is still plenty of work for professional narrators because the new era of AI auto-narration is just getting under way, though she said that might not be the case in the future.

“I understood that they would use my voice to teach software how to speak more humanly,” Ms. Papel said. “I didn’t realize they could use my voice to pronounce words I didn’t say. That’s incredible.”

DeepZen pays its narrators a flat fee plus a royalty based on the revenue the company generates from different projects. The agreements span multiple years, Mr. Kamis said.

Jeffrey Bennett, general counsel for the Screen Actors Guild-American Federation of Television and Radio Artists, a national union that represents performers, including professional audiobook narrators, said he expects AI to eventually disrupt the industry.

“Anything we’re seeing and hearing now will get better and better,” he said. Mr. Bennett added that the union is working to protect voice and likeness rights. “We do not believe the disruptions are unmanageable for professional voice talent,” he said.

Simi Linton said she isn’t a fan of “Mary,” the synthetic voice that the University of Michigan Press picked to narrate “My Body Politic,” her 2005 memoir that focuses on her life as a disabled woman.

“The reason I agreed to have this synthetic creation made was to increase the accessibility of my book for blind readers and those with other impairments,” she said of the audiobook, which became available in 2020.

Audiobooks remain a bright spot in an industry that has struggled since Americans returned to prepandemic activities. Audiobook sales rose 7% last year, according to the Association of American Publishers, while print book sales declined by 5.8%, according to book tracker Circana BookScan.

James Daunt, CEO of Barnes & Noble, which is already selling AI-generated titles, said such audiobooks are welcome as long as they are clearly labeled.

Audible is evaluating its approach to audiobooks narrated by artificial intelligence, according to a spokeswoman. “Professional narration has always been and will remain core to the Audible listening experience,” she said, but “we see a future in which human performances and text-to-speech generated content can coexist.”

The country’s biggest publishers are on the fence. Ana Maria Allessi, publisher of Hachette Audio, the audiobook arm of Lagardère SA’s Hachette Book Group, said the publisher hasn’t yet tested titles made with AI technology. She said she is open-minded about computer-generated recordings, as long as they are clearly marked.

HarperCollins Publishers, which like The Wall Street Journal is owned by News Corp, is relying on AI for limited uses. The publisher is testing some AI-generated narration for its audiobooks in foreign markets to gauge quality and consumer reaction. HarperCollins used Google to make those audiobooks.

HarperCollins doesn’t sell any such audiobooks in the U.S., but it recently started sending audio files generated by DeepZen to retailers and reviewers ahead of a book’s publication. The company deems those more environmentally friendly than printing advance copies, said Chantal Restivo-Alessi, the publisher’s chief digital officer. The audio files aren’t for sale.

DeepZen says it has signed deals with 35 publishers in the U.S. and abroad and is working with 25 authors.

The artificial intelligence engine created by DeepZen uses machine-learning software that replicates how a person speaks as well as the characteristics of that person’s voice. It can add emotion and focus on hard-to-pronounce words, such as names of characters in science-fiction novels or the smallest towns in China, Mr. Kamis said.

“It’s easier than using a human narrator,” he said.

Josiah Ziegler, a psychiatrist in Fort Collins, Colo., last year created Intellectual Classics, which focuses on nonfiction works that are out of copyright and don’t have an audiobook edition.

He chose Mr. Herrmann as the narrator for “The War with Mexico,” a work by Justin H. Smith that won the 1920 Pulitzer Prize for history whose audiobook version Dr. Ziegler expects to publish later this year.

“I knew his voice,” Dr. Ziegler said. “He was very good, and I thought I’d give it a try.”

DeepZen, which has created nearly a hundred audiobooks featuring Mr. Herrmann’s voice, is pursuing the rights of other well-known stars who have died.

“We are looking to expand our library, but we can’t divulge anything else,” Mr. Kamis said.

Speak Your Mind