Can You Tell If This Headline Was Written by a Robot?

From a Wall Street Journal story by Christopher Mims headlined “Can You Tell Whether This Headline Was Written by a Robot?”:

You probably haven’t noticed, but there’s a good chance that some of what you’ve read on the internet was written by robots. And it’s likely to be a lot more soon.

Artificial-intelligence software programs that generate text are becoming sophisticated enough that their output often can’t be distinguished from what people write. And a growing number of companies are seeking to make use of this technology to automate the creation of information we might rely on, according to those who build the tools, academics who study the software, and investors backing companies that are expanding the types of content that can be auto-generated.

“It is probably impossible that the majority of people who use the web on a day-to-day basis haven’t at some point run into AI-generated content,” says Adam Chronister, who runs a small search-engine optimization firm in Spokane, Wash. Everyone in the professional search-engine optimization groups of which he’s a part uses this technology to some extent, he adds. Mr. Chronister’s customers include dozens of small and medium businesses, and for many of them he uses AI software custom-built to quickly generate articles that rank high in Google’s search results—a practice called content marketing—and so draw potential customers to these websites.

“Most of our customers don’t want it being out there that AI is writing their content,” says Alex Cardinell, chief executive of Glimpse.ai, which created Article Forge, one of the services Mr. Chronister uses. “Before applying for a small business loan, it’s important to research which type of loan you’re eligible to receive,” begins a 1,500-word article the company’s AI wrote when asked to pen one about small business loans. The company has many competitors, including SEO.ai, TextCortex AI and Neuroflash.

Google knows that the use of AI to generate content surfaced in search results is happening, and is fine with it, as long as the content produced by an AI is helpful to the humans who read it, says a company spokeswoman. Grammar checkers and smart suggestions—technologies Google itself offers in its tools—are of a piece with AI content generation, she adds.

“Our ranking team focuses on the usefulness of content, rather than how the content is produced,” says Danny Sullivan, public liaison for search at Google. “This allows us to create solutions that aim to reduce all types of unhelpful content in search, whether it’s produced by humans or through automated processes.”

AI content services are thriving. They make content creators more productive, but they also are able to produce content that no one can tell was made by a machine. This is also often true of AI-generated content of other kinds, including images, video, audio, and synthetic customer service representatives.

Like other types of automation, there are many potential benefits to having AI handle basic writing tasks that are often mere drudgery for humans. That said, there also are considerable dangers of widespread and undetectable synthetic content. For one, it risks replacing a vast and thriving ecosystem of human workers, as has happened in so many industries subject to automation before, by a shrinking number of big entities that will thereby have greater power to shape what people think. At its worst, it could give bad actors a powerful tool to spread deception in moments of crisis, like war.

The rise of AI-generated content is made possible by a phenomenon known variously as computational creativity, artificial creativity or generative AI. This field, which had only a handful of companies in it two or three years ago, has exploded to more than 180 startups at present, according to data gathered by entrepreneur Anne-Laure Le Cunff. These companies have collected hundreds of millions of dollars in investment in recent months even as the broader landscape for tech funding has become moribund.

A lot of the content we are currently encountering on the internet is auto-generated, says Peter van der Putten, an assistant professor at Leiden Institute of Advanced Computer Science at Leiden University in the Netherlands. And yet we are only at the beginning of the deployment of automatic content-generation systems. “The world will be quite different two to three years from now because people will be using these systems quite a lot,” he adds.

By 2025 or 2030, 90% of the content on the internet will be auto-generated, says Nina Schick, author of a 2020 book about generative AI and its pitfalls. It’s not that nine out of every 10 things we see will be auto-generated, but that automatic generation will hugely increase the volume of content available, she adds. Some of this could come in the form of personalization, such as marketing messages containing synthetic video or actors tuned to our individual tastes. In addition, a lot of it could just be auto-generated content shared on social media, like text or video clips people create with no more effort than what’s required to enter a text prompt into a content-generation service.

Here are a few examples of the coming bounty of synthetic media: Artists, marketers and game developers are already using services like Dall-E, Midjourney and Stable Diffusion to create richly detailed illustrations in the style of different artists, as well as photo-realistic flights of fancy. Researchers at the Meta AI division of Facebook parent Meta Platforms unveiled in September a system that can automatically generate videos from a text prompt, and Google unveiled what appears to be an even more sophisticated version of such a system in October.

Dr. van der Putten and his team have created a system that can write newspaper articles in the style of any paper fed into their software. (The Wall Street Journal has its own AI article-writing tool, created in collaboration with Narrativa, a “language generation AI system” which helps a human writer produce some market updates.)

Automatic text-generation systems are helping novelists speed up their writing process, powering customer service chatbots, and powering a service, Replika, that hundreds of thousands of people treat as their artificial boyfriend or girlfriend—and with whom many say they’ve fallen in love.

One downside of this type of artificial creativity is the potential erosion of trust. Take online reviews, where AI is exacerbating deceptive behavior. Algorithmically generated fake reviews are on the rise on Amazon and elsewhere, says Saoud Khalifah, CEO of Fakespot, which makes a browser plug-in that flags such forgeries. While most fraudulent reviews are still written by humans, about 20% are written by algorithms, and that figure is growing, according to his company’s detection systems, he adds.

In the past, Amazon has said that Fakespot can’t tell which reviews are real or not on its site, because it lacks access to the company’s internal data. It has also said that more than 99% of the reviews read by customers on its site were authentic.

It’s important to note that much of the content created by these systems has errors or eccentricities of a type that humans don’t introduce. Some of what AI produces must still be reviewed and in some sense edited by a human.

Dr. van der Putten’s newspaper article writing AI, for example, can automatically rewrite a straight news article in the tone and with the political slant of a more partisan outlet, for example, but its output also can contain factual errors. (For example, in one article, it identified the capital of the Netherlands as The Hague.)

AI-generated images often have strange artifacts in them. Dall-E, in particular, is bad at rendering hands or correctly drawing the right number of limbs on a person or animal. Systems to automatically generate video from text prompts can only generate short clips, and are so far the most primitive of all these systems in terms of their output.

But at the intersection of skilled humans and sophisticated AI, the results can be as good or better than what can be produced by humans alone, and can be produced much faster, making human creators more productive. For example, while Mr. Chronister’s team uses AI text-generation services to create libraries of content for some customers—say, answers to common plumbing questions, meant to attract people to the site of a local plumber—his writers still review that content, and may edit it to further enhance its attractiveness to Google’s search algorithm. “It doesn’t replace the writer, but it can supplement their process,” he adds.

Hour One is a company that virtually “clones” real people, by creating a photo-realistic version of them that can automatically speak any text fed to it, in the original person’s voice. The results are still a bit stiff. In a video in which YouTube personality Dom Esposito uses the technology to make a virtual copy of himself, it’s obvious which parts of the video are automatically generated and which are the real Mr. Esposito.

But this technology is rapidly evolving. Recently, a deepfake version of the actor Keanu Reeves has taken TikTok by storm, racking up more than 550 million views on the app and fooling many viewers into thinking it’s the real thing. Other celebrities, like Tom Cruise, are also getting the deepfake treatment.

Aside from their believability, the major difference between the virtual actors generated by Hour One and these deepfakes is how they’re licensed, says Natalie Monbiot, head of strategy at Hour One. Ms. Monbiot’s company pays actors who agree to license out their virtual selves for clients, including companies seeking virtual hosts for instructional videos. Deepfakes, on the other hand, currently exist in a legal gray area, and there often isn’t a licensing relationship between those who create them and the people they simulate.

The dangers of an internet full of AI-generated content are myriad. For one, many content generation AIs have well-known biases. For example, an AI researcher recently documented how putting the term “ambitious CEO” into the Stable Diffusion AI image generator yielded no images of women executives. Arguments about bias in AI can go both ways—on the one hand, without accountability, AI can make biased decisions just like humans do. On the other, when it’s software producing content or making decisions, it could be easier to audit and systematically correct these biases.

There is also a broader danger of what happens as the best creators and companies become more productive, and able to generate much more content than mere humans can manage on their own. Depending on how much people like this content and how much cheaper it is to create, we could eventually get to a world in which much of the content we are all consuming on the internet on a daily basis is created by fewer and fewer people and companies. In the extreme version of this dystopia, some parts of the internet, from search results and short videos to social media posts calculated to go viral, are eventually generated almost entirely by artificial intelligences, says Andrei Lyskov, a developer at Coinbase who outlined this possibility in a recent essay.

Even if AI-generated content doesn’t take over the whole internet, as it becomes more widespread, there’s a danger that all of us come to trust whatever we see even less than we do now, says Ms. Schick. Arguably, this has already happened with AI curating our feeds and showing us content that plays to our biases and increases polarization. But with algorithms pumping out more content than ever, potentially tailored to our worldviews and inclinations, this situation could become even worse.

“It’s the liar’s dividend,” says Ms. Schick. “If anything can be faked, then why should I believe anything is real?”

Christopher Mims is a columnist who writes about technology for The Wall Street Journal’s tech bureau in San Francisco.

Speak Your Mind

*