Misinformation, mistakes and the Pope in a puffer: what rapidly evolving AI can – and can’t – do

·8-min read

Generative AI – including large language models such as GPT-4, and image generators such as DALL-E, Midjourney, and Stable Diffusion – is advancing in a “storm of hype and fright”, as some commentators have observed.

Recent advances in artificial intelligence have yielded warnings that the rapidly developing technology may result in “ever more powerful digital minds that no one – not even their creators – can understand, predict, or reliably control”.

Related: I thought I was immune to being fooled online. Then I saw the pope in a coat | Joel Golby

That’s according to an open letter signed by more than 1,000 AI experts, researchers and backers, which calls for an immediate pause on the creation of “giant” AIs for six months so that safety protocols can be developed to mitigate their dangers.

But what is the technology currently capable of doing?

It can generate photorealistic images

Midjourney creates images from text descriptions. It has improved significantly in recent iterations, with version five capable of producing photorealistic images.

These include the faked images of Trump being arrested, which were created by Eliot Higgins, founder of the Bellingcat investigative journalism network.

Midjourney was also used to generate the viral image of Pope Francis in a Balenciaga puffer jacket, which has been described by web culture writer Ryan Broderick as “the first real mass-level AI misinformation case”. (The creator of the image has said he came up with the idea after taking magic mushrooms.)

Image generators have raised serious ethical concerns around artistic ownership and copyright, with evidence that some AI programs have being trained on millions of online images without permission or payment, leading to class action lawsuits.

Tools have been developed to protect artistic works from being used by AI, such as Glaze, which uses a cloaking technique that prevents an image generator from accurately being able to replicate the style in an artwork.

It can convincingly replicate people’s voices

AI-generated voices can be trained to sound like specific people, with enough accuracy that it fooled a voice identification system used by the Australian government, a Guardian Australia investigation revealed.

In Latin America, voice actors have reported losing work because they have been replaced by AI dubbing software. “An increasingly popular option for voice actors is to take up poorly paid recording gigs at AI voiceover companies, training the very technology that aims to supplant them,” a Rest of World report found.

It can write

GPT-4, the most powerful model released by OpenAI, can code in every computer programming language and write essays and books. Large language models have led to a boom in AI-written ebooks for sale on Amazon. Some media outlets, such as CNET, have reportedly used AI to write articles.

Video AI is getting a lot better

There are now text-to-video generators available, which, as their name suggests, can turn a text description into a moving image.

It can turn 2D images into 3D

AI is also getting better at turning 2D still images into 3D visualisations.

It makes factual errors and hallucinates

AI, particularly large language models that are used for chatbots such as ChatGPT, is notorious for making factual mistakes that are easily missed because they seem reasonably convincing.

For every example of a functional use for AI chatbots, there is seemingly a counter-example of its failure.

Prof Ethan Mollick at the Wharton School of the University of Pennsylvania, for example, tested GPT-4 and was able to provide a fair peer review of a research paper as if it were an economic sociologist.

However, Robin Bauwens, an assistant professor at Tilburg University in the Netherlands, had an academic paper rejected by a reviewer, who had likely used AI as the reviewer suggested he familiarise himself with academic papers that had been made up.

The question of why AI generates fake academic papers relates to how large language models work: they are probabilistic, in that they map the probability over sequences of words. As Dr David Smerdon of the University of Queensland puts it: “Given the start of a sentence, it will try to guess the most likely words to come next.”

In February, Bing launched a pre-recorded demo of its AI. As the software engineer Dmitri Brereton has pointed out, the AI was asked to generate a five-day itinerary for Mexico City. Of five descriptions of suggested nightlife options, four were inaccurate, Brereton found. In summarising the figures from a financial report, Brereton found, it also managed to fudge the numbers badly.

It can create (cursed) instructions and recipes

ChatGPT has been used to write crochet patterns, resulting in hilariously cursed results.

Related: Robot recruiters: can bias be banished from AI hiring?

GPT-4, the latest iteration of the AI behind the chatbot, can also provide recipe suggestions based on a photograph of the contents of your fridge. I tried this with several images from the Fridge Detective subreddit, but not once did it return any recipe suggestions containing ingredients that were actually in the fridge pictures.

It can act as an assistant to do administrative tasks

“Advances in AI will enable the creation of a personal agent,” Bill Gates wrote this week. “Think of it as a digital personal assistant: It will see your latest emails, know about the meetings you attend, read what you read, and read the things you don’t want to bother with.”

“This will both improve your work on the tasks you want to do and free you from the ones you don’t want to do.”

For years, Google Assistant’s AI has been able to make reservations at restaurants via phone calls.

OpenAI has now enabled plugins for GPT-4, enabling it to look up data on the web and to order groceries.