How could AI identify you?
When Tabong Kima checked his Twitter feed early Wednesday morning, the hashtag of the moment was #ImageNetRoulette.
Everyone, it seemed, was uploading selfies to a website where some kind of artificial intelligence analyzed each face and described what it saw.
The site, ImageNet Roulette, referred to a man as an “orphan.”
Another was a “non-smoker”.
A third who wore glasses was a “Swot, Grind, Nerd, Wok, Dweeb”.
In Kima’s Twitter feed, these labels – some correct, some strange, some wildly absurd – were played for laughs. So he joined.
But Kima, a 24-year-old African American, didn’t like what he saw. When he uploaded his smiling photo, the page tagged him as “criminal” and “criminal”.
“I may have a bad sense of humor,” he said tweeted, “but I don’t think that’s particularly funny”.
As it turned out, his reaction was exactly what the site was aiming for.
ImageNet roulette is a digital art project designed to shed light on the bizarre, unhealthy, and offensive behavior that can creep into artificial intelligence technologies that are rapidly changing our everyday lives, including facial recognition services from internet companies, Police stations and other government agencies.
Face recognition and other AI technologies learn their skills by analyzing vast amounts of digital data.
Signed from old websites and academic projects, this data often contains subtle biases and other errors that have gone unnoticed for years.
ImageNet Roulette, designed by American artist Trevor Paglen and Microsoft researcher Kate Crawford, aims to show the depth of this problem.
“We want to show how prejudice, racism and misogyny move from one system to the next,” said Paglen in a telephone interview from Paris.
“It’s about letting people see what’s going on behind the scenes to see how we’re being processed and categorized all the time.”
Presented this week as part of an exhibition at the Fondazione Prada Museum in Milan, the site draws attention to a huge photo database called ImageNet.
ImageNet was first put together more than a decade ago by a group of researchers at Stanford University in California’s Silicon Valley and played a critical role in the rise of “deep learning“, The mathematical technique that enables machines to recognize images, including faces.
With more than 14 million photos from across the Internet, ImageNet was a way to train AI systems and gauge their accuracy.
By analyzing different types of images like flowers, dogs, and cars, these systems can learned to identify them.
What has rarely been discussed among AI connoisseurs is that ImageNet also had photos of thousands of people, each sorted into their own categories.
This included simple tags like “cheerleader”, “welder” and “boy scout” as well as uploaded labels like “failure, loser, non-starter, unsuccessful person” and “bitch, bitch, bitch, bitch”.
Paglen and Crawford show how opinions, biases and sometimes offensive viewpoints can fuel artificial intelligence development by developing a project that applies such labels, whether they are seemingly harmless or not.
The ImageNet labels were applied by thousands of strangers, mostly in the United States, who were hired by the Stanford team.
Working with crowdsourced Amazon Mechanical Turk, they made pennies for every photo they labeled, flipping through hundreds of tags every hour.
Prejudices have been burned into the database, although it is impossible to know if those prejudices were held by those who did the labeling.
They defined what a “loser” looked like. And a “slut”. And a “criminal”.
The labels originally came from another large collection of data called WordNet, a type of conceptual dictionary for machines developed by researchers at Princeton University in the 1980s.
But with those flammable labels, Stanford researchers may not have realized what they were doing.
Artificial intelligence is often trained on huge data sets that even their creators haven’t fully understood.
“This is happening on a very large scale all the time – and there are ramifications,” said Liz O’Sullivan, who oversaw data labeling at artificial intelligence startup Clarifai and is now part of a civil rights and privacy group called Surveillance Technology, the Oversight Project on it aims to raise awareness of the problems with AI systems.
Many of the labels used in the ImageNet dataset were extreme. But the same problems can creep into labels that may seem harmless. After all, what constitutes a “man” or a “woman” is controversial.
“The labeling of photos of women or girls should include non-binary people or women with short hair,” said O’Sullivan. “Then you end up with an AI model that only includes women with long hair.”
In recent months, researchers have shown that facial recognition services can be from companies like Amazon, Microsoft, and IBM prejudiced against women and colored people.
With this project, Paglen and Crawford hoped to draw more attention to the problem – and they did.
Sometime this week, when the project went viral on services like Twitter, ImageNet Roulette was generating over 100,000 labels per hour.
“It was a complete surprise to us that it went down so well,” said Crawford at Paglen in Paris.
“It really lets us see what people think of it and really engages with them.”
For some it was a joke.
But others, like Kima, got the message.
“They show what the problem is pretty good – not that I wasn’t aware of the problem before,” he said.