Bad science journalism: Gay facial recognition

Journalistic accounts of soon-to-be-published study called “Deep neural networks are more accurate than humans at detecting sexual orientation from facial images” (by Michal Kosinski and Yilun Wang) have gone viral and already prompted some outraged reactions from LGBT groups GLAAD and the Human Rights Campaign. The study primed a deep neural network face recognition program on photos of white homosexual and heterosexual adults obtained on a dating website, and used it to create a “classifier” that rates which photographs were most distinctively those of gay or lesbian people. This classifier’s ability to distinguish gays and lesbian individuals was compared with human observers on test samples from the data, and on Facebook profile pictures with a stated sexual orientation.

This is all a vaguely interesting computer science project about self-presentation (all of the images were curated by the people involved and put on profiles stating an “interest in” one sex or  the other), machine learning, and perception. Interesting, that is, until it is attached to fears about artificial omniscience and ubiquitous surveillance, and debates about nature and nurture. Then it becomes at turns frightening and polemical.

Before we get there (and I’ll update this post with some comments about the authors’ dubious understanding of the many social layers that separate, say pre-natal hormones and early adult physical presentation, the fluidity of sexual orientation, and the presumed future capacity of artificial intelligence to make omniscient predictions), we have to ask whether the results of this study justify this kind of grand implications. In other words, we first need to know what exactly the study shows.

Let me begin with two simple asks for journalists reporting on science:

  1. Read the whole scientific paper and explain to readers what actual evidence is being presented!
  2. Also, remember that “discussion” sections of papers lack the scientific validity that is attached to results of the research method involved.
  3. Be literate in math.
  4. Never ever present a numerical result without explaining what that number means.

Unfortunately, major accounts of the paper (such as this one in the Guardian) fail to follow this simple rule. And, as is often the case, the problem starts with the headline:

New AI can guess whether you’re gay or straight from a photograph
An algorithm deduced the sexuality of people on a dating site with up to 91% accuracy, raising tricky ethical questions

Now, does the paper show that the AI can guess your sexuality from a photograph with 91% accuracy? Nope.

As the paper states:

The AUC = .91 does not imply that 91% of gay men in a given population can be identified, or that the classification results are correct 91% of the time.

Here’s the 91% claim. The AI is shown five photos from two individuals on the dating website. Based on what it has learned from other photos, it offers a guess as to which is more likely to be gay. In 91% of the cases where there is a gay man and a straight man being compared it guesses correctly. Accurate headline:

AI can distinguish gay men based on five dating profile pics 91% of the time.

When presented with just one pair of images of men, the AI guessed right 81% of the time. Human judges—recruited by Mechanical Turk and untrained on any images—guessed right just 61% of the time. For women, both were right less often: 71% for the AI and 54% for the humans. In this test, 50% is rock bottom, the equivalent of zero gaydar.)

But it gets worse. Let’s try to apply the paper to original question raised by the headline. How well can this AI judge an individual person’s sexuality? That’s the critical ability, from which dystopian surveillance fears arise. For this, the researchers seemed to have tuned the data very carefully. Remember too, this is still an operation performed on profile pics, this time from Facebook.

First, the AI classifier still seems to work, though not as well:

The classifier could accurately distinguish between gay Facebook users and heterosexual dating-website users in 74% of cases…
But when presented with the task not of telling a gay profile pic from a straight one, but of evaluating a whether given profile pic is gay, the machine’s performance fell apart:

The performance of the classifier depends on the desired trade-off between precision (e.g., the fraction of gay people among those classified as gay) and recall (e.g., the fraction of gay people in the population correctly identified as gay). Aiming for high precision reduces recall, and vice versa.

Let us illustrate this trade-off… We simulated a sample of 1,000 men by randomly drawing participants, and their respective probabilities of being gay, from the sample used in Study 1a. As the prevalence of same-gender sexual orientation among men in the U.S. is about 6–7%, we drew 70 probabilities from the gay participants, and 930 from the heterosexual participants. We only considered participants for whom at least 5 facial images were available; note that the accuracy of the classifier in their case reached an AUC = .91. Setting the threshold above which a given case should be labeled as being gay depends on a desired trade-off between precision and recall. To maximize precision (while sacrificing recall), one should select a high threshold or select only a few cases with the highest probability of being gay. Among 1% (i.e., 10) of individuals with the highest probability of being gay in our simulated sample, 9 were indeed gay and 1 was heterosexual, leading to the precision of 90% (9/10 = 90%). This means, however, that only 9 out of 70 gay men were identified, leading to a low recall of 13% (9/70 = 13%). To boost recall, one needs to sacrifice some of the precision. Among 30 individuals with the highest probability of being gay, 23 were gay and 7 were heterosexual (precision = 23/30= 77%; recall = 23/70= 33%). Among the top 100 males most likely to be gay, 47 were gay (precision = 47%; recall = 68%).
Tuned to its highest setting, the machine could find nine of the seventy gay men and threw one straight man in the gay box. Set to a broader setting, the machine found 47 of the 70 gay men, but also labelled 53 straight men as gay.
Now, we have a big technical problem: the artificial gaydar can only find most of the gay people when it produces a pool of “gay looking” people that is majority straight. So no matter how repressive and homophobic the society, it’s hard not to imagine that the “gay looking” 5% of the population will put up with this kind of system.
Of course, if we imagine that gay and straight people really have different faces and we just haven’t found the magic formula yet (and the authors seem to leap to this conclusion, for what it’s worth) then we can imagine a better AI figuring out how to tell the difference. But there are plenty of reasons to doubt that this ever has been or ever will be the case.

Trawling for terrorists… aggressive prosecution, racial profiling, one million names

For those of you as curious as I am how exactly the Justice Department has pursued its so-called terrorism cases (the sketchy Liberty City 7 case has been discussed here in the past), there’s now a fascinating look at one of the few “successful” prosecutions thus far: the Justice Department’s prosecution of the Detroit Sleeper Cell case. This American Life devoted an entire episode to Richard G. Convertino’s prosecution of four men of Middle Eastern descent for an alleged plot to attack Disneyland. Leaps of logic and imagining the worst appear to have combined with a zealous effort at prosecution. The case unraveled not due to any search for justice, but, it seems, due to internal Justice Department politics, which raise huge questions about public accountability. Reporter Petra Bartosiewicz’s The Best Terrorists We Could Find should make an interesting read when it comes out next year.

Meanwhile the FBI is proposing to let race and travel schedules tell them who is a terrorist, according to the Center for Constitutional Rights:

The proposed guidelines would give the domestic intelligence agency authority to investigate American citizens and residents without any evidence of criminal acts, relying instead on a “terrorist profile” that would include race, ethnicity and “travel to regions of the world known for terrorist activity” to spark an initial “national security investigation.”

These proposed guidelines would also allow, according to the reports, for FBI agents to ask “open-ended questions” about the activities of Muslim or Arab Americans, or investigate them if their jobs and backgrounds match other criteria considered to be “suspect.” Once this initial investigation stage was completed, a full investigation could be opened – allowing for wiretapping of phone calls or deep investigation of personal data – all guided merely by a “terrorist profile” that openly relies on race, ethnicity, religion and community connections.

Do something about it by pressuring Attorney General Mike Mukesey.

Also, the “terrorist watch list” is now over a million names. More on who’s on it from the ACLU.

Your rights in New York

bag search on the subwayHaving people visit from out of town is a reminder of what inconveniences & impositions we get used to living in New York. Aside from the random convergence of police cars in downtown & midtown to practice, definitely one of the most unnerving is the randomly asserted right to search your “large backpacks and packages” on the subway. Somehow, the knowledge that this search is not exactly mandatory, courtesy of the Fourth Amendment (that whole “search and seizure” thing), and therefore can be refused, makes me feel better. So as a public service to New Yorkers, here’s what you can do if you’re not interested in your bag being searched:

If you choose to walk through a random search area and are stopped, you may refuse to be searched. In fact, Police Commissioner Raymond Kelly has said that you are free to “turn around and leave” any subway system where police are conducting random searches.Calmly and clearly say “Officer, I do not consent to any searches. I’m going to exit the station.”Then immediately exit the station — and do not return through the same entrance.  

Thank Flex Your Rights’ Citizen’s Guide to Refusing New York Subway Searches for this information, and this handy (if somewhat alarming) flyer.More on how to expand your First Amendment rights in New York City soon…