Wearing a mask protects you from more than just the coronavirus. A new federal study of 89 facial recognition algorithms has found that every single one of them performed worse – many over 30 times worse – when analyzing images of masked faces.
At a time when governments worldwide are toying with surveillance technologies as solutions to enforce COVID-19 quarantines and mask-wearing is increasingly ubiquitous, the results of this study have major implications for today’s face recognition developers.
The Face Recognition Vendor Test (FRVT) at the National Institute of Standards and Technology (NIST) has been the paramount source for third-party assessments of face analysis algorithms for years. A nonpartisan federal agency, NIST has published dozens of reports since 2017, auditing over 200 facial analysis algorithms developed by private companies.
In December 2019, the team released a landmark report examining demographic disparities in facial recognition systems – the most comprehensive such study to date. NIST found that the vast majority of algorithms performed best when recognizing the faces of middle-aged white men, and performed markedly worse when analyzing faces of women, darker-skinned individuals, children, and older adults.
In their latest study, NIST researchers photoshopped masks onto images of 1 million faces from previously existing datasets, taking care to vary the shape and color of masks as well as the degree of nose coverage. The study matched high-quality images of individuals’ unmasked faces to low-quality images of the same individuals’ masked faces. This design mimics the likely scenario in which a masked individual might attempt to authenticate against a formal database photo of themselves without a mask.
This time around, the NIST team chose to look only at facial verification (1:1, one-to-one) tasks, which attempt to determine if two photos are of the same person. A proposed future study will expand the analysis to evaluate the accuracy of facial identification (1:N, one-to-many) algorithms on masked faces: the more complex task of comparing one photo against a large database of photos.
Results show that all – that’s 100 percent – of the tested algorithms performed worse when attempting to verify the identity of a masked face as compared to a non-masked face. For the most accurate algorithms, the false non-match rate (how often an algorithm fails to match two images of the same person) increased by about 16 times, from ~0.3 percent to ~5 percent. However, in some algorithms the masked false non-match rate rose by over 75 percent. And in multiple cases, mask-wearing completely interrupted algorithms’ ability to detect a face in the first place, much less analyze or identify it.
Furthermore, the type of mask matters: masks which cover the entire lower portion of the face (like many homemade masks) flummoxed the algorithms more than circular masks (like N95s), and masks which covered more of the nose led to worse algorithmic performance than masks worn below the nose.
The researchers took care to point out that their study didn’t explore some potentially important factors in algorithms’ success, such as the effects of masks with patterns on them, or differences between demographics. It’s possible that wearing masks with patterns, logos, or text might further decrease accuracy. And, given the findings from NIST’s demographic report, there’s good cause to worry that any problems caused by mask-wearing would be aggravated in dark-skinned, feminine, young, or elderly faces.
In the future, another study promised by the NIST team will assess the performance of so-called “face mask capable” algorithms, which were developed after the onset of the pandemic and claim to be better suited to analyzing masked faces.
Ultimately, NIST’s findings fly in the face of claims that face recognition could be a useful public health tool to enforce quarantines and fortify contact-tracing during COVID-19 outbreaks. Clearly, even today’s state-of-the-art face analysis algorithms are ill-suited to recognizing masked individuals.
And no matter if new “face mask capable” algorithms somehow find a way to accurately identify people by their forehead wrinkles alone, face recognition technology is dangerous when it doesn’t work and when it does.
The government has no business using dystopian technologies to track its citizens, even when faced with a global pandemic. Flawed, privacy-threatening surveillance methods are not the answer to our current public health crisis. More governments should follow the example of the 14 communities in Massachusetts, California, and Maine which have already voted to press pause on the use of face recognition technology by banning its use in local government.
Ultimately, we need law reform to protect all people from discriminatory, dangerous technology like face surveillance. For now, glitchy surveillance technology gives us one more good reason to wear a mask.
This blog post was written by ACLU of Massachusetts Staff Technologist Lauren Chambers.