Finalist category: AbilityNet Accessibility Award
AbilityNet Accessibility Award, Finalist, 2018
Using artificial intelligence and machine learning services to power accessible technology.
Facebook is using artificial intelligence and machine learning services to power accessible technology for people with low vision and vision loss.
In 2016, Facebook launched Automatic Alt Text (AAT) a feature that uses object recognition to describe photos to people who are blind or who have low vision and use screen readers. In December 2017, they launched a Face Recognition tool that can tell people who use screen readers which friends appear in photos in their News Feed even if they aren’t tagged.
How AAT Was Developed
By programming machines through AI, Facebook was able to improve access to content at a global scale. AAT can currently detect 100+ concepts, such as: the number of people in a photo, whether people are smiling, physical objects like “car”, “tree”, “mountain”, and others.
This was an enormous technical challenge that was overcome by leveraging Facebook’s Computer Vision Engine. The engine is based on state of the art deep convolutional networks and provides web-scale scalability and reliability. The models for the concepts exposed through AAT were learned based on the utilisation of millions of human-labeled examples, and were evaluated by Facebook’s engineering group to guarantee an adequate balance of precision and recall. The development team also ran multiple rounds of user research and updated AAT based on feedback.
Leveraging Face Recognition for Accessibility
Facebook’s Face Recognition technology analyses the pixels in photos and videos, such as a user’s profile picture and photos and videos that the user has been tagged in, to calculate a unique number, which is called a template. When photos and videos are uploaded to Facebook’s systems, those images are compared to the template to find matches. Now, using this technology, people who use screen readers can know which friends appear in photos in their News Feed, even if they aren’t tagged.
The Challenge, and Facebook’s Solutions
Every day, people share over a billion photos on Facebook. The goal with the AAT and Face Recognition tools was to greatly improve the experience that people with vision loss have with this commonly shared media, including for the approximately 350,000 people registered as blind or partially sighted in the UK.
Through research with the vision loss community, Facebook knew that users of screen readers engage with photo content, but also that they desired more context for a photo’s content. That desire is not specific to Facebook, but broadly applicable to the interactions that users of screen readers have with digital experiences.
The traditional mechanism for describing photos to people with vision loss is the use of alt text. Traditionally, alt text requires that the content creator supply a secondary description on a per photo basis, which is both time consuming as well as an uncommon consumer activity.
To address this challenge, automatic tools powered by AI were created to describe photos on Facebook. Automatic solutions dramatically increase the number of photos that have supplemental text descriptions. About 75% of photos shared on Facebook now have at least one object identified by AAT.
As noted above, feedback from the community was integral to the development of AAT. To refine the AAT experience, Facebook ran multiple rounds of user research, including: (1) 1:1 person interviews with users of screen readers to test out early prototypes; and (2) a 2-week experiment on Facebook for iOS with a control (no AAT) and an experimental (AAT) condition, with follow-up surveys to both groups. Based on survey feedback, Facebook updated AAT to understand more about what people are doing in photos.
Facebook’s mission is to bring the world closer together, and accessibility is a core part of that mission. Facebook’s goal is not to simply include people with disabilities on its platform, but to create experiences that change what people believe is possible in the space of technology and disability.
As Facebook continues to improve its object and face recognition services, AAT and Face Recognition will continue to provide more descriptive narratives for visual content. Further, while Facebook built the AAT and Face Recognition tools to improve Facebook’s services and build a more inclusive platform for people with vision loss, Facebook believes that this deployed product experience demonstrates the importance of AI for enabling better access to content across the web for persons with disabilities. Facebook strongly believes that AI is the future of improving additional interaction experiences at scale, whether they are visual in nature or otherwise. As AI systems get better at understanding images, video, audio, and other media, Facebook believes that more novel and robust innovations in accessibility will follow.