Ethics in action: removing gender labels from Cloud’s Vision API

Tracy Frey is always learning. The daughter of an infectious disease physician at the forefront of pediatric cancer and AIDs research, and an early childhood development expert who worked as a child advocate on Capitol Hill, she learned from an early age that solving complex problems is often the cumulation of many small steps taken with integrity. Her work as Director of Product Strategy & Operations for Google Cloud AI and Industry Solutions, is no exception. 


A few years ago, Tracy’s team was one of the first groups at Google working to develop a set of principles that would help them review AI products from an ethical and responsibility standpoint. Today, she works closely with teams across Google to put what eventually became Google’s AI Principles into practice for Google Cloud. 

An impactful decision 


“There is no checklist for ethics,” says Tracy. “It doesn’t exist.” Instead, each decision is evaluated by a standing committee plus experts relative to the product being reviewed, which consists of members from a variety of backgrounds, functions, and levels at Google. 


One such decision, announced in February of this year, was to remove gender labels from Cloud’s Vision API, which is a machine learning service that developers can use to predict what objects are in an image.  Previously, the Cloud Vision API would predict if an image presented to it contained a man or a woman, based on all the accurately labeled images of men and women in its dataset. The need to rethink the accuracy of those labels arose when a Googler put a photo of themselves into the API and was misgendered.


According to Tracy, this is a perfect example of how technology should evolve alongside cultural understanding: “Our overall understanding of gender — including the difference between sex assigned at birth and gender identity and/or gender expression, as well as an understanding of gender fluidity — has grown. We have a clearer recognition of the harm of an error like this, that we might have mistakenly believed to be innocuous 10 years ago. Or worse, that we as a society didn’t even think to evaluate 10 years ago. So we made a different decision for now going forward than we had previously.” 


In reaching this conclusion, the committee had to weigh both positive and negative outcomes. “There are many applications in which gender labels are helpful,” says Tracy. One example is the Geena Davis Inclusion Quotient, which used gender labeling to identify gender disparities in the film industry. Of course, the implications of misgendering are also worrisome: “Imagine if TSA used this technology to incorrectly verify people’s identification at airport security.” 


The team also had to consider potential outcomes outside their frame of reference. “The systemic bias that would be perpetuated by this type of automated misgendering could be harmful in so many ways,” says Tracy. Ultimately, the committee agreed that the decision to remove the gender labels aligned with Google’s AI Principle #2: avoid creating or reinforcing unfair bias.


Difficult discussions worth having 


“The AI Principles are high level for a reason,” says Tracy. “Google can and should make different decisions about the same technology depending on how it is going to be used.” The work of aligning with the AI Principles varies depending on the project, and requires robust, complex, and sometimes difficult, discussions. 

For Tracy, the only thing certain about this work is that it will always be evolving. “Getting comfortable with that ambiguity is really challenging, but it’s also important,” says Tracy, who is already noticing the effects of the review process. “More and more, people are coming to us who never would have thought about anything like this before, asking us to help them identify bias, accountability or safety issues that they haven’t considered. We are changing the way people think about AI. And that is incredibly exciting.”  


Follow Tracy on Twitter @TracyFrey