Siri, Alexa and other programs sometimes have trouble with the accents and speech patterns of people from many underrepresented groups.
“Clow-dia,” I say once. Twice. A third time. Defeated, I say the Americanized version of my name: “Claw-dee-ah.” Finally, Siri recognizes it.
Having to adapt our way of speaking to interact with speech recognition technologies is a familiar experience for people whose first language is not English or who do not have conventionally American-sounding names. I have even stopped using Siri because of it.
Implementation of speech recognition technology in the last few decades has unveiled a very problematic issue ingrained in them: racial bias. One recent study, published in PNAS, showed that speech recognition programs are biased against Black speakers. On average, all five programs from leading technology companies like Apple and Microsoft showed significant race disparities; they were twice as likely to incorrectly transcribe audio from Black speakers as opposed to white speakers.
In normal conversations with other people, we might choose to code-switch, alternating between languages, accents or ways of speaking, depending on one’s audience. But with automated speech recognition programs, there is no code-switching—either you assimilate, or you are not understood. This effectively censors voices that are not part of the “standard” languages or accents used to create these technologies.
To continue reading this article, click here.