A new paper has found that large language models from OpenAI, Meta, and Google, including multiple versions of ChatGPT, may subtly discriminate against African Americans based on their speech patterns.
This paper, published in early March, examined how large language models (LLMs) perform tasks such as determining whether analyzed text is African American English or Standard American English, matching people with certain jobs without revealing race. They found that LLMs were more likely to associate African Americans with jobs that do not require a college degree, such as chefs, soldiers, or security guards.
The researchers also conducted hypothetical experiments asking AI models whether they would convict or acquit individuals charged with unspecified crimes. They found that individuals speaking African American English had higher conviction rates across all AI models compared to Standard American English.
This paper, published as a preprint on arXiv and not yet peer-reviewed, perhaps most disturbingly revealed findings from a second experiment related to crime. The researchers asked the models whether they would sentence individuals to life imprisonment or the death penalty for first-degree murder. The individual's dialect was the only information provided to the models in the experiment.
They found that LLMs were more likely to sentence individuals speaking African American English to the death penalty compared to those speaking Standard American English.
In their study, the researchers included OpenAI's ChatGPT models, including GPT-2, GPT-3.5, and GPT-4, as well as Meta's RoBERTa and Google's T5 models, analyzing one or more versions of each model. They examined a total of 12 models.
Interestingly, the researchers found that LLMs were not overt racists. When asked, they associated African Americans with extremely positive traits such as "intelligent." However, based on whether African Americans spoke African American English, they associated them with negative traits such as "lazy." As the researchers explained, "These language models have learned to hide their racism."
They also found that implicit biases were higher in LLMs trained with human feedback. Specifically, they noted that the difference between overt and covert racism was most evident in OpenAI's GPT-3.5 and GPT-4 models.
The authors wrote, "His findings once again demonstrate a fundamental distinction between explicit and implicit biases in language models - mitigating explicit biases does not automatically translate into mitigating implicit biases."
In conclusion, the authors argue that this contradictory finding regarding overt racial bias reflects America's inconsistent attitudes towards race. They point out that during the Jim Crow era, openly propagating racist stereotypes about African Americans was accepted. After the civil rights movement, this changed, making it "illegal" to express such opinions and causing racism to become more covert and subtle.
The authors suggest that their findings indicate that African Americans may face greater harm in the future due to dialect biases in LLMs.
The authors state, "While we have constructed the details of the tasks, the research findings reveal real and pressing concerns as business and jurisdiction are areas where AI systems involving language models are currently being developed or deployed."