Media Bias Example: Racial Labeling in News AI

Contextual bias occurs when media or tools omit key context, selectively frame facts, or rely on outdated data, leading readers to draw skewed conclusions. In AI for newsrooms, this manifests as models trained on historical data that embed old racial stereotypes, misclassifying modern stories.

A striking example is a multi-label classifier trained on the New York Times Annotated Corpus , which uses the antiquated ‘blacks’ label. This tag acts as a flawed ‘racism detector,’ applying to stories about discrimination against various groups but failing on contemporary events due to evolving language and norms.

Specific cases reveal the bias:

  • A Fox News article on COVID-era anti-Asian hate scored high (0.35) for ‘blacks’ simply because it mentioned ‘racism,’ while a CNN piece on the same topic scored low (0.04) and got labeled as health coverage.
  • For Black Lives Matter coverage, three articles scored correctly (0.65-0.77) thanks to ‘Black,’ but a Fox News story on BLM fundraising scored just 0.02 because it used the acronym ‘BLM’ instead.

The model correlates ‘blacks’ with crime, police, and poverty—echoing historical negative portrayals—while underperforming on Black media outlets’ diverse topics. Omission here is glaring: modern terms like abbreviations or inclusive language from post-2020 reporting are missing, as training data predates them. Readers can find fuller context in Black US Media evaluations or updated style guides from outlets like the Associated Press.

This shows how contextual bias in AI perpetuates yesterday’s prejudices, urging newsrooms to audit tools rigorously.