Cambridge Reelects Council Incumbents, Shifts Toward Center Following Progressive Departures


Cambridge Issues Vote of Confidence in School Committee, Reelects All Four Incumbents


Harvard Dean of Science Christopher Stubbs to Step Down at End of Academic Year


Harvard Business School Professor Francesca Gino’s Research Collaborators Launch ‘Many Co-Authors Project’ to Check Her Work


Harvard College to Discontinue ‘Linking’ for Blocking Groups in Housing Lottery

Harvard Computer Science Professor Fernanda Viégas Addresses AI Bias in Radcliffe Institute Talk

Fernanda B. Viégas, a Harvard Computer Science professor, gave a talk about generative artificial intelligence at the Radcliffe Institute for Advanced Study Wednesday.
Fernanda B. Viégas, a Harvard Computer Science professor, gave a talk about generative artificial intelligence at the Radcliffe Institute for Advanced Study Wednesday. By Soumyaa Mazumder
By Xinni Chen, William C. Mao, and Olivia W. Zheng, Crimson Staff Writers

Harvard Computer Science professor Fernanda B. Viégas spoke about bias in generative artificial intelligence at a talk hosted by the Radcliffe Institute for Advanced Study on Wednesday.

During the event, titled “What’s Inside a Generative Artificial-Intelligence Model? And Why Should We Care?,” Viégas spoke about experiments that showed AI models responded differently based on how researchers presented themselves.

In one experiment, Viégas said, she began a conversation with a chatbot in Portuguese — a language which uses gendered pronouns — about what she might wear to a hypothetical dinner.

At first, the chatbot addressed Viégas using masculine pronouns, she said. After she mentioned wearing a dress at the meal, the chatbot addressed Viégas in feminine pronouns without acknowledging the sudden shift.

“This got me thinking about the fact that there might be something internally in the system that actually cares about gender,” Viégas said. “Was there an internal model of the user’s gender or not?”

Viegás said AI models can also exhibit sycophancy, which she defined as mirroring the user’s beliefs. In one study, she said, a chatbot gave different answers about the ideal size of government depending on whether the user self-identified as conservative or liberal.

She suggested that AI chatbots may possess a more cohesive worldview rather than merely predicting text based on user input.

“Are they the kinds of systems that all they’re doing is memorizing?” she said. “Or are they doing something that goes beyond just statistics, where they can glimpse something about the structure of the world?”

Viégas proposed developing an AI dashboard which would display assumptions an AI model makes about a user, such as gender, education, and income, as well as the model’s assessment of its own utility.

“If they internalize some notion of our world,” Viégas said, “Wouldn’t it be nice if we at least knew about it so we could do something about it?”

Viégas said she drew the inspiration for the AI dashboard from a visit she took last summer to the National Railway Museum in England. There, she said, she learned about how railway engineers collected data on early models of locomotives, then a new and potentially dangerous technology.

Though the field of generative AI is still new, Viégas said she felt comforted that past engineers have adapted to novel and unfamiliar technologies.

“What they were doing is what we’re doing,” she said.

“It’s building all these gizmos, and all these ways of measuring something that turns out to be quite powerful, and not fully understood,” Viégas added. “So that gave me a lot of hope, like, ‘Okay, we’ve not fully understood things before that turned out to be incredibly important.’”

Want to keep up with breaking news? Subscribe to our email newsletter.

Radcliffe InstituteComputer ScienceArtificial Intelligence