Amid Boston Overdose Crisis, a Pair of Harvard Students Are Bringing Narcan to the Red Line


At First Cambridge City Council Election Forum, Candidates Clash Over Building Emissions


Harvard’s Updated Sustainability Plan Garners Optimistic Responses from Student Climate Activists


‘Sunroof’ Singer Nicky Youre Lights Up Harvard Yard at Crimson Jam


‘The Architect of the Whole Plan’: Harvard Law Graduate Ken Chesebro’s Path to Jan. 6

Undergraduates Ramp Up Harvard AI Safety Team Amid Concerns Over Increasingly Powerful AI Models

Harvard students formed the Harvard AI Safety Team last spring to reduce risks associated with AI.
Harvard students formed the Harvard AI Safety Team last spring to reduce risks associated with AI. By Julian J. Giordano
By Makanaka Nyandoro, Crimson Staff Writer

Undergraduates in the Harvard AI Safety Team have ramped up activity amid heightened public concern over artificial intelligence’s rapid development and lack of regulation.

The group, which was founded last spring and has grown to 35 Harvard and MIT students with a fellowship program of roughly 50 other students, conducts research and hosts reading groups each semester. HAIST projects — which center around reducing risks from AI — have included new methods for detecting anomalous behavior by AI as well as a project locating the components within AI systems that are responsible for specific behavior.

HAIST Director and Founder Alexander L. Davies ’23 wrote in an email that he decided to start the group after he began conducting machine learning research, where he confronted arguments that “increasingly powerful AI systems might pose serious risks to humanity.”

In a joint statement, Davies and HAIST Deputy Director Max L. Nadeau ’23 wrote about the need for more research into AI safety and their concern that as AI systems become more powerful, they may “potentially learn to deceive and manipulate human evaluators.”

“Compared to the rapid progress in making AI systems more powerful and economically relevant, we’ve had less success in understanding these AIs and increasing their reliability,” Davies and Nadeau wrote. “Even the most advanced AI systems confidently assert falsehoods, respond maliciously to users, and write insecure code that evades the notice of the humans they are meant to assist.”

Nadeau wrote that there are external pressures influencing the development of potentially dangerous AI models.

“Economic or military competition between organizations building powerful AI may create a race to the bottom, in which organizations deploy dangerous models early to stay ahead of their competition, despite the fact that many of the leaders of top AI companies acknowledge extremely high levels of risk from future systems,” he wrote.

HAIST member and MIT graduate student Stephen M. Casper ’21 said much of HAIST’s work is to ensure research and governance are not “so badly outpaced by technology,” emphasizing that technological progress can “put lots of power into the hands of very few people.”

“There are a million good things and a million bad things that AI systems could do,” Casper said. “I think there’s the mentality about AI being very impactful and a mentality about risk and reducing it,” he said.

In addition to pursuing research on AI topics, HAIST also offers an introductory reading group with a curriculum created in collaboration with Richard Ngo, a researcher at OpenAI. OpenAI created ChatGPT, a viral chatbot that generates AI responses to a user’s prompts.

“HAIST is an attempt to create the intellectual community around AI safety research that I wish I’d had when I was first getting interested in it,” Davies wrote.

“We believe that AI safety is an interesting technical problem, and may also be one of the most important problems of our time,” he added.

—Staff writer Makanaka Nyandoro can be reached at

Want to keep up with breaking news? Subscribe to our email newsletter.

CollegeStudent GroupsCollege News