Google To Scan Library Books

Books from University collections will be searchable online in new initiative

Google is partnering with Harvard to make 40,000 of the University’s library books searchable online in a pilot project that could lead to the digitization of all 15 million books in Harvard’s collections.

The initiative, set to be announced today, will allow anyone on the internet to browse through uploaded works as part of the Google Print project, an arm of Google. The University of Michigan, Oxford University, Stanford University and the New York Public Library are also participating in similar programs to be announced today.

Harvard will make the books available and Google will scan the books and bear all costs, according to Pforzheimer University Professor Sid Verba ’53, who is director of the Harvard University Library (HUL). Verba said that making Harvard’s books available digitally has long been a priority but until now has been infeasible.

“The collaboration with Google allows us to do something that we could not have possibly afforded on our own,” Verba said. Google Product Manager Adam Smith declined to comment on the cost of the project, citing company practice of not disclosing the terms of business deals.

The 40,000 books will be selected mostly at random from the 5 million books at the Harvard Depository, with the most delicate works exempted from the process, according to Peter Kosewski, director of publications and communications for HUL. He said Google will keep the books at the depository and digitize them on site to minimize disruption to researchers.

The project—which is estimated to take six months—has been in the planning stages for over a year and received the approval of the University’s top governing body, the Harvard Corporation, Verba said.

Verba said the new partnership with Google marks an unprecedented step in making Harvard’s books accessible to the public. Harvard’s libraries have made a smaller number of books available digitally through its Open Collections project.

Verba explained that researchers will not “really get the book out of their computer.”

“What they’re going to get is information about what’s in books and information on where you can find the books,” Verba said.

Books in the public domain will be fully available online so researchers can use Google Print in place of a trip to the library, according to Smith. Much smaller excerpts of copyrighted works will be displayed, he said.

While Google may upload some of Harvard’s copyrighted works, they will not be displayed for now, Kosewski said.

Verba said the collaboration was sparked when Google approached Harvard a few years ago. He originally expressed concerns about keeping the collections available to researchers and avoiding damage to the books. But he said that when representatives from Google came back to talk a year later, they had addressed these concerns with an effective scanning technology.

“It is very specifically designed to be non-destructive,” Smith said. “In working with these libraries and their collections, we need to be extremely careful.”

“One of the reasons we are doing this pilot study with 40,000 books is to see if that’s true,” Verba said. “We really don’t want to plunge into a really mega project without that assurance.”

Kosewski said the project would be a “learning experience.”

“We’ll figure out where the issues are, and there are a lot of issues from a lot of angles,” he said.

Verba said that he would have liked to discuss the idea with faculty and students before beginning the project but was precluded from doing so by an agreement with Google.

“This made me, to be honest, somewhat uneasy, because I wouldn’t like to do something major to the library without revealing it” to the Harvard community, he said.

He added that Harvard is not obligated to expand on the pilot program.

University President Lawrence H. Summers said yesterday he was excited because the project would make Harvard’s collections easier to search and more widely accessible.

“I think that it offers the opportunity to make our library an even stronger school resource,” he said.

—Staff writer Stephen M. Marks can be reached at marks@fas.harvard.edu.