Turning the Page

​While Harvard's library digitization efforts have presented new opportunities for scholars to access and preserve sources for posterity, they have also brought their own set of challenges.
By Archie J.W. Hall and Lucy Wang

Librarians may scan over 7,000 pages each day to digitize the physical resources in Harvard's archives.
Librarians may scan over 7,000 pages each day to digitize the physical resources in Harvard's archives.

Hidden amid the labyrinthine stacks in the basement of Widener Library, more than a dozen machines slowly transform books and artifacts into digital files available for download. Librarians stand next to the hulking machines for hours, painstakingly scanning more than 7,000 pages every day. They take care not to damage the delicate binding of the original books.

This process—digitizing library collections—occurs across the Harvard Library System, which includes 79 libraries. The end result: Everything from rare books to pamphlet and map collections to medieval manuscripts, business records, architectural drawings, and photographs can now be found on various websites of the Harvard library, and not just in the stacks.

Librarians may scan over 7,000 pages each day to digitize the physical resources in Harvard's archives.
Librarians may scan over 7,000 pages each day to digitize the physical resources in Harvard's archives. By Archie J.W. Hall

Since 1998, when the University started the Library Digital Initiative, Harvard Library has gradually made many of its materials available online and libraries across the University have opened facilities to digitize materials. Last year alone, Widener digitized more than 1.8 million artifacts, a 200,000 increase from 2015 and 600,000 more than 2014, according to Franziska Frey, Harvard Library’s Chief of Staff.

While these efforts have presented new opportunities for scholars to access and preserve sources for posterity, they have also brought their own set of challenges. Even as more people turn to the digital realm for their research, librarians and researchers say the physical archives remain invaluable. But some scholars must seek funding and support outside of Harvard to digitize many sources, a process some say can be complicated.

“If there are stakeholders who want all of the stuff online, Harvard is not really inclined to put its own money to that enterprise. What you want to do is to find money outside of Harvard, and then you’re good to go,” said Stephen Chapman, manager of digital strategy for Collections at the Harvard Law School library.


With millions of books, artifacts, and sources in Harvard’s system, librarians said they must prioritize some materials over others.

“The volume of work that there is to do exceeds our capacity to do it all at this point. There’s a component of selecting what gets digitized and what doesn’t. That’s a challenge,” said Thomas A. Hyry, lead librarian of Houghton Library.

Donations and grants from outside organizations or individuals can dictate priorities for the digitization of certain materials. If a researcher or scholar already has funding for their digitization project, the probability that the library will digitize the requested materials is much higher.

At Harvard Law School, librarians are digitizing 40 million pages of court decisions after Ravel Law, a startup that posts legal documents online, provided the funding. The project, called Caselaw Access Project, will make United States case law available on Ravel’s website.

“When people come to us with their passion and their money, they’re much more likely to get their stuff digitized than if someone comes to us with their passion and no money,” Chapman said.

Similarly, outside funding allowed researchers to digitize 5,136 items over the course several years for the Colonial North American Project, one of the Harvard Library’s largest ongoing projects. Eventually librarians hope to digitize all known archival and manuscript material from 17th and 18th Century North America in Harvard Library. The Arcadia Fund, a British organization that “supports charities and scholarly institutions that preserve cultural heritage and the environment,” has provided funding for the project.

At the Schlesinger Library at the Radcliffe Institute of Advanced Study, the National Historical Publications and Records Commissions gave $150,000 to digitize the Blackwell family collection.


Once a collection of materials does become digitized, professors and students say they encounter sources they may never have been able to gain access otherwise.

In one research project, University Professor Ann M. Blair said that the 65 sources she needed were so spread out across the world she may not have been access them if they were not digitized and online.

“I just finished this project, which would have been completely unthinkable in the age before digitization,” she said. “But because of digitization, I was able to find all of the books except one online, which I was still able to find online on film.”

Beyond research projects for professors, though, digitized sources can also help students in their studies. For more than 10 years, University Professor Laurel Ulrich has used the digital materials of the Artemas Ward house, a historical house from the 18th century, in her class titled, History 84c: “This Old House: A Social and Environmental History.”

Nathaniel R. F. Bernstein ’17, a student in the course, said the meticulous photos of every item in the house and the ability to search the house’s family records has enabled his research for the course.

“For a lot of archival sources you don’t need to go into the library, you have these great high-quality prints and scans that you look at online,” Bernstein said. “This makes a huge difference in terms of your ability to look at a lot of sources for your historical research in your own time.”

In addition to making it easier to access the sources, some researchers said digitizing sources allows them to analyze and discuss the materials in more sophisticated ways.

Bonnie Burns, the Head of Geospatial Resources at the Harvard Map Collection, said she can perform quantitative analysis of digitized maps.

“I see all of this as a treasure trove of data. We scan these, we go through the process of georeferencing, where you assign real-world coordinates, latitude and longitude to the image,” she said. “Suddenly this piece of paper becomes data.”

For example, Burns pointed to maps of railroad networks in the 19th century. “You can trace out those railroads and do quantitative analysis on them, you can ask what happened to the population of an area after the railroad went through one spot but not another,” Burns said.

Making a digital copy of a source also preserves materials that can sometimes be physically sensitive.

“As a means of preservation, if there is something that is very fragile and falling apart, then we can capture it so that if deteriorates further, we at least know what it looked like in 2016,” Hyry said.

But in order for digitizing resources to matter, librarians said students and professors must be able to find them online.

“We want to know the impact of our work and what we can do to make it easier for students to know that these materials are there,” Frey said.


Though the advances of digitization have aided researchers in their projects, some say it cannot replace consulting a physical source. But both Ulrich and Bernstein said that accessing the original material remains important despite the prominence of digitized sources.

“I think, to be a serious researcher, and this is probably not always the case for undergraduates, if you’re at Harvard, you don’t want to sit in front of a computer all the time,” Ulrich said. “I really encourage my students to go to archives to look at the physical materials.”

Bernstein said he appreciates the process of physically accessing a source.

“It’s definitely a different experience than walking into an archive, picking something up, and physically touching it, and also having the conversation with the archivist that leads me to other sources,” Bernstein said.

Sometimes, the original artifact includes information not available in the digitized version, according to Blair.

“In one of the books I was using, the metadata—that is, the information describing the book—in the archives had noted that there was a fold-out table in the back. But it wasn’t in the digitized copy of the book,” she said.

For Frey, it’s important students and professors consult both physical and digital sources.

“Human beings are funny. They say, I did this this way before, now I can do it this way, and they replace what they used to do with the new method, instead of combining the best of both methods and learning the most that way,” she said. “And that’s how research should be, a combination of digitization and looking at the original artifacts.”

—Staff writer Archie J.W. Hall can be reached at archie.hall@thecrimson.com.

—Staff writer Lucy Wang can be reached at lucy.wang@thecrimson.com.

LibrariesUniversityTechnologyUniversity News