The sorts of “fragile” materials that often come into the Libraries’ care are easy to imagine, as are the challenges presented by their delicacy: a stack of warped vinyl records, a frayed and faded manuscript, or a scrap of ancient papyri. Efforts to preserve such items, and in some cases, digitize them, is mission-critical for a research library. In just the past few decades, however, a new mode of preservation has urgently emerged: web archiving. Web archives preserve vulnerable information that may disappear from the live web and capture the ways in which selected websites have evolved over time.
Websites, along with other born-digital assets (those with no physical counterparts), are at high risk of disappearing. When a publication folds, a government changes hands, or a political movement peters out, websites may be taken down or lost. It is now the necessary work of web archivists, including a team at Columbia Libraries, to capture and preserve elements of potential scholarly interest from the live, ever-changing web. Today, the Libraries offers a growing collection of web archives, all of which are publicly accessible through Archive-It, “a subscription web archiving service from the Internet Archive that helps organizations to harvest, build, and preserve collections of digital content.”
“Our culture, our society, and our entire world take place on the web,” said Web Resources Collection Coordinator Alex Thurman. “Along with most everyone else in the world, users of the Libraries consider the web a crucial tool. Web archiving ensures that web-based resources are preserved for future scholars, not lost to the fast pace and impermanence of the Internet.”
“It’s relevant and important work to engage in collecting something as fragile as web content. Digital content can simply disappear because the web is always changing, but it’s the responsibility of the Libraries to provide researchers with the most relevant and useful information possible – which is something that we, as archivists and librarians, are proud to stand by.”
– Samantha Abrams, IvyPlus Web Resources Collection Librarian
The Libraries began formally collecting web resources in 2008 with support from the Andrew W. Mellon Foundation and in partnership with the University of Maryland. Part of the program sought to assess the feasibility of incorporating web resources into existing collections.
As with standard print collecting, web collecting is limited by space and resources. Decisions about exactly what to collect and archive from the web needed to be addressed. The Libraries chose human rights as the focus of its exploratory web archiving work, given an existing pre-Internet collections commitment in the area.
“Human rights worked well for web archiving because the Libraries already had a strong collection of print materials on the subject under the Center for Human Rights Documentation and Research,” said Pamela Graham, Director of Humanities & Global Studies in the Libraries. “And because so much of the work produced by organizations and scholars in the field is only available online, the web archive both complemented our existing collection and filled a pressing need to conserve these resources.”
With the Mellon Foundation support, the Libraries built the Human Rights Web Archive, a searchable collection of archived copies of human rights-related websites. The initial grant project also spurred two subsequent phases that involved expansion into other thematic collections and later, collaboration with fellow research libraries and web archiving programs.
“Our intention was – and still is – to supplement the Libraries’ print and other digital collections with important material that is only accessible on the web,” Thurman said. “Our goal is to advance the collecting and teaching missions of Columbia’s libraries and schools.”
One example of this commitment is the Avery Architectural & Fine Arts Library’s Historic Preservation and Urban Planning Web Archive, which not only enhances the library’s print and special collections, but supports programs in preservation and urbanism at the Graduate School for Architecture, Preservation, and Planning (GSAPP). Steered by Architecture Librarian Chris Sala, curation of this particular web archive is heavily informed by her work with GSAPP faculty and students.
For example, recent additions to this archive responded directly to a new course on urban planning and resiliency, an emerging field of study that surveys how developers and engineers proactively plan cities that can withstand natural disaster.
“Because the subject of urban planning and resiliency is still relatively new, there were few published articles or books available to students in the course,” said Sala. “But developers, engineers, and city planning officials share their work on their websites, which provided excellent content for the web collection and addressed many of the students’ research needs.”
In addition to building its own publicly accessible web archives, the Libraries has joined in collaborative web collections efforts with peer institutions through the IvyPlus partnership, and with international partners.
“Collaborative collections are a unique and forward-thinking approach to web archiving,” said IvyPlus Web Resources Collection Librarian Samantha Abrams, who heads a cohort of academic research libraries in web-based collecting.
Though the volume of the Libraries’ web archives has not matched its print holdings, the collection area commitments are central to the mission to build, sustain, and facilitate access to scholarly material regardless of format, especially those that face potential degradation or evanescence.
“It’s relevant and important work to engage in collecting something as fragile as web content,” said Abrams. “Digital content can simply disappear because the web is always changing, but it’s the responsibility of the Libraries to provide researchers with the most relevant and useful information possible – which is something that we, as archivists and librarians, are proud to stand by.”