Website Preservation

Managing University Records

Columbia website, 2009.

Websites are university records and as such the University Archives is committed to the capture and continued access of the University’s entire web domain. The Columbia University Libraries have been capturing Columbia websites via the Archive-It service since June 2010. All sites with a columbia.edu web address are included in our routinely scheduled web crawls.

The Library’s Web Resources Collection program has put together the Guidelines for Preservable Websites. These guidelines not only help with our web archiving efforts but will also make it easier for search engines to find and index your site.
If you maintain Columbia content in a web domain outside of columbia.edu, please contact uarchives@columbia.edu so we can be sure to include your site in our future crawls.
If you are retiring a website either as part of a redesign/relaunch effort or at the end of a project or initiative, be sure to contact uarchives@columbia.edu so we can make sure to capture your site before it goes offline.
To learn more about our web archiving efforts and to learn how to access the archived pages, please check out the Web Archives page on this site.

RBML Instagram (@columbia_rbml), March 2021

Archiving social media has always been imperfect: how do you capture dynamic content that is constantly changing and how do you navigate platforms that require users to log in to access any content? So far, the best approach is to manually archive the desired social media feeds directly as a logged-in user of the respective platforms. This way the look and feel of the platform is fully accessible. There are two good options for this approach.

Conifer is a free tool that used to be called Webrecorder, and still based at Rhizome (New Museum). With Conifer, you set up a free account, log in, and use their web interface to load specific web content that you want to archive in your browser, and then as you click around the tool archives each page/file that you click on. You can stop recording at any time, and immediately replay the content you have just archived. The data resides in their cloud account, but you can download it and save it locally as well. Importantly, archives created in Conifer can be downloaded in .warc format, which can be uploaded into our University Archives web collection.

The Chrome extension Archiveweb.page is an even lighter-weight version of this manual approach to web archiving, created by the same developer, Ilya Kremer. This tool is very easy to use: the user just has to add the Archiveweb.page extension to their Chrome browser. Once logged into Twitter, FB, Instagram, etc., go to the feed you wish to archive, and use the extension to archive exactly what you want by clicking on each relevant page/link in your browser. These captures can be stored as collections on the browser's device and can also be downloaded as .wacz or, preferably, .warc files, which can be uploaded into our University Archives web collection.

For more information, please see our guidelines on how to transfer digital records.

CUL - Main Content

Managing University Records