Examples of types of projects in scope

The Program welcomes projects that involve analyses, improvements and innovations in areas that include (but are not limited to):


Development of new tools for web harvesting, web archive curation and use

Examples:

●     Scoping crawls, analysis of website archivability      

●     Capturing web content difficult to harvest via traditional crawling (rich media, database-driven features, dynamically generated content, changing URIs)

●     Automated QA analysis of harvests

●     Automated metadata extraction/generation

●     Data mining of web archives

●     Data visualization in web archives

●     New APIs for analysis or discovery of web archives

●     Browser extensions for harvesting websites on demand                  


Improvement/extension of existing web platforms in collaboration with publishers/developers to support archiving

Examples:  

●     WordPress plugin for archiving

●     Drupal optimization for archiving                   

●     Solr 4 optimization for indexing contents of .warc files, improving full-text search relevance, result clustering, multi-language support

●     APIs (i.e. Twitter)

●     Browser configuration to improve navigation of archived websites (i.e. Chrome)

●     Promoting native compliance with Memento 

●     Integration of discovery services (such as widely adopted metasearch products)

  

Improvement/extension or innovative implementation of existing web archiving tools/services

Examples:

●     Memento

●     SiteStory

●     WAIL                

●     WebCite

●     ArchiveReady


Packaging/bundling of existing tools to assist creation, or use of web archives

Examples:

●     Supplementing crawlers with headless browsers or modules for capturing special content, e.g. streaming media

●     Providing a suite of data mining and/or data visualization options

A wide diversity of proposals within the projects in scope is being sought.  Projects must focus on automated tools and provide thorough documentation of expected behavior of the tools.  Preference will be given to projects that will modularly extend an existing tool, produce a fully functional tool, demo site or working prototype.  When appropriate the creation of a full technical specification will be considered a final deliverable (an accompanying product specification is desired but secondary).