Project Roles

Harvest: IA, CDL, LOC, UNT
Curation/selection: LOC, IA, CDL
Curation tool development: UNT
Storage: LOC, IA, UNT
Content analysis: IA, UNT
Access: IA, CDL

Project Partners

The construction of the End of Term Web Archive was a truly collaborative project, drawing on the skills and resources of each partner institution.

California Digital Library (CDL)

The California Digital Library provides digital library services and tools to the University of California and the digital library community at large. CDL developed the open source eXtensible Text Framework (XTF) behind this archive.


George Washington University (GWU)

The George Washington University Libraries has been building and using software tools to support researchers collecting social media data since 2012. With the support of grants from IMLS, NHPRC, and CEAL, GW Libraries developed the open-source Social Feed Manager (SFM) with capabilities to collect from Twitter, Flickr, Tumblr, and Sina Weibo. In addition to using SFM to support academic research in a wide array of disciplines and to support teaching and learning, GW Libraries builds and publicly shares social media data sets for reuse.

Internet Archive (IA)

The Internet Archive (IA) is a 501(c)(3) non-profit that was founded to build an internet library offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format. Founded in 1996 and located in San Francisco, IA's collections include texts, audio, moving images, television, and software as well as archived web pages. IA participates in the NDIIPP program and is a founding member of the National Digital Stewardship Alliance. Please visit the Internet Archive home page to learn more.


Library of Congress (LOC)

The Library of Congress has been archiving the web since 2000, with collections focusing on sites of Legislative Branch agencies, U.S. House and Senate offices and committees, and U.S. national election campaigns, among other thematic collections. More information about the Library's Web Archiving program is available at the Library of Congress Web Archives page; more information about the Library's other resources can be found at the Library of Congress home page.


Stanford University Libraries (SUL)

Stanford libraries has been archiving the Web for over 15 years, and are active members of the International Internet Preservation Consortium (IIPC), National Digital Stewardship Alliance (NDSA), and other efforts like the End of Term Web Archive. Stanford librarians collect the institutional legacy and research and learning output of the university as well as at-risk sites like Iranian blogs, FOIA sites, Congressional Research Service (CRS) reports, fugitive US agencies, bay area governments and the CA .gov domain. Stanford’s Web archiving program is a growing complement to the library’s robust collection building activities. The library stores web archives locally in the Stanford Digital Repository, provides discovery through its catalog, and enables browsing through a local instance of the Wayback web archive replay platform.

University of North Texas Libraries (UNT)

The University of North Texas Libraries began web archiving in 1997, when, as part of the Federal Depository Library Program, they created the CyberCemetery to capture and provide permanent public access to the web sites and publications of defunct U.S. government agencies and commissions. UNT also participates in the NDIIPP program and is a founding member of the National Digital Stewardship Alliance. More information about the UNT Libraries web archiving activities can be found at the following link: About Web Archiving at UNT.


U.S. Government Publishing Office (GPO)

The U.S. Government Publishing Office manages the Federal Depository Library Program and is charged with providing permanent public access to government publications.