Monday, December 12, 2016

Harvesting Government History, One Web Page at a Time

Harvesting Government History, One Web Page at a Time.  Jim Dwyer. New York Times. December 1, 2016.
     With the arrival of any new president, large amounts of information on government websites are at risk of vanishing within days. Digital federal records, reports and research are very fragile. "No law protects much of it, no automated machine records it for history, and the National Archives and Records Administration announced in 2008 that it would not take on the job."  Referring to government websites: “Large portions of dot-gov have no mandate to be taken care of. Nobody is really responsible for doing this.”  The End of Term Presidential Harvest 2016  project is a volunteer, collaborative effort by a small group of university, government and nonprofit libraries to find and preserve valuable pages that are now on federal websites. The project began before the 2008 elections. Harvested content from previous End of Term Presidential Harvests is available at http://eotarchive.cdlib.org/.

The project has two phases of harvesting:
  1. Comprehensive Crawl: The Internet Archive crawl the .gov domain in September 2016, and also after the inauguration in 2017.
  2. Prioritized Crawl: The project team will create a list of related URL’s and social media feeds.
The political changes in the past 8 years at the end of presidential terms has made a lot of people worried about the longevity of federal information.

No comments: