"What does it take to archive a linear foot of the Web?," Anna Perricci posed rhetorically to our web archiving metrics breakout discussion group two weeks ago. I don't yet have a good answer for what the question's getting at, but I was gratified by the level of interest and engagement in web archiving as archiving at the just-concluded Society of American Archivists (SAA) Annual Meeting and inaugurally coscheduled Archive-It Partner Meeting.
There were no fewer than five(!) dedicated web archiving events over the course of the week: the Archive-It Partner Meeting, the SAA Web Archiving Roundtable Meeting, and three web archiving-centric sessions.
My overall impression was that the SAA web archiving community is "growing out" faster than it is "growing deep" - i.e., noticeably more institutions doing web archiving but not many more at greater levels of institutional investment year-over-year. I felt that there was a pervasive, half-joking acknowledgement that doing web archiving well (and quality assurance, in particular) was an impossible time commitment for most institutions, which made me wonder how we normalize (and celebrate) realistic best-effort, what we measure to ensure our time on web archiving is best-spent, and how we can collectively level up our web archiving activities.
Those are all topics I'd like to revisit later, but for now I'll conclude by providing some brief notes on the various web archiving events:
The Archive-It Partner Meeting featured presentations on a range of different topics proposed by Partners, including web archives fulfilling the role of the traditional visual resources reference collection, the challenges and opportunities in running a regional web archiving consortium, the extent to which web archives can be treated as archives for description and discovery, micro-grants for incentivizing tool development, and leveraging a cloud preservation service provider for redundant storage of Archive-It data. I contributed with a presentation on web archiving metrics: what we're measuring, why we're measuring, and how we could be measuring better. We also heard from Internet Archive staff about the latest Archive-It program and platform developments, the K-12 Web Archiving Program, and Internet Archive's strategic plan.
With over 850 members, an active mailing list, and the annual meeting, the SAA Web Archiving Roundtable has become an increasingly important forum for development and exchange of web archiving best practices. After an administrative update, there were two presentations: from Rosalie Lack on the transition of the California Digital Library Web Archiving Service to Archive-It and from Karl-Rainer Blumenthal on the quality assurance documentation he developed for the New York Art Resources Consortium. I closed out the meeting with a follow-on facilitated discussion on quality assurance, focusing on what practices had the greatest impact.
The popularity of the session was unsurprising given the finding in the 2013 NDSA Web Archiving Survey Report (PDF) that 81% of U.S. organizations doing web archiving devoted half or less of the equivalent of one full-time staff person's time to it; web archiving with limited resources is the status quo. Rather than offer individual presentations, the panel addressed in succession their own experiences with exploring use cases, advocating for web archiving, setting up their programs, making collecting decisions, discovering collecting targets, managing crawls, describing content, performing quality assurance, and planning for the future.
The session featured four presentations dedicated entirely to outreach to different web archive stakeholders: researchers, selectors, developers, and (my presentation) campus webmasters. The audience asked questions about the experience of developers who participated in the Andrew W. Mellon Foundation / Columbia University Libraries Web Archiving Incentive Program, outreach to third-party hosts of institutional web content, and possible research access typologies for web archives and other masses of digital content.
Panelists shared their efforts to document institutional scandals in real-time using web archiving and other approaches. Much of the discussion was broader in scope than web archiving but underscored the importance of having premeditated policies for collecting and access to be able to respond quickly and confidently to incidents.