- Gathering the ’Net:Efforts and Challenges in Archiving Pacific Websites
In addition to more traditional material—books, journals and other serial publications, brochures, music, films, manuscripts, photographs, postcards and archives—the University of Hawai‘i-Mānoa (uhm) Library’s Hawaiian and Pacific Collections are now actively collecting websites. With so many new websites being created in and about the Pacific Islands region, and so much more information being made available online—and at times exclusively so—it has become increasingly clear to the librarians of these collections that to adequately document this period in history it is necessary to collect and preserve websites. The uhm Library has been attempting to archive websites in one form or another since 2001. This essay will discuss the importance of collecting Pacific websites, describe how the Hawaiian and Pacific Collections are finding solutions for the inherent challenges of preserving websites, and explore some potential future directions that would strengthen the project and meet the information and research needs of the Pacific Islands region.
Why Preserve Websites
The uhm Library’s Hawaiian and Pacific Collections have long recognized the research value of websites. In many ways the uh effort seeks to parallel a web archiving project being conducted at the Library of Congress (loc), which (as noted on the loc website) is “composed of sites selected by subject specialists to represent web-based information on a designated topic. It is part of a continuing effort by the Library to evaluate, select, collect, catalog, provide access to, and preserve digital materials for researchers today and in the future.”1 In the most recent survey of web archiving initiatives, forty-two institutions were identified worldwide, [End Page 158] with the vast majority of those focused on archiving websites limited to the hosting institution, its nation, or its region or state within a nation (Gomes, Mirana, and Costa 2011). This same survey identified the uhm web archiving project as one of only two initiatives that seek to collect internationally, based on a specific geographic area2—although, as will be discussed, the national libraries of Australia and New Zealand do collect beyond their national boundaries.
Vast amounts of information from and about the Pacific are being created digitally and made available via the Internet. Governments, nongovernmental organizations, and businesses use web pages as informational brochures, as tools for interacting with and serving citizens and customers, and as repositories for their work. Blogs are a tool frequently used by citizens to voice individual opinions, at times counter to government preference. Pacific websites are also a meeting place for the Pacific diaspora, connecting far-flung communities and collecting their voices as new cultural identities are navigated and shaped. And, of course, Pacific websites create strong connections between the Pacific region and the rest of the world. Pacific history, culture, and identity are being created and explored using the medium of the web, and librarians in the uhm Hawaiian and Pacific Collections feel it is important that all of this information is preserved for the long-term benefit of researchers and for the citizens of the Pacific.
Although of extreme importance, websites are not a stable medium and thus cause multiple challenges for long-term preservation. For one, websites are constantly changing and being updated, much as if there was a new edition of the same book being published daily. Therefore the same website must be repeatedly “collected.” Organizations that produce serial publications may only maintain the most current issue online, assuming that older issues are not of interest (when in fact they can be of great historical value). Then there is the regularity with which entire websites completely and irretrievably disappear. Broken links and “404 not found” messages are well-known and frustrating realities to even the most casual user of the Internet. While printed annual reports do not disintegrate when organizations shut down, websites disappear as soon as there is no funding to maintain them on a server. Government websites are highly susceptible to change or deletion in response to shifting political agendas. With so many economic and social fluctuations that have immediate impact on the existence of a website, libraries should not and cannot...