publisher colophon

The HeritageCrowd Project

A Case Study in Crowdsourcing Public History

Shawn Graham, Guy Massie, and Nadine Feuerherm

Digital history is public history: when we put materials online, we enter into a conversation with individuals from all walks of life, with various voices and degrees of professionalism. In this essay, we discuss our experience in relinquishing control of the historical voice in order to crowdsource cultural heritage and history. What is the role of the historian when we crowdsource history? Whose history is it anyway—the historian's or the crowd's? Which crowd can lay claim to it?

Wikipedia, the exemplar par excellence of what crowdsourcing can accomplish, has perhaps the most succinct and elegant definition of the term: “a distributed problem-solving and production model.”1 This definition dovetails nicely with recent polemics about the nature of the digital humanities more generally, where digital work is not just about solving a problem but also about “building things,” as Steven Ramsay has argued.2 Notice that this definition says nothing about the nature of the crowd, its professionalism, or its training; there is an implicit suggestion that “anyone” can be part of the crowd. Notable projects that crowdsource historical problems range from Ancient Lives, a project to transcribe the Oxyrhynchus papyri; to Transcribe Bentham, a project to transcribe the papers of Jeremy Bentham; to the National Geographic Society's Field Expedition: Mongolia, where contributors study satellite images of Mongolia to help direct the archaeological survey team on the ground.3

Roy Rosenzweig has made the case for the need for historians to engage audiences outside the discipline, as well as for the power of historical narratives to bring about social justice.4 On a similar note, in 1932, Carl Becker, taking part in what was already an old discussion about the professionalization of history, wrote, “If the essence of history is the memory of things said and done, then it is obvious that every normal person, Mr. Everyman, knows some history.”5 In the age of Wikipedia as the go-to place for historical knowledge and of increased funding cuts to humanities research, the need to reach out to the public has never been greater. Edward L. Ayers argued that while a “democratization of history” has taken place since the emergence of new historical fields in academia, a “democratization of audience” has yet to come.6 Digital history has the potential to address these concerns by linking members of a community together to collaborate on historical projects.

Nevertheless, the Internet is not an inherently even playing field; to digitize is not to democratize.7 Technical literacy, closed algorithms for search engines, unequal access to quality hardware, and poor Internet connections mean that there is a disparity among users in their ability to manipulate the Internet for their own purposes.8 Colleen Morgan points out that “when even considered,” the audience for digital work “is almost always assumed to be male, white, western users of technology, a broadly defined ‘public’ for whom digitality is an obvious boon.”9 To put historical materials online is not a neutral process; to ask the crowd to solve a problem has the effect of creating self-selected groups, people who participate not just by interest but also by technological proficiency.

Our own project, which we christened HeritageCrowd, attempts to take these issues into account as we provide tools for the group expression of local history and heritage in certain rural communities in Eastern Canada, using low-tech “old digital media,” such as short message service (SMS) and voice mail, built into a web-based system.10 We wanted to bring the potential of digital technology to bear on a region with relatively low Internet access but also a relatively high interest in local history. (See the images in the web version of this essay at http://WritingHistory.trincoll.edu.)

Canadians may lead the world in Internet use,11 but this usage is not distributed equitably—for instance, across the rural and urban divide.12 Many rural museums and cultural heritage organizations do not have the technical expertise, human resources, or funding to effectively curate and interpret their materials, let alone to present them in a comprehensive manner over the Internet. These organizations constituted our ideal “crowds” for this project. We used two web-based platforms. The first platform is Ushahidi, a system developed in Kenya in the wake of the 2008 election violence, allowing for quick “reports” to be posted to a map via SMS messaging, voice mail (using voice-to-text software), Twitter, e-mail, and web forms.13 The second platform is Omeka, from the Rosenzweig Center for History and New Media at George Mason University, which we use to archive and tell “stories” built around the contributions submitted on the Ushahidi platform.14

Local history associations and other heritage groups form the backbone of a community's collective memory, preserving and performing their sense of historicity. At its more elementary level, the goal of our project was simply to assist local heritage initiatives by creating a web-based system that could store and accept short, text contributions. The submissions that came in were then approved by members of the project team and enabled on the Ushahidi-powered site, where they were placed as reports on a map of the region.15

Research Objectives

In the initial proposal for this project, we were particularly interested in trying to address the rural-urban digital divide in Canada, by using the SMS system as the project's backbone. We asked, can public history be crowdsourced? What does that even mean? How could the SMS system be used to collect local knowledge of heritage resources? What can be curated in this way? In what ways would such a system change the nature of local knowledge, once that knowledge becomes available to the wider world on the web?

We targeted a local area with which we were familiar, Pontiac County in Western Quebec, known locally as “the Pontiac.”16 Internet connectivity in the Pontiac has only recently transitioned from dial-up Internet connection.17 More important, over half the population does not have a high school diploma,18 an indicator of low Internet use.19 The Pontiac's sister county in the neighboring province of Ontario, Renfrew, was also a target region, for similar reasons.20 Both of these counties together are known as the “Upper Ottawa Valley.” Could a low-tech approach to crowdsourcing history reach this particular crowd, and what kind of history would emerge?

Strong institutional narratives were already at play, given the provincial boundary between our two target counties. Education is a provincial responsibility in Canada, and the province of Quebec teaches a very different historical narrative than the province of Ontario.21 The histories of the regions and of minority groups do not have any real role in the “official” history taught at the high school level. Our project, then, has the political and social goal of validating those marginalized histories, to give a sense of legitimacy to the historical narratives of the local community. This made us question the role of the historian in this context; by crowdsourcing local history, we had transcended the traditional role of the historian as being an arbiter of historical truth.22 Historians who crowdsource the writing of historical narratives may be able to empower members of a given community who may not have the same institutionalized or professional authority conceded to “experts” in the discipline. This mission is distinctly different from that of most academic historians, whose work is centered around the construction of historical narratives based on the analysis of sources, and from that of museum or public historians, who attempt to provide an impartial and objective narrative of the past for public consumption.

Initial Results

To encourage submissions from visitors to the website, we created a number of reports to “seed” it, assuming that visitors would be less likely to submit reports if the site was empty or contained few reports. As of the end of July 2011, we have received 25 reports (5 contributions by voice mail, 7 by SMS, and 13 by e-mail, from unique contributors), and the site has 50 reports listed (this number includes the previous amount listed plus reports submitted via the website). At the time of writing, the site had been open to the public for a total of 54 days. As the Upper Ottawa Valley has a population of approximately 90,000 people, this suggests that about one in four thousand people living in the targeted area made a submission to the project.

It is difficult to judge whether or not this figure represents a low participation rate, since we have no comparable data. The promotion of the project took place by contacting local history associations and genealogical groups, churches, and museums via mail and e-mail. A brief labor disruption with Canada Post, the national postal operator, occurred in the early phases of the project, but we do not believe it to have been responsible for any significant delays in processing our mail. A large spike in submissions took place immediately after the publication of a newsprint article about the HeritageCrowd project in the urban newspaper the Ottawa Citizen (Renfrew and the Pontiac are in the city of Ottawa's hinterland).23 As Amanda Sikarskie describes in this volume, her experience with the Quilt Index database, another important historical crowdsourcing project, shows that an effective and well-organized social media campaign has the ability to vastly increase the size of the “crowd” that participates in the project.24

Reflections

From a technological point of view, our mission was simply to give people the digital tools to more easily express and share their sense of heritage and local history. During the course of the project, however, it became evident that a second crowdsourcing method could be used for a similar goal. This approach, which could be called “retroactive crowdsourcing” (for lack of a better term), involves gathering representations of local history and heritage from disparate online sources that already exist and then collecting them in an online database.25 This is different from our original concept of crowdsourcing, where we actively solicited submissions to our project from a wide community.

We trawled through a number of different kinds of sites (such as Flickr.com), other amateur and local historical and genealogical websites (such as Bytown.net), blog posts, and online exhibits. This produced a sizable collection of heritage materials. We created an example report, “St. John's Lutheran Church and Cemetery, Sebastopol Township.”26 A picture of the church taken by a Flickr user was uploaded (with permission) to the report, and a link was provided to a website that had photographed all of the headstones in the cemetery. The use of automated spiders and other software tools, such as DownThemAll or DevonAgent, could speed up this process and broaden its reach considerably.27 Indeed, this example shows one sense in which our project's focus was misplaced. Crowdsourcing should not be a first step. The resources are already out there; why not trawl, crawl, spider, and collect what has already been uploaded to the Internet? Once the knowledge is collected, one could call on the crowd to fill in the gaps. This would perhaps be a better use of time, money, and resources.

In hindsight, one of the ways in which the project could have attracted more submissions lay in implementing what Jane McGonigal calls “classic game rewards”—in other words, building a series of gamelike mechanics into the project. These include giving the participants “a clear sense of purpose,” as well as giving them the impression that they are “making an obvious impact” and contributing to “continuous progress.”28 Gamification is a troubled term, in that while it implies using the classical tools of games to foster engagement, it can also be taken to suggest the trivialization of the task at hand or, worse, exploitation of the user/visitor.29 Be that as it may, McGonigal cites major crowdsourced collaborations, such as Wikipedia, as being successful because of subtle systems of rewards, satisfaction, and, to some extent, social interaction.30 HeritageCrowd could foster engagement through its “comments” feature on the individual reports in the Ushahidi platform, but here we have a clear case of where the technology, the medium, shapes the message: Ushahidi is for quickly reporting crisis incidents, not for fostering a dialogue about them. For our purpose, a great deal of modification needs to be done to the core platform, perhaps by merging the reporting system with the autocreation of wiki pages.

Although the accumulation of reports on the Ushahidi-powered website's map could be seen as an indicator of progress over time, these reports first had to be approved by us before becoming visible (a decision taken to filter out potential spam or otherwise unsuitable material). The instant satisfaction of having made a contribution to the project was therefore lost. Similarly, one would not have been able to track one's own individual progress (that is, with a personal account and information interface that lists the number of contributions). Either further development of the Ushahidi platform or the use of an additional platform to track this data for users could provide this benefit.

The concept behind the project (crowdsourcing local history and heritage using SMS networks and voice mail) proved to be an obstacle in some cases. When we visited community events or corresponded with individuals who expressed interest, some people were unsure what exactly we were asking them to do. This was most likely because the project was centered on a concept with which many people in the region were unfamiliar. We could easily explain it in person whenever we were asked about the project, but it is entirely plausible that some contributors made submissions to the project (by sending a text message or voice mail, for instance) without having fully understood how the submissions were compiled onto our website. (The article in the Ottawa Citizen was published digitally for a while with the headline “Text If You Are a Descendant of Philemon Wright.”31 We duly received a number of text messages with the exact message “I am a descendant of Philemon Wright.”) The layout of the main website also provides some confusion, as it is not immediately obvious how or what visitors actually do on the site. We believe that this confusion was partly responsible for the evolution of the project from a tool where collaboration and community support was envisioned, a process of sharing authority, to one where we the historians seem to be using the crowd more as a reservoir, contrary to our intentions.

Finally, we had a number of potential contributors who were worried that what they had to contribute was not “professional” enough and who were thus reluctant to actually contribute; in these cases, our role seemed to be to reassure them that what they knew, what they valued, did have “official” historical value. One community activist approached us with a body of materials that she had collected as part of a continuing negotiation with a local city council in Quebec over the development of a neighborhood. This neighborhood is predominantly Anglophone, while the city itself is largely Francophone. The history and memory of this one neighborhood was thus caught up in larger issues of identity, power, and institutionalized interpretations of history. The city council wishes to rezone the neighborhood to allow for high-rise condominiums. The activist approached us to see if we could “legitimize” what she had collected, in the hopes of forcing the city to adopt specific heritage recommendations into its planning process. The act of collecting community knowledge, since it was being done via our university-funded project, seems to put an imprimatur of “truth” and legitimacy on anything submitted and displayed. On all submissions, the Ushahidi platform uses the term verified in the sense of crisis management, to indicate that what is described in the submission actually happened. Our approach was initially one where we used the term simply as a spam filter. Clearly, this was far too simplistic and carries implications far beyond what we initially imagined.

Early Conclusions

At this early stage in our project, the single most important observation is the role our project seems to have in validating individuals' and groups' historical knowledge. Even if we have not yet collected masses of documentation, we provide a new avenue for nonprofessional knowledge to enter into the academic world of knowledge production. Consequently, by adapting a platform meant for one domain into another, there is procedural rhetoric that needs to be taken into account when designing how the project works.32 Our authority was not shared; rather, the platform and our use of it seem to have reinforced the primacy of the historian.

Were we to start this project over, we would spend more time modifying the basic platform to combat this result. The terminology and structure of the platform as it currently stands give more authority to the data displayed than might be warranted. We had imagined that if a contribution was made that might not be factually accurate or that carried political bias, a discussion would take place in the comments for that item and would result in the issue resolving itself (much like what happens on Wikipedia). This has not yet happened. Perhaps the fact that this project is university funded and carried out by university researchers and students also gives immediate “weight” and authority to anything displayed on the website, thus inhibiting discussion.

When the aim of a crowdsourced project is to transcribe documents, it is self-evident what needs to be done. When the aim is a bit more nebulous, like in the case of HeritageCrowd, we could suggest the following guidelines:

 

• Choose your base platform carefully, thinking through the technological and epistemological implications. (As it happens, Ushahidi as a platform does work in terms of widening access beyond the tech-savvy: we did get voice and SMS contributions and so met at least that aim of our project.)

• Collect what already exists.

• Seed your site with the collected existing material so that you can identify the gaps.

• Narrow your target when communicating with the public: get them to fill the holes.

• Make sure to design for engagement.

• Put initial resources into publicity. Building your crowd is key. Get out, walk the walk, and talk to people. Identify, contact, and cultivate key players.

• Have an “elevator pitch.” Make sure that the project can be described completely in 30 seconds or less. Build your outreach and social media strategy around getting that pitch in front of as many eyes in your target crowd as possible.

 

The funding for HeritageCrowd was limited to only a few summer months. However, by using open-source, freely available software, its continuing operating costs run to that of maintaining the web hosting. We will be taking the lessons we learned in the summer of 2011 and using them to improve our approach. With time, we hope to reach more of our target audience. HeritageCrowd will also become a platform for the training of students in digital history, outreach, and exhibition. As we collect more materials, we will be developing the Omeka-based “Stories” part of our site, allowing individuals, societies, students, and researchers to tell the stories that emerge from the crowdsourced contributions. It is still our hope that the role of the digital historian might be shifted away from that of the expert, dictating historical narratives from an academic podium, and toward an activist role for grassroots community empowerment. Digitally crowdsourced history has the potential to be like a cracked mirror: it could reflect what looks into it, and while it might not (cannot?) produce a polished, singular view, the aesthetic pleasure will lie in the abundance of perspectives that it provides.

Acknowledgments: The HeritageCrowd project was funded by a 2011 Junior Research Fellowship from the Faculty of Arts and Social Sciences at Carleton University, whose support is gratefully acknowledged. We would like to thank James Miller, Jim Opp, John Walsh, Lisa Mibach, and the contributors to HeritageCrowd for their interest, support, and feedback. Errors and omissions are our own.

Notes

1. “Crowdsourcing,” Wikipedia, http://en.wikipedia.org/w/index.php?title=Crowdsourcing&oldid=470989039.

2. Stephen Ramsay, “Who's In and Who's Out” (text of paper delivered at MLA2011, Los Angeles, January 8, 2011, posted to personal blog), http://lenz.unl.edu/papers/2011/01/08/whos-in-and-whos-out.html.

3. Ancient Lives, University of Oxford, http://www.ancientlives.org; Transcribe Bentham, University College London, http://www.ucl.ac.uk/transcribe-bentham/; Field Expedition: Mongolia, National Geographic Society, http://exploration.nation-algeographic.com/mongolia.

4. Roy Rosenzweig, “Afterthoughts: Roy Rosenzweig,” The Presence of the Past, 1998, http://chnm.gmu.edu/survey/afterroy.html.

5. Carl Becker, “Everyman His Own Historian,” American Historical Review 37 (1932): 223.

6. Edward L. Ayers, “The Pasts and Futures of Digital History,” http://www.vcdh.virginia.edu/PastsFutures.html.

7. Compare Evgeny Morozov, The Net Delusion: The Dark Side of Internet Freedom (New York: Public Affairs, 2011).

8. Lorna Richardson, “The Internet Delusion and Public Archaeology Online” (paper presented at the annual conference of the Central Theoretical Archaeology Group, London, May 14, 2011), excerpted online at http://digipubarch.org/2011/12/14/inequalities-in-public-archaeology-online/.

9. Colleen Morgan, “Contextualized Digital Archaeology—Chapter 3,” draft of PhD diss., Anthropology Department, University of California, Berkeley, p. 3, http://middlesavagery.wordpress.com/2011/12/19/contextualized-digital-archaeology-dissertation-chapter/.

10. HeritageCrowd, Carleton University, http://heritagecrowd.org. In May 2012, the site was maliciously hacked and, as of this writing, is off-line, as described in Shawn Graham, “How I Lost the Crowd: A Tale of Sorrow and Hope,” Electric Archaeology, May 18, 2012, http://electricarchaeologist.wordpress.com/2012/05/18/how-i-lost-the-crowd-a-tale-of-sorrow-and-hope/.

11. ComScore, “The 2010 Canada Digital Year in Review 2010,” March 2011, http://www.comscore.com/content/download/7717/133765/version/5/file/comScore+2010+Canada+Digital+Year+in+Review.pdf.

12. Compare Ian Marlow and Jacquie McNish, “Canada's Digital Divide,” Globe and Mail, April 2, 2010, http://www.theglobeandmail.com/report-on-business/canadas-digital-divide/article1521631/.

13. Ushahidi, “About Us,” http://ushahidi.com/about-us. See also “Mobile Services in Poor Countries: Not Just Talk,” Economist, January 27, 2011, http://www.economist.com/node/18008202.

14. Roy Rosenzweig Center for History and New Media, Omeka, http://omeka.org.

15. “Approving” a report was a step built into the platform; no report could be viewed unless it was approved. We did not edit or turn away submissions unless they were manifestly spam.

16. One of us has deep family ties in the area.

17. MRC de Pontiac, “Plan stratégique—Vision Pontiac 2020,” April 2009, http://www.mrcpontiac.qc.ca/documents/vision2020/Diagnostic%20-%20MRC%20de%20Pontiac.pdf.

18. MRC Pontiac, “Demographic and Socio-economic Profile, Pontiac Municipal Regional County,” 2006, http://web.archive.org/web/20111011002302/http://mrcpontiac.qc.ca/en/regional/regional_demographic.htm.

19. Statistics Canada, “Internet Use by Individuals, by Selected Characteristics,” 2005–9, http://www.statcan.gc.ca/tables-tableaux/sum-som/l01/cst01/comm35aeng.htm.

20. The proportion of individuals in Renfrew County without a high school diploma is about 26 percent. Statistics Canada, “2006 Community Profiles—Renfrew County and District Health Unit,” http://www12.statcan.ca/census-recensement/2006/dp-pd/prof/92-591/index.cfm?Lang=E.

21. Problems with the provincial history curriculum, as it pertains to the Anglophone history of Quebec, have long been recognized. See, for instance, Sam Allison and Jon Bradley, “Quebec Exam Is Bad History, Written in Bad English,” Montreal Gazette, July 5, 2011, http://j.mp/gazette-bad-english.

22. See, for instance, the papers in the special edition edited by Steven High, Lisa Ndejuru, and Kristen O'Hare, “Sharing Authority: Community-University Collaboration in Oral History, Digital Storytelling, and Engaged Scholarship,” Journal of Canadian Studies/Revue d'études canadiennes 43, no. 1 (2009).

23. Matthew Pearson, “Text If You Are a Descendant of Philemon Wright,” Ottawa Citizen, June 25, 2011.

24. Amanda Grace Sikarskie, “Citizen Scholars: Facebook and the Co-creation of Knowledge,” in this volume.

25. Guy Massie, “Photos, Exhibit Research, and Thoughts about Crowdsourcing,” HeritageCrowd Journal, June 24, 2011, http://www.heritagecrowd.org/journal/?p=38.

26. “St. John's Lutheran Church and Cemetery, Sebastopol Township,” June 23, 2011, http://heritagecrowd.org/reports/view/39.

27. William J. Turkel, “Spider to Collect Sources,” March 23, 2011, http://williamjturkel.net/2011/03/22/spider-to-collect-sources/; DownThemAll, http://www.downthemall.net/; DevonAgent, http://www.devontechnologies.com/.

28. Jane McGonigal, Reality Is Broken: Why Games Make Us Better and How They Can Change the World (New York: Penguin, 2011), 222–23.

29. Ian Bogost, “Gamification Is Bullshit: My Position Statement at the Wharton Gamification Symposium,” August 8, 2011, http://www.bogost.com/blog/gamification_is_bullshit.shtml.

30. McGonigal, Reality Is Broken, 219–46.

31. Philemon Wright was the first major colonist and landowner in the region, Dictionary of Canadian Biography Online, http://www.biographi.ca/009004-119.01-e.php?id_nbr=3738.

32. See Ian Bogost, Persuasive Games: The Expressive Power of Videogames (Cambridge, MA: MIT Press, 2007), on how software processes force a particular rhetoric of expression in the final representation of digital data.

Share