In lieu of an abstract, here is a brief excerpt of the content:

Reviewed by:
  • The Nineteenth-Century Press in the Digital Age by James Mussell
  • Simon J. Potter (bio)
The Nineteenth-Century Press in the Digital Age, by James Mussell; pp. xviii + 232. Basingstoke and New York: Palgrave Macmillan, 2012, £55.00, $95.00.

The digitisation of nineteenth-century newspapers and periodicals allows us to interact with Victorian print culture in quite novel ways. We can ask new questions about the period and provide new answers to old ones. We can design research projects that would not have been possible a decade ago. We can channel primary material not just into our own offices, but also into the classroom, bringing students into more intimate contact with Victorian journalism and literature. Yet scholars have been slow to explore this brave new online world, to think about the opportunities it offers, or the difficulties it creates. James Mussell is an exception to this rule. He worked as a postdoctoral research assistant on the Nineteenth-Century Serials Edition (ncse) project before moving to the English department at the University of Birmingham and then to the University of Leeds; The Nineteenth-Century Press in the Digital Age is informed not just by his experience using digital texts, but also by his involvement in their production. This makes the book all the more valuable.

Mussell’s central argument is that digitisation does not simply make Victorian print culture more widely and easily available: it also has a transformative effect. When we sit in our offices and access digital newspapers and periodicals, we are not using the same thing as when we go to the library and read the original nineteenth-century printed copies. Access. Use. Read. The differences between these words are significant for Mussell. Digitisation might claim to present to us an exact replica of the original source, through [End Page 554] page facsimiles, for instance. Yet we are actually accessing new resources that take data from print versions and sort, manipulate, and present that data in different ways.

If we mistakenly assume that digital newspapers and print newspapers are the same thing, we may fall victim to a range of other errors. When, for example, we use digital newspapers to conduct text searches, we do something that no victorian reader could ever have done. We thus interact with the data contained in these newspapers in quite a different way than contemporaries did. Moreover, although our searches end up producing page images (or sections of page images) on our screen, we are not actually searching those images. Rather, we are searching digital text that has been generated from those page images through optical character recognition (OCR) and processed in a number of ways. The most obvious issue here is that the OCR process does not generate reliable transcripts. The poor quality of nineteenth-century print, the poor quality of the microfilms from which many digital newspaper archives have been produced (because it is cheaper to digitise from microfilm than from print), and simple issues like the curvature of text in the margins of the page caused by the binding of newspapers into volumes, means that OCR text is often garbled. A percentage (often a significant percentage) of the page is thus not actually searchable. When we run a search, we think we are searching the entire newspaper archive, but in fact some proportion is being ignored. Some digital newspaper archives, like the National Library of Australia’s wonderful TROVE website, allow you to see the OCR text side-by-side with the page image, and thus often to realise how unreliable searching is. TROVE even allows users to correct the OCR text. Few commercially produced digital newspaper archives, however, would want to make users aware of such problems.

Ideally, all digital newspapers would be editions rather than archives: extensive editorial intervention would bring us closer to the printed version. Yet the vast amount of material to be digitised means that this is not possible, even for a project like ncse, which aimed to select a relatively limited body of material and subject it to very careful processing. As Mussell suggests, probably the best we can hope for is an edited archive.

Mussell argues convincingly that all...

pdf

Share