In lieu of an abstract, here is a brief excerpt of the content:

  • Life after the Historical Thesaurus of the OED
  • Christian Kay (bio) and Marc Alexander (bio)

The Historical Thesaurus of the OED (HTOED) was published by Oxford University Press (OUP) on 22 October 2009 (Kay et al. 2009). It consists of two substantial volumes, the first containing some 800,000 meanings arranged in semantic categories, the second an index. Publication was the culmination of 44 years of work by a team in the English Language department at the University of Glasgow, led initially by Professor M. L. Samuels, and from 1989 by Professor Christian Kay1.

Final Days

Through accident rather than design, the very end stages of data entry in July 2008 overlapped with a conference at Glasgow involving a number of team members, fraying the nerves of all involved. So it was that midway through the conference the project director returned to the office to be informed that not only had the last slip been typed into the database, but by serendipity or careful planning this final entry was the word thesaurus itself. A week later, a CD containing all the data files was posted to OUP; the project team insisted on a formal handing-over ceremony on the department doorstep, with photographs of Christian Kay solemnly delivering the envelope into the hands of the University janitor entrusted with the task of conveying it to the mailroom. However, the nature of the data and the complexity of the project was such that, after this milestone, there followed a long series of further deadlines and other landmarks, rather than there being any one single point of completion. [End Page 107]

The data went on a long journey in the final year. Once the paper slips were categorized, they had been typed into dBASE files using an in-house data entry program in batches of around 30. These files were combined into larger batches, converted to Microsoft Excel format, and then combined once more into 186 files roughly representing a large semantic domain each. At OUP, these were converted to plain text format, checked using Perl scripts, and then converted once more to three XML files to be provided to the typesetters, who then produced InDesign files for printing. These typeset pages were supplied as PDFs to Glasgow, where they were printed, proofed, scanned as TIFFs, and converted back to PDFs. The annotations on these were inserted into the XML and thus updated on the In-Design files. At each stage of this process, nicknamed the "dance of the files," the possibilities for disaster in the conversion process increased — in many ways, it was a blessing that those problems which did arise were usually relatively minor, although distinctly trying under tight publication deadlines.

There then began a short break while the volumes were typeset abroad, although any relief this provided was marred by a cautious member of the team calculating that the typesetter's claim of a "99.995% accuracy rate," when applied to HTOED's 22.74 million pieces of data, would result in over 1,100 new errors being created. A welcome distraction from such speculations was provided by the arrival of OUP's designs for the HTOED wallchart, included with every copy to give an overview of its conceptual hierarchy. Proofreading then went quickly and with a minimum of hiccups, although it naturally revealed some blunders we were glad to remove before publication — one such was an unfortunate creature noted within the Life section as being "devoid of Brian," soon corrected to "devoid of brain."

The first advance copy arrived in Glasgow on 21 August 2009. A host of colleagues from across the entire campus, almost all of whom had lived with the presence of the Thesaurus project from their very first days at the University, found reasons to come to the department and finger its binding somewhat incredulously. It was at this point that we realized that we had actually succeeded despite all the obstacles along the way. As Philip Pullman wrote in his endorsement of the book: "... here is the information we had to spend hours hunting down through the thickets and coverts of the great OED, shot, stuffed, and mounted for us".

Publication

The calendar...

pdf

Share