Johns Hopkins University Press
  • Closing the Loop:Bridging Machine Learning (ML) Research and Library Systems
Abstract

This article argues that if libraries are to take leadership in conversations about the ethics and application of machine learning (ML) to cultural materials, they must move beyond the "perpetual future tense" of most library ML proposals and experiments, narrowing the gap separating promises that ML will enhance discoverability for library materials and the library systems through which most users encounter those materials. Even as ML methods have grown more powerful, nuanced, and sophisticated, ambitious hopes that ML might help better identify and describe vast library collections have been largely unmet, at least from the perspective of library patrons, researchers, and students. To address this gap, the article argues that libraries and ML researchers should work together to develop iterative, experimental, and even speculative interfaces that allow users to explore collections through ML-derived patterns that can enhance library data while educating users about ML processes, decisions, and biases.

Introduction: ML + Libraries

In the past decade, both popular and academic discourse about machine learning (ML) and artificial intelligence (AI) have increased in volume and intensity.1 This increased attention stems in large part from the fact that ML systems increasingly intersect with and influence our daily lives, from recommendation and search engines to the virtual assistants in our homes and on our phones and even hiring and lending decisions. In particular, there has been a welcome surfeit of attention to the biases that underlie and shape the outputs of ML systems, though that attention has not always translated into action by the companies, institutions, and [End Page 132] government agencies that employ those systems. In addition, scholars and activists have begun to more deeply investigate the environmental consequences of resource-intensive ML processes and to propose alternative approaches that minimize the climate impact of such research. In short, while ML and AI are not new areas of research, they are newly salient and increasingly urgent.

For libraries, Lise Jaillant argues in the introduction to Archives, Access and Artificial Intelligence, "Focusing on the preservation of born-digital and digitized records, or on the selection of these records, is not enough. Access and the production of new knowledge are issues that need to move to the center of the scholarly debate. In particular," Jaillant continues, "Artificial Intelligence can be used by archivists to identify sensitive records, but also by researchers to process large amounts of digital archival data" (Jaillant 2022, 9). The chapters of that book outline applications of ML, including flagging sensitive materials in born-digital collections and facilitating text research without violating confidentiality; creating tags and other metadata for image archives; and improving optical character recognition (OCR) for printed originals and handwritten text recognition for manuscripts. These applications have implications not only for libraries and their users, but for ML and AI researchers. As Eun Seo Jo and Timnit Gebru argue, the ML field has begun to focus more fully on how data are gathered, annotated, and documented, but "there are still open questions regarding power imbalance, privacy, and other ethical concerns." Jo and Gebru look to libraries, and archival studies in particular, as fields that ML can learn from. As disciplines primarily concerned with documentation collection and information categorization, archival studies have come across many of the issues related to consent, privacy, power imbalance, and representation among other concerns that the ML community is now starting to discuss (Jo and Gebru 2020, 2). The relationship of ML/AI to libraries, in other words, should be not simply one of adoption or adaptation but one of mutual influence and collaboration.

In 2020, I was commissioned by the Library of Congress (LOC) to write a report outlining the current state of ML research in libraries, clarifying the primary challenges facing that work, and offering recommendations to guide future library ML projects. As I wrote in the first paragraph of "Machine Learning + Libraries" (hereafter MLLR),

The majority of machine learning (ML) experiments in libraries stem from a simple reality: human time, attention, and labor will always be severely limited in proportion to the enormous collections we might wish to describe and catalog. ML methods are proposed as tools for enriching collections, making them more useable for scholars, students, and the general public. ML is posited as an aide to discoverability and serendipity amidst informational abundance. We might imagine, for example, patrons browsing automatically-derived topics of interest across a digital library comprising thousands or millions of texts—more texts, [End Page 133] certainly, than typical constraints on labor or expertise would allow us to imagine labelling manually.

The recommendations in MLLR seek to address many potential intersections of libraries and ML, from libraries' public mission—which might inspire programs that bolster public literacy about ML and AI—to their research agendas—which might sponsor ML experiments that foster new modes for exploring and analyzing library collections. In MLLR, in short, I advocate that libraries and information science professionals take leadership roles in current ML conversations, as both practitioners and advocates. This article does not summarize the full MLLR but draws from it and highlights some of its key recommendations. For example, MLLR outlines in far greater detail than I can here the practical, personnel, financial, and ethical challenges that libraries face when considering ML approaches. For a more robust account of those challenges, and potential solutions to them, please refer to the longer report.

In this article, I expand one core idea from MLLR, arguing that if libraries seek to lead broader ML conversations, we must narrow the gap separating promises that ML will enhance discoverability for library materials and the library systems through which most users encounter those materials. Aside from these few exceptions, such as OCR—which I will discuss below—in MLLR I describe the "perpetual future tense" of most library ML experiments. In project and grant applications, ML experiments promise to enhance collection metadata, illuminate new connections between holdings, or generate alternative user interfaces for navigating collections. Looking forward, the literature points to a time when ML will be incorporated into library digitization workflows and help structure core discovery and research experiences. For example, a project might propose to automatically improve the metadata in a catalog of photographs by extracting embedded text, identifying subjects, or simply identifying the compositional pallet.

At present, however, the results of most ML research tend to end up in data or code repositories, separate from the digital libraries from which they drew their data, and where the research findings will be seen only by a few other technically inclined researchers. ML research products are rarely ingested or incorporated into the core library interfaces through which most users, students, and domain researchers encounter collections, such that the contributions of ML to discoverability remain largely theoretical rather than practical. To frame this another way, the push to make research claims about the content of digital libraries perhaps obscures the most substantial contribution ML could make to cultural heritage collections, which is to enable new modes of browsing, relating, and serendipity for other researchers, students, and the public. [End Page 134]

Another reason the research-to-interface loop rarely closes is that within libraries one finds significant anxiety about the reliability of ML-derived data. Librarians and researchers alike reckon throughout the field's literature with the consequences of providing information created through probabilistic means to users who may not fully understand its provenance. In other words, even as ML methods have grown more powerful, nuanced, and sophisticated, ambitious hopes that ML might help better identify and describe vast library collections have been largely unmet, at least from the perspective of library patrons, researchers, and students. To address this gap, I argue that libraries and ML researchers work together to develop iterative, experimental, and even speculative interfaces that allow users to explore collections through ML-derived patterns that can enhance library data while educating users about ML processes, decisions, and biases.

Histories of Library AI

In her 1976 article "Artificial Intelligence in Information Retrieval Systems," Linda C. Smith observed that the movement from local information storage, using tape and similar storage technologies, to networked systems would enable more complex, machine-assisted research. Where "batch tape-based retrieval systems" were slow and "required a trained intermediary to formulate search strategies to be processed by the system," she foresaw that "on-line systems have the technology to permit: (a) random access; (b) interactive browsing, heuristic searches; (c) person with a need for information can conduct his own search if he desires; (d) no time delay" (Smith 1976, 194). Smith's article proceeds to outline a number of specific interventions that AI might make in information retrieval, through methods of "pattern recognition" for "extracting useful information from data" that can then be incorporated into IR systems (195) to the ways AI might facilitate "alternative forms of representation" for information, beyond document surrogates (201), AI techniques for automatic problem solving (204) or even "heuristic search" (209), and the development of "dynamic systems" that will learn and "improve performance over time" (212). Smith's article is striking for its prescient understanding of both immediate and subsequent effects that would stem from network and AI technologies, but also for the ways its vision still seems largely ahead of our time. While researching and interviewing information professionals to write MLLR, I read and heard many ideas about what ML could do for libraries in the future that were not dissimilar from what Smith's article posited nearly fifty years ago.

Reviewing the ambitions in Smith's list, we might identify a few technologies that have overhauled library systems in the intervening decades, most of which fall into the first category Smith outlines: pattern recognition and information extraction. Certainly "random access" and "iterative [End Page 135] browsing" are now a baseline expectation for research across library catalogs and databases, while ML technologies such as OCR have made vast collections of digitized print materials available to keyword search, as well as other forms of computational exploration and research. Though OCR is undertheorized, at least from the perspective of its use by researchers, it is the central technology enabling navigation across mass digital archives of books, newspapers, magazines, and related media (Cordell 2017). The pervasiveness of OCR has shifted scholarly attention to different media, in fact, such as dramatically increasing the use of primary sources such as newspapers, which were difficult to index or cross-reference prior to digitization. This reality has knock-on effects for historical research, driving attention toward digitized media over undigitized, whether or not the digitized actually reflects historical importance (Milligan 2013). Nonetheless, when writing MLLR, OCR stood out as the one ML technology well integrated into many library digitization workflows and deployed as metadata in digital library systems.

OCR would not be so transformative, however, without the most pervasive and thus, ironically, most invisible innovation in information retrieval, keyword search, which has radically altered scholars', students', and the general public's primary mode of interaction with information of many kinds. As Ted Underwood has written, "Algorithmic mining of large electronic databases has been quietly central to the humanities for two decades," despite many humanists' resistance to algorithmic methods, but has been normalized, and its probabilistic nature hidden, under the broad term "search" (Underwood 2014, 64). From library catalogs to digital archives and collections to Google itself, we would be justified in naming keyword search as the dominant mode of information retrieval in 2023, though the precise underpinnings of different search engines can be opaque to—or even proprietary and thus secret from—both users and information professionals. As the cliché of "doing my own research" has aptly illustrated in the age of the COVID-19 pandemic, for good or ill Smith's vision of user-driven, immediate, random-access information retrieval has become the norm.

Not all search engines, particularly those deployed in library systems, make use of AI or ML techniques, though more sophisticated search deployed by companies such as Google certainly does. Some search engines reference indices of manually created metadata, such as can be found in library catalogs. Nevertheless, even simple search over an OCR collection is, enabled by ML, as that metadata would not exist without an ML process. I reference search here because it and OCR represent significant shifts from discovery systems almost entirely founded on expert metadata, such as library catalogs or finding aids, to hybrid systems that blend expert metadata with algorithmically derived metadata, such as OCR text. We might even point to search as a technology that fulfills the second [End Page 136] item in Smith's list, as a list of search results offers "alternative forms of representation" for library collections from the order one might find by call number, the subject headings and cross references one might find in a library catalog, or the groupings one might expect in a finding aid. A list of keyword results, banal as it might seem in 2022, is a dynamic shuffling of the order of materials, based on a user inquiry. Moreover, contemporary hybrid systems often blend algorithmic methods of browsing, such as keyword search, with modes of filtering search results based on expert metadata, such as format or genre tags, dates, and so forth. In MLLR, I point to these normalized technologies as models for how we might reconsider the unrealized facets of ML/AI for library collections and activities.

ML Opportunities

In MLLR and similar reports, such as Thomas Padilla's "Responsible Operations: Data Science, Machine Learning, and AI in Libraries" (2019) and Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, and Chulwoo Pack's "Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project" (2020), we can identify both practical and theoretical obstacles to integrating ML into library systems. Technical and human limitations—everything from sufficient computing power to employee training and support—present significant hurdles that MLLR attempts to delineate and suggest paths forward to address. In this article, however, I focus on the theoretical objections to ingesting probabilistic data derived from a ML process into library systems largely comprising expert-created data. I ask in MLLR,

How can the reliability of ML data and metadata be assessed, and how can probabilistic information be integrated with human-created information, or integrated into systems designed around hand-assigned categories, tags, summaries, and so forth? To phrase this central question in another way, the ML and libraries field must develop means to bridge a world that prioritizes expert data and metadata, created slowly, and a set of methods that generate useful but flawed data and metadata, more quickly and at a larger scale.

As the examples of OCR and keyword search above indicate, such a bridge has already been constructed for at least two key discovery mechanisms. While the limitations of OCR and search are well known, their affordances for identifying information of interest within large-scale text collections have led to widespread adoption even as research continues seeking, for example, to improve OCR performance for historical and multilingual materials (Smith and Cordell 2018). While cautions around ML-derived data and metadata are well warranted, we need mechanisms to help libraries assess whether the affordances of particular ML processes for access and discovery merit infrastructural development. Fairly assessing these affordances will also require cultural shifts, similar to what has already [End Page 137] happened for OCR and search, to help library professionals, researchers, students, and the public distinguish and evaluate data of human and algorithmic provenance.

For the broader constellation of ML technologies, I argue in MLLR and here that a generative bridge between these cultures of expert and algorithmic data can be constructed through a key library competency: public pedagogy. As central nodes between the research, educational, and public sectors, libraries could play an integral role in helping the wider ML field develop, test, and deploy explainable ML systems. As Michael Ridley notes, "It is concerning that these [ML] innovations are happening outside the field of academic librarianship and with little or no involvement of library expertise" (2019, 38). In another piece, Ridley and Danica Pawlick-Potts argue that

The need for algorithmic literacy arises from two key and equally important perspectives, both of which essentially focus on power: control and empowerment. Algorithms, especially those using machine learning and deep learning, are complex, opaque, invisible, shielded by intellectual property protection, and most importantly, consequential in the everyday lives of people.

(2021, 4)

They too suggest that libraries can begin "addressing their role in relation to AI and algorithmic literacy" through both pedagogical programs and explainable AI systems (5, 7).

That latter facet of literacy, focused on building explainable AI systems, is where I would identify the most generative overlap between ML researchers and libraries. Specifically, we might look toward human-computer interaction, as expressed through digital library interface design, as a site for explainable AI that offers new paths into cultural heritage collections while educating users about the research methods that generated those pathways. Where computer scientists bring expertise in ML methods, libraries understand information literacy and can help construct interfaces that communicate ML data to a wide range of prospective users. As Andrea Gasparini and Heli Kautonen note in their review of AI in research libraries, design has not been central to most library AI strategies:

The perspective from which design has been viewed is rather narrow and limited to solving problems of user interface design (human-AI interaction). There have been some experiments using design methods, such as blueprints and customer journeys, to map out tension points among stakeholders, but none of these experiments seems to extend this mapping to the special characteristics or needs of AI.

(2022, 19)

Rather than thinking of design as a late-stage process, however, we might instead prioritize the creation of what Michael Whitelaw calls "generous interfaces" that "invite exploration and support browsing" rather than shrinking vast digital resources into a search bar. Whitelaw calls for "multiple, fragmentary representations to reveal the complexity and diversity [End Page 138] of cultural collections, and to privilege the process of interpretation." Whitelaw highlights generous interfaces he helped design that allow users, for example, to browse a collection of Australian periodicals through a mosaic view that can be dynamically shifted between chronological presentation and other algorithmically generated views, such as one that organizes the collection based on the similarity of color palettes in the magazines' front pages (Whitelaw 2015). Importantly, generous interfaces do not require the sheen we might associate with commercial software, which would hide the very qualities of complexity and diversity generous interfaces should strive to make apparent.

While AI and ML are often discussed in terms of technical feasibility and implementation, it is equally important to consider them in terms of imagination, justice, and equity. As Bethany Nowviskie asks, "Are we designing libraries that activate imaginations—both their users' imaginations and those of the expert practitioners who craft and maintain them?" For Nowviskie, the "speculative collections" of the future will require radically reimagined intellectual frameworks:

Are we designing libraries emancipated from . . . an externally-imposed, linear and fatalistic conception of time? Are we at least designing libraries that dare to try, despite the fundamental paradox of the Anthropocene era we live in—which asks us to hold unpredictability and planetary-scale inevitability simultaneously in mind? How can we design digital libraries that admit alternate futures—that recognize that people require the freedom to construct their own, independent philosophical infrastructure, to escape time's arrow and subvert, if they wish, the unidirectional and neoliberal temporal constructs that have so often been tools of injustice?

(2016)

Generous or speculative library interfaces should not seek to use ML/AI to simply reconfigure the technical ordering of materials, in other words—which is only one "alternative form . . . of representation" (Smith 1976, 201), to return to Smith. Instead, they might prioritize methods for reconfiguring, and helping users understand, the biases that underlie those collections and their ordering.

In MLLR, I found the most effective projects for explaining ML and AI were often those with creative and activist outputs that seek to interrogate, or expose by example, ML processes that might otherwise seem opaque to users. Libraries might look to the work of Joy Buolamwini and the Algorithmic Justice League, such as the Coded Gaze project, which demonstrates how wearing a white mask makes Black people more visible to facial recognition software and thus explains by example the "exclusion from seemingly neutral machines programmed with algorithms" for people of color (Buolamwini 2016). I also highlight Kate Crawford and Trevor Paglen's ImageNet Roulette, which asked users to upload a picture to be tagged using tags from ImageNet, a "gold standard" training data set for ML image classification (Crawford and Paglen 2019). As I write in [End Page 139] MLLR, by serving users a mixture of funny, strange, and offensive classifications, ImageNet Roulette cultivates broader public literacy about the actual contours of ML training data, an area commentators about ML or AI often gesture toward but rarely engage in specific detail.

I look even to more playful projects like Janelle Shane's fantastic blog & book AI Weirdness, which document "experiments [that] have included computer programs that try to invent human things like recipes, paint colors, cat names, and candy heart messages" (2020). Projects like these are as much about interrogating the possibilities and limitations of their underlying technologies as about the aesthetic objects created, but as such these experiments are intriguing models of explainable ML, in which readers (or hearers or users) come to better understand how ML works through engagement with the sometimes strange, sometimes delightful, and sometimes unsettling artifacts it produces. We need speculative library interfaces that forefront their createdness and ask patrons to better understand both cultural heritage data and ML methods through conscious exploration of ML's enlightening and odd creations.

Through their interfaces, which forefront resistance rather than seeking to project surety or objectivity, projects like these take up calls by researchers such as Rediet Abebe for computing that does not "treat . . . problematic features of the status quo as fixed" but instead

can serve as a diagnostic, helping us to understand and measure social problems with precision and clarity. As a formalizer, computing shapes how social problems are explicitly defined—changing how those problems, and possible responses to them, are understood. Computing serves as rebuttal when it illuminates the boundaries of what is possible through technical means. And computing acts as synecdoche when it makes long-standing social problems newly salient in the public eye.

These four modalities, I suggest, intersect with Whitelaw's and Nowviskie's calls for "generous interfaces" and "speculative collections" to suggest alternative pathways for libraries seeking to undertake ML research in ways that forefront algorithmic justice and public literacy. Library ML projects could serve as diagnostics to the biases and gaps in existing digitized or born-digital collections, as rebuttals to claims—whether from scholars or from Silicon Valley—about ML objectivity, or as a synecdoche that helps patrons, scholars, or students better understand the historical stakes of library collections and archives.

Generous interfaces create new perspectives on the library's primary materials, useful for scholarly or disciplinary pedagogical aims, but they also provide opportunities for patrons to explore, or even to poke, prod, experiment, and cocreate with the output of ML processes. Prioritizing such generosity, exploration, unpredictability, and technological explanation may require interfaces that prioritize transparency—even awkward [End Page 140] transparency—over polish. Rather than seeking to project certainty in presenting ML-derived results, for example—or worse, to hide the ML processes beneath the interface—ML-integrated interfaces could instead explicitly report the confidence scores for relationships, annotations, or other metadata determined algorithmically, seek to make users more aware of the probabilistic basis of this data, and perhaps even seek user contributions to improve the ML process in future iterations. We might contrast this approach with the OCR underlying most large-scale digital archives, which hide the OCR data beneath the interface and provide little context about the specific OCR processes or the reliability of the data.

This pedagogical reorientation toward explainability does not immediately solve concerns about the biases or reliability of ML/AI processes or the data created through them. However, it does help bridge cultures of expert-versus algorithmically-created data—not by flattening the distinction between them but instead by highlighting their differences and making plain the iterative, dialogic loop between them in research. The aim of explicit reporting of ML uncertainties is not simply to cast doubt—though skepticism is healthy in this domain—but to make presentations of ML data opportunities for cultivating literacy for ML and probabilistic methods. Helping users understand the confidence rating behind a particular label or category helps contextualize any claims they might make from those data. A sense of ML's limitations could, perhaps counterintuitively, serve to increase overall confidence in ML because its claims will be understood as contextual and relational rather than totalizing. Through the cultivation of explaining interfaces, libraries can help meet the goals of cultivating information literacy among patrons. In fact, I would suggest that, building on their existing roles as hubs of intellectual exchange, libraries could become focal sites for the translation of ML/AI research to the public and leaders in discussions around explainable AI in research communities. [End Page 141]

Ryan Cordell

Ryan Cordell is associate professor in the School of Information Sciences and Department of English at the University of Illinois Urbana-Champaign, as well as affiliated faculty in Informatics. Previously he was associate professor of English and a core founding faculty member in the NULab for Texts, Maps, and Networks at Northeastern University. His scholarship seeks to illuminate how technologies of production, reception, circulation, and remediation shape the sociology of texts. He primarily studies circulation and reprinting in nineteenth-century American newspapers, but his interests extend to the influence of computation and digitization on contemporary reading, writing, and research. He collaborates with colleagues in English, history, and computer science on the Viral Texts project, which uses robust data mining tools to discover borrowed texts across large-scale archives of nineteenth-century periodicals. He is also a senior fellow in the Andrew W. Mellon Society of Critical Bibliography at the Rare Book School.

Notes

1. The "terms machine learning" (ML) and "artificial intelligence" (AI), while closely related, are not synonymous. AI is the broader term, referring to a wide range of research, from machine learning to robotics. Though there are library applications for AI more broadly, my LOC report (Cordell 2020) and this article alike focus on ML, which is the domain of most research using library collections in 2022. I cite some literature that employs the broader term, AI, when its insights apply to my argument about ML. In the context of this article, then, one can read AI and ML as rough synonyms, even if this is not always true outside the context of this article.

References

Abebe, Rediet, Solon Barocas, Jon Kleinberg, Karen Levy, Manish Raghavan, and David G. Robinson. 2020. "Roles for Computing in Social Change." In FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 252–60. New York: ACM. https://doi.org/10.1145/3351095.3372871.
Buolamwini, Joy. 2016. "The Algorithmic Justice League: Unmasking Bias." December 14, 2016. https://medium.com/mit-media-lab/the-algorithmic-justice-league-3cc4131c5148.
Cordell, Ryan. 2017. "' Q i-jtb the Raven': Taking Dirty OCR Seriously." Book History 20:188–225. https://doi.org/10.1353/bh.2017.0006.
———. 2020. "Machine Learning + Libraries: A Report on the State of the Field." Washington, DC: Library of Congress, July 14, 2020. https://labs.loc.gov/static/labs/work/reports/Cordell-LOC-ML-report.pdf.
Crawford, Kate, and Trevor Paglen. 2019. "Excavating AI: The Politics of Training Sets for Machine Learning." September 19, 2019. https://www.excavating.ai.
Gasparini, Andrea, and Heli Kautonen. 2022. "Understanding Artificial Intelligence in Research Libraries: An Extensive Literature Review." LIBER Quarterly 32 (1): 1–36. https://doi.org/10.53377/lq.10934.
Jaillant, Lise. 2022. "Introduction." In Archives, Access and Artificial Intelligence: Working with Born-Digital and Digitized Archival Collections, edited by Lise Jaillant, 7–28. Bielefeld, Germany: Bielefeld University Press. https://doi.org/10.14361/9783839455845-001.
Jo, Eun Seo, and Timnit Gebru. 2020. "Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning." In FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 306–16. New York: ACM. https://doi.org/10.1145/3351095.3372829.
Lorang, Elizabeth, Leen-Kiat Soh, Yi Liu, and Chulwoo Pack. 2020. "Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project." January 10, 2020. https://digitalcommons.unl.edu/libraryscience/396/.
Milligan, Ian. 2013. "Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010." Canadian Historical Review 94 (4): 540–69. https://doi.org/10.3138/chr.694.
Nowviskie, Bethany. 2016. "Speculative Collections." Bethany Nowviskie, October 27, 2016. http://nowviskie.org/2016/speculative-collections/.
Padilla, Thomas. 2019. "Responsible Operations: Data Science, Machine Learning, and AI in Libraries." OCLC Research Position Paper. Dublin, OH: OCLC Research. https://www.oclc.org/content/dam/research/publications/2019/oclcresearch-responsible-operations-data-science-machine-learning-ai.pdf.
Ridley, Michael. 2019. "Explainable Artificial Intelligence." Research Library Issues, no. 299:28–46. https://doi.org/10.29242/rli.299.3.
Ridley, Michael, and Danica Pawlick-Potts. 2021. "Algorithmic Literacy and the Role for Libraries." Information Technology and Libraries 40 (2): 1–15. https://doi.org/10.6017/ital.v40i2.12963.
Shane, Janelle. 2020. "AI Weirdness: The Strange Side of Machine Learning." https://aiweirdness.com.
Smith, David A., and Ryan Cordell. 2018. "A Research Agenda for Historical and Multilingual Optical Character Recognition." https://repository.library.northeastern.edu/files/neu: f1881m035.
Smith, Linda C. 1976. "Artifcial Intelligence in Information Retrieval Systems." Information Processing & Management 12 (3): 189–222. https://doi.org/10.1016/0306-4573(76)90005-4.
Underwood, Ted. 2014. "Theorizing Research Practices We Forgot to Theorize Twenty Years Ago." Representations 127 (1): 64–72. https://doi.org/10.1525/rep.2014.127.1.64.
Whitelaw, Mitchell. 2015. "Generous Interfaces for Digital Cultural Collections." Digital Humanities Quarterly 9 (1). http://www.digitalhumanities.org/dhq/vol/9/1/000205/0002 05.html.

Share