publisher colophon

In February 2016 the American Academy of Arts and Sciences came under intense scrutiny because of its failure to nominate any actors of color for any of the top four award categories. The controversy, which became known as #OscarsSoWhite, was a topic of conversation for students at the University of California, Los Angeles, too, since we tend to be very aware of our proximity to Hollywood.1 Students in the Digital Humanities program wondered whether a digital project might offer a productive intervention in the conversation. It struck us that most of the conversations about race and Hollywood framed the Academy's failure to acknowledge actors of color as a kind of inability to see the vibrant work being done by actors and other film workers of color. But we knew that the story was actually longer, and more disturbing than that: a vibrant, viable, discrete, and artistically innovative community of Black filmmakers had existed for about fifty years in the United States. Its economic annihilation, and its subsequent expulsion from popular memory, were the result not of Hollywood's failure to see filmmakers of color but of a deliberate campaign of suppression by Hollywood studios.2

We were aware of this sphere of activity, often called the "race film industry," because the University of California, Los Angeles, holds in its Library Special Collections an extraordinarily valuable set of documents called the George P. Johnson Negro Film Collection.3 Johnson was a writer, producer, and distributor for the Lincoln Motion Picture Company, one of the most active and influential production companies in the race film industry. He documented his own work as well as the work of other film companies, films, and actors from 1916 until his death in 1977.4 When a group of students and faculty gathered in spring 2016 to formulate a digital humanities capstone project, we knew that we wanted to use the Johnson collection to expose for a wider audience the films and artistic contributions of this group of actors and filmmakers. [End Page 709]

In developing Early African American Film: Reconstructing the History of Silent Race Films, 1909–1930, a database about early American race film, we, a research team composed of undergraduates, a graduate student, and a faculty member, wrestled with a host of problems. Some are familiar to any scholar who grapples with questions of race, and some are particular to the problem of translating complex, multifaceted historical identities into "data."

Anyone who has led discussions about race with undergraduate students will be familiar with the line of questioning: if race is, as the professor claims, a social invention, then why are we reifying it as a category by designating such areas as African American history or a Black studies department? The answer, of course, is that discursive categories such as race, whether or not they are based in fact, have wide-ranging, deeply experienced, and historically critical implications. But our students are right to wonder how a historian can acknowledge the diversity and variation of human experience while being explicit about how these experiences are affected by structures of power such as race, gender, nationality, or ability.

The scholar's dilemma—of honoring individual experience, on the one hand, and acknowledging structure, on the other—is magnified when working with data, which often tolerate less nuance than one might wish. One can write about race film, a term historians have applied to films made for African American audiences in the twentieth century, with a great deal of subtlety, acknowledging that a film can be simultaneously a race film and not be a race film: not all race films, for example, were made by Black people, or seen by Black audiences, or traveled in the Black theatrical circuit.5 Not all Black actors or producers were Black throughout their careers, or necessarily Black in the same way, since colorism was rife within the race film industry. But a data set must either include a film or person within its ambit or disclude it or them; there is no room for half-measures. Here we describe the ways in which we both struggled with and embraced the strictures of categorization as we developed the first publicly available database of silent race film. In the following section, we discuss the details of the project and its specifications. In the final section, we discuss how we dealt with the theoretical and methodological challenges this project presented and what they contribute to American studies scholarship and practice. [End Page 710]

Project Description and Details

The aim of our project, built in the spring of 2016 under the aegis of the Digital Humanities program at the University of California, Los Angeles, was to construct for the first time a comprehensive database of the early African American race film industry in the silent era (between 1905 and 1930). With the database, we were able to expose for multiple audiences the complex history of silent films made for and by African Americans, and the community of practice that developed around the industry in the first three decades of the twentieth century. These films, and those who made and viewed them, are underrecognized, despite their significance to film and media histories, African American studies, and American studies.

The George P. Johnson Negro Film Collection, an archival collection held in the UCLA Library Special Collections at the Charles E. Young Research Library, was our starting point in developing this project. But shortly after beginning our archival research, we discovered that while the materials in the Johnson Collection are an invaluable resource, they also reflect a significant number of omissions and errors. As a result, we expanded the scope of our research to consider additional archival records, including African American newspapers from the period such as the Chicago Defender. We also drew on scholarly books and articles in film studies and African American studies, including filmographies such as Larry Richards's African American Films through 1959: A Comprehensive, Illustrated Filmography and Henry Sampson's Blacks in Black and White: A Source Book on Black Films.6

We decided to build a relational database, a well-established form that arranges data in multiple, linked tables. This arrangement affords the ability to separate data into discrete tables and for users to make complex queries without a great deal of technical knowledge. A user is able to manage and query at a fairly specific level, for example, searching for "all films produced by companies based in Missouri before 1920." We developed the database using Airtable, a piece of user-friendly, web-based software for the creation of versioned, collaborative databases. It contains a series of linked tables that provide details on different facets of the race film industry. The first table, "Film," contains information about each race film produced in this period that the research team was able to identify and to verify. In a second table, "People," this data set contains all the names, roles, and biographical details we were able to obtain about the actors and other film personnel. "Companies," the third table, consists of entries on the production companies involved in [End Page 711] the race filmmaking industry prior to 1930. The final table, "Sources," charts the sources, both secondary and archival, used to obtain information for the database. As of December 2017, the database comprises the 303 silent race films we uncovered in our research, linked to 759 actors and personnel, 176 race film companies, and a wide range of archival materials and scholarly works on black film and filmmaking. Each record included is supported with all the descriptive and archival information that was discovered.

The project we built is not only the first publicly available database on early race film. It also offers users a number of possibilities for engagement, augmentation, and reuse of the data. The project exists as a perusable database on Airtable. The tables are also downloadable as raw comma-separated value (CSV) files posted on the code-sharing site Github. The latter option affords others the opportunity to use, to build on, and to make any corrections to the data based on their own research or on evolving scholarly knowledge in these areas. In publishing the data set online, we worked to comply with current best practices in curating and publishing humanities data. These practices included documenting decisions made while assembling and fine-tuning our data, including making available the data dictionary and other explanatory materials within the data package; storing the data set in a relatively durable repository; obtaining a stable identifier for the data set; attaching a license to the data set; and making some provision for versioning and citation. After considering several options, we chose a solution that is both lightweight and consistent with current thinking about data sustainability. Once the data set was complete, we created a repository for it on Github, and then linked that repository to the data repository Zenodo.7

Theoretical and Practical Challenges and Opportunities

As we discuss below, the transformation of complex phenomena into structured data involves difficult decisions and, at times, the loss of information. But structured data can also offer great benefits, because of all the ways they can be used. The project's website ( features maps and visualizations created by the research team. These visualizations are designed to showcase the diversity of ways the data set might be used. For example, we include a map drawing on the location of all production companies for which we could find that information. Each data visualization we have published is accompanied by the specific data used to create it and by a set of instructions on how to go about creating similar visualizations in [End Page 712] form and content. We also included on the website a set of beginner-friendly tutorials to aid less-experienced users' own work with the data set. The maps we created, for example, provide the basics of mapping using our data set and refer users to further resources on mapping and other data visualization practices.

While we faced a number of technological hurdles in building the database, our most vexing problem was how to determine which people and films would be included. Scholars and popular writers working on these and related films offer various definitions of a "race film," but we found that each definition proposed had meaningful exceptions. One could not say, for example, that race films were always made by Black people, since a number of White-owned companies specialized in films for Black audiences. Nor could one say that only Black actors appeared in race films, for many widely acknowledged race films included both Black and White actors. The definitions that we found most generative defined a race film in terms of its intended audience—but, given the paucity of available production documents, how could we determine this in every case? Our solution, which we arrived at through a great deal of discussion, was to include in our database those films that were discussed as race films at the time. If race, as Stuart Hall tells us, is a discursive category, then it made sense to define our category in terms of the way in which it was discussed. We thus cast as wide a net as possible, gathering film titles and production information from numerous primary and secondary sources. We then winnowed the list down and preserved the "discarded data" as part of the race film data set, so that interested researchers can investigate our decisions for themselves.8

The database, then, is an example of theory-engaged practice: to build the database with integrity and accuracy, all project participants needed to have a shared understanding of how race works: not only the way it structures lives and livelihoods, but also the way it fails to fully describe the entirety of any individual's lived experience. Students decided, for example, that while the entire data set helped us understand Black history, "race" itself could not be a column in our "People" table, since individual people could and did change races during their lifetimes—a fluidity that exceeded our own ability to capture the data properly. In our mode of collaboration, too, the team drew on pedagogical principles drawn from feminist and antiracist traditions: we resolved to honor every voice in the team, to value the process of collaboration over the final product, to acknowledge each other's labor continually, and to respect the diversity of viewpoints represented in our varied group.9 When considering whether a digital humanities project is an "American studies project," the [End Page 713] most obvious question to ask might be whether its subject matter is consistent with the established domains of American studies scholarship. As our project demonstrates, however, the values represented by American studies as a field can be much more deeply embedded in a project than simply dictating its subject matter. To be an "American studies digital humanities project," a work must honor the critical praxis of American studies in its design, mode of collaboration, data ontology, and accessibility to new audiences.

Miriam Posner

Miriam Posner is assistant professor at the UCLA School of Information. She is also a digital humanist with interests in labor, race, feminism, and the history and philosophy of data. As a digital humanist, she is particularly interested in the visualization of large bodies of data from cultural heritage institutions, and the application of digital methods to the analysis of images and video. A film, media, and American studies scholar by training, she frequently writes on the application of digital methods to the humanities. She is at work on two projects: the first on what "data" might mean for humanistic research; and the second on how multinational corporations are making use of data in their supply chains.

Marika Cifor

Marika Cifor is assistant professor of information science in the School of Informatics, Computing and Engineering at Indiana University, Bloomington. She holds a doctorate in information studies as well as graduate certificates in gender studies and the digital humanities from the University of California, Los Angeles. Her research focuses on archives, cultural and critical theories, gender and sexuality, and digital cultures.


For their valuable contributions, we would like to thank Shanya Norman, William Lam, Hanna Girma, Karla Contreras, Monica Berry, Aya Grace Yoshioka, Allyson Field, Brian Graney, Thomas Padilla, Charles Musser, Cara Caddoo, Peggy Alexander, and Jan-Christopher Horak.

1. On the controversy, see, e.g., "Oscar Nominees Discuss Diversity in Hollywood amid the #OscarsSoWhite Backlash," Los Angeles Times, February 25, 2016,

2. On these campaigns of suppression, see Clyde R. Taylor, "Black Silence and the Politics of Representation," in Oscar Micheaux and His Circle: African-American Filmmaking and Race Cinema of the Silent Era, ed. Charles Musser and Pearl Bowser (Bloomington: Indiana University Press, 2001), 4–6; and Cedric Robinson, Forgeries of Memory and Meaning: Blacks and the Regimes of Race in American Theater and Film before World War II (Chapel Hill: University of North Carolina Press, 2012).

3. "George P. Johnson Negro Film Collection, 1916–1977, LSC.1042," University of California, Los Angeles, Library Special Collections, accessed April 4, 2018,

4. George P. Johnson, Elizabeth Dixon, and Adelaide Tusler, "Collector of Negro Film History," tape, July 11, 1967,

5. See, e.g., the discussion of Ebony Films in Jacqueline Stewart, Migrating to the Movies: Cinema and Black Urban Modernity (Berkeley: University of California Press, 2005), 189–218.

6. Larry Richards, African American Films through 1959: A Comprehensive, Illustrated Filmography (Jefferson, NC: McFarland, 1998); Henry T. Sampson, Blacks in Black and White: A Source Book on Black Films, 2nd ed. (Metuchen, NJ: Scarecrow, 1995).

7. Miriam Posner, Monica Berry, Marika Cifor, Karla Contreras, Hanna Girma, William Lam, and Aya Yoshioka, "Race Film Database," Zenodo, October 13, 2016,

8. For more on these technical issues, see Miriam Posner and Marika Cifor, "Tracing a Community of Practice: A Database of Early African-American Race Film," Moving Image 17.2 (2018): 101–5.

9. On feminist and antiracist pedagogy, see, e.g., Lynne M. Webb, Myria W. Allen, and Kandi L. Walker, "Feminist Pedagogy: Identifying Basic Principles," Academic Exchange Quarterly 6.1 (2002): 67–72; and Alda M. Blakeney, "Antiracist Pedagogy: Definition, Theory, and Professional Development," Journal of Curriculum and Pedagogy 2.1 (September 23, 2011): 119–32.

Additional Information

Print ISSN
Launched on MUSE
Open Access
Back To Top

This website uses cookies to ensure you get the best experience on our website. Without cookies your experience may not be seamless.