Back to the Sources:Practicing and Teaching Quantitative History in the 2020s
This article elaborates on our experience of teaching quantitative methods to historians and writing an introductory book on this topic. We promote respect for principles of source criticism as the cornerstone of the constitution of data from historical sources, and argue that a conversation on this constitution is as important for new historians of capitalism as it is for economic historians and business historians, among others. The first part of the article explains what led us to promote constructivist, small-scale, experimental quantitative history. In terms of teaching, this choice translates into a learning-by-doing approach focused on the construction and categorization of data from sources. The article then presents practical methods of teaching and research, borrowing examples from economic history and beyond, as well as from the history of capitalism. The second part also addresses the transformation of sources into quantifiable data, while the third part discusses data categorization and analysis.
quantitative history, cliometrics, historical sources, data, teaching
This article elaborates on our joint experience of teaching quantitative methods, mostly to historians, beginning in the early 2000s, and of writing an introductory book on this topic, first in French and then in a revised and expanded English edition.1 In what follows, we pursue two aims, each related to a different audience—as we have done in the book and in our teaching.2
First, we aim to make quantitative methods accessible for all historians—and humanists in general, especially those who do not think that such methods are for them, either because they do not enjoy mathematics or because they study topics that are not traditionally considered suited to quantification. We sometimes note that we wish to make quantitative history banal, in the sense that, for example, we hope it could be published in mainstream historical journals without being singled out as quantitative. We also think that any historian who would like to embed quantitative history in their methods courses should be able to do so after having read a book like ours, along with some empirical papers using these methods, and after having practiced a bit [End Page 473] on their own sources. For us, this introduction would ideally be part of any history curriculum—not an option for the sole use of students interested in economic (or demographic) questions, and not a course taught by nonhistorians. We have written this article primarily for historians—including those who are sometimes referred to as new historians of capitalism and business historians who do not consider themselves quantifiers. As their approach is in part a political and methodological reaction against mainstream cliometrics, some new historians of capitalism tend to associate any type of quantification or measurement with neoclassical economics.3 On the contrary, we hold that there are many different ways to quantify. The constitution of data as we teach it is conducive to a specific but very close reading of sources; as such, it can be one tool among others for fostering humanistic interpretation.
Second, we promote respect for the basic tenets of the historical profession, starting with the principles of source criticism, which we believe should be reflected in the constitution of data from historical sources. This second goal has become more and more central for us over the years—hence its presence in the title of this article. It is a message that we wish to convey to non-historians, as well as to those colleagues of ours who, because they are not practitioners of quantification, tend to equate it with automatically reading and interpretating a source using a computer, or with imposing simplistic categories onto the heterogeneity of primary sources. We know that few cliometricians in economics departments will think that they need advice from self-trained historians whose book mostly discusses examples taken from social, cultural, and political history, does not give formulas or address standard deviation, and dedicates many pages to sequence analysis and network analysis. We nevertheless hope that they will be interested in what we write about constituting data—not cleaning it—and about categorization, be it for their own research or because they would like to include these research stages in the training of their graduate students. Although our questions and our teaching methods differ from theirs, because we do not teach the same students, write for the same audiences, or publish in the same journals, we believe that many of them share our goal of using historical evidence in ways that are both innovative and true to its sources—and that we have original insights to share on these matters.
We are aware of the risk we face by seeking to borrow from both the critical humanities and the quantitative social sciences: we might end up convincing [End Page 474] neither of the two audiences.4 In terms of careers, young researchers in both groups probably have no incentive to learn non-standard quantification methods; many journals and hiring committees value standard methods above all. We can only hope that tenured faculty members among our readers will try, however slowly, to change this pattern. We know that, in each of the two groups, some colleagues are already sufficiently fond of sources and experiments to listen to us.5 We also derive hope from our teaching experience with graduate students, which allowed us to bridge the gap between the majority of students, who tend to be traditional historians, and a minority who are social-science-oriented quantifiers. It is our teaching, not only our personal research experience, that led us to make our stance on quantification more consistent and explicit over the years—as we saw what worked and did not work with diverse sources and research questions. We have now been teaching quantitative methods for almost twenty years, often together, in different universities in Paris and occasionally elsewhere, too.6 And we convinced many students, some of whom are now colleagues, that the approaches they encountered in our seminars could be useful in their research agendas and professional development. This article sums up what we learned from this experience.
In what follows, we begin by explaining where we speak from. As practices of quantification differ among countries and sub-disciplines, we first say a few words about our own experience with quantitative history in the context of its recent evolution, and why we promote small-scale, constructivist, and experimental quantitative history.7 In terms of teaching, this choice translates into a learning-by-doing approach focused on the construction and categorization of data from sources. The second and third parts of the article elaborate on the implications of these general principles in terms of practical methods of teaching and research, using examples from economic [End Page 475] history and beyond, as well as from the history of capitalism. The second part addresses the transformation of sources into quantifiable data; the third part discusses data categorization and analysis.
Teaching Quantitative History with Sources rather than with Equations
We came to quantitative methods more out of necessity than out of faith: our archival material came in quantities too large to be manageable using only close reading and word processing software. This exigency also explains how we teach and practice quantitative history. We teach the courses we wish we could have taken when we were graduate students in history, but we also realize that not everyone will be able to attend one of our courses. That is why our books are written in the style of self-help handbooks, so that any historian beginning a research project can use them to learn the basics on their own. This first section presents these general principles and positions them (and us) in the history of quantitative history. It begins with a clarification: which kind of quantitative history are we talking about?
We did not begin our research, in the late 1990s, thinking that we would become quantitative historians. Claire Zalc's master's thesis on German and Austrian migrants in France in the 1930s was entirely based on discursive sources. But for her doctoral dissertation, which focused on immigrant shopkeepers in the interwar period, her main source, the business register of Seine departement, was massive, with over 1 million inscriptions.8 Quantification was not an end in itself, but rather one tool among others, needed to sample this massive source and write a better history. Just before she began her Ph.D., Claire Lemercier learned the basics of Excel, such as mundane but crucial functions like "freeze panes" or "pivot table," from a staff member in a historical demography research center. She gathered and analyzed data on the institutional careers and personal ties of a few hundred Parisians in the nineteenth century, and this became just one part of her dissertation on the advisory role of the Chamber of Commerce. For her, too, this was not "quantitative history." Neither of us had any reason to identify with a label that was definitely old-fashioned: in the generation of our thesis advisers, many had learned how to quantify and some had followed through, but almost none thought that those methods still had purchase. [End Page 476]
In our view, our story says a lot about the role of quantification in history in the early 2000s. We were lucky enough to know social scientists who shared tools with us, but it was not easy for us to find either method courses or history professors who were able to advise us; and for others, this proved nearly impossible. Not until 2001 did a handbook illustrating how to use Excel and Access for the purpose of advancing historical research appear in French.9 Before then, the main book on quantitative history available in French addressed the use of a calculator to compute moving averages or variance in aggregate economic or demographic data.10 Our sources did not offer aggregate data: what we needed were ways to produce numbers in order to make sense of long lists of names, occupations, and addresses. We mostly had to find these ways by ourselves, in texts written for other disciplines. This was not easy, yet perhaps we were lucky to come along at a moment when the war against the first wave of quantification had been won. Many historians still equated quantification with anachronism, thought that it led scholars to ignore agency, or just considered it boring, but indifference was more widespread than hostility.
The story of this first wave has been told elsewhere.11 There is no reason to recount it again here, but we wish to insist on the fact that it should not be forgotten. On the one hand, it produced excellent work that we assign to our students. On the other, it receded for important reasons that we need to keep in mind if we do not want to reproduce past mistakes. What we call "the first wave" here is the so-called new economic history (also known as cliometrics), new social history, and new political history of the 1960s. Two reasons for being disillusioned with all three approaches remain vital for us. First, some of their promoters dispensed with source criticism, imposing anachronistic analytical categories or treating numbers found in past sources as objective data. If historians are to quantify again, they need to make source criticism a routine element of their data construction. Second, as Lawrence Stone put it in 1979: "Squads of diligent assistants assemble data, encode it, program it, and pass it through the maw of the computer, all under the autocratic direction of a team-leader."12 As students who came [End Page 477] after the first wave receded, that was not our experience. We learned quantification as an individual craft, working on our personal computers, and we came to appreciate this labor process in which the close reading of sources, the construction of the data, and quantitative analysis and interpretation of it all happened in the same head. We do not reject collaborative projects as such, of course, but we do not think that they should be equated, once again, with the subcontracting of source criticism to (poorly paid) research assistants.
What happened to the use of numbers and computers in historical research while we were making up our minds about these questions? In fact, it had not completely disappeared in the 1980s, and we developed our work in conversation with four different strands of research, which were mostly unconnected with one another. The texts that we share with our students come from all four research communities—as well as from the first wave and more idiosyncratic corners of academia.13
The first strand of research, cliometrics, is essentially divorced from history and tied to econometrics, micro-economics, and neo-institutionalist economics. Its mainstream approach in the 2010s involved multivariate regression as a tool aimed at disentangling causes from effects, whereas cliometrics in the 1960s often used quantification in a more descriptive way. This approach is also less concerned with the internal logics of the economic systems of specific periods in the past, and more focused on detecting the very long-term effects of particular causes. Moreover, definitions of causation in econometrics have become more and more stringent, leading many economic historians to focus on specific historical situations, deemed "natural experiments," where an important event occurred in one country or region, but not in another, otherwise similar one.14 As cliometrics is almost only practiced in economics departments and published in economics journals, it follows the standards of the discipline, which rewards sound data construction from historical sources less than statistically robust results.
Second, "digital humanities" became a popular phrase in the mid-2000s, as did "big data" in the 2010s. Suddenly, there were new series of handbooks and workshops addressing topics that overlapped with ours (network analysis, text analysis) and we were invited to give our opinion or even perceived as part of this emerging community. However, this community was indifferent [End Page 478] to much of what we care deeply about, such as the social sciences, contingency tables, and longitudinal data. And many of its members reverted to what, in our view, had failed in the first wave. Once again, the slogan was "retooling" history so that it functions more like the "hard sciences," with more money, more teamwork, and more objectivity—turning the historian into a programmer. The History Manifesto went as far as to criticize the undue interest of many historians in archives.15 Many—although certainly not all—digital humanities projects, like many of their predecessors in the first wave, seem to channel a lot of resources without producing original historical results. The "Venice Time Machine" project launched in 2013 was perhaps the most ambitious and expensive of all collaborations to date between historians and computer scientists, as it aimed to "build a multidimensional model of Venice and its evolution covering a period of more than 1000 years" by digitizing kilometers of archives.16 It would have arguably advanced research in computer science, notably in optical character recognition, but the benefits for historians were far from clear, given that it was premised on supposedly all-purpose data collection, which was not guided by any specific historical questions. Perhaps predictably, the project ran into serious problems and was put on hold in 2019; it is not clear it will ever be resumed.17
As data analysis inevitably begins with whatever is most accessible, the digital humanities have often tended to reproduce old scholarly or social hierarchies, focusing on the artistic canon or on printed sources.18 It is striking that much of the criticism launched against the "big science" of the first wave around 1980 could be repeated today without any change: lavish funding, mathematical sophistication, a quest for exhaustive data—yet very few new insights about the past.19 Those who marvel, for example, at Google NGrams without pausing to wonder what exactly is included in Google Books are, for us, not very different from those who admired figures for deflated wages without questioning their sources. Similarly, we find the search for general, unambiguous "ontologies" that would allow machines to read, [End Page 479] interpret, and compare all historical texts as risky and pointless as were past anachronistic occupational classifications.20 The good news is that there are exceptions to this mainstream version of the digital humanities.21
Third, as continental Europeans, we were exposed early on to Italian (social) micro-history. Contrary to most English-speaking versions of microhistory, its criticism of the first wave did not imply that all quantification was to be avoided. Italian micro-historians were deeply engaged in the question of standards of evidence.22 Their answers allowed us to think of quantification in constructivist terms, suited to the scales of interaction among individuals, and to approach it experimentally—open to non-standard procedures, including the idea that meaningful results can be obtained by studying a small, situated unit, such as a family or a village, systematically and even quantitatively. From this perspective, quantification is not an aim in itself, but can open up new questions or reveal exceptions suited to a more narrative approach. As micro-historians advocate "incorporating into the main body of the narrative the procedures of research itself," source criticism and data construction may become an integral part of quantitative research—not preliminary operations to be dispensed with or standardized.23 And this can all be fun. It can be "experimental history," in the sense of trying weird, idiosyncratic categorizations or correlations, in the hope of finding something new—the opposite of the "big science" of the first wave.24
Fourth, sociologists and political scientists went on quantifying from historical sources when most historians stopped doing it—as did economists.25 We drew a lot of inspiration from their broad toolkit, which is often more capacious than the one used by economists.26 In this respect, we [End Page 480] found Andrew Abbott's writings particularly exciting when he called for unheard-of alliances of methods, questions, and materials.27 Ivan Ermakoff studied the "collective abdications" of the German and French parliaments in 1933 and 1940—that is, their votes granting full powers to Adolf Hitler and Marshal Pétain, respectively—combining source criticism on testimonies, a keen interest in the notion of exception, and the ambition to model processes of alignment among actors in the context of high-stakes situations with uncertain outcomes.28 In a series of articles on Renaissance Florence, John Padgett paid closer and closer attention to his sources, without abdicating his ambition to provide multi-scalar explanations, from the individual and the event to long-term trends, of the co-evolution of economic, political, and family organizations.29 These three social scientists belong to different schools, but they share a reflective posture that, as much as their interest in original ways of linking methods, proved an inspiration for us.
In France as elsewhere, most students enter doctoral studies in the humanities and history via what might be called a literary path, convinced that mathematics is not for them. As historian Antoine Prost has pointed out, the social stigma of not being "good at math" can be turned around: "certain self-styled princes of the intellect commonly express a haughty disdain for insistence on rigor or quantitative discipline of any sort, as though these were trivial concerns, menial chores to be left to subordinates."30 Our students thus lack a role model halfway between such "princes of the intellect" and those social scientists who treat quantification as a merit badge. We try to provide them with that third way.
With this aim in mind, we have practiced two main formats. First, in compulsory courses for beginning graduate students in history, we assign some readings and, crucially, guide students through hands-on research, from the close reading of a source to the writing of short essays. On occasion, we have used the same teaching strategies with students in quantitative sociology, having them experiment with non-standard ways to input and categorize data from narrative sources or images. We believe that this format could also work in economics.
Second, we have also developed non-compulsory workshops mostly attended by doctoral students in history. Each participant presents their sources and research questions, and as a group, we discuss what datasets could be [End Page 481] built and which kind of quantitative analysis could be most useful. Curiously, the composition of the class has changed over the years, shifting from mostly male to mostly female. The topics and areas of research are extremely diverse, but most students use sources organized around persons and seek advice on prosopography, with a significant minority interested in textual analysis. We have thus come to believe that most historians could apply quantification—and would like to do it, if properly taught—to populations or samples of a few dozen to a few thousand individuals (often persons or texts, sometimes events such as lawsuits) described with a large number of variables, which generally change over time.31
For example, Elsa Génard attended our workshop as she began her research on prisons and prisoners in interwar France. Having located several thousand individual prisoner files from multiple prisons—themselves rare sources and very rich in information, but also very time-consuming to translate into data—the first thing she needed to do was to devise sampling procedures, which she did quite inventively. The next year, she carefully constituted data and was ready to test a technique that is not usual in history departments: logistic regression. It helped her show how prisoners' experiences depended on their social identities prior to detention (age, gender, geographic origin), on penal categories applied to them, and also, less predictably, on the prison's occupancy rate and the person who was its director.32 After this first experiment with the software R, she returned to our workshop, a few years later, to discuss various techniques for how to visualize her data and their relevance for writing her dissertation. Each time, we were able to provide her with advice tailored to her particular situation; but this happened before an audience of twenty to thirty fellow students, each of them learning from advice given to others. The important thing is that the advice was specific, showing, in practice, that methodological choices depend on the sources and questions at hand.
These two teaching formats, as well as our book, thus differ from the rare examples that we encountered as students. The main difference is the fact that we emphasize, in our courses even more than in our books, what precedes quantitative analysis: how to go from an archive (or any other source) to the "clean" datasets of social scientists, such as a .csv file or a "tidy" file in R. What we had not anticipated in our first years of teaching, at a time when we focused on introducing regression, network analysis, and so on, was that [End Page 482] those first, preliminary steps were at least as exotic for students—whether they were historians, sociologists, or economists—as statistics per se. Over time, we have come to appreciate that data preparation is itself an important stage of analysis, and certainly not the chore often called "cleaning" and therefore delegated to underlings. Many among our students had never heard of variables and never seen a .csv file. And those who knew what to do with "clean" rows and columns did not know any better than the others how to get there from a source. Students would either write down narrative details in their database and only use queries for analysis or, trying to emulate an implicit model of "clean data," they would translate their sources directly into supposedly unambiguous, rigid categories: 1 for upper-class, 2 for middleclass, 3 for workers, and so on. We knew the pitfalls of these two solutions, but it took us a long time to formulate alternative principles—to decide exactly which quantitative history we wanted to teach. We flesh this process out in the second and third parts of this paper. But how should it be taught? In this respect, we want to emphasize three ideas.33
First, the best way to teach and learn "quanti" is to do research together. One of our favorite metaphors in the classroom is that of assembling a piece of Ikea furniture. You can always learn the instructions by heart, but the important things happen when the parts to be assembled are in front of you; and each piece of furniture has a specific manual, even if there are some generic parts. Hence our hands-on approach, always starting with actual research questions and sources. Our book offers a version of the invariants in Ikea manuals, so to speak, in the "Ten Commandments of Inputting Data."34 We think that it is important for students to read those, if only because they advise them against deep-rooted (if never explicitly taught) reflexes, such as not keeping track of the source of each piece of information when creating a database. But the commandments are just the beginning. In master courses, we give the students one source—for example the official biographies of French members of parliament from 1946 to 1958—and they collectively decide on how to sample or focus on specific populations, how to enter data, how to categorize it, how to formulate reasonable research questions, how to produce provisional answers based on the close reading of specific cases and on correlations (such as through contingency tables and chi-squared tests), and how to discuss the results. We have experimented with this format with historians and sociologists, but it would arguably be as challenging and as rewarding [End Page 483] for economists. In graduate workshops, each participant brings their questions and sources and shares them in turn with the group, who exchange ideas about how to constitute, categorize, and analyze the data. We try to create a friendly atmosphere of mutual help, so that students do not hesitate to show their first attempts at creating databases, especially when they consider them ugly. We thus fight what sociologists call "math anxiety."35 And we share our own databases in order to explain that data just extracted from a source should look dirty, in the sense of heterogeneous, full of comments and contradictions, not yet made simpler in order to answer one specific question.
Second, our hands-on teaching methods involve actual historical sources (which digitization has made easier to use), not forged examples and simulated data, as is often the case in introductions to quantitative sociology, or aggregate, "clean" data from published syntheses. Even if the course does not allow us to perform each step leading from the source to the analysis, it is important to mention all the steps and show in class what the successive files look like. And spending more time on the initial steps makes sense: if one has to resort to learning alone, following tutorials (the Ikea manual) is doable if the aim is to learn regression or network analysis on structured, categorized data. It is, however, almost impossible to learn how to constitute such data on one's own. Hence we advocate including a few sessions on data constitution even in the most hardcore courses designed for economists.
Third, it is possible to understand quantification, learn the necessary skills, and develop an appetite for quantitative methods without being good at mathematics. We have to address math anxiety, not reinforce it. What we teach is not statistics, but some principles derived from statistics (such as sampling)—principles that can be explained in prose—and the proper use of software based on statistics for the exploration of historical sources. For example, to explain factor analysis (also called "correspondence analysis" or "geometric data analysis") in our book, we wrote that it is a tool for exploring rich corpora in order to identify salient features, a way of eliciting motifs from a mass of information. Then we presented the method by working from a case study: a prosopography of the administrators of a business union in the early twentieth century. Presenting the results of this study was an opportunity to give more general advice, along the way, on which numbers are important for interpretation and on classic pitfalls to be avoided.36 In our experience, students in history identify with the authors of such case [End Page 484] studies and thus become more receptive to general advice—as long as it is given in accessible prose rather than in formulas. We do not want to hijack history classes to train students in the mathematics that has put them off in the past. We want them to become informed users of quantitative methods—to understand what a method does and does not do, its advantages and disadvantages, not to demonstrate the theorem behind the software. Of course, for those who want to use factor analysis, not just be able to read a paper using it, following our workshop or reading our book is not enough, but both point them to the appropriate advanced handbooks, tutorials, and summer schools. What we have done is dispel the belief that these tools are not for them and helped them choose one specific method among others—something that, quite naturally, advanced materials focused on a particular method do not address. Often, however, a contingency table and chi-squared test are all they need for their research.37
From Close Reading of Historical Sources to Data
Almost all textbooks on quantification begin with data arranged in neat rows and columns and standardized in simple codes: there are just two genders, no unknowns, a finite number of occupations, one occupation per individual, and so on. It is then tempting either to conclude that complicated sources are not amenable to this format and give up on quantification, or to naïvely simplify the source in the interest of so-called cleanness. Lack of interest in the constitution of data leads to, and is amplified by, subcontracting the task to others—research assistants or even computers are left to deal with the variety found in the sources. Discussions of the topic tend to be confined to promoters of re-usable datasets who often advocate one-size-fits-all data structures and categorizations. By contrast, in our "Ten Commandments" we advise researchers to tailor the construction and categorization of their data to their own sources and questions. By making their choices explicit, scholars might allow others to use their data as new sources in the future, but these unlikely future users should not constrain choices in the present. The crucial first step in any research project should be a critical close reading of numerical as well as textual sources.
When we teach the practice of research, we begin with a source. We do not privilege the use of one type of source over another: archival records [End Page 485] lend themselves to quantification as much as newspapers, printed poems, biographical dictionaries, and many other sources do. Similarly, no medium is inherently better suited than any other for purpose of introducing students to quantitative historical research, though whether they work with actual dusty pieces of paper or with digital pictures or texts should of course be acknowledged in source criticism. Any source can be quantified, and research during Covid-19 has made it clear that open-access digitized sources, even though they should not become our only sources, are extremely useful. What matters is to subject second-hand digital sources to the same level of scrutiny as dusty archival papers whenever one intends to create a dataset.
Given that prosopography is a widespread practice in history and that we need a source that is easily available online for our introductory courses, we often use second-hand sources: biographies compiled by historians or by institutions for memorial purposes. For example, several classes have worked on the official biographies of French members of parliament in the 1950s.38 We ask students to read a few biographies and think about what type of information we could extract from them and how we might try to quantify it. They generally first come up with sociology's usual suspects—gender, age, occupation—with no critical reflection on how these variables are recounted in the (second-hand) source. Since the aim is to quantify, they tend to treat all information as objective. However, once prompted to think of other possibilities, the students start discussing what can be made of, for example, a certain phrasing, the lack of mention of marital status or religious affiliation, or the headgear worn in a picture.
One of the most brilliant essays produced by masters students examined the photographs of representatives of the colonial Empire in the French parliament. The initial idea was to create variables (columns in the dataset) describing what the students saw in these pictures. The students soon discovered that questions about what to describe in the pictures and how to describe it had non-obvious answers: Was there a hat? A tie? In which direction did the person look? Could they input something, and what, about the physical appearance or presumed race of these representatives? They also discovered that such questions were intimately linked to those of interpretation—what can we learn from such a systematic description?—and intention: Who had taken these pictures? Who had decided on the pose, and on the traditional [End Page 486] attire in some photographs, and why? The result was a very interesting reflection on the use of photographs as a source for historians. Some students decided to go beyond a close reading of the online source, investigating its production by doing archival research about photographers employed by the parliament. The database's main result was that the photographs of the representatives, whatever their origin, had become more standardized over time. The students reached the conclusion that more important than the origins of the members of parliament was the process of professionalization that photography had undergone. Other students similarly quantified the phrasing of biographies, correlating it with their authors or dates of writing as well as with the gender or actions of their subjects.
We wanted them to learn that many different aspects of a source can be quantified—not just those generally thought of as quantifiable. We achieved the same goal in our graduate workshop. For example, Claire-Lise Gaillard is a graduate student working on marriage agencies and personal advertisements in nineteenth-and early-twentieth-century France. She has created multiple datasets from the agencies' books and from series of personal ads. Drawing on a newspaper from the interwar period, she was able to compare not only the stated ages of men and women but, more importantly, the qualities that marriage-seekers described themselves as having and those they said they were looking for—or would refuse.39 Moreover, as the newspaper kept track of successful matches attributed to its personal ads, Gaillard was able to make inferences on what made a man or a woman actually eligible. Women who presented themselves as blond, cheerful, and already good homemakers married more often than others, perhaps unsurprisingly; but men who advertised less stereotypical qualities, such as being good company, succeeded the most. Gaillard also zooms in on couples who appear mismatched according to the criteria of the marriage press, and tries to make sense of their situation—something that she can do because quantification has allowed her to establish what the norms were in a sample of four thousand advertisements. In another, yet unpublished part of her research, she uses a more intriguing source, a sort of Facebook of the 1930s: the last page of a teen magazine, Midinette, where messages (not always sentimental) addressed to one pseudonym, several, or "all," were published and often answered. Using network analysis, she shows how centralized these communications were—whereas in a cursory reading, they would seem random—and proceeds to try to understand which pseudonyms were central, which types of messages they were associated with, and why. [End Page 487]
We are persuaded that a whole variety of sources can be treated quantitatively—not only statistical sources or individual records, but all sorts of texts and images. We oppose the still frequent description, made by first-wave cliometrics, of marriage records, census forms, probate inventories, and the like as inherently quantitative sources. Like all other texts, these, too, can be subjected to all kinds of readings. Conversely, any type of content in any type of source—or its absence—can be quantified.
In the end, what we teach is that it is not its type or format that makes a source suitable for quantification. A source is not particularly appropriate for creating a database because it deals with economic matters, because it contains numbers, or because it lacks an aesthetic quality. Sources of any type that come in large quantities (or, more precisely, any source that can be conceived as a long series of units—records, images, paragraphs, and so on) lend themselves to quantification, if only because it is difficult to grasp their contents without creating a database. But the main question is: will the constitution of data from this source help answer your questions and produce fresh knowledge? It is the research question, not the source, that should drive choices as to how to quantify, and without a good understanding of the source, what it shows, and what it hides, quantification will fail.
Historical statistics as a source are a case in point. We accumulated many questions in our workshops, over the years, about how to use them, and the answer matters as much, we think, for the new historians of capitalism as for economic historians. Statistics of the past were a favorite source of the first wave of quantification, too often taken at face value—as is still sometimes the case in cliometrics. By contrast, in history departments since the 1990s, these numbers and their accompanying texts have mostly become the source for a vibrant but non-quantitative history of statistics as an administrative and scientific practice, interested, for example, in how statistics shaped categorizations related to race or gender—a subdiscipline that is admittedly too little known by non-specialists.40 As with any other source, however, it is also possible to create new numbers, substantively meaningful for our own questions, from documents (in this case, numbers) of the past.41 Statistics of the past are a biased source, but no more and no less than any other. Using them requires external and internal source criticism, with [End Page 488] general questions such as: Who commissioned these statistics? Who was supposed to respond to queries, and who actually did? How did the promoters of the project define their categories? Are we sure that those who devised the questions, those who asked and answered them, and those who turned them into numbers understood them in the same way? Which quantitative techniques were used to aggregate answers into numbers, what do they emphasize, and what do they hide? Who subsequently used the numbers and categories and for what purpose? If we keep these questions in mind—even if we do not have definite answers to them all—it is possible to constitute new data from old statistics, and to answer some of our own questions.
Feminist scholars are among those who have utilized this reconstructionist approach to statistics in order to overcome biases in records of female labor. Their idea of "cunning historians" able to "read their sources against the grain" to produce not just new narratives but also new numbers interests us because many historians think of quantification as violence done to the original sources through undue simplification and abstraction.42 Here, the historian's motto of complicating the story—that is, avoiding a naïve reading of the source and acknowledging the subtle power dynamics it incorporates—is compatible with building new abstractions and measurements. In this spirit, one of us (Claire Lemercier) used the very same statistics that some historians had taken for accurate and objective, and that Joan Scott later analyzed for their gender biases, to elicit new knowledge about female apprentices in nineteenth-century Paris.43 She reconstructed estimates of the proportion and numbers of female apprentices and, more importantly, showed that they tended to be employed in two very different types of contexts: in family workshops in food trades and with subcontracting mistresses in fashion trades. These were quite separate from two other contexts where there were many male apprentices: not only small businesses in trades that were deemed highly skilled, but also medium-size workshops with a very advanced division of labor. The fact that female apprentices were less likely than male ones to become well-paid workers is not a surprise for historians, even though it is interesting to retrieve it from a source that was meant to hide it; the typology of trades and workshops, however, is an original result, and the result of a reading that is against the grain as much as it is quantitative. [End Page 489]
Fresh knowledge can also be found beyond the proverbial streetlight of extant datasets and of sources routinely quantified to answer our question or similar ones.44 This is a moving target. For example, using probate inventories to investigate consumption was once a pioneering idea; they are still useful, but their biases and limitations have become more apparent and scholars have looked for complementary sources. Our starting point is never to complain about the fact that a source has biases (they all do) or to find "the best" source, but rather to assess which questions it could help answer.45 The history of working hours, productivity, and female labor in early modern Europe offers a comforting example of debates that have partly evolved thanks to the addition of new sources and discussion of their biases. Reading witnesses' testimonies about crimes against the grain yields indications on working hours; systematically listing the verbs in diverse texts expands the range of documented occupations and tasks.46 Passionate debates on the productivity of English spinners have led to a rare, exciting occurrence: a graph comparing measurements across types of sources in an economic history journal, accompanied by a text discussing categorizations, and specifically whether work house spinners should be considered "workers."47
This exercise in looking for new sources and assessing what exactly they document and what they hide is something that need not pit economists against historians—if only each discipline rewarded it better. Non-standard sources, categorizations, and methods tend to take time, space, and increase risks of rejection by many peer-reviewed journals. It is striking, for example, that economists who took pains to delineate different types of advertising in early twentieth-century newspapers, and to produce numbers by measuring column space with a ruler, had to edit the part of their working papers where they carefully described this process from the published version.48 The situation in the historical discipline is not more encouraging, if only because the number of historical journals that accept submissions with quantification exercises, and have the expertise to evaluate them, is meager. However, teaching data constitution more systematically is one way of raising expectations, in younger audiences, as to what we need to know about what was done with the source in order to be convinced by published results. [End Page 490]
Having established that all sources can be quantified, we would like to explain why, in our view, the constitution of data is a key stage in research. Too often, handbooks about quantification only address the so-called cleaning of data—a metaphor that we refuse because it suggests that the complexity of sources should be erased rather than analyzed. By contrast, the creation of a dataset should be the very point at which scholars read their sources closely and take into account their complications.
Some economists do recognize historians' distinct craft of reading sources and want to learn from it, though this is not to say that all historians practice it equally well. According to Naomi Lamoreaux, the best historians know "how to read texts over and over in the context of related documents […], and how to derive meaning from what was not said as well as from what was said."49 In our view, this reading is not a mere complement to interpreting numbers; it is rather a pre-requisite for the constitution of data. In the same spirit, Philip Hoffman, Gilles Postel-Vinay, and Jean-Laurent Rosenthal recently wrote about working on a type of credit that was not recorded in past statistics and only appeared in scattered, non-digitized archives—but proved important even from the perspective of economists: measuring it showed that financial markets could thrive without large banks and beyond the stock exchange. They concluded that "ignoring any of these elements can cause enormous problems" in reference to "how the original historical evidence was generated, how it was preserved, and how they [scholars] go about collecting it."50 Likewise, Howard Bodenhorn, Timothy Guinnane, and Thomas Mroz, discussing historical height, a classic proxy for the good health and well-being of populations, took pains to explain that some problems could not be solved by econometric techniques, but required scholars to "ask hard questions about potential source bias" by investigating the historical production of their sources.51 After having examined profits from the slave trade on the basis of accounts published by historians, Guillaume Daudin, a specialist in international trade, is now building a database from a primary source: official statistics of trade in the eighteenth century. He insists that their very inaccuracies and contradictions document not just the history of statistics, but also trade itself.52 [End Page 491]
These scholars are, however, an exception. As historical data becomes more readily available online, many economists, physicists, and others analyze it without questioning its provenance. We like historian Mateusz Fafinski's admonition that "historical data is not your familiar kitten. It's a saber-toothed tiger that will eat you and your village of data scientists for breakfast if you don't treat it with respect."53 But we are not sure that the distribution of power across disciplines is such, for now, as to allow this type of retribution. At a minimum, we can warn against the infatuation with "big data" in history—as if the aggregation of small datasets could produce such a thing (even the data of Hoffman and his colleagues are not "big" in computer scientists' sense of the term). The new fashion for "bigness" is often based on the old naïve idea that many biases will cancel one another out. Archaeology as well as ancient and medieval history, in which the compilation of sources and the digitization of such compilations are more advanced than in early modern and modern history, offer many cautionary tales in this respect. For example, Søren Michael Sindbæk, a pioneer in the careful network analysis of archaeological and narrative medieval sources, claims that "'big data' is rarely good." This statement concludes his experiment on a large heterogeneous repository of data on maritime networks. A network visualization of this data mostly revealed patterns in archaeological knowledge—a useful result per se, but not to be confused with patterns in medieval transportation.54
This is one of the reasons why we are firm advocates of what we call dirty—that is, not artificially homogenized—datasets. In our view, the fact that historical sources are complicated, heterogeneous, and scattered is what makes them interesting and a basis for the construction of deep and dense, rather than big, data. Therefore, contrary to assumptions generally associated with quantitative history, we teach the value of "outliers and weirdness" as well as of missing data.55
For example, one of us (Claire Zalc) leads a large research project, funded by the European Research Council, on the social and geographical trajectories of Jews in Lubartów, a village in Poland, in the interwar period. One of the sources is a 1932 population register that lists all the households, apartment by apartment.56 The constitution of data is teamwork, with a lot of discussions, and Zalc is fully involved in this process. Beyond obvious information [End Page 492] such as dates of birth, the team recorded, for example, the fact that some of the entries were hastily handwritten with a pencil—perhaps considered provisional, likely to be erased. As there is very little external information on the history of this source, descriptive statistics proved crucial for internal criticism and, therefore, interpretation. Mentions in pencil were correlated with an unknown direction (kierunku) of "departure from the city" in the fall of 1939 and a mention of "Jewish" religion. The unknown direction and use of pencil were in fact a low-noise description of processes of flight to the USSR as well as to other cities in Poland at the time of the declaration of war. More astonishing is the mention, again in pencil, of "expelled/displaced" (wysiedlony), followed systematically by the date April 9, 1942. That was the last day of Jewish Passover and the time when the SS organized a deportation from Lubartów. The team concluded that it was likely that all pencil entries were from the Second World War, which led to the re-interpretation of other events mentioned in the source.
In sum, the complications of the source are not an obstacle to the constitution and quantitative analysis of data. On the contrary, a quantitative analysis that does not preemptively erase those complications often allows us to make sense of them. For example, what are often described as errors in the sources, such as when dates of birth change or double-entry accounts are not balanced, might become important proxies for individual strategies or abilities.57
Similarly, we always advise students to record missing data carefully, rather than be ashamed and hide it. Patterns in missing data may reveal aspects of the constitution of the source. The fact that data are missing might even become a substantive piece of information in its own right when it reveals an ability to hide from the authorities or a fictitious police operation, for example.58 Likewise, in the context of occasionally teaching quantitative sociologists, the simple instruction to create a mock spreadsheet with at least three rows (cases) and at least six columns (variables) from narrative or visual documents elicited fascinating discussions as to what could be done with, for example, a variable listing phone numbers given on posters. The number itself might indicate something (such as a geographical location), but the presence or absence of a phone number could also be interesting in itself—in terms of intended audience, for example. The data is not missing here: the fact that we lack it indicates something interesting. [End Page 493]
Accordingly, in our teaching and books, we emphasize the importance of the input phase of research—the moment when we iteratively read and copy the source into a table and make decisions about its rows and columns. This step is when we decide which type of unit (a person or a book, for instance) will be described in a row and which distinct bits of information about this unit will be typed in columns.59 The important thing is that, at this stage, the information is not homogenized or categorized, but copied from the source. We emphasize this phase not because we wish to impose an arduous and unpleasant rite of passage on the researcher, but because we believe that this work is crucial to any research project. Doing the data entry ourselves is our chance to discover the kinds of unusual things that lead to innovative results. As tedious as inputting data may be, it offers an opportunity to become truly familiar with the source. In recent years, large research projects have made delegating data entry more common for the more privileged among us, as was the case during the first wave of quantification. This privilege carries great risk, although we recognize that it can come with considerable benefits, such as when it affords the opportunity to gather original data in a language that one does not speak. We have therefore sought to adapt it to our principles.
In our courses for beginning graduate students, discussions of how to elaborate collective data entry instructions take a lot of time, but students enjoy them and we think of this time as perhaps the most productive in the course. Such instructions must be clear so that the data entry is consistent, but they must also aim to keep the complications of the source as visible as possible for the next steps in research.60 Even in the apparently simple situation of biographies, we encountered limit cases. Should a member of parliament who died just after his election and never took part in debates get his own row? (It depends on whether you're studying elections or parliamentary work!) What should we write in a column labeled "political action?" What counts as "political?" What counts as an "action?" Even without standardizing the contents of the columns (at this stage, the students mostly enter quotations from the source), a clarification of categories, based on research questions but compatible with the actual contents of the source, is in order. In our teaching, we want to make the point that quantification does not imply that we skip such complicated issues. On the contrary, trying [End Page 494] to be systematic can help us think about our own, often implicit definitions. This process of devising instructions based on a close reading of the text is not only useful for source criticism: it helps clarify the very questions that the research is supposed to answer. It is or should be, in our view, as important for economists as for historians.
Constructivist and Experimental Quantification
This view of data entry is intimately linked to our practice of the further stages of quantitative history. We want data entry to keep the words of the source, its lacunae and inconsistencies, because we have no unique, predetermined categorization scheme: experiments in categorization arise from the encounter between our questions and the source. The research practice that we teach thus often results in spreadsheets with hundreds of variables (including those directly copied from the source and their categorized variants, with separate columns for each source and often each date or period), whatever the number of individuals. Most variables are qualitative—that is, expressed in words, not numbers. This approach is only manageable because we know that different methods will allow us to explore different corners of this complicated whole. Producing aggregate numbers and correlations allows us to single out interesting exceptions suited to a more qualitative interpretation: quite often, quantification is one step, but not the last, in a historical research process using systematically gathered data.
Categorization has been the focus of criticisms of quantitative methods since the 1980s. The potential pitfalls are many, according to historians: anachronistic use of nomenclature, the reification of individuals, and improper aggregation of diverse entities. Economists are also concerned with some of these problems, even if they prefer to call them bad proxies or heroic assumptions. No categorization choice is ever perfect, but some are more suited to certain research objectives than others, whether for practical reasons (number of classes), theoretical reasons (classification criteria), or rhetorical choices (the names assigned to the groups in a chosen classification scheme). As statistician Alain Desrosières put it, "The question is not: 'Are these objects really equivalent?' but: 'Who decides to treat them as equivalent and to what end?'"61 Distinguishing the data entry phase from the categorization phase is a first step if we want to make the latter acceptable to historians and more meaningful for everyone, including economists. [End Page 495]
How should we categorize occupations? This is a classic question in our workshops, as in most circles interested in quantitative history, and always a daunting one. We believe that the answer depends on one's research questions and sources. We emphasize the fact that it is useful to have more than one classification scheme and offer concrete examples. During the interwar period in Paris, all entrepreneurs were obliged to register with local authorities. One of us (Zalc) used the business register to categorize their activities and created a field called "business purpose" in her database. Businesses were as diverse as souvenir stands, insurance brokers, laundries, belt manufacturers, and fruit and vegetable wholesalers, but the precision of self-reporting varied widely. Some terms, such as "textile," do not indicate whether the business involved manufacturing or retailing. As a result, Zalc faced many choices; her categorization followed her research questions, which had emerged from the alleged peculiarities of immigrants in the business world (such as hairdressers who complained of an "invasion" of their trade) and constraints imposed on them (for example, most immigrants were not allowed to sell alcohol, hence food shops and bars had to be filed under different categories). She built categories by following these guiding principles and iteratively aggregating mentions that she deemed close enough: beginning with many small classes, then merging them.62
In her dissertation, one of us (Lemercier) focused on the Paris business elite in the nineteenth century. Each member of the Chamber of Commerce was listed as having a different occupation in different sources from different years: many individuals were described as bankers at one time, and as more or less specialized manufacturers or wholesale merchants at another. Lemercier produced a simple table showing changes in the proportions of "bankers," "merchants," and "others" over time. The important thing here is the legend explaining, for example, that a "banker, textile manufacturer, and wine merchant" counts as a banker—because this table had a specific purpose: showing that the share of "others," those with one specialty rather than banking or commerce generally, rose during the period from the 1800s to the 1850s. In a different chapter, Lemercier used a definition of la haute banque (merchant banking) based on family ties in addition to stated occupation because it was more appropriate for the matter at hand: evaluating the possibility that specific families or wider social groups were able to influence public institutions.63 [End Page 496]
In these cases, we began with one field in the original data from one source, or several similar fields (denoting a type of business) from several sources, and we ended up with several categorization schemes. Scholars often create classifications from even more composite criteria. For example, digital humanists Miriam Posner and Marika Cifor have reported on an interesting teaching experience.64 The aim was the creation of a database of early African American silent "race films." Students had to decide how to define the genre, but the decision was not to be a priori: the various possibilities had to depend on the available data. They settled on a definition of the "race film" as "a film with African American cast members, produced by an independent production company, and discussed or advertised as a race film in the African American press," for reasons that they take care to explain. They also kept a separate file of all "discarded data" so that other scholars could make different decisions. Similarly, our students break into small groups to discuss various categorization schemes, which they then report to the entire class. We stress that there are no intrinsically bad categorization schemes, only insufficiently explicit ones, and many that are not well suited to one's research questions. For example, as regards occupations in the biographies we use, students are often intrigued by the fact that an individual had the same occupation as one of his or her parents. We then ask them to explain what "the same" means exactly, and emphasize the fact that they will probably need to create an "impossible to know from the source" category and think about how to interpret it.
Of course, we touch here on fundamental differences among disciplines. Some historians start with a source but no explicit research question; we teach students in history that they need one if they want to categorize data—they should not look for ready-made ontologies. By contrast, many economists start with a well-defined question and a standard method, then go look for the least inconvenient source—one that includes some type of proxy for what they want to measure. In our book, we warn against the pitfalls of a frequent categorization scheme: the use of names as proxies for origin, ethnicity, or religion.65 We do not, however, indict proxies in general. We insist on the fact that reading the source closely during the data entry phase, as well as using descriptive statistics or clustering techniques on not-too-simplified data, can produce interesting ideas for categorization. But we do not believe that analytic categories should—or can—directly come from historical materials. Rather, we try to balance the encounter between preliminary questions [End Page 497] and surprises that come from interacting with the sources. Historians have often criticized cliometricians for their tendency to look for pre-defined entities such as GDP, "human capital," or "skilled occupations" in whatever source is most easily available, only producing numbers thanks to "heroic assumptions."66 When it involves measurements of population, economic activity, ethnic fragmentation, or conflicts in the pre-colonial period, the economic history of Africa is particularly susceptible to criticisms aimed at bad proxies.67 The increasing availability of easily downloadable data has compounded the tendency to use questionable proxies. The problem is not that the assumptions are heroic but that they are, from an historian's point of view, neither explicitly stated nor justified, so that they pave the way for over-interpretation. As historians, we do welcome readings of sources against the grain and bold experiments in the creation of proxies—as long as the rules of the interpretive game are made explicit.68
As advocates of close reading as the first stage of quantification, and of dense and sometimes weird rather than big data, do we abdicate any attempt to generalize or answer big questions? Absolutely not. We simply do not believe that quantitative history should only focus on the typical or the average.69 Datasets that do not involve many individuals can nonetheless provide answers to big questions; shallow data on a large number of cases, produced without source criticism, may not.70 This is the point where we diverge from most cliometricians.
We are not overly interested in averages at the expense of exceptions because our questions are often not similar to those addressed by standard cliometrics papers. Some of our questions are descriptive—because when we lack a good description, causal questions can be pointless. Visualization tools, for example, are particularly apt for tackling descriptive questions, as shown by a paper using geographical information systems and excellent [End Page 498] source criticism to provide a new understanding of the Dust Bowl and open new causal questions.71
Our definition of causal questions is also more inclusive than that of standard econometrics. Hence we teach a wide range of tools: there is life outside regressions.72 When we first began to teach quantification in the early 2000s, we focused on what were then called the new methods—network analysis, sequence analysis, and event history analysis. We thought that these methods were well-suited to explore what micro-historians found interesting: life trajectories and interactions among individuals. Discussions and further readings led us to appreciate that factor analysis, regression, and text analysis, though older, remained useful for the types of data that we and our students wanted to analyze, and were therefore also worth teaching.
We now focus more on the constitution of data, but one thing has not changed: we always mention several methods and advocate the use of the one most suited to the user's data and questions, rather than the supposedly easiest or newest. A single quantitative method, however refined, cannot answer all historical questions. As readers, we would like cliometrics to go beyond regression more often, but we are also surprised when digital humanists use a strange—for us—version of one of their favorite tools, such as a map, word-cloud, or network representation, where we would have used a mere contingency table or perhaps a regression. In our teaching, we emphasize the fact that any historian can learn how to produce contingency tables and chi-squared tests, and that those can help tackle many historical questions. We also encourage beginners not to be shy if their data and questions require the use of one or several allegedly more advanced methods.
It would be difficult for economic historians to adopt our curriculum because their students need to learn regression. We are privileged in that there is no standard curriculum on quantification in history departments. We do not want to impose one single method, and know that our students would not learn it as statisticians do anyway. Instead, we present the menu and rules from which to choose a method, and give pointers about where to learn it in depth when it becomes necessary. We promote knowledge of diverse tools rather than the advanced mastery of a few—and we try to [End Page 499] keep our eyes open for the discovery of new ones in disciplines, specialties, or countries that we do not know well. We think that all willing historians, perhaps using our book as a first step, can become skilled readers of papers based on diverse methods (even without using those in their own research) and then teach this type of numeracy.73
Indeed, each method can play a role at different stages of one's reasoning, and there is often great heuristic value in combining more than one. In our research, we have taken advantage of being able to switch among methods that offer different perspectives on our data. In a paper focused on the value of this diversity of tools, Pierre Mercklé and Zalc show that the successive use of regression, network analysis, and sequence analysis brought to the surface different facets of the logic of persecution and survival of the Jews of Lens, a town in northern France.74 Regression showed that foreigners and persons declaring no occupation had a higher risk of being arrested. Network analysis added a counterintuitive factor: the more local ties someone had, the more visible they were; in that context, ties were a risk rather than a source of support. But those two methods only allowed for synchronic analysis. Considering individuals from a longitudinal perspective, from the interwar period to the end of the war, led to an understanding of how resources that eventually became key to survival were built cumulatively by some and not others. Migration patterns that preceded the war ultimately proved decisive in the fate of single individuals.
Mercklé and Zalc thus complicated narratives and explanations, in the positive sense in which historians often use that word: the narratives and explanations became more nuanced and accounted for more cases (not just the most frequent) and for differences among historical contexts. Did their account become too complicated? Social scientists with a taste for parsimonious models might think so, yet their approach still relies on quantification—data have been abstracted and hypotheses made explicit. We know that some economists—if only a few—have added tools other than variants of regression, including descriptive ones, to their toolbox. For example, Marc Flandreau and Clemens Jobst have been among the first social scientists, aside from sociologists, to use network analysis on historical data in a study that [End Page 500] also made an inventive use of the press as a source and carefully discussed its proxies.75
We welcome increased opportunities for discussing studies like these—which are neither standard qualitative history papers nor standard quantitative economics papers, yet distinctly recognizable as belonging to history or to economics—in workshops, seminars, and publications in peer-reviewed journals. In such contexts, practical and epistemological questions regarding the constitution and categorization of data are unavoidable. There is cause for optimism; especially if such discussions were not confined to an online appendix in reputable scholarly publication or to articles about the management of large collective research projects in digital humanities journals, a common ground across disciplines could emerge about key issues concerning data construction and categorization. The choice of methods and the status of models might remain distinctive in each discipline, but we would like, at least, to disentangle substantive reasons for this difference from institutional ones, such as the standards of journals and hiring committees.
Cliometricians have invested a lot in a very singular definition of causation—a definition that is not even shared by all economists, especially in macroeconomics.76 Arguably, this is one of the sources of their ongoing misunderstandings with the new historians of capitalism.77 Historians benefit from a less rigid standard of causation in that they can more easily experiment; the drawback is that they rarely pause to wonder about what exactly they call a cause. There are, however, interesting exceptions, and the possibility of learning from the writings of philosophers, sociologists, and political scientists, who offer more diverse answers than economists.78 Abbott, for example, criticized the routine use of variables and regression in mainstream U.S. sociology and offered alternatives. He specifically discussed ways to deal with longitudinal data: not only do individual variables change over time, but their meanings and their possible effects also change across historical contexts; sometimes, it seems meaningless to disentangle one effect from a series of causes. Yet the [End Page 501] solution is not purely narrative: all this complexity can be modeled, formalized, and visualized.79 At a time when some data scientists advocate for dropping causal questions altogether, we hope for more frequent interdisciplinary discussions of such questions among humanists and social scientists.
Quantification in history has not yet recovered from past excesses: in many circles, it still has a bad reputation or is considered extinct. This situation frees historians to use quantitative techniques in creative ways. Our intent is to promote diversity in methods and imagination in categorization schemes—going beyond the usual suspects in terms of sources, variables, and calculations. We think that all historians can and should include several quantitative methods in their teaching and research repertoire—not in order to use them every day, but whenever their sources and question require it. A repertoire leaves room for tinkering, experimentation, and variations on standard themes. The rules of quantification call for explicit mentions of choices and procedures; they do not preclude originality. In fact, they can encourage students and researchers to try many different approaches and make inevitable errors along the way, as well as to carry out analyses on different scales.
Rather than limiting historians' intuition or creativity, quantitative methods can stimulate it. We do not intend to ignore cultural history or to get only at the average or the typical; on the contrary, we have a keen interest in outliers and the atypical, which, incidentally, contribute to source criticism. Our quantification is constructivist, suited to the scale of interactions among individuals, and, hopefully, not boring. It is also reflective and cautious about the categories employed in the sources—and our own. The production of numbers is only one step among others: research does not have to stop there, and, more importantly, numbers derive from datasets that can only be created from a systematic and, especially, a close reading of the source.
Teaching quantification also means learning from one another in class—especially when teaching is centered on the idea that any source can be quantified. A class in quantitative methods is then a great opportunity to discover diverse sources and questions and discuss their mutual adjustment. This opportunity is open not only to historians, but also to economists, quantitative sociologists, digital humanists, and anyone who routinely uses formal methods, as long as the latter are willing to learn about the constitution of data. [End Page 502]
In the end, we would like to stress that our focus is fundamentally on data, rather than on quantification as such. Data is now a fashionable term. Our teaching has led us to conclude, like many before us, that good data are key to good quantitative research. In 2019, the Social Science History Association, part of the surviving offspring of the first wave of quantitative history, issued a call for papers on "Data and its Discontents." Below we offer our answer—one that we have been teaching for years, and have had the pleasure to see reflected in many papers and books by our former students. It was originally aimed at traditional historians, but perhaps it will also interest some economists.80
Data exist. We tend to call them data when we take notes in a spreadsheet, but data really consist of any type of notes taken systematically from historical sources. Systematically, but not automatically, mind you: everyone should give themselves a set of rules suited to the sources and questions they are pursuing, make them explicit, and follow them consistently; but no one should delegate the reading of primary sources to a computer. Our spreadsheets are verbose and we cherish their dirtiness, while most digital humanists would want them cleaned. Their verbosity is a sign that the transformation from source to data (notes) happened through our own thought processes (not those of underlings paid to sweep the dirt under the rug) and retains the ambiguities of the sources. To improve the data, and to have more colleagues crave them and fewer abhor them, we want to keep them as close as possible to the source, even though they will be dirtier and more costly to produce in large quantities. But that is fine, because we do not believe data are good only if they are really big. They are good if they are complicated, in the sense of rich in information but still systematically acquired and noted in a structured way, so that we can simplify them in many different ways as we experiment in later stages. Data are good for thinking, even, or especially, when they produce new questions rather than definitive answers.
We thank the editors of Capitalism for the rare opportunity to write about teaching, as well as the participants in a workshop held at the University of Pennsylvania in January 2020 and two anonymous reviewers for important suggestions on previous versions of this article. Finally, we thank Valentine Leÿs and Michelle Niemann for improving our English.
5. For positions similar to ours in economic and business history, see Lamoreaux, "Future of Economic History," and Rosenthal, "Seeking a Quantitative Middle Ground."
6. In addition to introductory courses, we have been running a research seminar in Paris (attended by students and colleagues based elsewhere as well) for fifteen years. We do not cultivate a fascination with the tools themselves, but rather promote the intellectual process of quantification. Our seminar is open to students working on any topic and period (and to more advanced colleagues who wish to join) in history and adjacent disciplines, and fosters a spirit of mutual advice. See also Lemercier, "Teaching with our Book."
7. In addition to our book, see Karila-Cohen et al., "Quantitative History." The fact that the co-editors are four women is not incidental for us. Freeing quantitative history from its gendered connotations is part of our more general effort to de-standardize it.
11. Even though the timing and the main tenets of the first wave (as summed up by Sewell in "Political Unconscious," for example) were commensurate in most countries, there were many differences in topics and methods. We give our own account in Lemercier and Zalc, Quantitative Methods, chapter 1.
13. The works cited in our book are listed here: https://www.zotero.org/clairelemercier/items/collectionKey/Y6DGTTKB We are aware of the limitations of our knowledge, esp. as regards research published neither in English nor in French, and hope to be able to expand this list.
17. Time Machine Europe, "Venice Time Machine Project—Current State of Affairs" (September 28, 2019), https://www.timemachine.eu/venice-time-machine-project-current-state-of-affairs/; Davide Castelvecchi, "Venice 'time machine' project suspended amid data row," Nature (October 25, 2019), https://www.nature.com/articles/d41586-019-03240-w.
19. Indeed, Stone made some of these points in 1979 in "The Revival of Narrative."
20. For example, Marco van Leeuwen and Ineke Maas ask, in the abstract of their book Hisclass: "How, for instance, can manual work be distinguished from non-manual work? Skilled from non-skilled? And what did 'supervision' really mean?" Most historians, including ourselves, would answer that such questions can only have contextual answers.
25. Ruggles and Magnuson, in "History of Quantification," notice that when American historians stopped submitting quantitative papers based on historical data, academics in other disciplines (along with some European historians) replaced them.
26. Among the venues in which to explore these tools, in addition to chapters 4–7 of our book, one can consult the journal Poetics, where so ciological research on arts and literature is published, and Franzosi and Mohr, "New Directions in Formalization."
31. It is this prevalence of lists of names, businesses, or texts over aggregate data in participants' research that has led us to emphasize methods other than time series analysis in our teaching and our book.
38. With other groups, we used the official biographies of members of the French Academies of Science, a French-language dictionary of the workers' movements (Maitron), and narratives of the Righteous Among the Nations presented on the official Yad Vashem website—sources that have equivalents in many countries and languages.
68. See, for example, how a sociologist and a mathematician built data on possible exchanges of information between captains from records of ship voyages kept by the East India Company: Erikson and Samila, "Networks, Institutions, and Uncertainty."
70. In addition to the classics of micro-history, see, for example, Zalc and Bruttmann, Microhistories of the Holocaust, and Stephenson "'Real' Wages?" The latter is an important paper for cliometricians, as Brownlow points out in "Economic History," 362.
71. Cunfer, "Scaling the Dust Bowl." Similarly, Claveau, in "Bibliometric History," uses network visualizations to ask new questions, as much as to provide new answers, about the history of economic thought.
73. For some ideas for reading lists, see the category "Good Reads" on our blog (https://quanthum.hypotheses.org/category/good-reads).
74. Mercklé and Zalc, "Can We Model Persecution?" For a combined use of regression and multiple correspondence analysis, see also François and Lemercier, "Financialization French-Style."
75. Flandreau and Jobst, "The Ties that Divide." A recent introduction to network analysis for economic historians shows that the field is finally blossoming (many years after it did in sociology and in other parts of history, and in relative isolation from those): Geisler Mesevage, "Network Analysis."