In lieu of an abstract, here is a brief excerpt of the content:

307 The One Based on 738,032 Words Language Use in the Friends-corpus Paulo Quaglio The American sitcom Friends (1994–2004) was one of the most successful shows produced by the network NBC. During its ten-year run, it received numerous awards and nominations, and was considered one of the most popular sitcoms all over the world. The show is about a group of three single males and three single females, all in their twenties when the series begins. These friends share their life experiences (and, in some cases, apartments) with each other, as each attempts to find success and happiness on professional and personal levels. Most of the friends’ interactions take place either at their homes or at Central Perk, a coffee house in New York City. The situations experienced by the characters in the show do not occur in a social vacuum; rather, the plots incorporate a series of relevant social topics, such as same-sex marriage, artificial insemination, surrogate mothers, age difference in romantic relationships, loyalty among friends, and, simply, friendship. The popularity of the show secured its status as a popular cultural artifact, and as such has been the object of academic research (for example , Quaglio 2009; Tagliamonte and Roberts 2005). In a previous study (2009), I concluded that the language of Friends shares most of the core linguistic features of natural conversation and can thus be used as a fairly accurate representation of natural conversation. In fact, video clips of the show with transcribed dialogues have even been part of several ESL/EFL 308 | Paulo Quaglio (English as a Second/Foreign Language) courses in the United States and abroad. As Jennifer Rey (2001) states, the language of television dialogue is a representation of the perception that scriptwriters have of actual conversation , and this may have been one of the factors contributing to the success of the show. This chapter introduces corpus linguistics, a fascinating field of study within applied linguistics. We will learn about the techniques used by corpus linguists in the investigation of language use and will learn how to create our own corpus (the data used in corpus linguistics) and do our own investigations. In this chapter, all of the examples of television dialogue come from the sitcom Friends, and the analyses are based on a corpus of all of the dialogue from the series’ ten seasons: a total of 738,032 words. Corpus: Definition and Collection Lynne Bowker and Jennifer Pearson (2002) define a corpus as “a large collection of authentic texts that have been gathered in electronic form according to a specific set of criteria” (9). When it comes to creating corpora (plural of corpus) based on television series, we’re in luck: there are many websites dedicated to providing episode transcripts, such as those referred to in many of the chapter examples throughout this volume. If no episode transcriptions of the target series are available online, then the corpus building process would start with careful listening and transcribing. The transcriptions , whether copied from a website or self-created, would then need to be saved into a specially created folder on the computer. The corpus compiler decides how much material to include from a television series, but however targeted the corpus is, it could be a good idea to save individual episodes in separate files, grouped by seasons (if relevant), which would allow for easy comparisons of episodes or seasons, in turn allowing for a diachronic study (see, for example, Bednarek [2011a; 2011b], in which the author compiled and compared seasonal corpora of Gilmore Girls). It is important to mention that the collection of data available online, especially transcriptions of spoken language, must be checked for accuracy. In other words, the data must be ‘cleaned’ before analyses are conducted. For example, once the Friends-corpus was collected from a fan website in my 2009 study, the transcripts were compared against the show videos. Typos The One Based on 738,032 Words | 309 and transcription inaccuracies were fixed, and scene descriptions provided by the transcribers (for example, The intercom buzzes) were eliminated. Does Size Matter? Corpora today typically have several million words (such as the American National Corpus (ANC), the British National Corpus (BNC), and the Corpus of Contemporary American English (COCA)). The need for large corpora will depend on the research question. For example, if we want to find information on the frequency and use of a particular word such as unlike (which can be an adjective, a preposition, or...


Additional Information

Related ISBN
MARC Record
Launched on MUSE
Open Access
Back To Top

This website uses cookies to ensure you get the best experience on our website. Without cookies your experience may not be seamless.