What I say, or how I say it? Ethnic accents and hiring evaluations in the Greater Toronto Area
This study investigated accent bias against job applicants with extralocal (non-Canadian) English accents in the Greater Toronto Area. Verbal guises recorded by British, Chinese, German, Indian, Jamaican, and Nigerian women and by Canadian women with at least one parent from these countries were evaluated by forty-eight human resources students, who rated the content of job interview responses and the candidates’ ‘expression’ and ‘employability’, determined what job they should be interviewed for, and provided commentary. Canadian voices were especially privileged in comments on speech. Quantitative analysis of responses reflected bias against extralocal voices. Consequently, we provide recommendations for relevant stakeholders.*
linguistic discrimination, discrimination policy, immigrant experience, hiring, Toronto, accent
Supplementary material: http://muse.jhu.edu/article/930312
1. Introduction
The Ontario Human Rights Commission (OHRC), an arm’s-length agency of the Government of Ontario, is mandated under the Ontario Human Rights Code (1990, Revised Statutes of Ontario c. H.19) ‘to prevent discrimination and to promote and advance human rights’ in the most populous province of Canada. The OHRC’s policy on discrimination and language acknowledges that one’s language and accent is linked inextricably to one’s ‘ancestry, ethnic origin or place of origin’ (OHRC 1996:4). Despite this, the OHRC does not explicitly include language (or any linguistic criteria) in its grounds for discrimination. Perhaps this is because ‘[t]here is no figure to indicate how prevalent discrimination based on accent is in Ontario’ (Sathiyanathan & Xing 2018). Nevertheless, there is anecdotal, experimental, and legal evidence of prejudice against extralocal (i.e. non-Canadian/foreign) accents of immigrants in Canada (e.g. Creese & Kambere 2003, Kalin & Rayko 1978, Munro 2003). These findings point to the fact that, regardless of whether newcomers speak one of Canada’s official languages (English and French) as an additional language or as a first language (with an extralocal accent), this bias affects many domains of immigrants’ lives, from their social lives to access to housing to employment.
At the same time, the government of Canada hopes to welcome almost 1.5 million newcomers to Canada between 2023 and 2025, almost 60% of whom are expected to be economic immigrants (Fraser 2022:34), that is, ‘those selected for their ability to contribute to Canada’s economy’ (Statistics Canada 2019). In 2020, almost one third of such economic immigrants were from the Federal Skilled Workers (FSW) category (Immigration, Refugees and Citizenship Canada 2021), a program for ‘skilled workers with foreign work experience who want to immigrate to Canada permanently’ (Government of Canada 2021). [End Page e27] The province of Ontario is the intended destination of almost 70% of admitted immigrants under the FSW, Canadian Experience Class, and Federal Skilled Trades programs (Immigration, Refugees and Citizenship Canada 2021). If extralocal accents—especially those associated with racialized individuals—are penalized during career recruitment, this can profoundly affect these immigrants’ satisfaction and quality of life. It is therefore important to investigate perceptions of extralocal accents, an area that ‘merits attention’ (Creese & Kambere 2003:566). This is a gap that the present research intends to fill.
This study focuses on the city of Toronto and the Greater Toronto Area (GTA). Toronto is one of Canada’s most multicultural cities (Anora 2019) and is immigrants’ top destination in Canada (Statistics Canada 2017a). Despite this, there is a lack of research exploring accent discrimination in hiring in the GTA. One may wonder if the city’s linguistic diversity fosters acceptance of extralocal accents. However, results from resume and housing audit studies discussed below indicate otherwise. In the context of economic migration, it is important to know what challenges newcomers face when applying for jobs. To partially address this, this study intends to answer the following overarching questions about the GTA: If language proficiency is controlled for, are candidates with extralocal accents penalized in the ratings and comments they receive in hiring evaluations when compared to candidates with a local Canadian English accent? Are different racialized accents similarly penalized? What variables influence listeners’ ratings of candidates? Can an ethnoracial hierarchy be inferred based on preference of accents linked to birthplace and origin group? Among a participant pool of soon-to-be human resources professionals, we show that (among other factors) speaking with an extralocal accent modulates the evaluation and perception of job applicants.1 These results add to the growing evidence of institutionalized linguistic discrimination in the province of Ontario, and, among other recommendations, we suggest that the OHRC explicitly take up language-based grounds for discrimination under their mandate.
2. Language discrimination in canada
The requirements of the FSW program lead many immigrants to focus on improving their English proficiency in anticipation of arriving in Canada.2 In a first step toward immigration to Canada, candidates are assigned an overall score according to criteria from the Comprehensive Ranking System (Government of Canada 2022); only those with the highest scores are invited to apply for permanent residence. One gains core points based on age, level of education, amount of Canadian work experience, and level of proficiency in either of the two official languages, as determined by standardized assessment testing (e.g. the International English Language Testing System (IELTS)); additional points are awarded based on other factors. One hundred points (out of a 1,200-point system) are allocated to ‘skill transferability factors’. Here, one can gain fifty points for having a postsecondary degree and either high official-language proficiency in reading, writing, speaking, and [End Page e28] listening or Canadian work experience. Fifty points can also be awarded for having foreign work experience and either high official-language proficiency in reading, writing, speaking, and listening or Canadian work experience. Many highly educated workers with foreign work experience do not have Canadian work experience; therefore, achieving a high English proficiency score is an attractive way to earn these one hundred points. As a result, many immigrants take English language assessment tests repeatedly to earn a high enough score to increase their points.
The bonus points assigned for high official-language proficiency paired with foreign work experience or a postsecondary degree may suggest to immigrants that having high English proficiency will position them competitively (relative to Canadian English speakers) in finding comparable work in Canada. However, Creese and Wiebe (2012:64) point out ‘contradictions’ between immigration policy and reality: highly skilled newcomers are often unable to find comparable work in Canada. Foreign credentials, nonCanadian work experience, and even non-English names are ‘undervalued or devalued’ on the Canadian job market (Colour of Poverty—Colour of Change 2019). Correlated with this, university-educated immigrants earn 20% less than their Canadian-born peers (Conference Board of Canada 2017a). Moreover, there is a well-documented racial wage gap in Canada—on average, ‘university-educated Canadian-born members of a visible minority earn … 87.4 cents for every dollar earned by their Caucasian peers’ (Conference Board of Canada 2017b). It is unsurprising, therefore, that ‘nonracialized immigrants do better in the Canadian labour market, and sooner, than racialized immigrants do’ (Block et al. 2019:14) and that this ‘income inequality extends to the second and third generations’ (ibid., p. 5).
According to Picot and Sweetman (2012:26), much of this wage gap can be explained by ‘[d]ifferences in French or English language ability between immigrants and the Canadian-born’ (our emphasis). However, they do not specify what ‘language ability’ means. Language tests for the FSW program evaluate speaking, reading, writing, and listening, but it is unclear which of these factors pose the biggest challenges for immigrants and what about these factors is perceived as deficient. Nevertheless, the authors conclude that ‘a continued focus on language ability is essential to improving economic outcomes’ (Picot & Sweetman 2012:26).
Findings from anecdotal, experimental, and legal reports indicate that another serious barrier for immigrants to Canada in many facets of life is language discrimination. Creese and Kambere (2003:565–66) discuss anecdotal reports of accent discrimination in employment with respect to the experiences of African immigrant women in Vancouver, British Columbia. These women report that their accents affected people’s perceptions of their language fluency and competence and that this was used to justify barring them from employment. The authors additionally note that British and Australian English accents ‘do not seem to elicit the same treatment’ (Creese & Kambere 2003:566), which points to the differential treatment of accents that may be associated with inner versus outer circle Englishes (see Kachru 1985), which themselves can be indexical of ethnoracial categories. In a later study in the same city involving both men and women from sub-Saharan Africa, Creese (2010:300) notes that ‘fluent English speakers were just as likely as those who learned English after migration to identify accent discrimination as a central feature of life in Canada’. Here, accent was identified as a barrier in education, the job market, and daily interactions (Creese 2010:301). Accent is similarly viewed as ‘a labour market obstacle’ in Branker’s (2017:213) study on Caribbean immigrants in Toronto, Ontario. Notably, the situation is not limited to [End Page e29] racialized immigrants. For example, Vujinovic (2017) reports on three women from Southeast Europe who viewed their accents as obstacles in their professional life in locations across Canada.
Accent discrimination has also been anecdotally reported in housing applications. For instance, in Mensah & Williams 2013, Somali and Ghanaian immigrants in Toronto admitted to asking others to phone landlords on their behalf or to changing their accent when calling prospective landlords, indicating that they anticipated hesitance to rent to them based on their accent. Racialization may even play a role. In Dion 2001, Somali and Jamaican newcomers in Toronto perceived greater accent/language discrimination than Polish newcomers when seeking housing (cf. Purnell et al. 1999).
Immigrants also report experiences of accent discrimination in more general domains. For instance, Kayaalp (2016) reports on Turkish immigrants in Vancouver, who attest to being excluded from social circles due to their accent. Additionally, Derwing’s (2003:555) interviews with 100 adult ESL students from various backgrounds living in Edmonton, Alberta, revealed that around one third believed they had experienced accentism (i.e. accent prejudice) and that this was more common for racialized people; 53% of respondents agreed that Canadians would respect them more if they ‘pronounced English well’ (Derwing 2003:554). This points to accent as a critical point of contention. In summary, anecdotal accounts suggest that accent prejudice affects all aspects of immigrants’ lives, from employment to housing to their daily life.
While anecdotal evidence is abundant, there is limited experimental evidence that language is a barrier for immigrants to Canada in terms of employment. Kalin and Rayko (1978) had 203 English-speaking Canadian undergraduate students from Queen’s University in Kingston, Ontario, evaluate thirty-second clips of ten candidates for four jobs of varying social status: industrial plant cleaner, production assembler, industrial mechanic, and foreman. Candidates had Italian, Greek, Portuguese, West African, Slovak, or local Canadian English accents. Speakers with extralocal accents were rated lower than those with local accents and received higher ratings for lower-status jobs (Kalin & Rayko 1978:1206). No statistically significant differences were found among ratings of the extralocal accents in this study (Kalin & Rayko 1978:1208). However, in a later study, Kalin et al. (1980) used sixteen speakers with South Asian, German, Caribbean, and local Canadian accents and had sixty-four students rate them for the same jobs. This time, an ethnic hierarchy emerged, where the order of preferred accents for the highest-status job was local Canadian, German, South Asian, and then Caribbean; this order was reversed for the lowest-status job. Thus, the Caribbean accent was consistently devalued, despite English being the first language of this group.
Correspondence studies, that is, experiments where fictitious resumes are sent to real job openings to determine if there is bias in the selection of candidates, have also been a useful kind of experiment in showing language discrimination in employment applications. Veit and Thijsen’s (2021:1299) recent cross-national correspondence study on Western European employers also found a hierarchy of preference for job applicants based on ‘place of birth and origin group’, which ‘varie[d] between countries’. Although that study was based on artificial written applications sent to real jobs and not on accent per se, it still points to the possibility of an ethnic hierarchy in hiring decisions, which accent can index. In a similar correspondence study set in the GTA, Oreopoulos (2011) submitted thousands of resumes online in response to job advertisements. Applicants’ names were either English, Chinese, Indian, or Pakistani; their degrees and work experience were either Canadian or foreign. Results showed substantial name discrimination, [End Page e30] with a preference for individuals with English-sounding names. Country of education and experience were also the subject of discrimination.
Empirically, language discrimination has been shown to extend beyond the workplace to housing. The Centre for Equality Rights in Accommodation (2009) conducted a housing discrimination telephone audit across 417 apartment listings in Toronto. Results showed that almost 25% of South Asian men (identifiable by accent and name) and Black single mothers (identifiable by a Caribbean accent) experienced major barriers to accessing housing. In a later housing audit via telephone and email, which used 1,370 pairs of individuals who differed by one characteristic, it was found that housing applicants in Toronto faced discriminatory treatment if they disclosed newcomer status, and this was further compounded if their name or accent was linked to a racialized group (Canadian Centre for Housing Rights 2022).
Finally, there is legal evidence of accent discrimination in employment and housing, shown in Munro’s (2003) review of cases tried by the Canadian Human Rights Commission. Immigrants of Chilean and Iranian backgrounds in British Columbia (BC) all received favorable rulings when it was proven that they were discriminated against based on ancestry or place of origin, with language used as a proxy for this discrimination. In another instance, a Polish substitute teacher in BC won a case due to being denied work specifically because of his accent, when it was not a bona fide occupational requirement and it did not impede his job.
From this overview, we see that immigrants across Canada and from various backgrounds experience challenges in enriching their social lives and accessing housing and quality employment due to accentism. Experimental and legal evidence highlights the fact that these barriers are often due to people’s prejudices against language and accents, which is a challenge faced by both racialized and nonracialized immigrants, albeit to varying degrees. Picot and Sweetman’s (2012) finding that immigrants’ official-language proficiency can affect their earning potential does not negate the fact that listeners can and do exhibit bias (Kubota 2001, Lippi-Green 2012, Shuck 2004). The situation surrounding language and the workplace is complex and merits further investigation.
This study considers factors such as racialization and an ethnolinguistic hierarchy in its examination of potential accent bias in hiring evaluations in the GTA. We focus on self-identified women for two reasons. First, we focus on a single group for the practical reason of reducing the number of factors involved in the analysis. Second, we focus specifically on women, as a recent report from the Toronto Region Immigrant Employment Council (2022) notes that immigrant women earn less than immigrant men and the Canadian-born population. Of the 365 educated and experienced immigrant women surveyed for that report, 34% cited discrimination in the workplace based on language or accent, and 40% were in lower-level positions compared to their jobs before immigrating. Furthermore, 35% of the 608 hiring managers surveyed cited ‘English language skills’ as a reason they did not hire racialized immigrant women; 17% had not even interviewed racialized immigrant women in the last year. We note that these results are consistent with Eckert’s (1989) argument that women are more status-bound than men (and therefore more likely to be evaluated/judged on the basis of their presentation of self, including their voice).
3. Methodology
To investigate accent bias in the GTA, we make use of the verbal guise experimental paradigm. This technique involves presenting participants with audio recordings of set scripts spoken naturalistically by different people. After [End Page e31] listening to a voice, participants are then asked questions about or asked to evaluate the person they heard.3
In this experiment, scripted answers to three different questions that one might be asked in any job interview were created by the first author.4 These questions were (1) ‘What makes you a good employee?’, (2) ‘How do you handle pressure?’, and (3) ‘Are you a leader or a follower?’. For each question, two responses were crafted.5 One was intended as a ‘good’ response and the other was intended as a ‘bad’ response. Answers were written so that formality, English lexical and grammatical proficiency, and the length of each response were kept as consistent as possible. We have included the full text of the six responses in the appendix.
Twelve individuals were recruited through our personal and extended networks to be recorded speaking these responses. Throughout, we refer to these individuals and their recordings as our voices or candidates to differentiate them from the other study participants who evaluated the voices. The twelve voices were all self-identified women who differed in their ethnonational background and Canadian immigration generation. Six voices were born and raised outside of Canada, in China, Germany, India, Jamaica, Nigeria, and the UK, respectively. The other six were born and raised in Canada but differed in that each had at least one parent from one of these countries. This results in a six-by-two design: six ethnonational backgrounds (Chinese, German, Indian, Jamaican, Nigerian, and British) and two localities: born and raised outside of Canada (‘extralocal’) and born and raised in Canada (‘local’).6 The choice of ethnonationalities here was to represent and compare reactions to nonracialized women (British and German) to reactions to women from Canada’s largest racialized groups (Statistics Canada 2022): South Asian (India), Chinese, and Black (Jamaican and Nigerian). Each of the twelve voices met with the first author over Zoom (Zoom Video Communications Inc. 2021) in July 2021 and February–March 2022 and was briefly trained to produce each response as naturally as possible. Audio recordings of multiple takes were made using Zoom’s native recording function. A final edited take was stitched together from these multiple takes. For each sentence in a response, the authors determined the clearest and most natural to include, and idiosyncratic fillers, hesitations, and audible breaths were removed. Final tracks were normalized via Audacity (Audacity Team 2021) to a maximum amplitude of −1 dB in order to make each version of each response as consistent as possible across voices. [End Page e32]
The experiment was created and hosted on the Gorilla Experiment Builder (http://www.gorilla.sc/; Anwyl-Irvine et al. 2020) and was conducted under the approval of the research ethics boards of five universities and colleges in the GTA. Data were collected between March 22, 2022, and June 1, 2023. We refer to the individuals who rated the voices in this study as the listeners. Listeners were students enrolled in a human resources course at a university or college in the GTA. Potential listeners were informed of the study via their course instructor or program coordinator. All listeners had previously completed a course in recruitment and selection and thus had some level of training in selection of qualified employees. Their ages ranged from eighteen to fifty-four years (μ = 24.33, σ = 7.50). Thirty-seven (77%) self-identified as female, and eleven (23%) self-identified as male. This is not unexpected given that women have higher enrollment rates at university and women are overrepresented in the field of human resources. Twenty-three listeners indicated their nationality as Canadian (48%). The second-largest nationality was Indian, with seven listeners (15%). There were also listeners from the Philippines (n = 3), China (n = 2), Vietnam (n = 2), Albania (n = 1), Bangladesh (n = 1), Brazil (n = 1), Honduras (n = 1), Indonesia (n = 1), Nigeria (n = 1), Somalia (n = 1), South Korea (n = 1), Syria (n = 1), and Taiwan (n = 1). One listener identified their nationality as Southeast Asian. Listeners were also asked to rate their own English proficiency on a scale of 1–5 (5 being ‘expert’): two chose 3, eleven chose 4, and thirty-five chose 5.
During recruitment and at the start of the experiment, including during the informed consent procedure, listeners were told that the purpose of the research was to identify ‘optimal approaches to interviewing over the phone’; the true purpose was revealed after evaluations were complete but before listeners decided to submit their responses. After consenting and providing demographic information, each listener was randomly assigned to one of forty-eight presentation lists. We adopted a between-subjects, partial Latin square design: each list included all six responses (three questions each with two distinct responses). Each list consisted of the local and extralocal voices from three of the six ethnic backgrounds; within each list, voices were presented in random order. Each of these six voices read exactly one of the six responses, so no response or voice was heard twice by a single listener. Each of the forty-eight lists was completed by one listener. In all, each voice was evaluated twenty-four times in total, four times for each response. An example of a list is provided in Table 1.
Example experiment design.
During the experiment, the listener would first see the written interview question on their screen and was then asked to click a button to listen to the audio-recorded response once. Following this, they were required to answer seven questions before being allowed to proceed to the evaluation of the next voice. These questions and their [End Page e33] answer options are presented in Table 2. Specifically, listeners were asked (i) to rate the ‘content’ of the candidate’s response on a scale from 1 (‘not suitably answered’) to 7 (‘suitably answered’), (ii) to rate the candidate’s ‘expression’ on a scale from 1 (‘difficult to understand’) to 7 (‘easy to understand’), (iii) to provide any comments on the candidate in open-ended textboxes, (iv) to provide any advice for the candidate in open-ended textboxes, (v) to rate the ‘employability’ of the candidate on a scale from 1 (‘undesirable employee’) to 7 (‘desirable employee’), (vi) to recommend a job for the candidate to be interviewed for from the following fixed options: ‘sales clerk’, ‘customer service manager’, ‘data-entry clerk’, database manager’, or ‘no recommendation’ (brief job descriptions were provided), and (vii) to rate their confidence in their assessment on a scale from 1 (‘not at all confident’) to 7 (‘completely confident’).
Questions and answer options per voice.
When all six trials were completed, the listener was debriefed and informed that the true purpose of the experiment was not to evaluate optimal interview strategies, but to understand the effect of ethnoracial accents on human resource professionals’ evaluations of potential job candidates. Once debriefed, listeners were given the opportunity to revoke their consent to participate or to proceed and submit the data. In accordance with Article 3.7B of the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans—TCPS 2 (2018)7 and our institution’s research ethics board, we offered withdrawal of consent after the debriefing, since deception was used. Only one individual opted to withdraw from the study after the debriefing. Listeners were compensated [End Page e34] with course credit, $10 (CAD), or entry into a prize draw for $200 (CAD). Listeners tended to complete the task within twenty minutes.
Data analysis was conducted in two phases. The first involved deductive and inductive thematic analysis of the remarks supplied in open-ended comments and advice to candidates using NVivo 12 (Lumivero 2017). The first author coded all remarks under four broad topics: ‘positive comments on personality’, ‘positive comments on speech’, ‘negative comments on personality’, and ‘negative comments on speech’. A research assistant was trained to categorize the comments this way, and then independently categorized all remarks. Intercoder reliability was 71%. Discrepant choices were discussed by the first author, research assistant, and second author until we all agreed on categorizations. Following this, the first author coded the remarks within each topic more specifically and then grouped them into themes (details discussed below). In the second phase, quantitative analysis and data visualization were conducted using R (R Core Team 2021) and RStudio (version 1.4.1106; RStudio Team 2021) using the ‘tidyverse’ (Wickham et al. 2019) and ‘party’ (Hothorn et al. 2006) packages. Data were subjected to conditional inference tree modeling and random forest analysis (details explained below).
Through this mixed-methods approach, we attempt to answer the following specific research questions:
(i) What patterns emerge from the open-ended comments/advice given to the voices?
• We expected qualitative analysis of comments to provide insight into patterns that quantitative analysis may be unable to capture.
(ii) Are listeners sensitive to the intended quality of the response (i.e. ‘good’ versus ‘bad’ versions)?
• This question addresses the ‘what I say’ part of our title. We controlled for formality and lexical, pragmatic, and grammatical proficiency and created ‘good’ and ‘bad’ responses with the expectation (or hope) that listeners would be sensitive to the content of the answers in a job interview.
(iii) Are extralocal (i.e. non-Canadian) accents more likely than local accents to be penalized in terms of content, expression, and employability ratings or job recommendations? If so, does an ethnic hierarchy emerge?
• This question addresses ‘how I say it’. Based on prior findings of differential treatment of extralocal and local voices in other domains and/or locations, we expected non-Canadians, especially those who are racialized, to receive lower content, expression, and employability ratings and to be more often recommended for lower-level jobs or no jobs at all.
(iv) Which other variables influence the evaluations that listeners make for content, expression, employability, and job recommendations?
• Given that this is an exploratory study, we evaluated the impact of other variables on the different evaluations.
4. Qualitative findings
This section focuses on patterns in the comments and advice for each candidate that listeners left in open-ended textboxes. We categorize these comments and advice as related to either a candidate’s speech or a candidate’s personality (i.e. anything about the candidate/voice that is not about their language/way of speaking) with the qualifiers ‘positive’ and ‘negative’. These qualifiers are meant to represent the listeners’ perspectives—they represent whether listeners seemed to believe [End Page e35] they were making positive or negative comments. Many listener responses contain more than one comment (e.g. a positive comment about speech but a negative comment about personality), so we analyzed 357 distinct comments in total.
4.1. Positive comments on personality
There were ninety-six positive comments on the personality of the voices. The distribution of these comments across the voices is shown in Figure 1, with extralocal voices in dark gray and local Canadian voices in light gray. An initial look shows encouraging results for immigrants: the Jamaican, Indian, and British extralocals all received more positive comments on personality than their Canadian counterparts.
Counts of positive comments on personality by ethnicity and locality.
In these comments they are called ‘intelligent’, ‘calm’, ‘confident’, and ‘team players’, as seen in examples 1a–c.8
(1)
a. Seems intelligent and calm when going through difficulty (J1, participant ID:7376741)
b. I liked their confidence (B1, participant ID:6128193)
c. Demonstrated a team-player skillset that can be useful in a variety of settings (I1, participant ID:6161416)
The extralocal Jamaican was described as ‘confident’ and a ‘team player’ more than all other voices. However, the German, Nigerian, and Chinese extralocals received minimal positive comments on their personality compared to their Canadian counterparts. [End Page e36] Only nine of the ninety-six positive comments are directed at these voices. However, binomial tests revealed no statistically significant difference in the number of comments between locals and extralocals in general, nor within any specific ethnic group.
4.2. Negative comments on personality
There were only forty-four negative comments on the personality of the candidates. Here, local Canadian voices received more negative comments on personality than extralocal voices did; 64% of the negative comments were directed at local voices. As visualized in Figure 2, this trend is consistent across five of six ethnic groups. That said, these differences are mostly marginal, except for the Nigerian voices. A binomial test confirms that the number of negative comments on personality toward local voices was not statistically significantly higher than the number toward extralocal voices (p = 0.096).
Counts of negative comments on personality by ethnicity and locality.
While assessing the most common themes for the negative comments on personality, it became clear that there was an effect of which response was given. Table 3 presents the top four themes that emerged for the negative comments on personality, with tallies for each based on the answer supplied. In the table, ‘Q1’, ‘Q2’, and ‘Q3’ refer to the questions asked (outlined in §3). Examination of Table 3 shows that all but one of the comments about coming off as superior or arrogant, being overconfident, not being a team player, and being a workplace liability occurred when a bad response was given (see examples 2a–d below). Conversely, comments on being underconfident occurred on good responses (example 2e). A binomial test showed that the number of negative comments on personality for bad responses was statistically significantly higher than that for good responses (p < 0.001). It seems, therefore, that listeners were reacting to the intentionally negative aspects of the bad responses but did so slightly more for local voices. [End Page e37]
Distribution of themes from negative comments on personality by response version.
(2)
a. Don’t come off as cocky, her answer seems as if she is better than other employees (1B, participant ID:6480707)9
b. Don’t be overconfident (1B, participant ID:6192406)
c. Seems like she is not good at teamwork (2B, participant ID:7995487)
d. did not seem professional (2B, participant ID:6127498)
e. be a bit more confident during the interview (1G, participant ID:6368597)
4.3. Positive comments on speech
Counts of positive comments on speech by ethnicity and locality.
There were ninety-seven positive comments on the speech of the voices. However, local Canadian voices received more positive comments than extralocal voices did. Local voices received 66% of the positive comments. In Figure 3, positive comments on the speech of the Indian, Jamaican, and British extralocals—who had more positive personality comments than their Canadian counterparts—are relatively close to those for their Canadian counterparts. In fact, the British extralocal still has more positive comments than her Canadian counterpart. However, again, the Chinese, German, and Nigerian extralocals are disadvantaged in these [End Page e38] comments. They received the fewest positive comments on speech compared to their Canadian counterparts. A binomial test indicates that the number of positive comments about the speech of local voices was statistically significantly higher than the number of positive comments about the extralocal voices (p = 0.002). This effect was clearly driven by the contrast within the German and Chinese groups (each independently statistically significant) and to some degree the Nigerian group.
The top five themes from the positive comments on speech are presented in Table 4, with tallies for each based on the voice heard. All of the local voices, along with J1, received the greatest number of positive comments on their speech. They were frequently praised for speaking clearly, having confidence, being well spoken and understandable, and having a good tone. Among the remaining extralocal voices, however, far fewer positive comments fall into the top five themes. The Nigerian extralocal voice received three comments on two of the characteristics, and the German extralocal voice received three comments on three characteristics. The Chinese extralocal voice received no comments on any of these attributes.
Distribution of themes from positive comments on speech by voice. Note: B: British, C: Chinese, G: German, I: Indian, J: Jamaican, N: Nigerian; 1: extralocal voices, 2: local voices.
4.4. Negative comments on speech
Negative comments on candidates’ speech were the most numerous type of comment in the data, with 120 comments total. Seventy-eight percent of these comments were directed at extralocal voices. The number of negative comments about speech directed toward extralocal voices was statistically significantly higher than the number toward local voices, as determined by a binomial test (p < 0.001). Figure 4 reveals that this pattern was consistent across all ethnic groups (note the change in size of the y-axis here in comparison to Figs. 1–3). Strikingly, though, 51% of the negative comments directed at extralocal voices were directed at the Chinese voice. Notably, the Jamaican and Indian extralocal voices enjoy a low number of negative comments on speech relative to the other extralocal voices, much closer to the local voices. Not apparent in Fig. 4 is another important pattern: local voices mostly received negative comments on their speech when listeners heard intentionally bad responses, but extralocal voices received negative comments on their speech on both good and bad responses.
Of particular interest to the present study is that, of 120 negative comments, just three referred specifically to candidates’ ‘accents’ (one comment each to B1, C1, and N1). Initially, this gives the impression that the training in recruitment and selection that listeners had previously received has successfully mitigated bias against extralocal accents. However, close examination of the top five themes from the negative comments on speech is quite revealing, as is the content of some of these comments.
The most popular theme was ‘not speaking clearly’ (n = 29). Listeners commented that responses contained slurred speech, or words were unclear, or even that the candidate’s [End Page e39] accent would be a hindrance for interacting with customers (examples 3a,b). The second most popular theme was being ‘difficult to understand’ (n = 22). For instance, listeners said they could not understand or catch some of the details (examples 3c,d). More than half of the comments for these two themes were directed at the extralocal Chinese voice. Third was ‘speaking rate’ (n = 17); listeners commented that speakers needed to slow down to be better understood (examples 3e,f). Fourth was ‘enthusiasm’ (n = 9): candidates were advised to be less monotone and more enthusiastic (examples 3g,h). Finally, there were comments about having a ‘bad tone’ (n = 8). These were comments about sounding rude, abrupt, or unprofessional (examples 3i,j).
Counts of negative comments on speech by ethnicity and locality.
(3)
a. She had some slurring in her words (C1, participant ID:6154059)
b. … was a bit unclear, sometimes with her accent, which can be a hindrance while dealing with customers (N1, participant ID:6480818)
c. I could not understand a lot of what was being said (C1, participant ID:6129980)
d. But somehow I cannot catch some of the details of her answer (G1, participant ID:6378274)
e. maybe speak slower possibly so the interviewer can understand (C1, participant ID:7372525)
f. for more complicated terminology perhaps it would be best for her to articulate in a more slower manner to help listeners who may or may not be accustomed to listening to the distinct way of speaking (B1, participant ID:6154059)
g. Be less monotone (C1, participant ID:7372580)
h. Just a bit more enthusiasm would make this perfect (G1, participant ID:6376622)
i. Talks rather rude and abruptly (N2, participant ID:7997307)
j. I would advise her to have a professional tone of voice (B1, participant ID:6618769) [End Page e40]
Reflecting on these five themes—not speaking clearly, being difficult to understand, speaking rate, enthusiasm, and tone of voice—we would argue that these are ultimately criticisms of one’s accent. In particular, the first three themes relate to comprehensibility, that is, how much effort is required to understand someone’s speech. In both the positive and negative comments on speech, what was valued was the need for clear speaking, being understandable, and having a good tone. It is clear who is assessed as (not) possessing these traits: local Canadian voices received 66% of the positive comments and extralocal voices received 78% of the negative comments, though there are differences across the different extralocal voices. We now turn to the quantitative findings to see if these biases are reflected in the ratings of and recommendations given to the voices.
5. Quantitative findings
Figure 5 shows overall scores given for content, employability, and expression in stacked bar graphs of raw responses. A little over half of the content and employability ratings and almost 75% of expression ratings were 6s and 7s (the high end of the scale). Overall, therefore, listeners’ evaluations were skewed toward positive ratings. This is also reflected in job interview recommendations, with only 25/288 (8.7%) decisions not to recommend a job interview. The remaining results explore the scores for content, expression, and employability and the job interview recommendations given to answer our research questions.
Overall scores for content, employability, and expression. The category that the horizontal bar crosses is the median response.
5.1. Content ratings
First, we focus on the ratings of each response’s content. Recall that all listeners heard the same six responses, which were crafted with the intention of having three ‘good’ and three ‘bad’ responses. Ideally, when asked to rate the content, individuals with recruitment and selection training like our listeners should focus on what is said and not on the various accents with which responses are delivered. To investigate this, we modeled a conditional inference tree (ctree) model to see if there was any relationship between content scores and several independent variables. Ctrees are a nonparametric decision-tree modeling technique. They utilize binary, [End Page e41] recursive partitioning to optimally split the data (according to the given levels of the predictor variables) into groupings (bins) that the model predicts will exhibit the same response variable value (Tagliamonte & Baayen 2012). The results can be visualized as a tree structure that straightforwardly reveals any complex quantitative interactions in the data when multiple predictors are modeled.
We examined the following thirteen independent variables: question asked, specific response (six options), response version (good/bad), voice heard (twelve options), locality (local/extralocal), ethnicity (six options), and listeners’ age, year group in school, years speaking English, self-identified English proficiency, gender, nationality, and race. The dependent variable was content score. Notably, ctree models can effectively handle the inclusion of nonorthogonal (i.e. correlated or overlapping) predictors (e.g. response version and specific response or voice heard and locality).
Effect of thirteen independent variables on content ratings.
The resultant ctree can be viewed in Figure 6. First, which response is heard is the most important variable (p < 0.001). The bad responses to questions 1 and 2 (1B and 2B) group together on the right branch of this tree, while the good responses for all three questions (1G, 2G, 3G) along with the intended bad response for question 3 (3B) group together on the left. Indeed, despite our intentions, it seems that listeners viewed 3B as a good response; it has the third-highest average content rating (5.75/7), above 2G, which averaged 5.71. The averages for 1B and 2B are 4.95 and 4.77, respectively. Thus, listeners rated bad responses differently from good responses. Focusing on the right branch, among the bad responses, self-assessed English proficiency of the listener is significant (p = 0.047). Listeners with a lower self-assessed English proficiency gave higher content scores on bad responses than those with the highest proficiency.10 Perhaps these listeners misunderstood the responses or were more lenient. [End Page e42]
On the left branch, among the good responses, locality is significant (p < 0.001). To the left of that branch, local Canadian voices receive the best content scores and a smaller range of scores compared to other voices. To the right of the locality branch, among the extralocal voices, the Chinese voice was singled out in the model as receiving the lowest scores and widest range, while the remaining extralocal voices to the left receive higher scores than that voice but not as high as the local Canadians.
The importance of the response given was anticipated, given that we intentionally designed good and bad guises. This means that listeners were sensitive to the quality of the response (what is said). The relevance of the listeners’ English proficiency is also understandable, since this may affect one’s ability to fully comprehend the information. However, the significance of locality in an assessment of response content is problematic since it indicates that the accent with which an answer is delivered—especially a good answer—can affect how favorably it is assessed.
To further corroborate the effect of these variables, we subjected the data to a random forest analysis (Levshina 2015, Tagliamonte & Baayen 2012). A random forest analysis estimates the variable importance of predictors by taking an average over the results of a large number of ctrees that have been modeled from random subsets of the data; in our case, each random forest is based on 1,000 trees. In a random forest, the larger a predictor’s variable importance, the more discriminatory power that predictor has; variable importance scores around 0.0 can be understood as ‘irrelevant’, but Levshina (2015:298) notes that the cut-off value can be understood as ‘the absolute importance value of the variable with the smallest score’. The results of a random forest analysis are typically visualized with a dot plot. Here we simply report the order of importance of variables, but dot plots are available in the supplementary material.11 Consistent with the ctree in Fig. 6, the specific response and response type (good or bad) are the two most important variables, followed by locality, specific voice, question, and English proficiency of the listener. Listeners’ year of birth, nationality, and age and the ethnonationality of the voice also exceed the threshold for importance.
5.2. Expression ratings
Listeners were also asked to rate how difficult or easy it was to understand the voices, that is, how comprehensible they were. We modeled a ctree with the same thirteen independent variables listed in §5.1, with expression scores as the dependent variable. In the resultant ctree, seen in Figure 7, the most important variable was voice (p < 0.001), as one would expect with expression ratings. On the right branch, the extralocal Chinese (C1), German (G1), and Nigerian (N1) voices were grouped together. These voices received the lowest expression scores and widest range of scores. Their raw averages were 3.79, 4.46, and 5.12 out of 7, respectively. These voices were therefore seen as the hardest to understand. Within the remaining voices, locality was significant (p = 0.011). For the remaining extralocal voices, the range of scores on expression is skewed more toward negative ratings, though the median is the same as the locals. Within the local Canadian voices on the left branch, response version is significant (p = 0.005), with good responses receiving higher expression scores and a smaller range than bad responses. Overall, the ctree indicates that extralocal voices are evaluated as more difficult to understand than local Canadian voices. This is reflected in the negative and positive comments on their speech discussed above. A random forest analysis confirms that voice is the most important variable, followed by locality and [End Page e43] the ethnonationality of the voice. Response version follows in importance. Listeners’ self-identified race and length of time speaking English also just barely cross the threshold of importance.
Effect of thirteen independent variables on expression ratings.
With this result, a post-hoc study was conducted to determine if the claims of lower understandability for C1, G1, and N1 had any merit. Sixty non-human resources and nonlinguistics students participated in the post-hoc study. Each participant listened to twelve sentences in total, all of which came from recordings of the twelve voices in the main study. No participant heard a voice more than once. Voices were split into two groups of six, A and B. For the first six sentences, participants were asked to guess which country the speaker came from (this is discussed later). For the other six sentences, they were asked to transcribe what they heard after listening once. Each participant was assigned to one of two lists: in the first list, they identified the countries of group A and transcribed sentences of group B; in the second list, they identified the countries of group B and transcribed sentences of group A. Thus, thirty participants identified and transcribed a sentence for each voice. Following Derwing & Munro 1997, an intelligibility score was calculated for each voice by dividing the number of words accurately transcribed by the number of words in the recording. Trivial errors, such as writing demand for demands or I am for I’m, were ignored.
N1’s, G1’s, and C1’s intelligibility scores were among the four lowest, at 62.7%,12 80.6%, and 81.2%, respectively, which is consistent with their low expression ratings and comments. However, G2, the Canadian with a German parent, had the second-lowest intelligibility score overall, 76.7%, but she was not penalized in the expression ratings in the main study: she was grouped with other local Canadians and received the second-highest expression score average (6.34/7). That is, although listeners in the post-hoc experiment did not necessarily understand this voice, listeners in the main [End Page e44] experiment still ranked it as easy to understand. Furthermore, J1, the extralocal Jamaican voice, and B1, the extralocal British voice, earned the second- and fourth-highest intelligibility scores, 96% and 95.2%, respectively, but their speech was rated as more difficult to understand than that of all of the local Canadians in the main experiment. Hence, the expression ratings, like comprehensibility ratings in other studies, are not necessarily indicative of how intelligible the voices’ speech was. Critically, it is a local voice that was perceived as more comprehensible than her intelligibility score would suggest and two extralocal voices that were perceived as less comprehensible than their intelligibility scores would suggest.
Effect of thirteen independent variables on employability ratings.
5.3. Employability ratings
For this parameter, listeners were asked to rate whether the candidate was a desirable or undesirable employee. We modeled the aforementioned thirteen independent variables, with employability scores as the dependent variable. The resultant ctree, in Figure 8, showed that the most important variable for employability scores was response version, that is, whether a good or bad response was provided (p < 0.001). This meets our expectations. The bad responses, on the right branch, received a wider range of scores and lower median scores than the good responses in nodes 3 and 5. Among the good responses on the left branch, locality is significant (p < 0.001). Local Canadian voices, on the left side of that branch, received the highest scores and smallest range of scores on employability. They are the most desirable employees. On the right side of the branch are the extralocal voices, and among them, voice is significant (p = 0.04). C1 and G1, previously penalized for expression ratings, were given lower scores on employability (note that their median is the same as the bad responses). The other extralocal voices received higher scores than C1 and G1, but their scores indicate that they are not as desirable as local voices. Non-Canadians therefore seem to be experiencing a ceiling effect with scores, despite providing responses identical to those of the Canadians. A random forest analysis confirms the order of importance of variables: response version (good/bad) and specific response are most important, followed by specific voice and locality. The ethnonationality of the voice and the specific question also crossed the threshold of importance. [End Page e45]
5.4. job interview recommendations
For this parameter, listeners were asked to choose one of the following five recommendations for a job interview: customer service manager, sales clerk, database manager, data-entry clerk, or no recommendation. Descriptions of each position were given to listeners. The first two are customer-facing jobs, while the following two are non-customer-facing jobs. Two are managerial jobs and two are lower level. We modeled the thirteen independent variables, with job recommendation as the dependent variable. The resultant ctree, in Figure 9, shows that the length of time a listener has spoken English is the most important and only relevant variable (p = 0.007). The model divides the data into three nodes: listeners with ten or fewer years’ experience speaking English, those with eleven to thirty years’ experience, and those with over thirty years’ experience. However, the first group consists of only four listeners, while the third group is represented by just two listeners. Since a small number of listeners seemed to be skewing the data, we repeated the model with this independent variable removed.
Effect of thirteen independent variables on job recommendations.
The new ctree, in Figure 10, showed locality as the only variable of importance (p = 0.009): local Canadian voices were more likely to be recommended to be interviewed to be customer service managers and sales clerks. Notably, these are the two customer-facing positions. This may be related to the higher expression scores Canadian voices received—since they are believed to be more comprehensible, perhaps they are perceived as more suited to face the public. By contrast, extralocal voices are more likely to be recommended to be interviewed to be data-entry clerks, a low-ranking, non-customer-facing job. Extralocal voices are also more likely not to be recommended for a job interview at all. A random forest analysis confirmed that only locality crossed the threshold of importance.13 [End Page e46]
Effect of twelve independent variables on job recommendations.
To confirm the hierarchy of positions that listeners chose from, sixty participants in a post-hoc study were asked to rank the four jobs from highest to lowest status. Job rankings were as follows: (1) database manager (85% of participants ranked this job first), (2) customer service manager (61.7% ranked this job second), (3) data-entry clerk (46.7% ranked this job third), and (4) sales clerk (60% ranked this job last). While it is encouraging that extralocal and local voices were recommended for the highest-ranking job at equal rates (twenty-two vs. twenty-one times), a look at the proportions between the job recommendations is more illuminating. Local voices were recommended for customer service manager three times more than they were for database manager. By contrast, extralocal voices were recommended for this position only 1.6 times more than they were for database manager. Similarly, local voices were recommended as data-entry clerks around 71% as much as they were recommended as database managers, but extralocal voices were recommended for the lower-ranking data-entry job 180% as much as they were for the managerial post. Furthermore, local voices were recommended as customer service managers 1.8 times more than they were as sales clerks, but extralocal voices were recommended as customer service managers only 1.1 times more than they were as sales clerks. Finally, extralocal voices received 1.5 times as many ‘No’ decisions as local voices. The unequal proportions of higher-ranking to lower-ranking positions recommended for local and extralocal voices in the face of identical responses provided point to the fact that the immigrant wage gap cannot be largely explained by candidates’ language proficiency; candidates’ accents seem to invoke bias among recruiters.
6. Discussion
This study investigated how potential job applicants with local Canadian and extralocal non-Canadian accents are differentially evaluated in the context of hiring decisions in the Greater Toronto Area, one of the most multicultural cities in the world. Since official language proficiency is emphasized in permanent residence [End Page e47] applications for economic immigrants and has been identified as an explanation for the wage gap between immigrants and Canadian-born individuals, this study controlled for English lexical, pragmatic, and grammatical proficiency by using a verbal guise to determine the extent to which bias against extralocal accents exists. Six responses were crafted: half of them were intentionally ‘good’ responses to interview questions, while half were intentionally ‘bad’ responses. These were recorded by British, Chinese, German, Indian, Jamaican, and Nigerian speakers, as well as by speakers who were born and raised in Canada with at least one parent from these countries. Forty-eight human resources students who had completed a course in recruitment and selection at one of five universities and colleges in the GTA rated the content of the responses, the candidates’ expression, and the candidates’ employability, determined what job they should be interviewed for, if any, and provided comments and advice. We conducted qualitative and quantitative analyses of the data. Below, we address our research questions directly and move into a more general discussion.
6.1. Patterns from comments and advice to voices
Qualitative analyses of the data showed that local voices were especially privileged in comments on speech. Canadian voices received 64% of the negative comments on personality, although these comments were significantly linked to whether the response itself was good or bad. Conversely, they received 66% of the positive comments on speech, significantly more than extralocals received. Their speech was often described as clear, confident, well spoken, understandable, and having a good tone. No local voice received more negative comments on their speech compared to their extralocal counterpart.
Analysis of comments directed at extralocal voices showed more nuance. The Jamaican, Indian, and British voices received more positive comments on their personality and similar numbers of positive comments on their speech in comparison to their Canadian counterparts. The Jamaican and Indian extralocals also received minimal negative comments on their speech. The German, Nigerian, and Chinese extralocals, however, were consistently disadvantaged in the comments. They received fewer positive comments on their personality compared to their Canadian counterparts and the fewest positive comments on their speech. Extralocal voices received 78% of the negative comments on speech, and more than half of these were directed at the Chinese voice. This resulted in statistically significantly higher negative comments on speech for extralocals. Overall, extralocal voices had unattractive speech. Listeners were primed to be critical of what they heard in this study. Speech, however, was the salient characteristic that listeners reacted to, as seen by the significance of locality. Thus, much critical commentary on speech was used as a proxy to critique candidates’ accents, and critique of accents, in turn, is often used as a proxy for (less socially acceptable) discrimination (see Lippi-Green 2012:67, Milroy & Milroy 1999:2–3). As Matsuda (1991:1329) puts it, ‘[y]our self is inseparable from your accent. Someone who tells you they don’t like the way you speak is quite likely telling you that they don’t like you’.
6.2. Sensitivity to intended response quality
The ctrees for content (Fig. 6) and employability ratings (Fig. 8) indicated that listeners recognized and responded to differences in the intended quality of the responses. Good responses earned higher scores for both content and employability, and the response given was the most important factor in determining these scores, as demonstrated by our statistical analysis. This result is encouraging for immigrant employment opportunities in general, since the content of a candidate’s interview responses is ideally what employers should focus on in making [End Page e48] hiring decisions. However, the (type of) response provided was not the only significant variable in our models.
6.3. Accent discrimination, other relevant variables, and ethnic hierarchies
The patterns of bias against extralocal voices in the qualitative analyses were reflected clearly in the quantitative analyses. Apart from the response provided, locality and listeners’ English proficiency were found to be important factors for predicting content scores (Fig. 6). Listeners with lower English proficiency gave higher scores on bad responses. This could be a result of their misunderstanding the content or being more willing to overlook negative attributes in the responses. When the responses were good, however, local voices received the best content scores, while extralocal voices received lower scores for giving identical responses. This clearly demonstrates bias against non-Canadian voices, since the task was to evaluate the content of the response. Locality should have been an irrelevant factor.
The negative comments on extralocal candidates’ speech were also reproduced in listeners’ evaluation of whether voices were easy or difficult to understand (i.e. in expression scores). The most important variable that influenced expression scores was the specific voice heard (Fig. 7). Local voices were rated easiest to understand, consistent with the highly positive comments on their speech. Among the extralocals, the Chinese, German, and Nigerian voices were rated as the most difficult to understand. Importantly, post-hoc intelligibility scores showed that expression ratings were not necessarily indicative of intelligibility. For instance, all local voices had average expression scores higher than all extralocal voices (all were above 6), even though, for example, the Jamaican extralocal voice was found to be the second-most intelligible voice in the post-hoc study, with 96% accurate transcriptions of her speech. Similarly to Niedzielski 1999, where perceived nationality of a speaker affected listeners’ perception of that speaker’s speech, in this study, listeners believed that Canadian voices were more comprehensible regardless of their intelligibility. As Dragojevic (2020:159) notes, processing fluency, which he defines as ‘the ease or difficulty listeners experience processing a person’s speech’ (which is similar to the idea of comprehensibility already discussed), can influence language attitudes positively or negatively. In the case of the local German voice, comprehensibility, as seen by expression scores, is high, while for the extralocal Jamaican, it is lower. Comprehensibility thus seems to matter more than the intelligibility of the voices in the evaluations. Expression ratings also help to explain the patterns found in comments on extralocal voices, where the Jamaican, Indian, and British voices often patterned together: these voices had the next three highest ratings after the local voices.
Employability scores were influenced not only by the response version heard (good vs. bad). When a bad response was given, the voice was seen as an undesirable employee, regardless of locality. However, when good responses were provided, the local Canadian voices were rated as the more desirable employees (Fig. 8). Among all of the voices, the extralocal Chinese and German voices were rated as the least desirable employees even when they provided the same good responses. Overall, locality was significant for scores on content, expression, and employability, always in favor of local voices. Evaluators thought local voices were easier to understand and were more desirable employees, and that the content of the same responses was better coming from their mouths than from an extralocal candidate’s. By contrast, we observe a ceiling effect for the extralocal voices: even when extralocal voices delivered the same good content as local voices, they were rated lower on average. This suggests that listeners were [End Page e49] more critical when the content of the response was good, looking beyond content to distinguish candidates. Here, a listener’s focus shifts to a candidate’s voice, which some listeners evidently see as a legitimate way to filter candidates. This situation is similar to Johnson & Buttny 1982, where speech in a Black speaking guise was critiqued more than a white guise on intellectual content. While overall, listeners were generous in their evaluations (with overall median scores quite high on all scales), what extralocal voices said, how they said it, and their worth as employees were all devalued in comparison to locals. In real-world contexts, where often just one candidate is offered employment, the smallest downgrading can still make the critical difference between a candidate’s receiving a job offer and not receiving one.
Most notably, when listeners recommended job interviews for candidates, the responses candidates gave were not significant. Only one variable mattered: locality (Fig. 10). Local voices were recommended more for customer-facing jobs, likely because locals are believed to be easier to understand. This matches Creese and Kambere’s (2003:569) claim that Canadian English is preferred for public-facing jobs and Timming’s (2016) note that potential employees are evaluated differently depending on whether the job is customer-facing or not. In this study, extralocal voices were recommended more for the low-ranking, non-customer-facing job and not to be interviewed for a job at all. There were also disparities in the proportions of high-ranking to low-ranking job interview recommendations received by local and extralocal voices. De La Zerda and Hopper (1979) found that standard-sounding speakers were favored for supervisory positions, and a similar preference seems to be occurring based on the proportions examined.
In terms of an emergent ethnic hierarchy, local Canadian voices clustered at the top, as in Kalin and Rayko’s (1978) and Kalin, Rayko, and Love’s (1980) Kingston, Ontario, studies four decades ago. However, the other rankings differ from these studies. Little distinction was made among the six local voices, and none that would indicate that any local ethnolectal differences between the voices were relevant. That said, as discussed in n. 6 above, very few ethnolectal differences seem to exist in the Toronto context. With respect to the evaluations, the local voices are followed by the Jamaican, Indian, and British extralocal voices. The Chinese, German, and Nigerian extralocal voices are at the bottom of the hierarchy, with the Chinese voice most disfavored. We can only speculate as to why these voices fell at the bottom of the hierarchy. One possible explanation is the low expression ratings they received: they were the least comprehensible, and this may have influenced the other ratings.
We also investigated whether associations of the extralocal voices with their ethnicity may have been relevant. As part of our post-hoc study, additional participants were asked to identify the country of origin of our twelve voices after listening to a short clip of their speech. The extralocal Chinese, German, and Nigerian voices were linked to their specific country 63.3%, 13.3%, and 40% of the time, respectively. It is therefore unlikely that most listeners were responding specifically to what they thought was a German or Nigerian accent. Only for the Chinese voice can we speculate that listeners may have been reacting to associations with a particular country, viz. China. Given that the data were collected during the aftermath of the global pandemic, which has spurred heightened anti-Asian racism, it is possible that such sentiments contributed to the evaluation of this voice. Claims that her voice was more muffled or her speed was faster than that of other candidates are groundless. That said, when we consider regional associations (i.e. Asian countries linked with the Chinese voice, European countries [End Page e50] linked with the German voice, and African countries linked with the Nigerian voice), the identifiability scores go up substantially to 73.3%, 70%, and 73.3%, respectively. It is possible, therefore, that many listeners were responding to Asian, European, and African accents in their evaluations and that accents from these regions in general are devalued.14 It is notable that one nonracialized voice (German) landed in the bottom three, while the other nonracialized extralocal voice (British) ranked in the middle group. Racialized voices were found in both the middle and bottom groups as well, so the hierarchy was not based simply on racialization, although racialization surely played a part in the evaluations. Further research is needed to determine why the extralocal Chinese, German, and Nigerian voices were devalued and why the extralocal Jamaican, Indian, and British voices ranked higher.
6.4. Other remarks
Let us return to Picot and Sweetman’s (2012) claim that immigrants’ official-language proficiency can explain much of the immigrant wage gap. In this experiment, we presented an optimal scenario: we controlled for lexical, pragmatic, and grammatical proficiency by recording scripted guises for listeners to evaluate. Despite all of the candidates providing identical responses, bias against extralocal voices was evident. For an extralocal job applicant, although having high English proficiency may minimize bias against their accent, it does not negate it. As Subtirelu (2015) posits, the assumption that language must be commented on if the voice is extralocal decreases the possibility of extremely positive evaluations. This helps to explain the ceiling effect we observed. These results add to the limited empirical evidence of accent discrimination in the GTA, the top destination of immigrants to Canada, by showing that apart from language proficiency, language discrimination is a serious barrier for immigrants to Canada. We expect that these results would be more pronounced for speakers with lower proficiency in the linguistic domains we controlled for. Furthermore, Sato (1998:105), in an examination of ratings of extralocal voices by students in Alberta, suggests that if locals are familiar with an ethnic group, their perception of speakers from that group may be ‘largely determined at a personal level’; however, the perceptions of those unfamiliar with an ethnic group ‘may be more strongly shaped by social information such as stereotypes’. This could mean that the bias identified in the GTA in this study could be even more pronounced in other areas of Ontario that are less diverse, allowing locals less opportunity to interact with various ethnic groups.
Notably, the evaluations were completed by human resources students across the GTA who had already completed a course in recruitment and selection, thereby giving insight into the future of Canadian hiring. While their comments were not openly biased or discriminatory, much of their language pointed to a dispreference for extralocal accents, and their ratings showed statistically significant bias. Additionally, our group of listeners represented diverse nationalities, but the bias against extralocal voices was consistent. Scassa (1994:115) importantly points out that ‘where the dominant group conceives of its dominance as natural, inevitable and desirable, communicative failure will always be blamed on the nondominant speaker. Failure to master the dominant idiom becomes a fault; lack of employment or under-employment becomes a consequence of that fault’. Note that while many of the study listeners may not have been members of the dominant group, it does not mean that they have not accepted and perpetuated a hegemonic [End Page e51] ideology. Much of their commentary places the communicative burden on speakers, and their expression ratings further absolve them of their responsibility as listeners. While we acknowledge that many listeners may have found some of the voices less comprehensible than others, we also note Lippi-Green’s (2012:73) observation that although accent ‘can sometimes be an impediment to communication’, often ‘breakdown of communication is due not so much to accent as it is to negative social evaluation of the accent in question, and a rejection of the communicative burden’. We recognize that this was a possibility in this experiment. For instance, the extralocal Jamaican and Indian voices had relatively high expression and intelligibility scores but were still penalized relative to their local counterparts in evaluations.15 Overall, our study shows that one’s success on the job market depends not just on what is said and how it is said, but also on how the listener perceives what they hear.
Apart from its controlling for English lexical, pragmatic, and grammatical proficiency and the quality of the responses, another strength of this study is its mixed-methods approach. Examining the comments directed at local and extralocal voices provided insight into listeners’ perceptions and evaluative foci, which helped to make sense of their quantitative ratings and illuminated what exactly about language can constitute a barrier for immigrants seeking employment. Nevertheless, one limitation of the study is that when using a verbal guise, one must acknowledge that features beyond ethnoregional accent may be variable: individual perceptions of candidates’ tone, charisma, and enthusiasm, for instance, are difficult to control for but nonetheless may influence evaluations. That said, these descriptors are often used as colloquial proxies for describing someone’s accent (Lippi-Green 2012, Milroy & Milroy 1999). It was important for us to authentically represent the voices of possible newcomers to Canada. A matched-guise design (i.e. recording one person who uses the different accents naturally) was not appropriate given the seven accents of interest. It is unlikely that one speaker would be able to authentically produce these accents. Another potential criticism is that we should have used multiple representatives for each ethnicity-locality combination. However, this was not feasible for many reasons, the most critical being the difficulty in recruiting eligible listeners and avoiding listener attrition.
7. Concluding suggestions
In closing, we offer suggestions based on our findings.
7.1. Newcomer training
Since we controlled for English lexical, pragmatic, and grammatical proficiency in this study, lower ratings were due to evaluators’ linguistic biases more than to candidates’ apparent deficiencies. We are, therefore, limited in the suggestions we can make regarding newcomers to the GTA who are seeking employment. One recommendation regards the job interview training that the Governments of Canada and Ontario and immigrant-serving organizations offer immigrants before arriving in Canada, or shortly thereafter. In this training, a heavy focus should be placed on the content of responses, since content was, as it should be, an important factor in decision making. Confidence may also be an important trait to work on, since it was mentioned frequently in the comments we examined.
Some previous studies have recommended accent modification as a solution to accent discrimination (e.g. Bhatt 2013, Carlson & McHenry 2006:80; see also Guo 2009). Several companies, speech-language pathologists, and YouTube channels offer this option [End Page e52] to immigrants to Canada. However, this places the communicative burden solely on the extralocal speaker. We do not support this suggestion. There is considerable evidence that not only is modifying one’s accent extremely difficult, but it is also often irrelevant if the underlying problem is prejudice (see Lindemann 2006, Lippi-Green 2012, Rosa & Flores 2017, Rubin 1992). In fact, several studies of one-and-a-half- and second-generation speakers in Canada with Canadian English accents have highlighted their experiences of others ignoring their local accents and assuming they were extralocals due to their racialization (e.g. Kobayashi & Preston 2014, Plaza 2006), with people still inquiring where they were (really) from (e.g. Creese 2019). Thus, accent reduction requires considerable effort from extralocals and forces them to compromise an aspect of their identity without guaranteeing the intended consequences.
Nevertheless, we acknowledge that the low expression (comprehensibility) ratings of the Chinese, Nigerian, and German extralocal voices may indicate that these voices were more difficult to understand for some listeners. In this case, these speakers may benefit from pronunciation instruction that ‘align[s] with learners’ needs, backgrounds and first languages’ (Lee et al. 2015:361). This kind of individualized instruction requires skilled teachers whose goals should be improving intelligibility and comprehensibility and not achieving a native-like pronunciation (Thomson & Derwing 2015:339). Such skilled, individualized instruction has been shown to improve the comprehensibility and intelligibility of speakers with entrenched accents in as little as seventeen hours over three months (see Derwing et al. 2014).
7.2. Government services
Despite the complexity of language training, we could not find any position statements on accent-reduction courses or guidelines about pronunciation instruction from any Canadian institutions. Furthermore, while Government of Canada information for newcomers emphasizes the importance of official-language fluency, information packets direct newcomers to both public-funded and private language services without providing any critical commentary or guidance about the effectiveness or usefulness of potentially vastly different services. We recommend a more thorough and critical approach to recommending appropriate language services for newcomers.
In the face of the current high immigration targets, governments should also provide information, tutorials, and workshops on language prejudice to members of the general public in order to improve sentiments about accent and language diversity. These could be offered optionally by libraries and agencies offering language classes (English or otherwise) and could be as simple as displaying posters about language discrimination with QR codes linking to online tutorials on linguistic discrimination hosted on the OHRC’s eLearning platform (such tutorials do not currently exist). International Mother Language Day would be an especially appropriate day to highlight this issue, although it should not be limited to one day.
Crucially, discussions on language prejudice need to be part of the school curriculum, beginning as early as possible, given Kubota’s (2001) finding that affirming linguistic diversity needs to begin at an early level, and Paquette-Smith et al.’s (2019) finding that five-year-old southern Ontario children already exhibit preference for English Canadian-accented peers despite exposure to different accents. This method would have the widest impact, but it would take the longest timeframe to be realized. Current curricula documents for Ontario public schools mention encouraging an appreciation for linguistic diversity and avoiding discriminatory language but do not mention appreciating [End Page e53] dialect or accent diversity or strategies for overtly addressing and discussing linguistic discrimination with students.
7.3. Human resources curriculum
The listeners in this study were all students of human resources courses offered at a college or university in the GTA. A recent survey of fourteen Human Resources (HR) university students in Calgary, Alberta, and Montreal, Quebec, found that students, despite not receiving explicit instruction about accentism, were aware of the negative effects of accent discrimination, believed that listeners and speakers should share responsibility in communicating, and were open-minded toward diverse accents (Trofimovich et al. 2023). It is not clear to us whether antidiscrimination is a part of the curriculum at the universities and colleges we surveyed. However, the results of this study show that the curricula for these programs should be reviewed and revised as necessary, since they are not providing adequate training to reduce linguistic discrimination among future decision-makers in Canadian hiring. Antibias training should not mean antibias masking, as we saw in the comments provided on candidates’ speech, which avoided explicitly critiquing accents but used related linguistic attributes to justify bias against extralocal voices. We intend to share our results with all program coordinators and course instructors who advertised the study. Trofimovich et al. (2023:17) suggest activities for the HR classroom for instruction in accent bias: taking perspective by doing mock job interviews or reading about discrimination; reflecting about similarities and differences in in-class group activities; course assignments aimed at reflecting on how identity influences evaluations of individuals; and specific training on language discrimination.
7.4. Workplace HR
It is difficult to assess the potentially vastly different approaches to and policies about linguistic diversity and discrimination across workplaces in Ontario. However, we can look to third-party organizations that work toward increasing equitable hiring practices across industries. For example, Hire Immigrants is a website and information hub, developed by the Global Diversity Exchange thinktank at Toronto Metropolitan University, that is employer-centered and provides information and resources ‘aimed to empower employers to build a diverse and inclusive workforce’ (https://hireimmigrants.ca/about/). We commend organizations like this for explicitly recognizing the need to train ‘hiring managers to look beyond language proficiency’ (Hire Immigrants 2016). However, very little practical advice is provided for how to implement this; one link is provided to a less-than-two-minute video from 2010 of 3M Manager of Recruitment and Talent Development, Sarah Tattersall, discussing a linguistic empathy-building exercise her company uses with HR staff. While such an exercise is a positive step in the right direction, it is telling that a large employer-focused organization committed to increasing hiring equity for newcomers to Canada has very little to say about linguistic bias. Moreover, there is no discussion specifically about accent bias here.
We recommend that workplaces adopt clear statements about their position on linguistic diversity and bias in their antidiscrimination/equity, diversity, and inclusion policies. Such a policy should be explicit about how to recognize, report, and respond to linguistic discrimination. In addition, the policy should be easily accessible to employees and committed to on a yearly basis by all members of the organization. For instance, employees could be required to complete a short training module on linguistic diversity and to affirm their support of anti-linguistic discrimination statements at the end. This should be included in the onboarding process. [End Page e54]
Another aspect that could be beneficial for all employees is training in listening to diverse accents, given that the candidates in this study who were rated as most difficult to understand were penalized the most. Improving employees’ accent familiarity allows listeners to share the communicative burden and may provide employees with greater confidence and less frustration in interacting with different speakers. Training sessions could also include discussion of how to respectfully troubleshoot conflicts in communicating and explicitly address linguistic discrimination and its effects. Given that this would be time- and cost-intensive, upper management and HR employees could be the first to be trained before expanding to the general workforce. This would obviously be a large undertaking for smaller businesses, so for the sake of consistency, this training program should be developed by governments in partnership with linguists and immigrant-serving organizations, with companies who adopt the program receiving an incentive.
The selection process itself would benefit from some revision as well. Employers should consider more anonymized selection strategies, such as removing names and identifying information like country of education and experience from resumes before they are evaluated. In order to have evaluators focus on the content of responses in interviews, interviewers should be briefed on what is relevant to the job, and rubrics should be provided for them to justify their assessments. Levon et al. (2020) found that the most effective strategy in reducing accent bias in job candidate evaluations was raising awareness of accent bias. They provide a short text that they recommend for recruiters to read before assessing candidates, to remind them not to rely on accents in their decision making. Another consideration is the use of machine-learning technologies to evaluate candidates’ eligibility, as per Hoffmann et al. 2018, although this may be more useful for certain types of jobs and organizations, and these technologies would have to be assessed for the inherent bias that is inevitably built into the models behind such tools. This avenue requires much more work to ensure fairness before it is a feasible option (see Mihaljević et al. 2023).
7.5. Government policy
Of course, companies will not be motivated to adopt any of the aforementioned strategies without the government first taking a firm position on the issue. For instance, the Ontario Public Service’s inclusion and diversity blueprint does not mention language discrimination. Similarly, the OHRC’s guide to developing human rights policies and procedures for organizations does not consider language. It is encouraging that the City of Toronto lists ‘level of literacy’ under the prohibited grounds for discrimination for their organization, but it is not clear what this means, and this does not apply to all organizations in the GTA. We therefore have some recommendations for the Government of Ontario. First, language should be added to the OHRC’s prohibited grounds for discrimination. If we fail to recognize language discrimination as a problem, we cannot protect against it. The OHRC’s (1996) position is that complaints about language discrimination must be rooted in one’s ancestry, ethnic origin, place of origin, or race. Our post-hoc experiment shows that participants were not consistent at identifying the place of origin of our voices but exhibited bias nevertheless. Here, they were responding specifically to language that did not match the dominant group. There are other instances where language itself can be the object of discrimination, not linked to the grounds identified by the OHRC. For instance, one’s pitch, nasality, vocabulary size, and use of particular linguistic features such as the be like quotative or vocal fry can all affect perceptions of one’s employability and one’s ability to be promoted, while not being linked to one’s ancestry, origin, or race. While language is [End Page e55] not grounds for discrimination in any of the English-dominant provinces in Canada, it notably is grounds for discrimination in the only French-dominant province, Quebec (likely a legacy of the francophone effort to maintain political and cultural autonomy). Adding language to the OHRC’s grounds will place more accountability on individuals and organizations and remove the burden of having to link language ideologies to racial or ethnic ideologies where they are not necessarily relevant. Following this logic, the call to add language as grounds for discrimination to Canadian human rights codes was made almost three decades ago by Scassa (1994).
If the Government of Canada is to make their immigration initiative worthwhile, they must identify obstacles that prevent immigrants from adding the most value and work with other levels of government to put policies in place to remove these hurdles. One way to incentivize companies to hire candidates from diverse language backgrounds is to introduce hiring subsidies for groups that are most penalized, in conjunction with independent assessments of candidates’ skills to validate their competence for coworkers (Valfort 2018:7). This would be similar to the current apprenticeship tax credits, cooperative education tax credits, and job grants currently offered in Ontario. Incentives could also be offered to companies with specified levels of linguistic diversity. In both of these cases, companies should be required to monitor and periodically report on their recruitment, selection, interview, and promotion processes in order to be compensated. Finally, to actively combat linguistic bias against immigrants to Canada and to allow newcomers to successfully integrate into the community, strong penalties must be levied against companies that are found to engage in linguistic discrimination in any of these processes.
8. Prospectus
We close by discussing further avenues of research, although all of the suggestions above require empirical study. While we by no means recommend ‘accent reduction’, we agree with Lindemann (2006) that more research needs to be done to identify which (if any) linguistic features of extralocal speech impede intelligibility or comprehensibility. Additionally, there is a need for hiring personnel to be trained in listening to extralocal accented speech (e.g. Derwing et al. 2002, Derwing & Munro 2014), since familiarity with extralocal accents can reduce one’s perception of how difficult it is to understand them (Gass & Varonis 1984) and it was those voices who were perceived as most difficult to understand who were most penalized in this study. As Baese-Berk et al. (2020) state, more work needs to be done on refining methods to optimize this. Moreover, researchers need to be clearer about what is meant when they refer to shortcomings in immigrants’ language proficiency, since language tests focus on speaking, reading, listening, and writing and laymen can use ‘language proficiency’ to refer to some, all, or none of these aspects (for instance, see Yates 2004 about pragmatic awareness).
Our immediate future work has two lines of inquiry. In the first, we will examine a larger data set of ninety-six listeners drawn from colleges and universities across southern Ontario to determine if the bias examined here persists. Beyond this, we would like to investigate ways to reduce accent bias. Levon et al. (2020) provide a framework for determining the most effective of five strategies to reduce accent bias in evaluations, viz. raising awareness, identifying irrelevant information, committing to fairness and objectivity, increasing accountability, and appealing to multiculturalism. While raising awareness was most effective in that study, it was conducted in the UK, and we would like to determine what is most effective for Canada. [End Page e56]
University of Toronto Mississauga
Maanjiwe nendamowinan Building, 4th floor
3359 Mississauga Road
Mississauga, ON, Canada L5L 1C6
[samantha.jackson@utoronto.ca]
[derek.denis@utoronto.ca]
revision invited 13 September 2023;
revision received 13 October 2023;
accepted pending revisions 19 December 2023;
revision received 16 January 2024;
accepted 5 February 2024]
Appendix
Questions and scripted responses
Question 1: What makes you a good employee?
Well, I’m a perfectionist. I have very high standards and always submit my best work. You see, I’m not afraid to be the best at something because others might feel jealous. In fact, I make it a point to inspire my peers to do better. Also, I give very honest feedback, because it helps everyone to improve.
I think it’s my optimistic attitude. When things get difficult, I try to find the silver lining. I like to be that encouraging voice for the team and try to be flexible and helpful. I’ve also been told I’m very thorough. I’m usually pleased with my work because I stay mentally present and pay attention to the details.
Question 2: How do you handle pressure?
I honestly never get stressed. I’m known for my composure in the face of adversity. For instance, at my last job, there were always unexpected demands. The tight turnaround times would fluster my co-workers. I, on the other hand, never lost my cool. I’d just jump into action and push through it.
I don’t see pressure as a bad thing. It’s taught me better time management, among other skills. When I see a situation becoming stressful, I focus on allocating the appropriate amount of time and resources for each job that needs to be done. What also helps is reducing distractions and interruptions as much as possible so I can focus.
Question 3: Are you a leader or a follower?
I’ve always been a leader. Whether it’s at work, with family or friends, people always seem to fall into step with my plans. I think it might be because I’m skillful at planning, executing and delegating. Everyone’s always pleased with the results. I’m used to taking on that responsibility and I think that’s where I’m most successful.
Sometimes I’m a leader and sometimes I’m a follower. It’s situational. There have been times when, given my skillset, I was assigned as leader and was happy to step into that role because I enjoy it. However, there were other times when the situation called for me to take on a supporting role for the success of the team, and I committed myself to that with just as much dedication.
REFERENCES
Footnotes
* This research was funded by a Provost’s Postdoctoral Fellowship from the University of Toronto awarded to Samantha Jackson. We are grateful to Emilia Dabrowski, our research assistant, who helped with coding participant comments, and to all the women who allowed us to record their voices as stimuli for this experiment. We would also like to thank our referees and the editors for their insightful guidance during the development of this paper. Additionally, we appreciate those who provided feedback and suggestions on prior versions of this work that were presented at the Workshop on Language Equity and Justice at the University of Toronto Mississauga (2022), the Language and Power conference at the Universität Münster (2023), and the Language Latitudes workshop at the University of British Columbia (2023).
1. We recognize a distinction between dialect and accent, the former a wider-ranging descriptor for a language variety and the latter focused exclusively on phonetic-phonological features. In this article, we use accent because we have controlled for factors such as vocabulary and grammar and focus specifically on pronunciation.
2. While Canada has two official languages, English and French, we focus on the GTA, where English is the majority language and French is a minority language. The 2016 Canadian Census reports that just 0.4% of Toronto residents use French in a work context and only 8% have knowledge of the language (Statistics Canada 2017b).
3. The main experiment and post-hoc experiments followed a research ethics protocol approved by the University of Toronto’s research ethics board (protocol #40784). The main experiment was also approved by all five institutions where data were collected.
4. This allowed us to control for English proficiency, since her scores on one of the English tests for immigrants to Canada would be considered ‘high proficiency’.
5. Answers were based on the guidance of several job-seeker advice columns on handling such interview questions and the advice of a working human resources professional.
6. Very little is known about local ethnolectal differences in the GTA, and generally it is assumed that secondgeneration Canadians acquire the same normative variety regardless of their ethnolinguistic background. That said, Hoffman and Walker (2010) find minor differences between second-generation Chinese, Italian, and Anglo-Irish Torontonians in the rates but not the variable conditioning of a handful of consonant and vowel phenomena, Baxter and Peters (2014) find differences in consonant cluster simplification between Black Torontonians and non-Black Torontonians, and Denis et al. (2023) find some vowel distinctions among racialized youth that they argue are features of a local multiethnolect. The extent to which these differences are recognizable and identifiable is not known. While we assume that our six local voices will all be perceived as speakers of normative Canadian English, we still consider it important to match our six extralocal voices with local voices from the same ethnolinguistic background in order to test the effects of local vs. extralocal accent independently from any effect of racialization of accent, just in case there are perceivable ethnolectal distinctions that we are otherwise unaware of.
8. Metadata is included in parentheses after each comment, including a letter-number code for voice commented on (ethnicity, generation) and seven-digit participant ID (as randomly created by Gorilla.sc). British, Chinese, German, Indian, Jamaican, and Nigerian ethnicities are represented by ‘B’, ‘C’, ‘G’, ‘I’, ‘J’, and ‘N’, respectively, while ‘1’ represents extralocal voices (first generation) and ‘2’ represents local voices (second generation). All comments in examples are presented verbatim but may have been extracted from a larger comment.
9. Note that in this set of examples, the number-letter code indicates the responses commented on, with ‘B’ and ‘G’ referring to intended bad and good responses, respectively (e.g. 1B refers to the bad response to question 1, while 1G refers to the good response to question 1).
10. While participants were asked to self-assess their English proficiency on a scale from 1 to 5, no participants ranked themselves at the 1 or 2 level.
11. The supplementary materials are available at http://muse.jhu.edu/resolve/242.
12. It should be noted that the scores for N1 were almost equally split across the following ranges: 18–45%, 54–73%, and 82–100%.
13. A random forest analysis also further justifies excluding the factor of participants’ length of time speaking English from the ctree model. This factor does not cross the threshold of importance when included in a random forest. This makes intuitive sense: if the effect we see in Fig. 9 is driven by a small number of outlier participants, a random forest, which creates a large number of trees based on a large number of subsets of the data, will be able to account for these outliers.
14. Interestingly, participants were the worst at identifying the nationality of the extralocal Jamaican voice. Jamaica was identified 20% of the time, Caribbean countries were chosen 33.3% of the time, and Caribbean and African countries were selected 63.3% of the time.
15. Extralocal voices still patterned separately from local voices in ctrees where the Chinese, German, and Nigerian voices were removed.