- Nabokov's Favorite Word is Mauve: What the Numbers Reveal About the Classics, Bestsellers, and Our Own Writing by Ben Blatt
Mathematical computer analysis of literature is blossoming. The Hawking Index grew out of the hunch that if an e-book is highlighted only at the beginning, the reader probably gave up on reading it.1 The most-likely abandoned books, according to the Hawking Index, are: Ulysses by James Joyce, 1.7 percent,2 Les Miserables by Victor Hugo, 1.8 percent, Capital in the Twenty-First Century by Thomas Piketty, 2.4 percent, Hard Choices by Hillary Rodham Clinton, 4.2 percent, and, of course, A Brief History of Time by Stephen W. Hawking, 6.6 percent. The books most likely to be finished are: Harry Potter and the Sorcerer's Stone by J. K. Rowling, 95.9 percent, and The Goldfinch by Donna Tartt, 98.5 percent.
Nabokov's Favorite Word Is Mauve provided me with a spontaneous fantasy: I ask HAL-9000,3 "Can statistics confirm our admiration of [End Page 455] the famous authors?" And HAL shows me several bar graphs. "Use of Thought Verbs per 10,000 Words" (94) studies the dictum to avoid "thought" verbs (for example, thinks, knows, understands, realizes, believes, wants, remembers, imagines, desires, loves, or hates), counting these verbs in all tenses used by fifty authors: Joyce with 56, J. R. R. Tolkien 60, Chuck Palahniuk 64, and Vladimir Nabokov 64. At the bottom are Ayn Rand 144, Agatha Christie 144, Alice Walker 145, and Elmore Leonard 150 (93-94).
"Use of Not per 10,000 Words" (98) evaluates the preference for the positive form, in effect, "dishonest" rather than "not honest." The winners are Joyce with 52, Dan Brown 61, Michael Chabon 66, Palahniuk 67, Virginia Woolf 68, and Nabokov 71. At the bottom are Theodore Dreiser 124, Ernest Hemingway 125, Jane Austen 126, William Faulkner 131, Christie 131, Stephenie Meyer 131, E. M. Forster 137, Veronica Roth 146, and Rand 151.
"Use of Exclamation Points per 100,000 Words" (85) lists Leonard with 49, Hemingway 59, John Updike 88, and Chabon 91. The heavy users are Brown with 411, E. M. Forster 418, Austen 449, Joseph Conrad 483, Mark Twain 512, D. H. Lawrence 609, George Orwell 620, Rowling 670, Charles Dickens 713, Tolkien 767, E. B. White 782, Sinclair Lewis 844, Tom Wolfe 929, and Joyce 1105. The Wake contributes the many exclamation points with 2,102 average exclamation points every 100,000 words (90).
Joyce is the best and the worst in the graphs! Seeing that Nabokov appears in more bland positions, I wondered why Joyce's name was not used in the book's title. Some may find these results superficial, but they resemble scientific discoveries of hidden truths and presage a new kind of literary criticism. One may even sense in these numbers an aesthetic quality evoking the "Ithaca" episode of Ulysses.
"Use of Suddenly per 100,000 Words" (92) puts Joyce near the top: Palahniuk 2, Austen 8, Leonard 9, Chabon 12, Twain 12, and Joyce 12. At the bottom are F. Scott Fitzgerald with 64, Conrad 71, and Tolkien 78. "Use of Clichés per 100,000 Words" (158) tallies the 4,000 entries from the The Facts on File Dictionary of Clichés,4 placing James Patterson at 160 (among five men) and Austen with 45 (among five women) at both ends of the spectrum and Joyce with 118 as the ninth most cliché-ridden writer. I became suspicious and focused on Joyce's "3 Novels" notation in the graphs, meaning A Portrait, Ulysses, and the Wake (237).
In the Wake, "every Tom, Dick, and Harry" only appears in variations such as "every tim, nick and larry," "every tome, thick and heavy," and "every toad, duck and herring" (FW 19.27-28, 325.34, 506.01-02). "Suddenly" appears six times with that spelling and additionally as, for instance, "he...