- On misunderstandings and misrepresentations:A reply to Rao et al.
When Language forwarded me the reply from Rao, Lee, and colleagues (2015; henceforth Rao et al.) to my article in Language 90.2 (Sproat 2014), I was offered an opportunity to respond, but I was also instructed to be brief. Fortunately, it is easy to be brief since I believe my article lays out the case very well, and the reader need only look there and at Rao et al.’s original articles to see that I did not misunderstand or misrepresent their views, as they claim.
Nonetheless, I should still say a few things, so I will pick a few highlights of their reply and discuss the most prominent issues: the reader should understand that all of their objections can easily be dismissed, but that I do not have the space here to do that.
First of all, Rao et al. seem to imply that I used circular reasoning to define linguistic versus nonlinguistic symbol systems, and set things up to suit my needs. On the contrary, I used a definition that most linguists would agree on, and I developed the corpora well before I subjected any of them to serious statistical analysis. Contrary to what Rao et al. imply, my notion of linguistic versus nonlinguistic was derived from first principles that had nothing to do with any point I might hope to prove, and the creation of these corpora has in any case been well documented. If the measures applied to those sets fail to make the distinctions Rao et al. prefer, it is not because I cooked the books.
Let me now turn to the issue of whether I have misrepresented or misunderstood Rao et al.’s claims. I do not think I misrepresented them, and if I have misunderstood anything, it is only because their views keep shifting. I give just two examples.
First, Rao et al. criticize my use of the decision tree in Lee et al. 2010a, claiming that I mischaracterized its intent since their classifier ‘was developed to try to determine the level of communication that a character communicates at, rather than a definition of writing’ (p. e200). That seems to me inconsistent with their original claims: if nothing else, it was very clear from their tree labels that everything on the right branch of their decision tree was intended to be linguistic—they used the labels ‘words’, ‘syllables’, and ‘letters’. One can obfuscate the issue by talking about a middle layer of ‘level of communication’, but the point of their method was clearly to decide whether something might count as writing: the very title of their article was ‘Pictish symbols revealed as a written language through application of Shannon entropy’. In their reply to Sproat 2010, Lee and colleagues (2010b) were happy with what I had thought was a problematic result, namely that their measure classified Mesopotamian deity symbols as writing: for them this was acceptable and came down to a ‘difference in viewpoint over terminology as to the definition of what constitutes “writing” ’ (p. 793). They were content with my application of their measure then and even replicated my result, but now they think I have misapplied it. But I applied it exactly the same way in both cases: the only difference is that now I have applied it to a much larger set of data, and there are many more cases that are inconsistent with their theory, and much less easy to reconcile with it. In any case I do not see how Rao et al.’s reinterpretation helps: given my results, one would say that Pennsylvania barn stars communicate at a ‘level of communication’ corresponding to letters. What does that even mean? [End Page e206]
Second, Rao et al. claim that in a footnote in my article I misrepresent their views of ‘rigidly ordered systems’ and the expected behavior of unigram block entropy. The current definition they give of a system of ‘one with very restrictive syntactic rules’ (p. e203) differs substantially from the one they give in their Science article (Rao et al. 2009) where the archetypal instance was a symbol system where after any symbol...