Legal Corpus Linguistics and the Half-Empirical Attitude

Legal writers have recently turned to corpus linguistics to interpret legal texts. Corpus linguistics, a social-science methodology, provides a sophisticated way to analyze large data sets of language use. Legal proponents have touted it as giving empirical grounding to claims about ordinary language, which pervade legal interpretation. But legal corpus linguistics cannot deliver on that promise because it ignores the crucial contexts in which legal language is produced, interpreted, and deployed.

First, legal corpus linguistics neglects the relevant legal context—the conditions that give legal language authority. Because of this, legal corpus studies’ evidence about language use perversely obscures and misstates the issues legal interpreters face. Second, legal corpus linguistics also overlooks the relevant institutional context—the way legal language is produced by particular speakers, taken up by particular audiences, and formulated in particular genres. By unrealistically treating language as undifferentiated, legal corpus work imagines a communicative world that is not reflected in its own data.

The underlying problem, I show, is a mismatch of method with goal. Corpus linguistics in linguistics makes an empirical claim: that its analysis illuminates truths about the language in the corpus. Legal corpus linguistics, in contrast, uses empirical methods to support a normative claim: that its analysis ought to influence the interpretation of legal texts. Treating normative claims as though they were empirical findings constitutes what I call a half-empirical attitude. Because of it, legal corpus work rests empirical results on fictional foundations. At the same time, I suggest ways that legal corpus linguistics could be useful to legal theory—if it embraces the other half of an empirical attitude.

To read this Article, please click here: Legal Corpus Linguistics and the Half-Empirical Attitude.