The icon indicates free access to the linked research on JSTOR.

Murder! Mystery! Mayhem! These are not generally words one associates with linguistics. And yet it turns out in some of the world’s most baffling criminal cases—notorious kidnappings, domestic terrorism, thinly veiled threats and collusion, false confessions, mysterious deaths—it was not the chance appearance of some wayward DNA, CSI-style, that finally cracked the code, but some seemingly harmless point about language.

JSTOR Daily Membership AdJSTOR Daily Membership Ad

Strange to think that a handful of mere words, short of a blatant confession, could end up pointing the finger at unknown perpetrators of a crime. Perhaps like DNA, words and the ways we use language can potentially reveal features of ourselves, our intentions, and our actions, left hastily at the scene without our being aware of it.

It’s thanks to the quirky use of idioms, oddly-placed punctuation, vocal tics, and certain other idiolectal, dialectal and stylistic markers, that anonymous speakers and authors have often been identified. Linguistic evidence left behind in wire taps, ransom notes, texts, tweets, and emails, (and even pet parrots!) has sometimes led to major breakthroughs and even the resolution of many famous cases. Just like DNA analysis, however, these linguistic markers have to be used cautiously in a forensic context.

Used judiciously, however, these linguistic markers can actually provide the turning point for hard-to-crack cases. By the mid-1990s, one of the FBI’s most expensive and frustratingly as-yet-unsolved mysteries was that of the anti-technology domestic terrorist dubbed the Unabomber. Between 1978 and 1995, people at universities and airlines (hence the code name UNiversity and Airline BOMber) were targeted by homemade bombs, cobbled together with wood, metal pipe, and wire, sent through the US Postal Service. Overall, the Unabomber killed three and injured many others, deliberately leaving a trail of red herrings and false clues in a long-running puzzle that stumped investigators for nearly 20 years.

Many who have heard of the Unabomber and seen the famous composite sketch of his hooded face still remain unaware of the central role that language played in the elusive Unabomber’s eventual capture. A new Discovery true crime series (breathlessly entitled Manhunt: Unabomber) explores how forensic linguistics provided the turning point for finally identifying the Unabomber as Theodore Kaczynski, a former mathematics prodigy and UC Berkeley professor turned neo-luddite Montana hermit.

The big break came when the FBI agreed to the Unabomber’s demand to be given a public voice, in the hopes that someone would recognize something about him or his words, and provide a lead. The New York Times and the Washington Post jointly published his 35,000 word manifesto “Industrial Society and Its Future.” Many leads came in from the public, but it was Kaczynski’s sister-in-law (who he’d never met), Linda Patrik, who put two and two together and convinced her husband David Kaczynski to review the published manifesto. He immediately recognized unique phrases, idioms, and oddly familiar ideas that were often used by his brother, such as the unusual term “cool-headed logicians.” This was the crucial start of the FBI’s interest in Ted Kaczynski, but certainly not the end.

It’s possible to recognize linguistic similarities between two texts, such as an amateur might do, in order to identify the author of a work, but when is it a clear correlation, enough to be admitted into evidence or acted upon? It’s not so surprising that language in general might play a communicative part in any kind of conflict or crime, and consequently in legal settings, but in the field of forensic linguistics, the wider acceptance of linguistic evidence is still often on shaky ground. The journey from interesting linguistic coincidence to admissible evidence in court is often a circuitous one.

As Peter Tiersma and Lawrence M. Solan have pointed out in “The Linguist on the Witness Stand: Forensic Linguistics in American Courts,” “the vast majority of American lawyers and judges have little or no experience with linguistic expertise in a legal matter. Many have never even heard of it.”  No wonder, then, that it may not occur to investigators that the expertise of trained forensic linguists may be needed, much less that they exist. The lack of experience with forensic linguistics also means judges, lawyers, the police, and profilers can be swayed by common language biases and assumptions, while potentially misunderstanding less common linguistic evidence.

The common sense linguistic intuition that led David Kaczynski to identify his brother as the Unabomber through the written word was the spark, but needed to be reinforced by more rigorous methods. The linguistic analysis was done by one of the FBI profilers working on the case, James Fitzgerald (who at the time was not a forensic linguist). Now with access to Ted Kaczynski’s letters and papers provided by his family, a closer comparison between Kaczynski’s language use and the Unabomber’s could be done.

The FBI used a simple computational method looking at word frequencies, spelling variants and the like to build up a linguistic profile in an attempt to compare and match up the authors. For example, similarities included both authors using “analyse” for “analyze,” “licence” for “license,” “wilfully” instead of “willfully,” “instalment” instead of “installment,” etc. Fitzgerald identified a weird version of the common idiom “you can’t have your cake and eat it too!”—both Kaczynski and the Unabomber inverted it into “you can’t eat your cake and have it too.” There were also many other similarities of content, style, and expression between Ted Kaczynski’s known work, and that of the Unabomber’s manifesto, outlined in detail in the FBI’s affidavit.

Together with David Kaczynski’s initial intuitions, this built up a much stronger linguistic case for the two authors being one and the same. Whether this constitutes an unassailable kind of “linguistic fingerprint” is another matter, but what’s clear is it were these similarities in language and style that led to a search warrant being issued for Kaczynski’s off-grid Montana cabin in the woods, in which more incriminating evidence was collected, a major breakthrough for the conclusion of this 18-year mystery.

There are a couple of interesting points to note about the case—according to the search warrant affidavit written by Terry Turchie, the special agent in charge, there were caveats that none of the outside experts called in identified Ted Kaczynski from the manuscript: “205. Numerous other opinions from experts have been provided as to the identity of the UNABOM subject. None of those opinions named Theodore Kaczynski as a possible author.”

So the link between Ted Kaczynski and the Unabomber could very well have been missed based on the expert opinion available, without help from David Kaczynski’s knowledge of his brother’s speech patterns and Fitzgerald’s linguistic analysis. Law enforcement’s lack of experience with forensic linguistics meant that many of the academic experts called in to consult had no training in linguistics, including the well-known (though not always well-regarded) Donald Foster, an English academic at Vassar and self-styled “literary detective” who has worked in the field of author identification. It is also interesting to note that thanks to the linguistic puzzles in the successfully solved Unabomber case, investigator James Fitzgerald later went on to become the FBI’s first trained forensic linguist, attaining a Masters in Linguistics in the mid-2000s.

Left to his own devices, however, Ted Kaczynski himself might have provided the words to his own undoing. In a curious study of Kaczynski’s writing and editing habits, researcher Catherine Prendergast reviewed his writings, housed at the University of Michigan Labadie Collection of Social Protest Literature, and even corresponded with Kaczynski.

It turns out one of the crucial items in his remote Montana cabin, along with a manual typewriter and bomb making materials, was Strunk and White’s Elements of Style, the popular style bible for those who like their language rules black and white and prescriptive all over. While owning the book is not unusual, Kaczynski’s predilection for editing and “correcting” language down to bare expressions, hearkening back to an simpler linguistic time, apparently were. According to Prendergast “though a terrorist, Kaczynski is also Strunk and White’s target audience: an amateur writer who hates to be wrong.”

Though Kaczynski has never admitted to being a killer, the Unabomber, or even writing the Unabomber’s manifesto, he could apparently not resist making pedantic editorial corrections to versions of the manifesto that would help solidify the case for him being its author, with firsthand knowledge of what was intended.

As Prendergast points out, Kaczynski first annotates the manifesto with: “Note. The corrections made on this copy of the ‘Manifesto’ are derived from the FBI’s transcription of the ‘Manifesto’ that accompanied the FBI’s application for a search warrant in April, 1996.”

And then:

The above note is false. I stated that the corrections were based on the FBI’s transcription of the Manifesto in order to give a plausible source for the information that enabled me to correct the Manifesto, and because in November of 2000 I thought that for legal reasons it would be imprudent to reveal the real source of the information on which I based the corrections of the Manifesto. (9 Oct. 2003)

(So perhaps Ted Kaczynski should have stuck to the language he knew best, that of mathematics—yes, if you’re curious, it turns out you can find the academic work of the Unabomber on JSTOR).

The Unabomber may have wanted to force the world back to a simpler age, but context and complexity are hard things to escape, particularly when it comes to language.

The popularity of true crime documentaries and police procedural dramas now allow us to binge on polished stories containing relatively clean clues in cases of very messy human tragedies, conditioning us to expect easy, black-and-white answers to “whodunnit,” through the use of DNA analysis and on the strength of the evidence we leave behind us. But DNA testing has also often proven to be disastrously flawed and has in fact contributed to the convictions of innocent people, with features such as bite marks now debunked as unreliable pseudoscience.

Is there a similar danger for the field of forensic linguistics, especially given “expert” witnesses who may not be trained linguists? Is the mere presence of matching and counting words enough? When does a linguistic coincidence become a smoking gun?

Stay tuned for next month’s installment of Lingua Obscura, in which we explore the darker side of forensic linguistics…


JSTOR is a digital library for scholars, researchers, and students. JSTOR Daily readers can access the original research behind our articles for free on JSTOR.

Language, Vol. 78, No. 2 (Jun., 2002), pp. 221-239
Linguistic Society of America
College English, Vol. 72, No. 1 (September 2009), pp. 10-28
National Council of Teachers of English
Transactions of the American Mathematical Society, Vol. 141 (Jul., 1969), pp. 107- 125
American Mathematical Society