A history of the regular expression

Posted on June 15, 2021 at 9:14pm

My pal Buzz Andersen has written a wonderful meditation on the regular expression, the multitool of programmers everywhere.

The concept of a regular expression has a surprisingly interesting history that dates back to the optimistic, mid-20th Century heyday of artificial intelligence research.
The term itself originated with mathematician Stephen Kleene. In 1943, neuroscientist Warren McCulloch and logician Walter Pitts had just described the first mathematical model of an artificial neuron, and Kleene, who specialized in theories of computation, wanted to investigate what networks of these artificial neurons could, well, theoretically compute.
In a 1951 paper for the RAND Corporation, Kleene reasoned about the types of patterns neural networks were able to detect by applying them to very simple toy languages—so-called “regular languages.” For example: given a language whose “grammar” allows only the letters “A” and “B”, is there a neural network that can detect whether an arbitrary string of letters is valid within the “A/B” grammar or not? Kleene developed an algebraic notation for encapsulating these “regular grammars” (for example, ab in the case of our “A/B” language), and the regular expression was born.