# Grammar and Algorithms: Linguistics' Contribution to Computer Programming

Have you ever wondered why codes used in programming are referred to as languages? They do not seem to have something in common with human languages, except for the fact they may be learnt, and one can become fluent or proficient in using them. This point of view is only partially correct. While it is true that programming languages are substantially different from spoken languages, as commonly conceived, it may be surprising to discover that some important contributions to the development of these types of artificial languages were made by linguists.

## Computer Science before Computer Science: Pāṇini's Grammar and Mechanical Procedures

As strange as it might seem, computer science started to develop long before the first computers were invented. If computers are considered nothing but computational devices able to manipulate symbols and perform operations, with no regard to their physical implementation, it is no surprise that they were invented even before those computational devices became concrete. After all, manipulation of symbols and operations with them are procedures that do not need any concrete electro-mechanic implementation to be conceived. But what kind of symbols and operations are these devices (abstract or concrete) supposed to manipulate and perform? The very first machine that theorists imagined was an abstract machine that could produce utterances, for example.

One of these theorists was a grammarian, Pānini, from India, who lived around 350 B.C. (Cardona 1976, pp. 267-268). His grammar has been described as “one of the greatest monuments of human intelligence” (Bloomfield 1933, p. 19), and it is considered to be one of the very first attempts to construct an intelligent machine. Why is that? Before answering this question, it is necessary to clarify the meaning of the word ‘grammar’.

At school, children are taught that the grammar of a language is a series of prescriptions on how the language should be spoken or written. This is a prescriptive approach to grammar, not a descriptive one. Grammar in the Pāninian sense is totally descriptive. Pāninian grammar is a mechanism, a device, able to produce all grammatical sentences of a language, in this case, Sanskrit. So, not only does it describe the grammar of a language but includes a generative procedure for that same language. It seems not obvious at all the way one can conceptually construct a machine of this kind, one that, for each input, gives the corresponding output, and completes all kinds of grammatical information. As we will see, this procedure partly resembles the notion of an algorithm. Pānini’s machine works through a series of rules and metarules that the grammarian has described in detail in his masterpiece, the *Astādhyāyī* (‘eight books’). This Sanskrit machine manipulates the sound of the language and constructs correct sentences through semantic, morpho-syntactic, morphological, and phonological operations. The series of rules that leads to the complete, final, stage of the sentence is called a derivation of that sentence. Consider some simple grammatical operations of English:

(1) verb --> noun

a. derive --> derivation

b. collect --> collection

(2) singular --> plural

a. hope --> hopes

b. key --> keys

(1) and (2) contain morphological and phonological rules: (1) converts verbs into the corresponding nouns, while (2) forms the plural of nominal forms. Such rules, though productive to a large extent, are not exhaustive (what about *child* --> *children*?) and somewhat imprecise (*s* in *hopes* is pronounced somewhat differently from *s* in *keys*, although both words contain the same morpho-phoneme /splural/, so this grammar lacks phonetic information). Differently from grammars constructed with rules like (1) and (2), Pānini’s grammar is an exhaustive and precise system: it describes and explains all the possible grammatical operations, even exceptions, and their effective phonological output.
There has been a discussion on what the computational capacity of Pānini’s grammar is (cfr. Staal 1965, 1966, Hyman 2007, Penn and Kiparsky 2012, and Lowe 2021), but scholars agree that the formal system depicted by Pānini has impressively anticipated theoretical works that in the 20th century underpinned the concept of computation and, consequently, that of computing machines or computers.

## Dreams and Machines: From Leibniz to Turing

Pānini’s grammar has no equivalent in the ancient world. Yet, centuries later, philosophers and mathematicians seemed to explore the very same intellectual area, i.e., conceiving a mechanical procedure to solve specific problems. More than for language, these computing procedures were now more associated with thought in general (of which language constitutes a significant part): philosophers such as Thomas Hobbes (1588-1679) and Gottfried Wilhelm Leibniz (1646-1716) define reasoning as nothing but a calculus. Besides, Leibniz himself wished that once all mind’s rules are found out and clarified, the discussion between humans would be reduced to a form of calculus able to resolve any debate. To achieve this, the following elements were required: a universal language composed of symbols and syntactic rules, rules to compute reasonings, and an encyclopedic knowledge of the world (Davis 2000). But what is a universal language and how does it differ from natural languages?

Natural languages like Italian, English, Swahili, Japanese, etc. are rather numerous and differ too much from each other (in their sounds, grammar, and vocabulary) to be included in a language potentially computable and understandable by anyone. Other than this, they present all kinds of ambiguity at different levels: words and structure may have different meanings although they are the same on the surface. Consider metaphorical uses: *London has fallen*. What does this sentence mean? The meaning is not that the city precipitated as an apple from a tree, but, for example, that after a long battle, it has been conquered by the enemy. Well, a universal language for reasoning should avoid such ambiguities, so that each symbol, or string of symbols, has a unique and clear meaning. Mathematics offers this kind of language. This is the reason why the development of logic formalism began to exploit mathematical symbolism. The idea that natural languages are unsuitable for logical formalization started to be reconsidered in the 20th century. The next most significant contribution to a mechanical approach to thinking, as indicated by Hobbes and Leibniz, was from mathematician and logician George Boole (1815-1864), who in his *The Mathematical Analysis of Logic *(1847) and *An Investigation of the Laws of Thought* (1854), formalized the logic rules for reasoning in algebraic symbols, bringing the dream of Leibniz closer to fruition.
The birth of mathematical logic in the last two decades of the 19th century started a new era of speculations on formal systems. In 1879, Gottlob Frege (1848-1925) published his *Begriffsschrift *("concept-script"), which contained a more rigid formalization of logic, different from that of Boole. Boole’s concept of logic saw it as a part of mathematical reasoning, hence his reduction of logical concepts into algebraic operations. Frege’s idea was symmetrically opposite: mathematics had to be reduced to logic. Although Frege had dedicated much of his works to this reduction, the possibility to achieve the ‘logicist programme’ was found unsuccessful due to internal contradictions (exemplified by Russel’s Antinomy). The problems of the logicist programme will not be dealt with further in this article; it suffices to say that Frege’s work has been essential to a complete mathematization of logic. In fact, it brought the development of mathematical logic. This is the first important step towards a deeper comprehension of grammar and languages in a broad sense. The other step is the following: the idea that natural languages were too imperfect to be treated logically, or that it was impossible to construct grammars for these languages in a logico-mathematical fashion, was still present in logicians’ minds, but it was ready to be reconsidered thanks to the following intuitions. Considering the progress in the field of mathematical logic, between 1931 and 1934, contributions by Kurt Gödel (1906-1978) like *Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme* ("On Formally Undecidable Propositions of Principia Mathematica and Related Systems") started a new area of research, the theory of recursive functions. The description of this theory is highly technical and goes beyond the purpose of the present article, but the attention must be focused on a relevant element for this article: recursion. Recursive systems have the peculiar property to enumerate all the elements of one set through a mechanical procedure: consider natural numbers and the notion of successor; for a given natural number *x*, its successors are enumerable by recursively adding single units through a simple operation *x+1*. This opened an entire area of research on the notion of algorithms.

An algorithm is a mechanical procedure that, having received a given input and after a finite number of steps, produces the corresponding output. The property of recursion, then, can be included in algorithmic procedures, making them extremely powerful in terms of effective computability. Although different, but equivalent, formalizations of the notion of the algorithm were developed, the model constructed by Alan Turing (1912-1955) is acknowledged as the most influential on the development of computer science (Turing, 1936). Turing’s formalization, referred to as the Turing Machine, is a mathematical model of computation. Its main features are the following (and are represented in Figure 2):

the head, which receives and reads the input symbols through a tape (see below);

the head may execute, in each stage, one (or more) of the following operations (and after reading the symbol on the cell): stay, move right, move left, delete symbol, print symbol, stop;

a tape, divided into cells, finite but extendible by necessity in both directions.

It appears to be clear that to show the functioning of a Turing Machine a pen and a piece of paper are more than sufficient. Indeed, the internal design of a Turing Machine is extremely simple: it receives input symbols, any kind of symbols, providing that they are adequately formalized, and, after a series of operations, produces output symbols. Now it may be clearer why Pānini’s Grammar is considered so groundbreaking: it is a small Turing Machine; in fact, it is a Pānini Machine.

## Formal Languages and Generative Grammar: The Chomsky Hierarchy

In the now refined concept of algorithm, grammars in the traditional sense – i.e. sets of rules prescribed to use a language in its correct form – vaguely resemble this notion of mechanical procedure which brings from an input to an output. But traditional grammars are rather unstable, with their imperfections and limitations. They do not formally produce correct sentences in a general way, they only contain indications to judge constructions according to imprecise notions of correctness. However, it seems that Pānini’s grammar distinguishes itself from traditional grammar, for it showed an algorithmic nature at its core, running derivations from the semantic structure to its phonological output. Indeed, it takes symbols as input and, after a finite number of steps, derives grammatical sentences in output.

Major contributions between the 1930s and 1940s were crucial to new findings on some formal properties of human languages. At the end of the 1950s a neo-professor in linguistics at MIT, Noam Chomsky (b. 1928, Figure 1), defined grammar for a language L as a set of rules that allows us to generate all (and only) the sentences belonging to L. Grammars of this kind were then called generative grammars, for their computational power to create grammatical (i.e. acceptable) sentences from a given, finite alphabet of symbols, and through a series of rewritten rules. This new mathematical vision of human language brought the possibility to construct a classification of formal languages and grammars; this classification included artificial languages developed for mathematical and logical purposes, but also natural languages, according to their formal properties. Chomsky (1956) described such classification, which now carries its name: the Chomsky Hierarchy (Figure 3). Note that '-->' refers to symbol rewriting in the Turing sense, capital letters to non-terminal symbols (i.e. they can be rewritten as something else), lowercase letters to terminal symbols (i.e. they cannot be rewritten), α and β to symbols which can be terminal, and ε to the empty string (ε=0).

Type 0: Turing Equivalent Grammar (TEG) α --> β (α≠ε)

Type 3: Regular Grammar (RG) A --> xB

Type 2: Context Free Grammar (CFG) A --> γ

Type 1: Context Sensitive Grammar (CSG) αAβ --> αγβ (γ≠ε)

Where do natural languages fit in the Chomsky Hierarchy? The early works in generative grammar consisted in spotting formal features of natural languages to relate them to all the other artificial languages. They cannot be regulars. RG cannot produce languages of the kind *anbn*, i.e. languages where the occurrence number of *a* depends on *b* and vice versa, and this is exactly what natural languages can do. *If-then* sentences are an example of this kind, where the number of *if* and *then* in the sentence is exactly equal, given that each *then* depends on the corresponding (*If you think that if it rains then you will bring your umbrella, then I will bring mine too*). Chomsky and Miller (1963) formulated the following sentence to show the potential power of these kinds of dependencies (and illustrated them by numbered indexes):

Anyone1 who feels that if2 so many3 more4 students5 whom we6 haven’t6 actually admitted are5 sitting in on the course than4 ones we have that3 the room had to be changed, then2 probably auditors will have to be excluded, is1 likely to agree that the curriculum needs revision.

It is hardly understandable but fully grammatical for it respects the morpho-syntactic rules of English. Natural languages are not even generable by CFG, as evidenced by the following examples:

(3)

*Jan säit das mer em Hans es huus hälfed aastriiche*(literally: “Jan says1 that we2 to Hans3 the house1 have helped2 painting3”)(4)

*Gianni, Luisa e Mario sono rispettivamente sposato, divorziata e scapolo*(literally: “Gianni1, Luisa2 and Mario3 are respectively married1, divorced2 and bachelor3”)

They show ABC…ABC dependencies are not generable by CFG. Since natural languages exceed the computational power of CFG but do not seem to fit completely in CSG, linguists have included natural language in an *ad-hoc* class of languages, called Mildly-Context-Sensitive (MCS).

## Conclusion

Grammatical studies by Pānini in ancient India somewhat anticipated the impact that mechanical procedures could have not just in the study of language, but in different areas such as mathematics, logic, and modern computer sciences. While logic and mathematics followed distinct paths until the modern era, their encounter had a deep impact on the later developments of both disciplines. The 20th century marked a formal investigation of the basis for computation and algorithms, opening the way to Turing’s innovation, the Turing Machine. The concrete implementation of this abstract device led to computers as we know them nowadays, while Chomsky’s intuitions on the similarities and differences among natural and artificial languages helped to have a complete view of the potentiality of what would shortly become programming languages. Indeed, while Turing Machines perform the most general kind of computation (rewrite α-->β, where β may be empty), not all mechanical procedures need all this computational complexity. Regular Expressions (RE), which in programming are used to find sets of strings in a text, are exactly equivalent to RGs; both generate regular languages.

## References

Bloomfield, L. (1933). *Language*. New York: Henry Holt.

Chomsky, N. (1956). Three models for the description of language. *IRE Transactions on Information Theory*, 2(3), 113-124.

Chomsky, N., Miller, G.A. (1963). 'Introduction to the formal analysis of natural languages’. In Luce, Bush and Galanter (eds ) (196,3) pp. 269-321.

Cardona, G. (1997). *Pānini: His Work and Its Tradition. Volume I: Background and Introduction* (Second Edition). Motilal Banarsidass Publishers.

Davis, M. (2000). *The Universal Computer: The Road from Leibniz to Turing*. Norton. Reprinted as *Engines of Logic: Mathematicians and the Origin of the Computer*. New York: Norton. 2000.

Hyman, M. (2009). From Pāninian Sandhi to Finite State Calculus. In G. Huet, A. Kulkarni, & P. Scharf (Eds.), *Sanskrit Computational Linguistics – First and Second International Symposia* (pp. 253-265). Springer.

Lowe, J. J. (2021). Revisiting Pānini’s generative power. *Trends in South Asian Linguistics*, 367, 361.

Penn, G., & Kiparsky, P. (2012). On Pānini and the Generative Capacity of Contextualized Replacement Systems. *COLING 2012*, 943-950.

Staal, J. (1965). Context-Sensitive Rules in Pānini. *Foundations of Language*, 1 (1), 63-72.

Staal, J. (1966). Pānini Tested by Fowler's Automaton. *Journal of the American Oriental Society*, 86 (2), 206-209.

Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. *Journal of Math*, 58(345-363), 5.

## Visual Sources

Figure 1. Noam Chomsky in his office, 1957, __activism.org____,__ via La Voce di New York: __https://lavocedinewyork.com/arts/lingua-italiana/2017/11/14/noam-chomsky-ha-ancora-oggi-ragione-lingua-e-cervello-sono-una-cosa-sola/__

Figure 2. Basic graphical representation of a Turing Machine. Kevd1337, Public domain, via Wikimedia Commons: __https://commons.wikimedia.org/wiki/File:Turing_machine.png__

Figure 3. A graphical representation of the sets of languages included in the Chomsky Hierarchy. J. Finkelstein, CC BY-SA 3.0 <__https://creativecommons.org/licenses/by-sa/3.0__>, via Wikimedia Commons: __https://commons.wikimedia.org/wiki/File:Chomsky-hierarchy.svg__