The Chomsky hierarchy

Introduction:
  • Noam Chomsky proposed syntactic models of natural languages.
  • The Chomsky hierarchy is a containment hierarchy of classes of formal grammars.
  • Four families of grammars make up the chomsky hierarchy
Formal definition of Gramar:

A formal grammar G = (N; Σ ; P; S) consists of:
A finite set N of nonterminal symbols;
A finite set Σ of terminal symbols not in N;
A finite set P of production rules of the form u→v, where u and v are strings in (Σ U N) and u contains at least one non-terminal symbol;
a start symbol S in N.

  • A formal grammar defines (or generates) a formal language, which is a set of sequences of symbols that may be constructed by applying production rules to a sequence of symbols which initially contains just the start symbol.
  • A rule may be applied to a sequence of symbols by replacing an occurrence of the symbols on the left-hand side of the rule with those that appear on the right-hand side
Nonterminals are usually represented by uppercase letters, terminals by lowercase letters, and the start symbol by S
For example, the grammar with terminals {a,b}, nonterminals {S,A,B}, production rules
S → ABS
S → ε (where ε is the empty string)
BA → AB
BS → b
Bb → bb
Ab → ab
Aa → aa
and start symbol S, defines the language of all words of the form a^nb^n.

The Chomsky hierarchy consists of the following levels:

Type – 0 grammars (unrestricted grammars): They include all formal grammars.They generate exactly all languages that can be recognized by a turing machine.

An unrestricted grammar is a quadruple (V; Σ ; P; S) where V is finite set of variables, Σ (the alphabet) is a finite set of terminal symbols, P is a set of productions, and S is a distinguished element of V.

Type – 1 grammars (Context-Sensitive Grammars): The Generate the context sensitive languages.

No restrictions are placed on the left-hand side of the production, but the length of the right-hand is required to be at least that of the left.


Type-2 grammars (context-free grammars): They generate the context-free languages.

Type-3 grammars (regular grammars): They generate the regular languages. Every regular grammar is a context-free grammar.
Such a grammar restricts its rules to a single nonterminal on the left-hand side and a right-hand side consisting of a single terminal, possibly followed by a single nonterminal.