I have been having some fun lately exploring the development of language parsers in the context of how they fit into the Chomsky Hierarchy.
What is a good real-world (ie not theoretical) example of a context-sensitive grammar?
6
Good question. Although as mentioned in the comments very many programming languages are context-sensitive, that context-sensitivity is often not resolved in the parsing phase but in later phases — that is, a superset of the language is parsed using a context-free grammar, and some of those parse trees are later filtered out.
However, that does not mean that those languages aren’t context-sensitive, so here are some examples:
Haskell allows you to define functions that are used as operators, and to also define the the precedence and associativity of those operators. In other words, you can’t build the correct parse tree for an operator expression like:
a @@ b @@ c ## d ## e
unless you’ve already parsed the precedence/associativity declarations for @@
and ##
:
infixr 8 @@
infixr 6 ##
A second example is Bencode, a data language that prefixes content with its length:
<length>:<contents>
The issue with this format is that it’s pretty much impossible to parse without something context-sensitive, because the only way to figure out the “field” sizes is by … parsing the string.
A third example is XML, assuming arbitrary tag names are allowed: opening tag names must have matching close tags:
<hi>
<bye>
the closing tag has to match bye
</bye>
</hi> <!-- has to match "hi" -->
4
Context sensitive grammars are sometimes used in descriptions of programming language semantics. Perhaps the most comprehensive use of context sensitive grammars was the Algol68 language definition. It used a two-level context free grammer (see http://en.wikipedia.org/wiki/Two-level_grammar) to describe both the syntax and semantics of Algol68 programs.
A couple of my colleagues used the van Wijngaarden grammar to direct their implementation of Algol68 (see http://en.wikipedia.org/wiki/FLACC).
As far as I know, context-sensitive grammars are used in natural language processing, only. Programming language interpreters and compilers do not try to parse a context-sensitive grammar because of the complexity (even if some attempt has been done in the past).
Maybe, you can find some example of real use in one of these libraries:
http://en.wikipedia.org/wiki/List_of_natural_language_processing_toolkits
http://opennlp.sourceforge.net/projects.html
http://nltk.org/
http://nlp.stanford.edu/nlp/javadoc/javanlp/
2