I have experience with object-oriented programming languages (C++ and Java), but I am wondering what kinds of programming languages — including OOP languages– or programming paradigms might be suitable for a rule-based natural-language translation program.
Does existing machine-translation software tend to use a certain paradigm or language?
2
In all honesty: natural language translation is so incredibly, mindbogglingly, insanely, hard in general that the choice of language is completely and utterly irrelevant. Even when programming in Malbolge, your 99.9999999% of your time, effort, resources, manpower and money is going to be spent researching and inventing the algorithms rather than implementing them.
There are basically two known approaches to machine translation that “work” (as in “produce results that are slightly better than complete garbage”): the classical approach and the Google approach:
- classical approach:
- hire a dozen or so researchers (alternatively: donate a couple of million dollars to a university)
- wait twenty years
- Google approach:
- create a database of everything ever written in the entire history of humankind
- pattern match your input text against documents where the same document was written in both the source and target language you want to translate from and to
There is also a third approach that IMO works best:
- use the Google Translate API
Now, if you insist on doing it yourself, then whenever you hear the words “rule-based”, you should think of languages like Prolog and Mercury or embedded DSLs such as KANREN and miniKANREN.
1