Apple’s scripting language AppleScript was designed with localization in mind; allowing the language to be represented in multiple dialects resembling languages from around the world.
In this way, using lexing and parsing tables, programs could be written in a style similar to that of the developer’s mothertongue (or otherwise).
Although English is the only supported dialect at the time, Apple originally implemented additional dialects in French and Japanese.
Furthermore, a ‘professional’ dialect resembling Java was considered.
The benefits of localization of this type was considered to be that programs written in a certain language may be translated to the dialect of another, in order to aid developers co-operating internationally.
However, the method was deemed too complex due to complex conjugation and other issues when implementing dialects of certain languages other than English.
I have not heard of any other programming language adhering to localization schemes such as this, and I’m interested to hear the history of programming language dialect localization.
The only similar implementation I’ve found is that of Microsoft Excel, which may – apparently – be authored using nouns and verbs of multiple languages depending on user region.
My question is: what other languages, if any, implement a similar paradigm for programming language internationalization?
For those of you interested, Cook authored a great article on the development of AppleScript in the third HOPL-conference on the history of programming languages.
5
My question is: what other languages, if any, implement a similar paradigm for programming language internationalization?
Fortunately, as far as I know, there are none. I fervently hope it stays that way.
Localisation of anything important in a language or an API or any of the various defined symbols and strings that programs depend on, every time I’ve run across it, is an unmitigated disaster. We recently ran across the situation where software could not be installed on a Spanish version of Windows because the system-defined group usually called “Everyone” is called “Todos” in Spanish. We routinely have problems with simple things like date formats in the USA and the decimal separator in Germany. These are problems no-one needs to have embedded in their source code.
Lest it be said that I am seeking unfair advantages for English, in practice I think the opposite is true. Developers in languages other than English can typically use a wide range of defined names drawn from their own language with no fear of confusion with the reserved words and API symbols provided by the system environment. I see this as a plus.
4
Localization and internationalization facilities exist for applications, often as library functions (e.g. Posix gettext).
In the 1960s and 1970s several programming languages appeared in France with French keywords, e.g. PAF and LSE.
However, it makes much less sense to localize the source code of programs and scripts (e.g. by changing keywords of programming languages like if
to si
in French….) because the meaning of a program is also conveyed by the identifier names and the comments.
Automatic and reliable (and faithful) translations of such names and comments is IMHO beyond the state of the art.
And I believe it would be simpler to have the machine program itself, i.e. synthetize its own code, instead of translating programs to be humanly understandable by other cultures. Look at Artificial General Intelligence and e.g. J.Pitrat’s blog.
In practice, developers of software to be worked on by some international team (e.g. free software projects) should agree on some human language (often some form of English) and on some coding conventions or coding styles.
Some language don’t have keywords (e.g. APL or even PL/1 where the same name IF
can have both a role of keyword and a role of identifier, so that IF IF=THEN THEN;
is a valid but cryptic PL/1 statement), but they do have identifiers and developers do give meaningful names in identifiers for their own culture. Translating these automatically is not realistic.
Some very few publications mention using natural language processing techniques on comments and identifiers in static program analysis of source code. (I am interested in additional references)
BTW, look into COBOL; I believe it was designed with the naive claim that source code would become readable by non-programmers.
Some French teachers did taught programming in C by defining macros like
// French equivalent of some C keywords
#define si if
#define sinon else
#define faire do
#define tantque while
#define pour for
but this became out of fashion. Now most teachers in France on programming requires some basic fluency in English.
(I am interested if today, in Chinese or Arabic universities, some equivalent is done)
0
The Citrine Programming Language is an attempt to create a localized programming language. It currently supports English, Dutch, Hindi, Lithuanian and Romanian languages. They are working on Russian, Chinese, Turkish, Albanian and more but need help. The language also offers a translation feature so you can translate from one language into another. The language even supports localized punctuation: Danda for Hindi for example.
https://citrine-lang.org/
This is very, very difficult.
What I would really, really dislike is having five languages “c++-English”, “c++-French”, “c++-German”, “c++-Italian” and “c++-Chinese”. Much better a system where the program text is tokenised and can be displayed and edited in a language of your choice.
But if you have a language like Swift, much of the readability comes from carefully crafted function and parameter names. This will work reasonably well but not perfect with English. Whether it works with other languages is hard to say. It wouldn’t work with switchable languages at all.