Internationalization of non-english application

I know there are lots of posts for internationalization, but this is something I didn’t found while searching.

I have a PHP Web application, which is pretty big right now. It’s developed actively for 4 years and wasn’t built with internationalization in mind. Text is everywhere – in plain HTML, in PHP variables, in echo’s, in the DB…

Now I’m familiar with the concept of gettext and this is what i plan to use for the internationalization project of the application. However the app is not written in English and here is my question:

Should I first translate everything to English while wrapping every string in gettext() function, or I can use my native language as a base?

P.S. also any quick suggestions (links maybe) on making my life easier with the whole i18n project will be greatly appreciated!

0

It is just my very humble opinion but I think you really should use english as the default language of your application (as for every other conceivable application).

There are at least three reasons for this:

  1. Most if not all of the existing tools recognize and handle just english (or at least just english-like languages). I’m italian and I know very well what does it mean to handle non-ASCII (accented) characters in code and in UIs.

  2. Nowadays it is almost impossible to develop an application without any code exchange with foreign people. Just think to SO itself. You stomp against a problem, look for help on SO and publish a snippet of your code here. The people that could help you will have a hard job trying to understand your native language variable names and comments.

  3. Most important, on the web almost everybody understand english. Maybe just a little percentage of people can actually speak it and understand the spoken language but almost everybody can read it and understand the written language. English is arguably the best choice as a default (or “safe fail” or “fall-back”) language.

Despite this, there isn’t any compelling reason to translate evrything into english as your first move. Just translate it while you move to the new version.

Moreover: do not try to translate everything. There are elements of code and of UI that will never get exposed, not even to other (foreign) programmers.

Just try to translate these elements:

  1. Messages that shows up in the UI (notification and the like) and in the console (error messages and the like)

  2. UI texts (of course…). That means any text in HTML and Javascript.

  3. If possible, public class names, public variable names, public methods names and comments in code. These elements will likely be read and used by other programmers and in this moment you cannot know what will be their mother language.

  4. If you use some cusom build script (Ant/Maven/Rake/Whatever), some custom versioning script (GIT, SVN, etc.) or some custom code-generation tool, translate them as well. They are part of your toolbox/toolchain and it is important that other (maybe foreign) programmer can understand and use them.

I know that translating source code is a huge and terrible effort. I do not mean you have to do it all at once and just now. Just translate the code while you refactor and just when you stomp on it. It is better to have a mixed english/native-language source tree than a native-language only (because every decent programmer in your area will have to understand english in any case. This means that you cannot do any damage translating your stuff into english).

Leave in you language the elements that do not show up in UI and in code and/or that are too painful to translate. For example:

  1. DB tables and columns names. Create an external “glossary” file for them.

  2. Private (“inner”) classes, private variable and private method names. Most likely, these elements will never show up elsewhere and you can safely ignore them.

Keep in mind that most IDEs can help you a lot in this task. Also, you can easily write some custom PHP (or Python/Perl/Ruby) script to help yourself finding and changing the text strings.

1

I’d suggest you start localising into two languages, the one you have, and English, so you can see that everything is translated. Translating the source code makes no sense, do that at the time when you add the localisation step.

Not a very good source, Ill see if i can find a better one. Until then, this is from Gnome dev center

Currently, gettext does not support non-ASCII characters (i.e. any characters with a code above 127) in source code.

As I live and work in Germany, our websites use translations all the time. We use Ruby on Rails, but from what I read gettext() seems to work similar to RoRs translation system. It won’t matter if you have the translation already. Just create the file with your local texts and find some nice system to sort this file (mostly following your sites structure.)

We normally do something like this:

main.news: News
main.download: Download

portal.order.name: Name
portal.order.street: Street
portal.order.city: City

So this would be the first two are texts on the main page and the others are labels for the order form in the portal section. You the would get them with gettext(“main.new”). Do not just make one long list or you will never find stuff again. Allow for some stuff to appear several times (like we have addresses in several forms). You could make tags for them like address.street, address.city, but don’t try to avoid repetitions 100%.

It may help to have the translations in Excel sheets or similar, from where you generate the text files. This may make handling easier. We don’t do this normally, a good editor is ok too.

Then comes the database, which is more complex. There are several ways to do this. Basically have separate tables for the translations like Product with ProductTranslations or just repeat fields like title_en, title_de etc.

The table approach is better if you have many languages. The fields are simpler and searching with SQL is easier. But we use Apache Solr for the textsearch, having one core per language which works pretty well, so the database is mainly for storage.

Depending on your language be prepared to encoding problems. If any possible avoid having to copy data over different systems. Worst is Windows to Linux. If necessary use the most direct way possible. We once exported MS-SQL server data to CSV and the read back into Linux/Mysql. Constant source of pain. Worked far better when we skipped the CSV part and read data directly through ODBC.

Should I first translate everything to English while wrapping every string in gettext() function, or I can use my native language as a base?

You have to translate to English as gettext maintainers explicitly refuse to support non-English default language https://savannah.gnu.org/bugs/index.php?49598

0

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật