AutomaticTranslation

From TDN

Contents

Note

If you are looking for information on langc, you're in the wrong place. You want TorqueLocalization instead.

Introduction

As there are automatic translation tools available on the web, for example Google and Babelfish, it would be nice to be able to use them to get a head start on translations. Unfortunately, automatic translations are usually very bad and are thus not good for shipping games. However, they can be quite useful for testing purposes or to give human translators a head start.

To this end, I decided to experiment with the Torque localization code and Google. As you will soon see, it was not particularly successful. Despite that, the code remains and this document was written to cover it in the (probably vain) hope that it may be useful some day in the future.

It is assumed you have read Torque localization docs and are familiar with the localization process.

The Idea

Google and Babelfish both allow support for either translating text pasted into the browser or translation of pages from the web. The fundamental problem is that a translation file cannot be pasted verbatim, as the ID may end up translated.

The LangC code was designed to make it very easy to write additional file types with the minimum of extra code, so I added a new writer to write translation files with numeric IDs instead of the textual IDs and no comments. The general idea being that this file would be translated, then you would run LangC on it again to turn the translated file back into the normal commented translation file format.

Unfortunately, whilst playing around with some test data, I discovered that both Google and Babelfish like to eat the new line characters and return the whole text as one long string. Undeterred, I modified the code to write a sentinel marker at the end of every line (##!!##). I also proceded to write a converter to convert the result of the translation back to a usable translation file. This didnt take long and it worked great with the hand written test data.

The Problems

Up to this point I had been testing with translations from English to Chinese. I had noticed that one or two lines out of the 138 strings from starter.fps had screwed up a little, with the = sign being moved. Looking at the data it didnt seem like much of a problem and could easily be sorted out in the converter.

I then proceded to test the translation with a language I can actually speak, French. It was then that I discovered how much of a problem there was going to be. The translator not only moved around the ID numbers, but also the equals sign and the end of line markers in such a way that it would require artificial intelligence fluent in both English and the translated language to sort out the mess.

I played around with a few more ideas to try and get some reasonably decent output, but decided that considering how bad automatic translations are and that they were of no use in a shipping product, enough much effort had been spent and I dropped the idea.

One idea that I havent tried is to write a sneakily formatted HTML file, put it on a web server and use the web page translation tools. I suspect that will work, but of course requires access to a web server, which somewhat limits its usefulness. At some point in the future I may try it, but for now enough effort has been spent.

The modified LangC

As I mentioned earlier, the additional code I've just described has been left in LangC.

Arg Description
-g Create a file for pasting into Google or Babelfish.
-G Convert the result of a Google or Babelfish translation back to a usable translation file

The -g option works exactly the same way as the -r option but produces a .google file with numeric IDs instead of the text IDs and the english text included.

The -G option works similarly to the -t option, but can only produce a translation file. You do not need to specify the -r option along with -G.

For example:

> langc -g starterfps_english.lang test
Compiling ...
test.lso - 133 string(s), 0 error(s), 0 warning(s)
> langc -e starterfps_english.lang -G convtest.txt conv
Compiling ...

convtest.txt - 2 string(s), 0 error(s), 0 warning(s)