internationalisation - [SOLVED] | The BAsic CONverter Forum

vovchik
God

Posts: 2,792

internationalisation - [SOLVED] Mar 9, 2011 11:47:21 GMT 1

Quote

Post by vovchik on Mar 9, 2011 11:47:21 GMT 1

Dear Barry and Peter,

I followed Peter's instructions last night and got everything to work, after making a translation, by invoking "hello" for testing as follows:

export LANG=ru_RU; ./hello | gxview.

This gave me the output in a utf8 screen, so I could view the terminal results properly (my terms are set up for English). It did show:

Привет мир

as it should. Barry's instructions look good, too. I will try them next. I don't know whether the LC_MESSAGES var is needed since LANG=ru_RU seems to work.

With kind regards,
vovchik

l18l
New Member

Posts: 44

internationalisation - [SOLVED] Mar 9, 2011 23:28:10 GMT 1

Quote

Post by l18l on Mar 9, 2011 23:28:10 GMT 1

Mar 9, 2011 11:22:38 GMT 1 @admn said:

I guess for Chinese or Hebrew or Cyrillic we probably need to define the charset as UTF-16.

I don't think so as in the multilingual edition Quickset Wary of puppy there it is UTF-8.

But what about implementing eval_gettext and ngettext in bacon?

I'm hoping to have more time tomorror.

Good night

Last Edit: Mar 9, 2011 23:31:01 GMT 1 by l18l

Pjot Administrator Posts: 2,833	internationalisation - [SOLVED] Mar 10, 2011 8:53:30 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Pjot on Mar 10, 2011 8:53:30 GMT 1 Hi l18l, Hm, I don't have 'eval_gettext ' on my system... for 'ngettext', this is to determine plural forms of pieces of text - it doesn't really relate to other character sets...? Regards Peter

l18l
New Member

Posts: 44

internationalisation - [SOLVED] Mar 10, 2011 12:09:34 GMT 1

Quote

Post by l18l on Mar 10, 2011 12:09:34 GMT 1

Hi Peter,
now "back in village" I will
- try to review this thread (1st idea: renaming title of tread to "getting started internationalisation" and mark as SOLVED again?)
- try to answer LIFO
ngettext My experience in this is limited to shell scripting
www.gnu.org/software/hello/manual/gettext/sh.html#sh
As this forum is bacon-converter C is relevant:
www.gnu.org/software/hello/manual/gettext/C.html#C
cited from www.gnu.org/software/gettext/manual/gettext.html#Sources

# For C, C++, and GCC-source: gettext, dgettext:2, dcgettext:2, ngettext:1,2, dngettext:2,3, dcngettext:2,3, gettext_noop, and pgettext:1c,2, dpgettext:2c,3, dcpgettext:2c,3, npgettext:1c,2,3, dnpgettext:2c,3,4, dcnpgettext:2c,3,4.# For Objective C: Like for C, and also NSLocalizedString, _, NSLocalizedStaticString, __.
# For Shell scripts: gettext, ngettext:1,2, eval_gettext, eval_ngettext:1,2.

example without ngettext
message: dependent package(s) missing
same using ngettext
n="`echo dependencies | wc -w`" # sh syntax
message: There are 2 dependent packages missing
Note, in some languages plural forms are complexer than in English (vovchik may confirm)
well documented here
Peter, it was you who made me putting my nose into this.
You have presented your way making po file by msginit.
Now I have in proxy-setup.po have the line
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
and vovchik must have another content in this line
Plural-Forms: nplurals=3; plural=(n==1) ? 0 : (n>=2 && n<=4) ? 1 : 2;
eval_gettext try

# sudo gettext.sh
GNU gettext shell script function library version 0.17
Usage: . gettext.sh
#

in my stem I have got it like so


# gettext
gettext: missing arguments
# eval_gettext
bash: eval_gettext: command not found
# gettext.sh
GNU gettext shell script function library version 0.17
Usage: . gettext.sh
# . gettext.sh
# eval_gettext
#
# ngettext
ngettext: missing arguments
#

We can learn from each other and most when making mistakes
Best regards

Last Edit: Mar 10, 2011 12:12:14 GMT 1 by l18l

Pjot
Administrator

Posts: 2,833

internationalisation - [SOLVED] Mar 10, 2011 13:17:46 GMT 1

Quote

Post by Pjot on Mar 10, 2011 13:17:46 GMT 1

Hi l18l,

When using ngettext programmatically, wouldn't that complicate things a lot more? At least the INTL function must be adapted (or add a new func), where for each string you need to present two translatable strings...

We can learn from each other and most when making mistakes

Fully agreed! I make mistakes myself all the time

Regards
Peter

l18l
New Member

Posts: 44

internationalisation - [SOLVED] Mar 10, 2011 13:45:08 GMT 1

Quote

Post by l18l on Mar 10, 2011 13:45:08 GMT 1

Mar 9, 2011 3:41:18 GMT 1 barryk said:

...feedback is welcome if that page needs improving in any way.

So here is my feedback.

locale, for example de for German, fr for French, etc

A Linux user sets their system up with a locale such as en_AU,

might confuse as locale is something more than just a language or dialect, see
www.gnu.org/s/libc/manual/html_node/Effects-of-Locale.html#Effects-of-LocaleThe other things that are defined in locale (currency, date format ...) are not used in the demo, just the language.

The idea of using English dialects is genious !
So English-only people can get the idea and/or even better produce their own version (maybe just for fun, but puppy is fun and shall stay fun for some of us, isn't it?)
Exercise for barryk: translate just 1 msgstr in en_AU. Result?
Conclusion: ..........................

Best regards and thanks

l18l
New Member

Posts: 44

internationalisation - [SOLVED] Mar 10, 2011 14:30:57 GMT 1

Quote

Post by l18l on Mar 10, 2011 14:30:57 GMT 1

Hi Peter,

At least the INTL function must be adapted (or add a new func), where for each string you need to present two translatable strings...

This could be
bacon: define N_INTL or INTL(n, string$)
converter: converting to C syntax
C: using functions of gettext

The 2 ore more plural forms are made by the human translator.
I just know it from experience in sh using gettext package.
I don't see any reason why it should not work in C.
Most of the gettext documentation seems to be made for C but works in sh and hopefully sooner or later in bacon, too

Regards

edited later:
no idea of C but in my system but I have found in /usr/share/gettext/intl/ everything C relevant
try
# sudo find /usr -name ngettext
on your system, please.

Last Edit: Mar 10, 2011 17:56:40 GMT 1 by l18l

barryk
Senior Member

Posts: 269

internationalisation - [SOLVED] Mar 10, 2011 16:43:11 GMT 1

Quote

Post by barryk on Mar 10, 2011 16:43:11 GMT 1

I hit a problem when testing L18L's de translation.

If the de.po file has charset=UTF-8, but the LANG env. variable is set to de_DE (not UTF-8), then translation is not quite correct. However if LANG set to de_DE.utf8 then translation correct.

If de.po file has charset=ISO-8859-1, but the LANG env. variable set to de_DE.utf8, then translation not quite correct. However if LANG set to de_DE then translation correct.

I examined other .mo files and it seems the standard is to always have charset=UTF-8.

Therefore, I had to put a fix into my proxy-setup program:

REM .po/.mo files have 'charset=UTF-8', so LANG must also have .utf8 (or .UTF-8)...
REM however puppy users often set en_US not en_US.utf8, so append if missing...
lang$=EXEC$("echo -n $LANG | sed -e 's%\\.utf8%%' -e 's%\\.UTF-8%%' -e 's%$%.utf8%'")
SETENVIRON "LANG", lang$

...this will change LANG from de_DE to de_DE.utf8

It seems that this fix is going to be needed at the beginning of all BaCon programs!

Last Edit: Mar 10, 2011 16:45:05 GMT 1 by barryk

l18l New Member Posts: 44	internationalisation - [SOLVED] Mar 10, 2011 19:43:15 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by l18l on Mar 10, 2011 19:43:15 GMT 1 LANG without utf8 Yes, developers and translators must use UTF-8. General users should live fine without. Another solution tested in LANG=de, see image. Note, without 'export' it does not work.
	Last Edit: Mar 10, 2011 19:54:16 GMT 1 by l18l

Pjot
Administrator

Posts: 2,833

internationalisation - [SOLVED] Mar 10, 2011 20:30:30 GMT 1

Quote

Post by Pjot on Mar 10, 2011 20:30:30 GMT 1

@barry: your code again is inspired by shell scripting

...we can also make a genuine BaCon version:


lang$ = GETENVIRON$("LANG")
IF RIGHT$(lang$, 5) != ".utf8" THEN SETENVIRON "LANG", CONCAT$(lang$, ".utf8")

l18l: for performance reasons I would like to keep the name of internationalization functions to 4 letters, so lets add NNTL for 'ngettext', if that is OK with you?

Thanks
Peter

Last Edit: Mar 10, 2011 20:40:17 GMT 1 by Pjot

l18l New Member Posts: 44	internationalisation - [SOLVED] Mar 10, 2011 21:33:53 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by l18l on Mar 10, 2011 21:33:53 GMT 1 NNTL is okay for me thanks, Peter he that's no troubleshooting but features Will we be doing further discussing in a new thread (maybe 'i18n_plurals' ??) Regards

Pjot Administrator Posts: 2,833	internationalisation - [SOLVED] Mar 10, 2011 22:01:10 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Pjot on Mar 10, 2011 22:01:10 GMT 1 OK. It is available in BaCon 1.0 build 22 beta. Regards Peter

barryk
Senior Member

Posts: 269

internationalisation - [SOLVED] Mar 11, 2011 9:50:56 GMT 1

Quote

Post by barryk on Mar 11, 2011 9:50:56 GMT 1

L18L,
Thank you for the info, I found that putting this at the beginning of my program fixes everything:

OPTION INTERNATIONAL TRUE
REM .po/.mo files have 'charset=UTF-8', so either set UTF-8 on in LANG variable, or do this...
SETENVIRON "OUTPUT_CHARSET", "UTF-8"

So, I have updated my internationalization HOWTO:

bkhome.org/bacon/international.htm

l18l
New Member

Posts: 44

internationalisation - [SOLVED] Mar 11, 2011 14:50:36 GMT 1

Quote

Post by l18l on Mar 11, 2011 14:50:36 GMT 1

Hi Barry, Peter, vovchik and hello world,

knowledge about internationalisation is growing slowly but growth continues.
We can really learn more from each other

I have done the exercise en, en_AU myself including a bit of Russian
made some changes in po/mo and running in urxvt:

# cd /root/bacon/bk_test1
# LANGUAGE=ru ./test1
?????? ???
some more text
# 
# LANGUAGE=en ./test1
Hi Guys
some more text but note, this is text from 'en.po'
# 
# LANGUAGE=en_AU ./test1
Hi Aussies
some more text but note, this is text from 'en.po'
#
# LANGUAGE=de ./test1
Hi Guys
yadda yadda äöüßÖÜÄ
#

That was the result.
My locale (# locale -a) is just "C de_DE de_DE.utf8 de-DE.utf8 en_US POSIX"
Note, there is no en_AU and no ru.
Conclusions:

No need to translate every msgstr in dialect (~~locale~~) LANGUAGE as gettext will search further (1st en_AU then en then source) with each msgstr.

That is what I had exspected
but new for me and all who have not RTFM of GNU gettext
Locale-Environment-Variables
the LANGUAGE variable
was:

No need to use locale if just another LANGUAGE is needed.

Russian: As already mentioned, I don't have a ru_* locale set at the moment and I think urxvt would displayed correctly having set one. ( gxview not available with me) My ru.po is just containing msgstr "Привет мир".

I am hoping this will not confuse but give a better understanding and maybe simplify things.

Last Edit: Mar 11, 2011 14:54:21 GMT 1 by l18l

l18l New Member Posts: 44	internationalisation - [SOLVED] Mar 11, 2011 15:06:26 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by l18l on Mar 11, 2011 15:06:26 GMT 1 Mar 10, 2011 22:01:10 GMT 1 Pjot said: OK. It is available in BaCon 1.0 build 22 beta. Regards Peter Then I will try to build a small demo Thanks, Peter

internationalisation - [SOLVED]

Post by vovchik on Mar 9, 2011 11:47:21 GMT 1

Post by l18l on Mar 9, 2011 23:28:10 GMT 1

Post by Pjot on Mar 10, 2011 8:53:30 GMT 1

Post by l18l on Mar 10, 2011 12:09:34 GMT 1

Post by Pjot on Mar 10, 2011 13:17:46 GMT 1

Post by l18l on Mar 10, 2011 13:45:08 GMT 1

Post by l18l on Mar 10, 2011 14:30:57 GMT 1

Post by barryk on Mar 10, 2011 16:43:11 GMT 1

Post by l18l on Mar 10, 2011 19:43:15 GMT 1

Post by Pjot on Mar 10, 2011 20:30:30 GMT 1

Post by l18l on Mar 10, 2011 21:33:53 GMT 1

Post by Pjot on Mar 10, 2011 22:01:10 GMT 1

Post by barryk on Mar 11, 2011 9:50:56 GMT 1

Post by l18l on Mar 11, 2011 14:50:36 GMT 1

Post by l18l on Mar 11, 2011 15:06:26 GMT 1