Requested: Good Website about Internationalization of HTML in Jsp

I have a general question about presenting internationalized text in html in a java environment.

I have the following problem: I have western european encoding as standard setting in my Windows. I have to enable eastern european text (also greek letters) to be presented in html. The text shall be stored via a swing application into a Hypersonic SQL database. This database encodes the special characters in Unicode format (/uXXXX).

After definition of the texts, a java application reads the text from db and shall present the thext via jsp and jsp-tags in html. The first problem I have solved was that the swing application (which also reads the text from db) can show the eastern european text, but the generated html didn't. I found the solution by adding <%@ page contentType="text/html;charset=UTF-8" language="java" pageEncoding="UTF-8" %> into the beginning of the jsp. The next problem was that javascript didn't work correctly. I have found the solution by adding the correct charset - Definition to the script tag. The next problem I faced was that the input of special chars via html forms did not stored correctly. This problem I have solved by reencoding the field value with a new String(_x.getBytes("ISO-8859-1"), "UTF-8") - Statement in the Bean which keeps the entered value and stores it into the db.

As you see it is not easy to enable a internationalized html application.

I have searched the web for a good explanation site about the problem but only got help by collecting info from different sites and by trial and error.

My questions are:

1.) Does anyone know if there are any web sites with a very good, compact and understandable description that describe what someone has to regard to enable a truely internationalized html application, maybe I still have forgotten something.

2.) What would I have to regard when changing the db to let's say postgres, mysql and so on.

[1959 byte] By [_veilchena] at [2007-10-1 10:20:11]
# 1

I have learned of the same stumble stones in the same way, by trial and error.

Fields arriving from a form in ISO-8859-1 should be possible in a current version of J2EE 1.4 by request.setCharacterEncoding("UTF-8").

(I haven't tried it.)

The other thing is, that ISO-8859-1 treats u0080 - u00A0 as Unicode control characters, that is, they are discarded or shown as boxes. In fact ISO-8859-1 in browsers is treated as Windows-1252 (even MacOS+Netscape), where these characters are used for mdash etcetera. They also have higher Unicode equivalents (greater than u00FF). Converting these characters to ISO-8859-1 by J2EE will yield question marks '?'.

joop_eggena at 2007-7-10 2:47:09 > top of Java-index,Desktop,I18N...