How to handle macrons?

Hi

We have the need to display several macrons, eg in the word 'shogun' there is a line above the 'o'. To get the 'o' with a line above it, you can specify: "o"+"\u0304"

We encounter these sorts of 'combining macrons' during data imports and were thinking of searching for this pattern and replacing with the standard unicode character \u014D.

But many other 'combining macrons' may exist - I guess we'd have to do a big search an replace on all the common ones? The trouble is for X possible characters and Y macrons, there'll be X * Y possible combinations to look for...

Anyone with recommendations or similar experience?

Thanks

[680 byte] By [TimeLord2004a] at [2007-10-1 20:08:24]
# 1
Hi,Can't you create a stream which converts the character while you read from it?/Kaj
kajbja at 2007-7-13 1:29:19 > top of Java-index,Desktop,I18N...
# 2

Firefox does that: for example if I put ō in here then Firefox displays it as ō which I see as "o" with a line over it. Maybe your browser does the same thing, if you're using something else.

Now, I have no idea how Firefox does it. But Firefox is open-source so presumably their code for it is accessible. Could take a while to dig it out of there, and I don't think it's written in Java, but if all else fails you could try that.

DrClapa at 2007-7-13 1:29:19 > top of Java-index,Desktop,I18N...
# 3

> Can't you create a stream which converts the

> character while you read from it?

Yes this is one approach. My question was more along the lines of: is there a good approach for handling x * y possible combinations and whether anyone has had similar experiences...

thanks

TimeLord2004a at 2007-7-13 1:29:19 > top of Java-index,Desktop,I18N...
# 4

> Firefox does that: for example if I put ō

> in here then Firefox displays it as ō which I

> see as "o" with a line over it. Maybe your browser

> does the same thing, if you're using something else.

The browser does not come into it - we are writing a java application and happen to be storing the data in firebird! BTW, I use the mozilla suit ;-)

thanks all the same

TimeLord2004a at 2007-7-13 1:29:19 > top of Java-index,Desktop,I18N...
# 5
Try Normalization Form C (NFC) on your string. http://www.unicode.org/reports/tr15/
nguyenq87a at 2007-7-13 1:29:20 > top of Java-index,Desktop,I18N...
# 6
> The browser does not come into itFirebird displays the characters correctly. You want to know how to display the characters correctly. My suggestion was that you should look to see how Firebird does it. I didn't mean you should USE Firebird or anything like that.
DrClapa at 2007-7-13 1:29:20 > top of Java-index,Desktop,I18N...
# 7

> > The browser does not come into it

>

> Firebird displays the characters correctly. You want

> to know how to display the characters correctly. My

> suggestion was that you should look to see how

> Firebird does it. I didn't mean you should USE

> Firebird or anything like that.

errr... you mentioned firefox the first time and now you say firebird... I assume you still mean firefox (the browser) and not firebird (the database)

TimeLord2004a at 2007-7-13 1:29:20 > top of Java-index,Desktop,I18N...
# 8
Um, yes, you do assume correctly. Sorry for the confusion.
DrClapa at 2007-7-13 1:29:20 > top of Java-index,Desktop,I18N...
# 9
FYI, Firefox was indeed called Firebird with a phoenix logo: http://www.mozilla.org/products/firefox/releases/0.7.htmlHowever, due to legal reason, they had to abandon Firebird and use Firefox.
horiniusa at 2007-7-13 1:29:20 > top of Java-index,Desktop,I18N...