Regarding getting specific font data from file

I am developing one application in java(swing) font converter. In which I have to upload .doc file .Then if that file contains marathi,english text then I want to change marathi font to unicode.My converter program is completed.Now problem is that how I get marathi font or any other non unicode font from file whcih I want to unicode?

Thanks in advance.

[367 byte] By [indictransa] at [2007-11-27 8:25:43]
# 1
unicode is not a font, do you mean charset instead of font?
robtafta at 2007-7-12 20:14:48 > top of Java-index,Java Essentials,Java Programming...
# 2
So after some very short wiki research, can I assume you are using Devan鈍ar?script? If so can I assume the docs are using ISCII encoding?With a little google help I found this: http://office.microsoft.com/en-us/word/HP030745551033.aspx
robtafta at 2007-7-12 20:14:48 > top of Java-index,Java Essentials,Java Programming...
# 3

I think I have not clearly mentioned my question.

I want to do like this...

1.Opening a .doc file in my java program(currently I am using text files)

2.Read that .doc file which contains different fonts data.

3.Then get data which is from dvbttyogesh font and convert it to unicode.

At this stage my program can convert the whole file which is in dvbttyogesh font

to unicode.

indictransa at 2007-7-12 20:14:48 > top of Java-index,Java Essentials,Java Programming...
# 4

font is to unicode like apples is to oranges

Unicode assigns a number to a character.

A charset or encoding is what is used to represent the character as bytes.

A font is used to draw the character to the screen.

You are reading in bytes, and you want to assign unicode values to those bytes. You keep using the word 'font' when font really isn't related to converting it to unicode. What encoding are these docs in? UTF8? UTF16? ISCII?

If this were chinese, you might have a BIG5 charset, but you could have any number of fonts to draw those characters to the screen. You are missing an important step in this conversion.

robtafta at 2007-7-12 20:14:48 > top of Java-index,Java Essentials,Java Programming...
# 5
I am sorry. I want to do like this..Parse a doc/odt file and extract text with a particular font tag, convert this text using our converter and then put back the converted unicode text into o/p file in place of the extracted fileand rest of the file remains intact.
indictransa at 2007-7-12 20:14:48 > top of Java-index,Java Essentials,Java Programming...
# 6

I clearly understand what you are saying. Do you clearly understand what I am saying? You are just dismissing what I say as if I don't know what you are doing.

If you are talking about a Microsoft Word Doc, then it is probably encoded in ISCII as stated here http://office.microsoft.com/en-us/word/HP052584541033.aspx

The other link I posted is a conversion tool to do exactly what you are asking. What you are asking is like asking me to convert 'Times New Roman' to Unicode, they aren't the same types of things so there is no real conversion.

robtafta at 2007-7-12 20:14:48 > top of Java-index,Java Essentials,Java Programming...