Diacritized Output

I'm a relative beginner with just a semester of Java under his belt. Anyway, I'I've just written a program that requires output of strings containing characters with diacritics on them (i.e. "accent marks"). I can use either macrons or a carrot-tops, but there has to be one or the other. The problem is that, although the program displays the output fine when compiled and run using SciTE, the diacritized characters are converted into what can best be described as cyber-gibberish when I run the program in DOS. I've tried saving the source code in Unicode and ASCII, but neither version works. Whenever I try to compile and run my program in a simple command prompt, one of three things happens depending on the encoding: 1) the compiler complains of "illegal characters," 2) the diacritized characters appear as cyber-gibberish, or 3) the diacritized characters are converted into regular characters, which hinders funcionality.

[943 byte] By [ScriptorProgrammoruma] at [2007-11-26 18:03:58]
# 1

HackusCodusmusmusmus,

Java uses UTF-8 internally doesn't it, and you've tried saving your source code files (which I presume has the snazzy characters hardcoded in them) in UTF-8.... which all sounds good so far...

So, might you be using byte[] IO functions where you need the character equivalents... that'd explain the gibberish output, but not the "illegal characters" (which I guess you got when you saved the UTF-16 characters as (8 byte) ASCII.

Other than that, try out your "Specto Petronum" on them.

keith.

corlettka at 2007-7-9 5:34:14 > top of Java-index,Java Essentials,Java Programming...
# 2

The problem is not in your programit's in the "DOS prompt."

The default encoding of the operating system is used when you write text to files or the console in Java; this guarantees that the text will be readily usable to native applications such as SciTe. On Windows in western locales the default encoding is windows-1252.

http://en.wikipedia.org/wiki/Windows-1252

Unfortunately the DOS prompt in Windows uses an ancient character encoding or "code page" -- not the windows default! Usually the DOS encoding is CP437 or CP850.

http://en.wikipedia.org/wiki/CP437

http://en.wikipedia.org/wiki/CP850

To remedy the situation you can try to change the encoding of DOS using the commandchcp 1252

If that doesn't work, there is a way to change the encoding Java uses from within your program but that means that it will work correctly only in DOS prompt.

> Whenever I try to compile and run my

> program in a simple command prompt

If you have saved the program source in a "unicode" encoding you have to tell it to the compiler, otherwise it will assume the default encoding (windows-1252) ... e.g. if you use utf-8 you have to compile withjavac -encoding utf-8 MyClass.java

jsalonena at 2007-7-9 5:34:14 > top of Java-index,Java Essentials,Java Programming...
# 3
jsalonen, thank you. keith.
corlettka at 2007-7-9 5:34:14 > top of Java-index,Java Essentials,Java Programming...