Adding tag support to HTMLEditorKit related questions.

I have been trying for a LONG time here to try to add ruby support (used to write small characters above other characters, recommended by W3 but not part of the HTML 4.01 DTD) to Swing. I have read more unhelpful APIs and obnoxiously tortuous code than I care to think about and solved many, many, problems, but I am absolutely stuck in a place I believe is about two stops away from finishing.

I need to figure out how to build the Content models for the RUBY, RB, and RT tags. Everything I have tried has thrown errors like mad (either class cast or stack overflow). If there were ANY description at all in the API I think I would be able to do it.

Ruby markup looks like this:

<ruby><rb>main</rb><rt>annotation above main text</rt></ruby>

I don't actually know SGML, but from what I am reading the ruby stuff would look like this:

<!- ELEMENT RUBY - - (RB, RT)>

<!- ELEMENT RB - - (#PCDATA)>

<!- ELEMENT RT - -(#PCDATA)>

So you must have exactly one RB and one RT, in that order, each containing PCDATA only, and these are the only legal elements of the RUBY itself. All tags must be properly closed.

At present, I am registering these on the DTD like so:

dtd.defineElement("rt", DTDConstants.ANY, false, false, null, null, null,null);

dtd.defineElement("rb", DTDConstants.ANY, false, false, null, null, null,null);

dtd.defineElement("ruby", DTDConstants.ANY, false, false, null, null, null,null);

Obviously I will need to be passing ContentModels in when I am finished. I am using a custom ParserCallback which spits out message describing the operations performed upon it by the parser. Here is what I am putting in at the moment, and what I am getting from it on the console (courtesty of the custom ParserCallback):

<HTML>

<b>Some text</b>

<ruby><rb>1</rb><rt>2</rt></ruby>

</HTML>

Start: html

Start: head

End: head

Start: body

Start: p

Start: b

Text: Some text

End: b

End: p

Start: p

End: p

Start: ruby

Start: rb

Text: 1

End: rb

Start: rt

Text: 2

End: rt

End: ruby

End: body

End: html

So you can see that the tags are working right (when they are not registered, both start and end tags fire handleSimpleTag events on the ParserCallback). The problem is that because I am not defining the ContentModels, the parser is too generous. For example, you can throw in extra RB and RT tags or even unrelated junk. Before I move on to the next stage (turning the markup into a javax.swing.text.Element or whatever) I would like to solve this problem.

The code which definitely did NOT work was like this:

javax.swing.text.html.parser.Element pcdata = dtd.getElement("#pcdata");

ContentModel rtModel =new ContentModel(pcdata);

ContentModel rbModel =new ContentModel(',', pcdata, rtModel);

ContentModel rubyModel =new ContentModel(0, rbModel,null);

// these then being passed in when calling DTD.defineElement()

I tried many variations on the type parameter, which is where I suspect the problem is, but none of them worked and I cannot figure out for the life of me what they are supposed to be. The code above, for example, throws this:

java.lang.ClassCastException: javax.swing.text.html.parser.ContentModel

at javax.swing.text.html.parser.ContentModel.first(ContentModel.java:205)

at javax.swing.text.html.parser.ContentModelState.first(ContentModelState.java:148)

at javax.swing.text.html.parser.TagStack.first(TagStack.java:85)

at javax.swing.text.html.parser.Parser.legalElementContext(Parser.java:612)

at javax.swing.text.html.parser.Parser.legalTagContext(Parser.java:695)

at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1910)

at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1960)

at javax.swing.text.html.parser.Parser.parse(Parser.java:2127)

at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:105)

at Test$InnerParser.parse(Test.java:101)

at javax.swing.text.html.HTMLEditorKit.read(HTMLEditorKit.java:230)

at javax.swing.JEditorPane.setText(JEditorPane.java:1314)

at Test.<init>(Test.java:27)

at Test.main(Test.java:19)

Anyway. I am thoroughly stuck. Any help would be much appreciated.

Drake Dun

ne'r do well

[4972 byte] By [Drake_Duna] at [2007-10-3 10:12:35]
«« Hmmm..
»» help
# 1

Okay, the problem is definitely with the ContentModels.

I realize the way I posted the first time made the question a bit much to chew on. Let's narrow this down. How do I model this:

<!- ELEMENT RUBY - - (RB, RT)>

<!- ELEMENT RB - - (#PCDATA)>

<!- ELEMENT RT - -(#PCDATA)>

...using ContentModel?

Drake

Drake_Duna at 2007-7-15 5:32:48 > top of Java-index,Desktop,Core GUI APIs...
# 2

Man, you weren't kidding about that API: badly designed, with totally useless docs! Here's my guess: ContentModel rtModel= new ContentModel(pcdata);

ContentModel rbModel= new ContentModel(0, pcdata, rtModel);

ContentModel rubyModel = new ContentModel(',', rbModel);

uncle_alicea at 2007-7-15 5:32:48 > top of Java-index,Desktop,Core GUI APIs...
# 3

I FINALLY got it. I had to track down an unfamiliar HTML tag (<dl>) which had a somewhat similar format, get the ContentModel generated for that, and write a routine to break it down bit by bit recursively before I could figure out how this idiot class is supposed to work. For anyone who is wondering, the answer is:

ContentModel pcdataContainer = new ContentModel(pcdata);

ContentModel pcdataModel = new ContentModel('*', pcdataContainer, null);

ContentModel rtContainer = new ContentModel(rtElement);

ContentModel rbContainer = new ContentModel(0, rbElement, rtContainer);

ContentModel rubyModel = new ContentModel(',', rbContainer, null);

dtd.defineElement("rt", DTDConstants.ANY, false, false, pcdataModel, null, null, null);

dtd.defineElement("rb", DTDConstants.ANY, false, false, pcdataModel, null, null, null);

dtd.defineElement("ruby", DTDConstants.ANY, false, false, rubyModel, null, null, null);

Wheee. On to the next part, which will no doubt be just as bad or worse.

Drake

Drake_Duna at 2007-7-15 5:32:48 > top of Java-index,Desktop,Core GUI APIs...
# 4
Oh, thanks for trying, BTW. Three dukes. :PDrake
Drake_Duna at 2007-7-15 5:32:48 > top of Java-index,Desktop,Core GUI APIs...
# 5

And then, of course, since the list of elements allowed to show up in a P element is not open, it was necessary to add RUBY:

ContentModel pModel = dtd.getElement("p").content;

if (pModel.type == '*') {

ContentModel pSubOne = (ContentModel)pModel.content;

if (pSubOne.type == '|') {

ContentModel pSubTwo = (ContentModel)pSubOne.content;

ContentModel rubyContainer = new ContentModel(0, rubyElement, pSubTwo);

pSubOne.content = rubyContainer;

pModel.content = pSubOne;

dtd.defineElement("p", p.type, p.oStart, p.oEnd, pModel, p.inclusions, p.exclusions, p.atts);

}

}

Drake

Drake_Duna at 2007-7-15 5:32:48 > top of Java-index,Desktop,Core GUI APIs...