GapContent resetting offset to 0?

Hi,

I'm trying to parse lines in a Document using segments and the getText(int,int,segment) method. And in my parser i read the: offset, count , and array variables of the segment returned.

I've noticed that in some cases the segment offset is reset back to 0 rather than pointing at the offset into the document where my segment was taken from. And sometimes the "array" variable contains the whole document, where other times it only contains the text for the line.

I looked at the GapContent.getChars() source and I can see that it resets the offset to 0 IF the text is spanning the gap? What does that mean? and how can I not get it to reset the offset?

I was relying on the offset to be stable and the array to be stable too. But if those two keep changing I can't reliably find the correct offset when finding words in my text.

I suppose I could pass in the offset to my method and the string text instead of the Segment, but I was hoping to be as efficient as possible.

Can anyone explain why this is happening?

Here is the getChars() method from GapContent in JDK1.4.2_11

/**

* Retrieves a portion of the content. If the desired content spans

* the gap, we copy the content. If the desired content does not

* span the gap, the actual store is returned to avoid the copy since

* it is contiguous.

*

* @param where the starting position >= 0, where + len <= length()

* @param len the number of characters to retrieve >= 0

* @param chars the Segment object to return the characters in

* @exception BadLocationException if the specified position is invalid

* @see AbstractDocument.Content#getChars

*/

publicvoid getChars(int where,int len, Segment chars)throws BadLocationException{

int end = where + len;

if (where < 0 || end < 0){

thrownew BadLocationException("Invalid location", -1);

}

if (end > length() || where > length()){

thrownew BadLocationException("Invalid location", length() + 1);

}

int g0 = getGapStart();

int g1 = getGapEnd();

char[] array = (char[]) getArray();

if ((where + len) <= g0){

// below gap

chars.array = array;

chars.offset = where;

}elseif (where >= g0){

// above gap

chars.array = array;

chars.offset = g1 + where - g0;

}else{

// spans the gap

int before = g0 - where;

if (chars.isPartialReturn()){

// partial return allowed, return amount before the gap

chars.array = array;

chars.offset = where;

chars.count = before;

return;

}

// partial return not allowed, must copy

chars.array =newchar[len];

chars.offset = 0;

System.arraycopy(array, where, chars.array, 0, before);

System.arraycopy(array, g1, chars.array, before, len - before);

}

chars.count = len;

}

Thanks,

- Tim

[4435 byte] By [tmullea] at [2007-10-3 8:26:41]
# 1

If the range of characters you're retrieving doesn't overlap the gap, getChars() hands the Segment a reference to its own backing array to avoid making a copy of the data. The purpose of the offset and count fields is to let you know which part of the array you can use safely. If you look at anything outside that range, you'll either throw an ArrayIndexOutOfBoundsException or get garbage data because you're looking inside tha gap. If the portion you're interested in happens to be before the gap, the offset value will be the same as the where value that you specified. But if the portion is after the gap, offset will be equal to where plus the length of the gap. If the portion spans the gap, getChars() has to create a new array, and that array is only as big as it needs to be in order to hold the designated portion. That's why offset is set to zero.

In other words, while count will always be the same as len (unless you told the Segment you would accept partial returns, which you shouldn't do in this case), you always have to translate offset to bring it into alignment it with the Document or Element you're working with.

Message was edited by: uncle_alice (who else?)

Words in bold were changed to reduce ambiguity.

uncle_alicea at 2007-7-15 3:33:06 > top of Java-index,Desktop,Core GUI APIs...