Did the Java Writers Get a Little Happy with Memory Use in String class?

I mean that question half jokingly.

Taking a look at the String class code, I saw the following (below).

Why is there an int for offset and an int for count?

String is immutable, why not just return value.length for the count?

And what is the offset for?

Thanks!

publicfinalclass String

implements java.io.Serializable, Comparable<String>, CharSequence

{

/** The value is used for character storage. */

privatefinalchar value[];

/** The offset is the first index of the storage that is used. */

privatefinalint offset;

/** The count is the number of characters in the String. */

privatefinalint count;

/** Cache the hash code for the string */

privateint hash;// Default to 0

PS: a second look kinda verifies it too.

in the constructor ,count is even defined as value.length.

public String(char value[]){

int size = value.length;

this.offset = 0;

this.count = size;

this.value = Arrays.copyOf(value, size);

}

[1904 byte] By [TuringPesta] at [2007-11-26 20:17:45]
# 1
String s1 = "abcdefghijkl";String s2 = s1.substring(3,6);s1 and s2 will share the same backing char array, I believe. s1 will use the whole thing, and s2 will use chars 3 through 5--offset (inclusive) to offset + length (exclusive).
jverda at 2007-7-10 0:41:03 > top of Java-index,Java Essentials,Java Programming...
# 2
I'd take a look at the substring method if I were you.Edit: Basically beaten to it by jverd.Message was edited by: warnerja
warnerjaa at 2007-7-10 0:41:03 > top of Java-index,Java Essentials,Java Programming...
# 3

I believe youre right. Substring eventually gets around to this:

// Package private constructor which shares value array for speed.

String(int offset, int count, char value[]) {

this.value = value;

this.offset = offset;

this.count = count;

}

Am I silly for second guessing this? How often are Strings "shared".

Id say I almost NEVER share strings. Id much rather just have the

2 ints per String back, lol.

TuringPesta at 2007-7-10 0:41:03 > top of Java-index,Java Essentials,Java Programming...
# 4
> Am I silly for second guessing this? How often are> Strings "shared".> Id say I almost NEVER share strings.You don't get the choice. The arrays are shared whenever you do substring.
jverda at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 5

> Id much rather

> just have the

> 2 ints per String back, lol.

That's 8 bytes.

If there were no sharing, an entire new array object (20 bytes Object overhead, I think) would have to be created, plus 2 bytes * the number of chars in the substring, plus the int for the array's length, plus the cycles to copy the characters.

jverda at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 6
> You don't get the choice. The arrays are shared> whenever you do substring.Yea, thats my point. I think most Strings are unique. Id be very surprised if even 2% of all String use was a result of a substring.
TuringPesta at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 7

> If there were no sharing, an entire new array object

> (20 bytes Object overhead, I think) would have to be

> created, plus 2 bytes * the number of chars in the

> substring, plus the int for the array's length, plus

> the cycles to copy the characters.

Interesting, I didnt know the overhead was 20 bytes.

Anyway thanks.

By the way, this thread was just curiosity im not about to stop using

String or anything, lol. I just happened to notice it when looking

for something else.

TuringPesta at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 8

When I think of the code I have written recently, whenever I get a substring of some string, it's nearly always for some temporary purpose -- like comparing it to something else, or concatenating it with something else. I almost never assign it to a variable where it's going to be a long-lasting object.

And for that kind of usage, the non-copying implementation is ideal for saving time and space.

DrClapa at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 9

> > You don't get the choice. The arrays are shared

> > whenever you do substring.

>

> Yea, thats my point. I think most Strings are unique.

>

> Id be very surprised if even 2% of all String use was

> a result of a substring.

The wrapper classes use it for parsing.

File uses it for getName and getParent.

String uses it for trim.

Class uses it for resolving names.

Regex uses it.

etc. ...

Now, I don't know what percentage of usage all that amounts to, but it IS heavily used.

So your desire to gain the 2 bytes back would cost enough byte every time substring is used that it might end up costing far more than the 8 bytes you save.

jverda at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 10

> > If there were no sharing, an entire new array

> object

> > (20 bytes Object overhead, I think) would have to

> be

> > created, plus 2 bytes * the number of chars in the

> > substring, plus the int for the array's length,

> plus

> > the cycles to copy the characters.

>

> Interesting, I didnt know the overhead was 20 bytes.

I'm not sure if it is. That number's in my head for some reason though.

jverda at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...
# 11

>> Interesting, I didnt know the overhead was 20 bytes.

> I'm not sure if it is. That number's in my head for some reason though.

It's VM dependent. Generally the figure I'd work with would be 16 bytes - 8 for general object overhead, 4 for size, 4 for worst-case word alignment.

YAT_Archivista at 2007-7-10 0:41:04 > top of Java-index,Java Essentials,Java Programming...