Faster split than String.split() and StringTokenizer?
First I imrpoved performance of split by replacing the String.split() call with a custom method using StringTokenizer:
final StringTokenizer st =new StringTokenizer(string, separator,true);
String token =null;
String lastToken = separator;//if first token is separator
while (st.hasMoreTokens()){
token = st.nextToken();
if (token.equals(separator)){
if (lastToken.equals(separator)){//no value between 2 separators?
result.add(emptyStrings ?"" :null);
}
}else{
result.add(token);
}
lastToken = token;
}//next token
But this is still not very fast (as it is one of the "hot spots" in my profiling sessions). I wonder if it can go still faster to split strings with ";" as the delimiter?
If anything can beat the tokenizer, then probably indexOf and substring.
> If anything can beat the tokenizer, then probably> indexOf and substring.Or a StringBuilder, filled while iterating over the String.
Yup, for simple splitting without escaping of separators, indexOf is more than twice as fast:
static private List<String> fastSplit(final String text, char separator, final boolean emptyStrings) {
final List<String> result = new ArrayList<String>();
if (text != null && text.length() > 0) {
int index1 = 0;
int index2 = text.indexOf(separator);
while (index2 >= 0) {
String token = text.substring(index1, index2);
result.add(token);
index1 = index2 + 1;
index2 = text.indexOf(separator, index1);
}
if (index1 < text.length() - 1) {
result.add(text.substring(index1));
}
}//else: input unavailable
return result;
}
Faster? ;-)
It was rather obvious - regex is complicated, and Tokenizer is just as quick as indexOf; at best it's almost the same, plus one additional method call, and a one-iteration loop.
I think the fastest yet would be too abandon Strings and use char arrays,a for loop and System.arraycopy
Sometimes I wonder Martin why you use Java at all? Why not assembler? I am serious.
> I think the fastest yet would be too abandon Strings> and use char arrays,> a for loop and System.arraycopyI'd have to look again to make sure, but I thought String is already using sub-arrays for substrings?
> > I think the fastest yet would be too abandon
> Strings
> > and use char arrays,
> > a for loop and System.arraycopy
>
> I'd have to look again to make sure, but I thought
> String is already using sub-arrays for substrings?
What I am thinking is to return a list (or array) of char arrays. Don't want
to waste time allocating whole new String objects that aren't needed
you know.
I dunno. All Martin's questions are of two varieties it seems:
- how big is x in memory? and how can I squeeze this down
- how can I hack up things to squeeze more performance out of
standard API features?
I mean this are both fine topics but to me suggest that Java is not really
the way to go here...
> But this is still not very fast (as it is one of the
> "hot spots" in my profiling sessions). I wonder if it
> can go still faster to split strings with ";" as the
> delimiter?
Speeding up the code is only one approach.
Another approach is to not call the code in the first place. Look at the design and determine how to do the work without calling that piece of code, perhaps deferring it until needed.
I wrote my own, years ago, before regex was part of the SDK. I wrote it mostly to return an empty string when two delimiters where found consecutively. At the time it was about 30% faster then StringTokenizer (I made a few other assumptions as well). If you want to check it out then search the forum for "SimpleTokenizer" to find the download link.
I should mention that my largest test case was on a string of 200 characters that yielded 10 tokens. I did 200,000 iterations. Total time was half a second. So I don't see a big savings here. As suggested above looking at the design may be a better approach.
String.split() has to recompile the regex and create new Pattern and Matcher objects each time it's called. If you do that stuff ahead of time, I think you'll notice that the split() approach speeds up considerably. You would be using Pattern's split() method in that case, not String's. I'm also assuming that the regex is well written, but the regexes people use with split() tend to be simple, which makes it difficult to screw them up too badly. :D
> I dunno. All Martin's questions are of two varieties
> it seems:
>
> - how big is x in memory? and how can I squeeze this
> down
>
> - how can I hack up things to squeeze more
> performance out of
> standard API features?
>
> I mean this are both fine topics but to me suggest
> that Java is not really
> the way to go here...
"All" my questions? Hello? In my last 3200 posts I guess, the last 2 questions were of this type, the (latest) rest has nothing to do with it. It's that fast that people get squeezed into a tiny little drawer ... :-(
> "All" my questions? Hello? In my last 3200 posts I
> guess, the last 2 questions were of this type, the
> (latest) rest has nothing to do with it. It's that
> fast that people get squeezed into a tiny little
> drawer ... :-(
You say you posted 3200 questions? :) I do recall some performance questions, but mostly related to BigInt or BigDecimal. I also remember a lot about 1.5 backwards incompatibility. But of course, I never remember much.
For example http://forum.java.sun.com/thread.jspa?threadID=618476 http://forum.java.sun.com/thread.jspa?threadID=623619&start=0&tstart=0Basically you never seem to like what Java does or how it works etc. So I do wonder why you bother at all.
> Basically you never seem to like what Java does or> how it works etc. So I do wonder why you bother at> all.Well, there's a lot of stuff I dislike, too. Still I wouldn't want to touch C++ again if I can avoid it.
> > Basically you never seem to like what Java does or
> > how it works etc. So I do wonder why you bother at
> > all.
>
> Well, there's a lot of stuff I dislike, too. Still I
> wouldn't want to touch C++ again if I can avoid it.
I am not seeing your long list of concerns about how common API's are too slow, how much memory is taken up by X and how common parts of Java are borked.
If these indeed were your concerns (and I am pretty sure they are not) then I would advise the same to you. Java is not for you.
> Java is not for you.Never claimed it is. :)
To use an analogy. If someone consistently complained that the handleof their screwdriver was shattering when they used it to drive in nails I would tell them to consider using a hammer instead.
@cotton.m: you seem to be somebody who just argues against other people. You posted 2 links above as "examples" and I wonder that you didn't posted there. You just browsed my posts (3200 posts ... there you can find definitely some 'usable' example for your claims) to find something. You also seem to be somebody who just denies about the problems in Java. I love Java because it is so easy and powerful compared to all the other languages I have used so far (Basic, Pascal, Oberon, Modula-2, Lisp, Ada, C, C++). But that doesn't mean that everything is shiny golden perfect. I also don't see _any_ post of me where I say that Java is bad. I just wrote some posts about known issues of Java (as every other language also has) - either if it is a bug, bad design (in the eyes of me and some others), or simply an opinion of mine where most others don't agree with. This is called freedom of speech and can be answered with arguments against it if they are valid. This thread is just a thread about possible performance improvements. Some people (like you) seem to be ignorant about the requirements other people have. They say "use another language" (like you) or other stupid things that don't help here. Just for your information: my requirement is to make a Java application faster (if possible) without sacrificing the environment (i.e. it should still be pure Java). It's not about "I want to split text as fast as possible" - this is a Java forum here.
It's the same for the double/BigDecimal issues I posted about. Many people just argue in the technical clean room way and don't think about the real requirements of an application or that an existing application is there to be fixed/improved.
> my requirement is to make a Java
> application faster (if possible) without sacrificing
> the environment (i.e. it should still be pure Java).
> It's not about "I want to split text as fast as
> possible" - this is a Java forum here.
But are you really doing it in the correct way? Profiling an application is good, and that should be done, but what jschell posts in reply #9 is more important.
"Speeding up the code is only one approach.
Another approach is to not call the code in the first place. Look at the design and determine how to do the work without calling that piece of code, perhaps deferring it until needed."
My professor at the university did always say:
"The fastest code is the one which never executes"
Kaj
kajbja at 2007-7-21 10:35:29 >

Out of interest I created a test harness to compare the performance of your 'fast split' with a Regex split.
import java.util.*;
import java.util.regex.*;
import java.text.*;
public class Fred904
{
static private List<String> fastSplit(final String text, char separator, final boolean emptyStrings)
{
final List<String> result = new ArrayList<String>();
if (text != null && text.length() > 0)
{
int index1 = 0;
int index2 = text.indexOf(separator);
while (index2 >= 0)
{
String token = text.substring(index1, index2);
result.add(token);
index1 = index2 + 1;
index2 = text.indexOf(separator, index1);
}
if (index1 < text.length() - 1)
{
result.add(text.substring(index1));
}
}//else: input unavailable
return result;
}
public static void main(String[] args)
{
try
{
DecimalFormat formatter = new DecimalFormat("0.00");
int count = 1000000;
String text = "The quick brown fox jumps over the lazy dog";
{
long start = System.currentTimeMillis();
for (int index = 0; index < count; index++)
{
List<String> split = fastSplit(text,' ',false);
}
double delta = (System.currentTimeMillis() - start) * 1000.0 / count;
System.out.println("Fast split delta = " + formatter.format(delta) + " uS");
}
{
Pattern p = Pattern.compile(" ");
long start = System.currentTimeMillis();
for (int index = 0; index < count; index++)
{
String[] split = p.split(text);
}
double delta = (System.currentTimeMillis() - start) * 1000.0 / count;
System.out.println("Regex split delta = " + formatter.format(delta) + " uS");
}
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
On my FC5 using 1.6.0 beta2 I get
Fast split delta = 1.44 uS
Regex split delta = 6.22 uS
which indicates that your 'fast split' is about 4 times as fast as using Regex.
BUT, would I use your 'fast split' ? Not unless profiling showed it to be critcal!
I have worked on some performance critical systems serving thousands of concurrent users and have never found this sort of feature critical. If you are using a database then it is almost certain that the limiting factor will be the database. If you are using EJB then object lookup using your EJBHome will be much slower than this. If you are using RMI then serialization will be much slower than this.
> It's the same for the double/BigDecimal issues I
> posted about. Many people just argue in the technical
> clean room way and don't think about the real
> requirements of an application or that an existing
> application is there to be fixed/improved.
Speaking of which, I didn't notice any responses to that post after you dismissed the description of what ""+d gets translated into (it's correct, by the way). At some point in there, I explained how ""+d also relied on undocumented rounding within Sun internal classes. As you said that this application is used for financial services, I very strongly recommend that you revisit that thread, and give some more consideration to exactly what your program is doing.
> > my requirement is to make a Java
> > application faster (if possible) without
> sacrificing
> > the environment (i.e. it should still be pure
> Java).
> > It's not about "I want to split text as fast as
> > possible" - this is a Java forum here.
>
> But are you really doing it in the correct way?
> Profiling an application is good, and that should be
> done, but what jschell posts in reply #9 is more
> important.
>
> "Speeding up the code is only one approach.
>
> Another approach is to not call the code in the first
> place. Look at the design and determine how to do the
> work without calling that piece of code, perhaps
> deferring it until needed."
>
> My professor at the university did always say:
> "The fastest code is the one which never executes"
>
> Kaj
This is soooooo helpful! :-(
> BUT, would I use your 'fast split' ? Not unless
> profiling showed it to be critcal!
>
> I have worked on some performance critical systems
> serving thousands of concurrent users and have never
> found this sort of feature critical. If you are using
> a database then it is almost certain that the
> limiting factor will be the database. If you are
> using EJB then object lookup using your EJBHome will
> be much slower than this. If you are using RMI then
> serialization will be much slower than this.
Why do people not just only post if they have a constructive answer? I post a question about how I can split a string faster and I get an answer that I first should make other things faster. Can't you imagine that there are requirements that need to have a faster split? I profiled my app and the split method is one of the "hot spots" performance wise. (=> of course, the profiler showed other hot spots that I already "fixed" but for a multithreaded app that is performance critical you also need to optimize code fragments that "just" gain some seconds - through iterations of thousands and millions of calculations this could even lead to minutes or hours)
> > BUT, would I use your 'fast split' ? Not unless
> > profiling showed it to be critcal!
> >
> > I have worked on some performance critical systems
> > serving thousands of concurrent users and have
> never
> > found this sort of feature critical. If you are
> using
> > a database then it is almost certain that the
> > limiting factor will be the database. If you are
> > using EJB then object lookup using your EJBHome
> will
> > be much slower than this. If you are using RMI
> then
> > serialization will be much slower than this.
>
>
> Why do people not just only post if they have a
> constructive answer?
Constructive answers are not just the ones you expect (or want) Martin. This is a point that I really think you should know by now.
> I post a question about how I
> can split a string faster and I get an answer that I
> first should make other things faster.
It's a good, constructive answer.
> Can't you
> imagine that there are requirements that need to have
> a faster split?
No.I can however well imagine that you misunderstand your requirements.
> I profiled my app and the split
> method is one of the "hot spots" performance wise.
> (=> of course, the profiler showed other hot spots
> that I already "fixed" but for a multithreaded app
> that is performance critical you also need to
> optimize code fragments that "just" gain some seconds
> - through iterations of thousands and millions of
> calculations this could even lead to minutes or hours)
If this is really so overly critical, and based on your history it isn't because you don't know what end is up, then write some native methods and call them. That will be faster.
PS I love how when I first pointed out that Java is not really for you in this thread you got pissy and complained that I was typecasting you but when I posted links to similarly minded threads of yours that was being argumentative.
> Why do people not just only post if they have a
> constructive answer?
I think that my post is constructive. What can you do when you have optimized so that all methods are as fast as they can be? The only thing you can do after that is to reduce the calls to the methods, and change algorithms and/or design of the application. You have to move from the micro level to the macro level.
Kaj
kajbja at 2007-7-21 10:35:29 >

@Op. You can try to create lookup tables for common data.Kaj
kajbja at 2007-7-21 10:35:29 >

At any rate Martin I come back to my first point.If micro-optimizations in your code is truly where it's at for you then Java is not a suitable choice for you. THAT is a constructive comment. Even though you will continue to bemoan otherwise. As you always do.
@cotton: I ignore you now because you always repeat yourself with your non-constructive answers and just point out that I am "always " so or so ... instead of just contributing to this thread. But I guess we're now in a dead end situation. You just have different opinions how to "convince" someone to not use Java instead of simply make suggestions how to possible make a routine faster. Seems you never wrote any performance critical application in Java. Oh yes, of course: you don't use Java for performance critical applications ... I forgot.
@Martin
> @cotton: I ignore you now because you always repeat
> yourself with your non-constructive answers
In your mind only Martin
> and just
> point out that I am "always " so or so ... instead of
> just contributing to this thread.
My contribution to this thread is to point out that many of your questions are in the same vien of "how can I get micro-optimizations and/or Java is broken because it doesn't work the way I think it should".
>But I guess we're
> now in a dead end situation. You just have different
> opinions how to "convince" someone to not use Java
> instead of simply make suggestions how to possible
> make a routine faster.
The other advice given. For example call it less is good. But you don't want to consider even for a moment that there could be any problem with the design of your application.
>Seems you never wrote any
> performance critical application in Java.
You are an idiot. Flat out idiot. You seem to think performance begins and ends with your micro-optimizations. That is just beyond stupid.
> Oh yes, of
> course: you don't use Java for performance critical
> applications ... I forgot.
I never said that and you know it.
I am not wasting any more time on you. You are an utter fool a liar and a whiny suckwad.
@Martin this thread has been happily dead for a week, why did you bring it back?
mlka at 2007-7-21 10:35:34 >

Folks, are you done playing? If Martin says "that part needs to be faster" and we say "look at other parts first": he should cut us some slack because in 99% of the posts here it's the correct answer, and we can't guess he used a profiler.
Now that he told us, his question is valid IMO, and got lots of replies. Plus "best not to run it at all", too, which is just as valid and worth a consideration.
As for performance-critical code in Java: at least I never did. All performance fixes I ever came across included DB caching and removinf String concatenations. The crude stuff. If you really want to optimize at instruction level (what performance optimizaiton usually boils down to), then using assembler via JNI would indeed give you greater control.
> @Martin this thread has been happily dead for a week,> why did you bring it back?There might be a bug in Java. *rolleyes*
> Folks, are you done playing? If Martin says "that
> part needs to be faster" and we say "look at other
> parts first": he should cut us some slack because in
> 99% of the posts here it's the correct answer, and we
> can't guess he used a profiler.
>
> Now that he told us, his question is valid IMO, and
> got lots of replies. Plus "best not to run it at
> all", too, which is just as valid and worth a
> consideration.
>
So you didn't read reply 5 either?
Let me repeat what I said there.
"I think the fastest yet would be too abandon Strings and use char arrays, a for loop and System.arraycopy"
This is VERY applicable. There are all sorts of new String objects being created. Get rid of that overhead and use arraycopy directly. This will get rid of one object allocation and multiple statements to execute in each run.
This will be the fastest that one can do before moving the code to C++ via JNI.
And again it was suggested by me in reply 5.
import java.util.*;
public class Test{
public static void main(String args[]){
Test t = new Test();
List l = t.fastSplit("The speedy road-runner runs away from the slow coyote",' ');
for(int i=0;i<l.size();i++){
String s = new String(((char[])l.get(i)));
System.out.println("'"+s+"'");
}
l = t.fastSplit("The speedyroad-runner runsaway from theslow coyote ",' ');
for(int i=0;i<l.size();i++){
String s = new String(((char[])l.get(i)));
System.out.println("'"+s+"'");
}
}
public List fastSplit(String toSplit, char delim){
List l = new ArrayList();
char[] chars = toSplit.toCharArray();
int begin = 0;
int end = 0 ;
boolean inToken = false;
for(int i=0;i<chars.length;i++){
if(!inToken){
if(chars[i]!=delim){
inToken = true;
begin = i;
continue;
}
}else{
if(chars[i]==delim){
end = i;
int len = end-begin;
if(len>0){
char[] temp = new char[len];
System.arraycopy(chars,begin,temp,0,len);
l.add(temp);
}
inToken = false;
}
}
}
if(inToken){
end = chars.length;
int len = end-begin;
if(len>0){
char[] temp = new char[len];
System.arraycopy(chars,begin,temp,0,len);
l.add(temp);
}
}
return l;
}
}
> So you didn't read reply 5 either?
>
> Let me repeat what I said there.
>
> "I think the fastest yet would be too abandon Strings
> and use char arrays, a for loop and
> System.arraycopy"
Yes. Did I say it's incorrect? Let me quote myself again, too:
>> Now that he told us, his question is valid IMO, and
>> got lots of replies. Plus "best not to run it at
>> all", too, which is just as valid and worth a
>> consideration.
I don't feel like this discussion is centered around specific Java-related suggestions. Especially not since Martin argued your point about moving to another language.
> "I think the fastest yet would be too abandon Strings and use char arrays, a for loop and System.arraycopy"
The only way to know for sure is to run a test. Following are the result for a test string containing 10, 20 character long tokens.
(ie. "00000000000000000000,11111111111111111111, .....")
StringTokenizer: 656
SimpleTokenizer (see reply 10): 344
String.split(): 1687
compiled Pattern( see Note below): 1891
fastSplit (see reply 35): 1094
Note: I know next to nothing about regex's. I based my code on the following example:
String text = "a b c \"one two three\" d e";
Pattern p = Pattern.compile("\".*?\"|\\S+");
Matcher m = p.matcher(text);
while (m.find())
{
String nextToken = m.group();
System.out.println(nextToken);
}
Since our example is only using a "," as the delimiter, I changed the Pattern compiled to be:
Pattern p = Pattern.compile(".,");
I know the expression isn't correct, but it matches a smaller string, so you would think it would be faster than matching the whole string. End result is the precompiling the expression seems to be slower than using String.split(), so I assume I am still doing something wrong. The Pattern was declared as a static variable when compiled, but assigned to a local variable before being used.
> Seems you never wrote any
> performance critical application in Java. Oh yes, of
> course: you don't use Java for performance critical
> applications ... I forgot.
I have in java. I have in C++ as well.
And I have used profilers in both.
The standard idiom .....
1. Find the hotspot(s)
2. Find how many times it is called.
3. Find how long it takes.
Reducing the number of calls, when multiple calls exist will produce the greatest speed up.
You dismissed that idea with no qualification.
And if I had an application where the primary and overriding requirement was speed I would write it in C++. Then profile it. And for selected hotspots it at least possible I would re-write in assembler. I have certainly done that in the past.
> > Seems you never wrote any
> > performance critical application in Java. Oh yes,
> of
> > course: you don't use Java for performance
> critical
> > applications ... I forgot.
>
> I have in java. I have in C++ as well.
>
> And I have used profilers in both.
>
> The standard idiom .....
> 1. Find the hotspot(s)
> 2. Find how many times it is called.
> 3. Find how long it takes.
>
> Reducing the number of calls, when multiple calls
> exist will produce the greatest speed up.
>
> You dismissed that idea with no qualification.
>
> And if I had an application where the primary and
> overriding requirement was speed I would write it in
> C++. Then profile it. And for selected hotspots it
> at least possible I would re-write in assembler. I
> have certainly done that in the past.
That proofs that you don't have much experience in optimizing Java:
> Reducing the number of calls, when multiple calls
> exist will produce the greatest speed up.
because this is still plain wrong (It _can_ be right for specific situations, but it doesn't guarantee the greatest speed up). It always depends: sometimes routines that are just called a few times (but take long) are better to be optimized. Sometimes, methods that are called several 10 thousand times can't be further optimized or wouldn't gain much (compared to other hot spots). There is no "do this and it gets faster" - that's the long and detailed work of profiling big applications in detail.
For the last 15 months I concentrated in optimizing our existing Java applications that basically do heavy number crunching, imports, exports (text, csv, xml, soap, sql), and also massive database access. The result of our optimizations were that we rewrote already server 3 applications that were written in VisualBasic and C. The result was a speedup of factors 10 - 50 (!). During reimplementation we had a phase where we had parts in Java and parts still as the DLLs called from Java and each replacement of a DLL gained more speed. Of course, it has also to do with application desgin and those apps might have been written badly, but the most gain in performance was achieved in optimizing the Java code. Of course, we started optimizing database access, minimizing data movement between resources (files, database, soap), usage of heavy caching. This was by far the greatest gain. Example: instead of a quarterly number crunching of 32 hours we now are at 6 hours (... so far!) with the new Java application - and it not only computes the same numbers with the same amount of data in the database but also computes additional data, has a lot more processing log messages.
We found most hot spots during our profiling sessions. So, with this deep knowledge of months of profiling and testing, we don't want to stop after this quite good success. Why stop optimizing if you still see hot spots that you don't really accept? Even with these "micro optimizations" (using final, wrap logger messages, use StringBuilders, append chars instead of string, use unsynchronized collections, etc.) we still gaining more time and every minute we still gain is in the end cash money. So in this situation, suggestions like "don't use Java" are just senseless. It's just a question of how to get even further. Most I've learned about Java was during profiling and debugging. Building test cases and comparing different algorithms and methods. I don't care optimizing our client apps because they're interactive apps that don't need to be some milliseconds or even seconds faster - the cost/benefit relation is too bad. Optimizing a server application for 2 weeks to just gain a few minutes is more valuable that others might imagine.
So in the end, I just wanted to discuss here how to improve speed in Java and not how to imrpove speed in general for writing applications. It's sad that you always have to justifiy yourself _why_ you want to achive something. If I wrote all these in the first post, nobody would have read it, because I also don't want to read posts with so much text.
Just another example: who cares optimizing BigDecimal scaling? With my custom scaling method, we gained minutes (!) in our number crunching app - and of course, in this thread there also were people convincing me to not use Java or that I'm plain "stupid" being so "narrow minded" to only want to speed up standard Java APIs.
I think you have way too much free time.
What about scaling in hardware?
If you're down to squeezing minutes off your application, wouldn't a greater speedup be achieved with upgraded hardware?
That would naturally cost some money, but it would free your time to work on other parts of the system(s) that may not have been optimized as much as that.
I'm all for optimization, but when it comes down to nano-optimization, I would move on.
You've pissed off a lot of people and you're pissed off at them (or rather us) as well, but you can't just say "that's useless info", considering that most of the answers in this thread have been constructive, even if they criticize your way of seeing things.
Cheers,
Jussi
The original question wasFaster split than String.split() and StringTokenizer?I can see I've been wasting my time actually providing an answer to the question.
> Just another example: who cares optimizing BigDecimal
> scaling? With my custom scaling method, we gained
> minutes (!) in our number crunching app - and of
> course, in this thread there also were people
> convincing me to not use Java or that I'm plain
> "stupid" being so "narrow minded" to only want to
> speed up standard Java APIs.
And in that thread as well were people who pointed out to you that your use of BigDecimal was incorrect.
If you're going to micro-optimize, fine. Perhaps the runtime you save offsets the programmer time invested. But if you dismiss the peope who do, in fact, know more than you about the inner workings of Java, just because you don't want to hear what they say ... well, as long as you're not working with me, that's fine too.
And as a side note: you keep reviving this thread. Do you really expect people to say, "yes, Martin, you were right all along?"
So why do you respond to this thread? You keep going insisting on your points and accusing me on insisting on my points. I just explained _why_ I need to optimize Java code and therefore why I find the suggestions useless trying to convince my to do something different like "don't call the method at all" or "don't use Java" - such answers are plain useless - at least after I tried to make my point clear. But I could have saved the time for it anyway as you don't care about the requirements and circumstances of other people. It's just your way that is the best. Of course, you probably would gain more speed if you've written it in assembly in the first place, but that was not the requirement.
Your suggested answer to my question of how to save gas when driving my car boils down to: don't drive, use your feet. Which is plain useless if you have 25 Km to drive all day long.
Thanks for leading this thread to boring arguing ... is there a "close" feature?
> So why do you respond to this thread?
I'm responding to this thread because you've ignored the responses in the BigDecimal thread where it was pointed out that you're constructing your BigDecimals improperly. That is not a matter of your view versus others, it's a matter of you doing the incorrect thing and refusing to admit it. Given that you've described your company as being in financial services, they (and you) have a fiduciary obligation to fix such errors.
> Thanks for leading this thread to boring arguing ... is there a "close" feature?
Yeah, stop reviving it.
> Thanks for leading this thread to boring arguing ...
> is there a "close" feature?
Looking at the timestamp for replies 22 & 23 and 28 & 29, I would suggest that you don't drive and use your feet. If you don't get useful answers, you shouldn't be resurfacing this thread every week.
> at least after I tried to make my point clear
I've tried to make my point clear as well. Too bad you waste all your time responding to postings that didn't attempt to answer the question. So much for acknowledging my attempt at helping. I guess I won't bother the next time you have a question.
> >
> > And I have used profilers in both.
> >
> > The standard idiom .....
> > 1. Find the hotspot(s)
> > 2. Find how many times it is called.
> > 3. Find how long it takes.
> >
> > Reducing the number of calls, when multiple calls
> > exist will produce the greatest speed up.
> >
> > You dismissed that idea with no qualification.
> >
> > And if I had an application where the primary and
> > overriding requirement was speed I would write it in
> > C++. Then profile it. And for selected hotspots it
> > at least possible I would re-write in assembler. I
> have certainly done that in the past.
>
> That proofs that you don't have much experience in
> optimizing Java:
>
> > Reducing the number of calls, when multiple calls
> > exist will produce the greatest speed up.
>
> because this is still plain wrong (It _can_ be right
> for specific situations, but it doesn't guarantee the
> greatest speed up). It always depends: sometimes
> routines that are just called a few times (but take
> long) are better to be optimized. Sometimes, methods
> that are called several 10 thousand times can't be
> further optimized or wouldn't gain much (compared to
> other hot spots). There is no "do this and it gets
> faster" - that's the long and detailed work of
> profiling big applications in detail.
>
Perhaps you didn't understand what I meant by 'hotspot'. That term refers to locating the where the code actually spends the most time.
Or perhaps you thought that I didn't understand that somethings can't be changed. So to clear it up, yes I understood that. I thought that was a given.
And I thought this would be obvious but apparently it isn't. Given a routine the is called only once. One can optimize it by doing one and only one of the following.
1. Do not call it at all.
2. Optimize it to make it faster.
I would think that it is obvious to most people that regardless of what you do in 2 that 1 will always be faster.
The fact that a routine might be called 10,000 times should only make that more obvious.
Could you provide an example of code that would be faster if you execute it than if you don't execute it?
> For the last 15 months I concentrated in optimizing
> our existing Java applications that basically do
> heavy number crunching, imports, exports (text, csv,
> xml, soap, sql), and also massive database access.
> The result of our optimizations were that we rewrote
> already server 3 applications that were written in
> VisualBasic and C. The result was a speedup of
> factors 10 - 50 (!).
Redesigns are often capable of great speed ups. That is because traffic patterns are known rather than guessed at.
That doesn't surprise me at all.
> During reimplementation we had a
> phase where we had parts in Java and parts still as
> the DLLs called from Java and each replacement of a
> DLL gained more speed. Of course, it has also to do
> with application desgin and those apps might have
> been written badly, but the most gain in performance
> was achieved in optimizing the Java code. Of course,
> we started optimizing database access, minimizing
> data movement between resources (files, database,
> soap), usage of heavy caching. This was by far the
> greatest gain. Example: instead of a quarterly number
> crunching of 32 hours we now are at 6 hours (... so
> far!) with the new Java application - and it not only
> computes the same numbers with the same amount of
> data in the database but also computes additional
> data, has a lot more processing log messages.
>
Good for you. In one case I looked at requirements and by changing those, rather than the design or implementation, I went from a 12 hour run and 3 months of developement time down to a 2 minute run and 3 days of developement.
> We found most hot spots during our profiling
> sessions. So, with this deep knowledge of months of
> profiling and testing, we don't want to stop after
> this quite good success. Why stop optimizing if you
> still see hot spots that you don't really accept?
Because of decreasing gains. At some point the increasing long term maintenance costs along with the decreasing real gains are no longer economically worthwhile.
> Optimizing a server
> application for 2 weeks to just gain a few minutes is
> more valuable that others might imagine.
>
Stated like that - no it isn't. If you qualified where those minutes were gained and what impact it actually has then it might.
For example if the server now starts 2 minutes faster it is meaningless. On the other hand if each server request use to take 121 seconds and now they only take 1 second then it might be. I say might be cause the type of transactions matter as well.
> So in the end, I just wanted to discuss here how to
> improve speed in Java and not how to imrpove speed in
> general for writing applications. It's sad that you
> always have to justifiy yourself _why_ you want to
> achive something. If I wrote all these in the first
> post, nobody would have read it, because I also don't
> want to read posts with so much text.
If you only want to get the answers that you want then I suggest you start your own board and pay people to post there. Then if you don't like the answers you can fire them.
>
> Just another example: who cares optimizing BigDecimal
> scaling? With my custom scaling method, we gained
> minutes (!) in our number crunching app - and of
> course, in this thread there also were people
> convincing me to not use Java or that I'm plain
> "stupid" being so "narrow minded" to only want to
> speed up standard Java APIs.
There are many, many posters here that post a question and are absolutely convinced that there can only be one possible answer or one possible type of answer.
They are mistaken.
