Virtual Database? Load Tables into Memory
Hi All,
because of slow performance, I'm currently considering taking the tables from my database and loading them into memory for faster access. My application isvery query intensive and some coworkers have suggested to me to try loading the database tables into memory (using collections) and 'query' the collections. Because of the database structure I'm using, I don't think doing this will be that difficult. Time consuming, but not difficult.
Has anyone done this before? If so, how was the performance of your application affected?
I'm maintaining a single connection to the database. So openning and closing connections is not the issue. I'm 99% sure it's the massive amount of queries that's causing the problem.
I'm working on an Insurance application that uses a database to store deductibles, limits, base premiums, and a whole bunch of factors used to produce coverage premiums. To calculate a premium I must go to the database to get the factors that should be applied to each coverage. This results in MANY queries.
Here is an example of a trimmed down version of a Liability coverage calculation:
protectedvoid applyCoreFactors(int mode){
//Retrieve the rate engine
AutoRateEngine rateEng = this.vehicle.getAutoRateEngine();
//Driving Record
this.applyFactor("Driving Record",
rateEng.getLiabDRFactor(this.vehicle.getVehicleDrivingRecord()),
String.valueOf(this.vehicle.getVehicleDrivingRecord() ) );
//Driving Class
this.applyFactor("Driving Class",
rateEng.getLiabCLFactor(this.vehicle.getDriverClass()),
String.valueOf(this.vehicle.getDriverClass()));
//Limit Factor
int rateLimit = this.getLimit();
if(mode != RATE_PAGE_MODE){
//Check for new business discounts
rateLimit = rateEng.getLiabLimitDiscount(this.getLimit(),
this.vehicle.getAutoPolicy().getPolicyTerm());
}
//Apply the limit factor
this.applyFactor("Limit",
rateEng.getLiabLimitFactor(rateLimit),
String.valueOf(rateLimit));
}
The AutoRateEngine handles the database queries.
It I've left anything out that could help better inform you of my problem, please let me know. And thanks for any help!
Mike
# 1
Depends on the data volume and the queries being done.Caching is fairly standard in database layers.Note that a 64 bit VM will greatly increase the memory limit.
# 2
Hey,
thanks for the reply.
The queries being performed by a call such as rateEng.getLiabLimitFactor(rateLimit)
are very simple SELECT statements that return a single String or double value.
The problem is, there may be up to 10 queries like this for a single coverage (such as Liability and Collision) and there may be up to 16 coverages on a single vehicle. So that's in the worst case senario, say, about 80 queries since some converages only have 1 or 2 factors from the database applied.
I should also note that performance isn't a problem when processing/rerating a single vehicle's coverages. That takes less that a second. It's when doing mass processing of tens or hundreds or thousands of vehicles, which can take several hours.
Hope that provides some more info about my problem.
# 3
> Hey,
>
> thanks for the reply.
>
> The queries being performed by a call such as
> rateEng.getLiabLimitFactor(rateLimit)
> are very simple SELECT statements that return a
> single String or double value.
Then for caching a simple has works.
>
> The problem is, there may be up to 10 queries
> like this for a single coverage (such as Liability
> and Collision) and there may be up to 16 coverages on
> a single vehicle. So that's in the worst case
> senario, say, about 80 queries since some converages
> only have 1 or 2 factors from the database applied.
>
Another solution is to optimize those to a single call.
Create a proc with 10 parameters. The proc deals with the fact that some of those might be null.
1 call will be significantly faster than 10.
> I should also note that performance isn't a problem
> when processing/rerating a single vehicle's
> coverages. That takes less that a second. It's when
> doing mass processing of tens or hundreds or
> thousands of vehicles, which can take several hours.
>
I can only suppose that you are doing this via java. And then the question would be why? Why not just do the entire task in the database? Databases are very well suited for this. Moving such tasks like that into the database can easily achieve several orders of magnitude in performance.
# 4
Hi jschell,
first to answer your question about why not just move this rating to a database. Well, there are two main reasons why right now.
1 - Some of the Factors applied to a coverage must be determined based on vehicle(s), driver(s) and policy information. This would require some major query coding to simulate the algorithms programmed using Java, and may be more then a simple database like MS Access could handle. I may be wrong about that. Though I do know that it is much easier (to me at least) to test, debug and code the rating of coverages using Java than SQL.
2 - Forecasting... we make changes to the rates, coverage factors, base premiums and rerate entire books of business to see how they will be impacted (i.e. forecasting). Some of these changes can be complicated and require a significant amount of logic to be programmed. For me, it's just easier to do in Java.
So I guess what I'm saying is simply that I prefer to use Java for these calculations. It may not be the best solution, but it does work and is easier for me. If I have to go the database way, then I will, but not yet...
I just found out about Apache's Java Caching Service (http://jakarta.apache.org/jcs/) and it looks like exactly what I need! Has anyone ever used this before?
Basically, it will mange the caching of my database values for me. As you may imagine its a lot more technical than that, but that's all that I care about right now. I'm going to test it out on monday and will post my results.
jschell, I have considered doing a single query for several factors at once as you have suggested and it's a bit more difficult than it sounds. The order and method that rating factors must be applied in cause the problem. That is a good idea, no doubt, but will cause several headaches right now and may make it more difficult for others to pick up and understand how my code is working. But is another good suggestion that I will consider BEFORE the "move it to the database" suggestion.
Thanks jschell and anyone else watching this Thread.
# 5
>
> 1 - Some of the Factors applied to a coverage must be
> determined based on vehicle(s), driver(s) and policy
> information. This would require some major query
> coding to simulate the algorithms programmed using
> Java, and may be more then a simple database like MS
> Access could handle. I may be wrong about that.
> Though I do know that it is much easier (to me at
> least) to test, debug and code the rating of
> coverages using Java than SQL.
>
Yes that is probably going to be more that MS Access SQL can handle.
> 2 - Forecasting... we make changes to the rates,
> coverage factors, base premiums and rerate entire
> books of business to see how they will be impacted
> (i.e. forecasting). Some of these changes can be
> complicated and require a significant amount of logic
> to be programmed. For me, it's just easier to do in
> Java.
From experience I can only suggest that you be careful with that model. If you create a report 'now' using values (say a rate) and then those values change then you will not be able to reproduce a report in the future for 'now'. And it can be rather hard even to explain in the future why the report that was generated now has the values that it has.
Consequentely one should always maintain such values as historical data. In the past I have even maintained intermediate values.
>
> Basically, it will mange the caching of my database
> values for me. As you may imagine its a lot more
> technical than that, but that's all that I care about
> right now. I'm going to test it out on monday and
> will post my results.
>
Keep in mind that only helps if you reuse the values. It doesn't help if you just process the entire table sequentially.
> jschell, I have considered doing a single query for
> several factors at once as you have suggested and
> it's a bit more difficult than it sounds. The order
> and method that rating factors must be applied in
> cause the problem. That is a good idea, no doubt,
> but will cause several headaches right now and may
> make it more difficult for others to pick up and
> understand how my code is working. But is another
> good suggestion that I will consider BEFORE the "move
> it to the database" suggestion.
Performance optimizations in general will always make the code harder to understand and maintain.
Adding more comments is my solution to that.
# 6
Hey readers,
sorry for the delayed reply... busy trying to get my cache working properly.
I created a simply cache for my read-only database using a single HashMap. Performance was increase by about 1100%
Run without cache: 23hrs 51mins 37 secs
Run with 95% of database calls using cache: 2hrs 1mins 20secs
I just added another main database query call to my cache and I expect that to give my reports a run time of about 1hr 45-55mins.
Thanks for the help jschell. And to anyone out there that suspects their database is a bottleneck, definately test out caching. I didn't realize exactly how big of a bottleneck my database was.
Now I'm going to also move my database to either MSSQL or MySQL, index it, and expect to see yet another increase.
Good luck to anyone else with this same problem.
# 7
The best way to increase performance is by profiling the application. Using an automated tool for that makes it easy.