System scalability issue
Hi,
We are in the process of developing one application which will be used to upload several feed files (Excel and CSV format). These files will be processed at the application server side using POI (to read excel format) and than after certain processing, it will be stored in the staging tables at database side.
The same system will be used to download excel reports based on the data present in the staging tables. These reports will be generated by the java components written using POI API.
Now the problem is, POI. As it consumes a lot of memory so if larg no of users request for report downloads, the system is crashing because of limited memory size.
Could anyone proviod me some good design strategy to resolve this serious problem.
Thanks in advance,
Amit
Solutions - one or both of the following:1. increase the amount of memory available.2. decrease the amount of memory required.
Thanks for the early reply.
But this is a constrain I am facing. I can not increase the memory size. And I feel that this is not a fieseble solution as say, tomorrow 100 new users will be added to the system than again scalability will be at risk.
How can I reduse the memory consumption.
If possible, kindly suggest some design pattern.
Thanks,
Amit
1. Restrict the number of users that can access that facility at one time.
2. Process the reports separately (like at night) in a batch process. When users request a file you give them the file that has already been created.
3. Use something besides POI. I know Crystal reports will extract data from a database and output a variety of formats including excel. The also have a report server which allows for all sorts of options. There are probably other solutions as well.
4. A user requests that a report is created. That request is stored. A batch process (runs all the time) processes each request sequentially. The user comes back later and retrieves the generated report.
> Thanks for the early reply.
>
> But this is a constrain I am facing. I can not
> increase the memory size. And I feel that this is not
> a fieseble solution as say, tomorrow 100 new users
> will be added to the system than again scalability
> will be at risk.
>
> How can I reduse the memory consumption.
> If possible, kindly suggest some design pattern.
For one, make sure you aren't holding onto Objects longer than you need to. Share any resources you can.
The only way to deal with this without decreaseing memory reuqirements (by using another package perhaps) is to detect that you are exceeding requirements and block until there is enough room for the request to execute. This can be tricky but you can use a queue and catch the OOME (there is probably a more direct approach) to make sure each request will succeed. How this will affect the user experience is another matter.
A profiler is needed.Use JVMStats to watch what the GC is doing while you run.%
You could just make the creation of the Excel document (with POI) a distributed task, which can then be done on a seperate server. You can call this distributed process from the webserver and it will run on a seperate machine, which releives your webserver.
By further inserting a load balancer to determine which POI-server you are allowed to connect to you can insert as many of them as needed.
I build a similar system for converting audio / image files recently, and it works like a charm.
Let me know if you need more info on how to make this work.
Mark
> Could anyone proviod me some good design strategy to resolve this serious problem.
You're assuming your 'serious problem' is a weak design strategy, though you don't highlight what the design strategy was. (Though I can guess it was something like "We need this yesterday, so get coding now and don't waste time capturing the requirements or designing" aka some variation of XP).
You problem is a weak or none existing development process. Issues such as machine memory load and performance are None-Functional requirements. These have only a tagential relationship to your software design strategy. If the development process you're using does not take these none-Functional requirements into account then it is a deeply flawed process.
That is your companies or managers fault not yours. You are not going to change your development process overnight but you NEED to start changingf it today.
No design strategy is going to save you from incompetent management or absence of process. Only you can do that, you need to learn how to handle your manager or find a job where it is not nessecarry.
Go and talk to the [project] manager. If nessecarry take in qtr pint cup and a 2 pint jug of water with you. Tell him you will squeeze another 100 users onto the system when he get's all the water into the cup. Leave him to stew while you go to lunch. Next time he comes to see you, unless he as a solution give him another 7 cups.
> You could just make the creation of the Excel
> document (with POI) a distributed task, which can
> then be done on a seperate server. You can call this
> distributed process from the webserver and it will
> run on a seperate machine, which releives your
> webserver.
This is assuming that the OP can acquire more servers. I'm going to wager that if the OP can't get more memory, he can't get more servers.
> The only way to deal with this without decreaseing
> memory reuqirements (by using another package
> perhaps) is to detect that you are exceeding
> requirements and block until there is enough room for
> the request to execute. This can be tricky but you
> can use a queue and catch the OOME (there is probably
> a more direct approach) to make sure each request
> will succeed. How this will affect the user
> experience is another matter.
Unfortunately, catching OOME in your web app is not necessarily going to cut it. The problem is that the OOME will likely affect a number of threads almost simultaneously and could cause the container's own threads or methods to fail in ways that you cannot control. The whole container could become unstable, user sessions could become broken or other such things could happen.
Yes, a very robust container would try to anticipate these types of problems and handle them in a reasonable way. However, it's best to avoid ever getting in this position.
> Unfortunately, catching OOME in your web app is not
> necessarily going to cut it. The problem is that the
> OOME will likely affect a number of threads almost
> simultaneously and could cause the container's own
> threads or methods to fail in ways that you cannot
> control. The whole container could become unstable,
> user sessions could become broken or other such
> things could happen.
Good point. But isn't there a way to set a max allocationg size per thread or somehing like that?
> Good point. But isn't there a way to set a max
> allocationg size per thread or somehing like that?
Yes, but I think this has changed with Java SE 5.0 (it is more dynamic as described here: http://java.sun.com/docs/hotspot/gc5.0/ergo5.html#0.0.0.Thread-local%20allocation%20buffers%7Coutline) and of course the behavior is very JVM-specific. It might reduce the chance of two threads suffering from the problem concurrently, but won't address the problem that there is no way to control what code or what thread will hit the OOME.
> > Good point. But isn't there a way to set a max
> > allocationg size per thread or somehing like that?
>
> Yes, but I think this has changed with Java SE 5.0
> (it is more dynamic as described here:
> http://java.sun.com/docs/hotspot/gc5.0/ergo5.html#0.0.
> 0.Thread-local%20allocation%20buffers%7Coutline) and
> of course the behavior is very JVM-specific. It
> might reduce the chance of two threads suffering from
> the problem concurrently, but won't address the
> problem that there is no way to control what code or
> what thread will hit the OOME.
If the sum of all the threads max heaps is <= to the total max heap, it seems to me that one thread could not cause an OOME in another thread. I'm not saying that's necessarily optimal but it is feasilble, no?
> If the sum of all the threads max heaps is <= to the
> total max heap, it seems to me that one thread could
> not cause an OOME in another thread. I'm not saying
> that's necessarily optimal but it is feasilble, no?
Basically, no. The allocation size is the unit of allocation, not the total amount allowed. Having threads get large blocks of memory from the heap and then suballocate from that before going back to the well is a tried and true method to avoid thread contention on the heap itself to improve performance. It doesn't directly address this issue.
>
> Basically, no. The allocation size is the unit of
> allocation, not the total amount allowed. Having
> threads get large blocks of memory from the heap and
> then suballocate from that before going back to the
> well is a tried and true method to avoid thread
> contention on the heap itself to improve performance.
> It doesn't directly address this issue.
The issue of the OP or a sub issue?
For the OP, if this was possible to implement in java and for the particular application then that would be one solution to the problem.
> The issue of the OP or a sub issue?For the sub issue...