How to store and organize images uploaded by users
Hi,
i'm developing a web-application based on struts and hibernate.
I'm looking for patterns, best-practice, framworks or examples how to save, or more important, how to organize images or media-files uploaded by users in general.
How to solve the mismatch between hibernate-pojos and the images on the file-system?
Thx in advance for any suggestion!
[384 byte] By [
artofWara] at [2007-11-26 14:21:35]

# 1
I guess you wouldn't want the image itself in the database, so give each file a filename and save that in the database. The filename can be the id of the row in which it was saved, with the file type suffix added.
I think now your pojo's have a string property for filename, or an ID you can convert to a name, and you can build <img> html elements with it.
Does it work for you?
# 2
hi gewitter,
thanks for your answer!
this is one part of the approach i would choose too, to solve the problem.
My question is more pointed to the organisation of the file-system.
More like:
Store all images/files of the users to one folder?
Create a folder for each user to store images?
How to protect those folders from unauthorized access?(I'm already using tomcat with a realm.)
Are there any patterns for java-based online-communities with multi-media upload? Any open-source solutions?
thx!
# 3
The filename of the the image could contain the id of the user that the image/file is associated with, e.g. 5_0987983_blackbart.gif
Images should be stored in a single directory. You could name the directory 'userimages'. Depending upon the number of images of each user, you could create subdirectories for each user.
Is this is a browser-based application? You could design it so the application does not expose the filenames or directory names, so unauthorized users would not know where to look?
# 4
Hi GhostRadioTwo,Ok, will create a images-subdirectory for each user.It's a web-application. Any ideas how to get the security more flexible, e.g. changes during runtime, than the tomcat-realm? I have to solve it programmatically?THX!
# 5
How many users? Is this going to be clustered? How many images? What OS?
I had a problem once where a bad script was creating lots of little files. At some point the OS (SunOS) ran out of handles (on the order or 900K files) and said the disk was full. You need to consider these kinds of things before you choose your approach.
# 6
>>How many users? Is this going to be clustered? How many images?
>>What OS?
I think these questions are the right ones, because i'dont have much experience in storing data on the filesystem, only RDBMS.
OS: Windows 2003 Server ;-)
Clustered: Maybe one day, if there is a need.
User/Images: Maybe a lot users one day * about 5 Images/User with max 200kB
My main questions are:
1. How to secure the user-folders with Tomcat-Realm/Struts?
2. How to organize the user-folder/filesystem.
3. Common pitfalls?
Message was edited by:
artofWar
# 7
> >>How many users? Is this going to be clustered? How
> many images?
> >>What OS?
> I think these questions are the right ones, because
> i'dont have much experience in storing data on the
> filesystem, only RDBMS.
>
> OS: Windows 2003 Server ;-)
> Clustered: Maybe one day, if there is a need.
> User/Images: Maybe a lot users one day * about 5
> Images/User with max 200kB
I know that some Unix systems have a limit around 1 million files on a file system so you might want to see what the limit is for your OS and whether you will ever approach this.
> My main questions are:
> 1. How to secure the user-folders with
> Tomcat-Realm/Struts?
Security isn't my area of expertise but I will tell what I think. I think you will have a problem doing this on your own. The best thing I can tell you is that you should use sessions for identifying members and never allow pages to accept just a username, even once the user is authenticated.
I'm also not sure that using a database is such a bad idea. Do you have a DBA you can talk to? Ask them what they think about storing that volume of data will be a problem.
> 2. How to organize the user-folder/filesystem.
Personally, I would create a folder per user for maintenance reasons but could do it based on user name.
If you use a database, you could create a user account on the db for each user (with very limited authority!) and use that as part of the security.
The reason that i asked about clustering is that a file based solution is going to be more difficult to scale (I think.) Databases often support clustering and can support many webserver instances at once.
> 3. Common pitfalls?
I would say that with any system the biggest pitfall is underestimating the effort and the amount of time needed for testing and refinement. Avoid adding things that are not needed right away. If you see a future need, try not to paint yourself into a corner but don't write it.
Also make sure you are using tools because they solve a problem. Why do you need hibernate? Make sure you can answer that question before you use it. If you can't answer it, you might want reevaluate your approach.
# 8
Hi dubwai,
i found this article:
http://databases.aspfaq.com/database/should-i-store-images-in-the-database-or-the-filesystem.html
I think the most contras are related to SQL-Server. I use Oracle10g and i made good expierence in storing binary-data to oracle via Hibernate.
One reason to use Hibernate, one other reason is to have a good OO-design made with a OR-Mapper with important features like lazy-loading, eager-fetching etc.
Even if there is no need to cluster the tomcat, some independent nodes via load-balancing are possbile in future.
So i have to keep the images redudant on each node if i use the file-system, or i have to use a dedicated file-server, maybe on a apache, right?
The other approach is to save all images to the database and create them temporally on the filesystem if needed. Then the clustering option for oracle is available and it's much easier to implement the security.
I think to save the images in the database and create them on the file-system temporally is the cleanest approach from architecture view.
So i will implement this and do some performancetests.
Thanks for helping!
# 9
> i found this article:
> http://databases.aspfaq.com/database/should-i-store-im
> ages-in-the-database-or-the-filesystem.html
"if there is any chance that you will migrate to a different database platform, your current BLOB format might be incompatible with, or at least a pain to convert to, the new format -- since, like web browsers, each vendor has implemented things with their own slant. "
I guess. This won't be the only migration issue. It's not like there is no solution either.
"when your database really goes south, to the point where even the backup is useless, you still have the files on the filesystem (though their usefulness is questionable, depending on how much related data was kept in the database). Which is arguably better than losing all of your data *and* all of your files. "
Um. This is pretty much nonsense if you ask me. If the hard drive crashes it craches. What difference does it make if the data was in separate files or not? What kind of database error makes the backups useless?Databases (Oracle, I know) can easily by mirrored to another instance.
"having the images in the file system allows you to access the images from many different standard applications (FTP, web browser, etc) without having to write application code to pull the data out of the database, since you can't just 'SELECT image FROM table' and have the image appear in Enterprise Manager or Query Analyzer. "
If you don't need these things, I don't see what value they provide.
"with some databases, e.g. Access and MSDE, the data inside is limited to 2 GB (SQL Server Express 2005, as of July 2004, is planned to support 4 GB), whereas the file system is only restricted by the size of the volume. Also, most hosts charge a premium for SQL Server storage space, so in that case it would be cheaper to store them in the file system. "
I think you are maintaining your own DB and Oracle can hold a lot more than this. I still think you should talk to an Oracle expert before continuing. In particular, how will deletes affect performance? I've seen an Oracle DB become useless for over a day because it was blocking while reclaiming memory after deletes. This was (I think) caused by a misconfiguration of the database.
"in SQL Server 7.0,.."
Who cares? That just makes me think SQL Server blows.
"performance wise, including an <IMG SRC> tag generated by the database and pointing to a file that already exists is going to be faster than pulling the file out of the database, generating a temp file on the web server, and streaming that to the user. Also, table scans take more resources when there is an image datatype as opposed to a varchar that simply holds a 'pointer' to the file's location. "
You don't need to write to a temp file. Your Java code can stream the data directly from the database. I question the assertion that the file storage will be faster, especially as the number of files grows.
"can be quite complicated extracting images, say from an Access database, since it adds OLE header info to the file (see KB #175261). With SQL Server it's not so bad; see KB #173308 for an example that works right out of the box, KB #258038 for a VB example using ADODB.Stream, and KB #194975 for samples that use ADO's GetChunk and AppendChunk methods. "
It's not hard in Java, although Oracle does make it a little tricky.
"The number of KB articles involving BLOB/IMAGE columns in SQL Server is astounding. Here is a brief subset:"
Why do people choose SQL Server again?
# 10
> One reason to use Hibernate, one other reason is to
> have a good OO-design made with a OR-Mapper with
> important features like lazy-loading, eager-fetching
> etc.
I'm not sure it makes sense for this part of your application. It may make sense for the rest (e.g. user management.) When someone requests a file, you retrieve it. What are the performance implications of having hibernate manage the blobs? If you use the normal Oracle approach, you can stream it to the client browser from the DB. I doubt caching will be feasible with this amount of data.
# 11
Again, talk to an Oracle expert about doing this. You won't know how it will handle large amounts of data until you get there or unless you test it, which I also strongly recommend. See what happens when you get 10 Gigs of data in the DB. Then see what happens when you take it to 100 GB. What's the limit for Oracle? I'm guessing more than that. Take it to about 10 times what you expect to have to deal with in the foreseeable future. And make them good tests. Use different sizes of files. Retrieve different files in rapid succession in a random order with enough clients to max-out the number of connections to the DB.
# 12
I have expierence with oracle and terabytes of data in it, i think handling some hundred GB of data schouldnt be that problem. But i will ask some Ora-DBAs about their opinion about storing binary-data in oracle.
I think the missing part of the my design-concept was to stream the images directly from database via a servlet like here:
http://www.devx.com/getHelpOn/Article/11698/1954?pf=true
I will directly call the servlet in the <img>-tags in the JSPs.
Many thanks again for your help and food for thought!
# 13
Good luck. Sounds like fun.