Image storing algorithm at photo album sites

Hi there;

I am coding a image storeing website... Photo albums etc.But storing them in a folder system is getting complex for me... Whats the best way of storing image to folders?

What i did up to now is: every member has its own main folder named with hex numbers... and i rename the uploaded images with hexnumbers again...

The result is like:(first one is user-hex /second one is image-hex.jpeg)

4324AFC3BC2390DE / 45F83AECB43B2A.jpeg

its an unprofessional but simple working way...hard to predict the image hex if the image is published as private picture...

If the user has 2000 photos it maybe hard for a single folder to index the photos...

When i look at professional photo album sites webshots.com or flickr...

Image folders are like 67/434/4543/345/4534/5443F432ACD342.jpeg

so there are too many subfolders...But whats the logic and algorithm behind..How do they name these subfolders?how do they group images in folders?

What can i do with JAVA with best performance on heavy traffic?The archiving mechanism must be fast on indexing and good for maintanance

Please give me ideas or refer me some web links

thank you

BuraK

[1217 byte] By [netsonicca] at [2007-11-27 2:03:29]
# 1

Get a hash of the picture's filename with a timestamp attached to it, [ MD5(filename + timestamp).jpg ]. If I were to do this, my directory structure would be:

/images/userid/photoid.jpg

OR, if you want it to be organized by albums:

/images/userid/alumbid/photoid.jpg

/images/154/2102/fc0586aca6e42cffade83252446d0613.jpg

There, you can have 4 subdirectories. I sure hope you are also using a database, such as MySQL, with this project too?

-Kevin

kmangolda at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 2

yes i use Mysql for this project...i get the hex numbers from mysql database...

if you group the pictures by the album id's, when you move the pictures from one album to another album you must move the pictures from one folder to another and there is lots of IO operations unnecessarily...

how about creating sectors like about 64 mb folders..(just an example)

/userhex/sector1/imagehex.jpg

/4EA7C3D33/23452A/2C1AEEF42.jpeg

when sector 1 reaches 64 mb i will create sector2 folder and go with it.

i dont know what do it brings me but just an idea?

netsonicca at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 3

You have a point there, but how often do you expect users to be changing photo's between albums? The user sectors would work as well, similar to an album, but still different. Once the file is uploaded, if you move it, all you're doing is changing the index of all the files on the drive, you're not actually rewriting the entire file.

kmangolda at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 4
In some cases, their crazy sub-directory structure is also a way to discourage people from randomly guessing paths to actual pictures.
kmangolda at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 5

i havent tested the performance but crazy subdirectory system is not good i think...

but hard to guess anyway:)

i think serverip/userhex/sectorhex/imagehex is enough...but this time server hard drive may be out of resources before sectors will reach the max value...

it needs a control from many directions....

i will keep searching it...any help will be appreciated allways.

netsonicca at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 6

You really don't need to store files by so many directories. Take a look at how a hard drive works. Files are written as they are received, just because there is a directory structure doesn't mean data within the directory is next to each other. As long as you have an absolute or relative path to the file you need, then your seek time is pretty much however long it takes for the hard drive to get to that sector on the hard drive. Subdirectories are more a way to easily section off files for the user's convenience. It's easier to look at 100 directories than 10000 files.

kmangolda at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 7

> In some cases, their crazy sub-directory structure is

> also a way to discourage people from randomly

> guessing paths to actual pictures.

And what you are looking at is a URL. It is not necessarily the case that the "path" structure in a URL corresponds in any way with a directory structure on the server. The web application can interpret that URL in whatever way it likes.

DrClapa at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...
# 8

If you are using a database to store information about the image then you can use the primary key (or an obsfucated version of it) as the name of the file.

That way when you move images between folders/groups you don't have to move the physical file as its primary key stays the same. All that is needed is a DB update to change the folder/group relationship in the table.

A side note: Many file systems have severe performance problems if you store too many files in a single directory. Subdirectories can really help. Use part of the primary key as a directory name and the rest as the file name.

matfud

matfuda at 2007-7-12 1:45:50 > top of Java-index,Other Topics,Algorithms...