Rotation Invariant Image Matching Algorithm
Can anyone give me a clear idea as to how do I perform an Image Matching between a source Image and the target Image(source Image rotated to right or left ) . I mean if i have the exact same Image i get the best match with my current algorithm .But if the Image is rotated how do I get the comparision.Is there a specific algorithm someone could refer me to?
[365 byte] By [
Superprada] at [2007-10-1 10:42:03]

I personally don't know of any nice algorythems, but it seems to me that when an image is rotated, its edges will retain many of their properties. Specifically the angles between segments of edges should remain the same, that may or may not be useful to you.
Assuming that you already know the center of rotation the traditional way of doing something like this is to construct a measure that selects an orientation.
It is just a little easier to understand if you imagine that you have a vector diagram. i.e. a bunch of lines drawings between a bunch of points. If you think of each point in the picture as having a mass, that mass is some distance from the origin, the product of the two is a torque (torque is a vector, in this case it is mass times vector (x,y) since we are assuming a mass of 1 the torque is just the vector)
Now if you add the torques from all the masses you get the total torque (this is just the vector sum of the individual moments). It is unlikely that this sum will be exactly zero. This means that if you tried to balance the picture at the origin it would tip in the direction of the torque because the mass is not perfectly distributed.
This is a distinguished direction for the picture. You can think of it as a vector from the origin pointing toward where most of the mass lies.
Now to compare two images that are indentical except for rotation, you can use the distinguished distance of the torque to orient the pictures in the same way. For example you could just rotate the two images that you are going to compare so that their torque vectors are basically pointing upward.
Now in the case of images, were you don't have vectors, you can do filtering to turn the image into the vectors that you are looking for. For example you could ignore all the pixels that are NOT bright red and compute the torque only of the locations of the red pixels. What if there are no red pixels because it is a black and white image. Well, then pick some other measure that makes sense for your image. All you need is a filter that will select some small subset of pixels which you then treat as the points that have mass.
Note: this is a very general technique and has considerable application. For instance, given two images you can computer the center of mass of the green components, the center of mass of the red components and the center of mass of the blue components. This gives you a little triangle of one red, one green and one blue dot for each picture. A single affine transform will orient and map one triangle into another. This affine transform will similtaneously, scale, sheer, rotate, and translate the two images into allignment so that they are equally centered on their red, green, and blue coordinates.
So, if this tenchnique is so excellent, why have we not solved the problems of machine vision long ago? because it is rare that you are ever comparing two things that really are exactly alike. You may have a picture of your face against a green background and then that same identical face, cut and pasted (and rotated first in photoshop) onto a blue background. Because the backgrounds are different the blue moments and the green moments of these pictures have nothing to do with one another.
Thus finding parts of pictures that are similar to parts of other pictures requires a little more judgment (or rather judicious tuning of algorithm parameters) This is of course the fundamental problem in doing motion estimation between adjacent scenes in a movie for the purposes of extreme video compression.
So there is an outline of a method that may or may not be appropriate in your case, and a pointer to where you hit the literature (video compression, in particular MPEG4) for sources of way more complicated algorithm that are even more specialized and even less likely to be appropriate to your purposes.
Hope that helps
Enjoy!
Thanks, marlin314, for the tutorial. However, I see a problem with the method you describe.
When rotating a bitmap the pixels do not lign up with the orientation of the image. For instance, a vertical black line, one pixel wide, hundred pixels long, when rotated 45 degrees will fit in a square 71 by 71 pixels. There are 71 pixels along the diagonal of that square, which correspond to the rotated line. The rasterizer may also color some of the neighboring pixels gray for anti-aliasing or smoothing purposes. Note that there are no gray pixels in the original. So the two pictures are definitly not identical; there is loss of information when rotating a bitmap. (Windows evens warns me that I loose information when I rotate a jpeg image by 90 deg, which I do not understand why is necessary.)
How similar do the two images need to be in order to assert that they are "the same"? Is there anything that is invariant under rotations (e.g., the total amount of red, blue and green), that could be used by an algorithm?
Welcome to the real world. In most classification algorithms you must have the notion of "close enough"
An exact square pixel on one image rotates into an exact square (but one that is not oriented vertically and horizontally) so the inverse image of a single pixel under a rotation map becomes a mixture of several pixels on the image that you compare to. So you must do a blending or filtering operation before you compare. (purists will calculate the areas of overlap of that rotated square to grid aligned squares - lazy folks will just look at where the 4 corners land and flat average the 4 values)
When you have the blended pixel value you take the square of the difference between that and the target value and that is your error term.
You sum your error terms across a somewhat uniform sample of pixels and if it isn't too big you declare it a prefect match and hope that the plane does not fly into the mountain that it though was clear sky. :)
I know this sounds rather gross, but it is really exactly the same problem when you try to compare two floating point numbers. 3 * 1/3 does not necessarily equal 1.0000000000000000. In real number problems you never check for one number equal to another, you always test if it is "close" to some target number.
How close you have to be to declare two images to be "equal" is very dependent upon your situation. What you do is accept the fact that there must be thresholds in any recognition problem and you determine the appropriate thresholds by taking images that are the same, rotating them, grinding through your algorithm and seeing how big the error terms are. Then you take two images that are certainly NOT the same and see how big the error terms are. Then you take two that are a near miss and test.
This testing will force you to think about what sorts of images you really are going to work with which can help you tune your algorithms. For example in all this discussion, I still have no idea whether what you wish to rotate consists of photo images, Black and white line drawings, cartoons with large patches of uniform color. You would probably do different things with those different kinds of images.
While it would be nice if total mass of green were conserved in a rotation, and it should be nearly conserved, it would be unwise to assume that that will be the case. Testing is required.
While it will not really help you in the design of an algorithm, it might be useful for you to think along these lines. If you take an nice rasterized image and rotate it, the result is NOT equal to the first. I mean obviously it is rotated, but in another sense it is also not equal because you can't get back home. If you perform the inverse rotation you do NOT get back the thing you started with. You now have a different slightly degraded image. (just like dividing 1 by 3 gave you a real number that was not really one third, and taking the inverse operation does NOT in general get you back where you started)
Rotation in a raster world is a lossy operation so equality is no longer an appropriate concept. "Close enough" is all you can ever hope to get so it will have to be close enough.
One other point since you did mention invariants. All this image processing to rotate the images and compare things and filter them can take a lot of time. It might be very useful to have a quick rejection routine. Something that tells you that one image is NOT the rotation of another. Things that ought to be invarient like total mass of red, while not exaclty invarient, should be real close. So if you average the amoung of red in one image and get 17.2345 and in the other image you got 156.982 you can be fairly safe in concluding that one image is NOT a rotation of the other. For those images you can skip the whole rotation and comparison jazz.
Interesting! marlin what you described makes sense.But, not completely convincing. Ok let me explain. If we have a case where we have an Image with trees and mountains and a man standing under a tree (For example).Now if I want my comparision routine to identify a templet from that Image say try to find the man alone then how would this averaging work. So the tempelet is just a man and the Image is lot of other stuff along with the man. Will the algorithm you described give me a perfect match( or say close match) for the templet in the main Image . I guess now calculating the Red content will not be applicable as we have lot more of unnecessary stuff in the main Image.
I can think of a simple approach where we correlate the tempelet with the original Image and find the bright stop of the best correlation. but this fails when the Image is rotated or shifted. Then how would the algorithm work? Hope I was clear.
I understand most of what you say, marlin. What I do not understand is why the torque method would work. For instance, if you are filtering al lthe pure red pixels to compute the compound torque you may get very different results because some of the pure red pixels become pinkish or blueish in the rotated image, and there is not a one-one mapping between the pure red pixels in the two images. If it works it is not inuitive why it works. Do you have some extra contraints such as:
1) all weights are equal
2) the sum of the weights = 1
?
I was assuming that the weights are all equal, but I don't know that that is required. You could make the weight vary based on how much red you found in each pixel. Thus if the red got blended across two pixels you would still pick up all the mass. I think either method would work.
how well it works depends on what your pictures are like. Most photo pictures do not have one single red pixel in a field of green. There will be areas that are all mostly the same color. These areas will contribute a bunch of vectors to the sum that are all pretty much the same. This will make the torque tend to point to that massive blob of red (assuming that there is one. - if not you try another color)
What you are doing is trying to find any measure that is mostly rotation invarient like mass of a color, and are using any lopsidedness in that measure to reorient the two pictures so that they are lopsided in the same direction. Once you have rotated the two images. You must then just compare the pixels one on one and collect an error term and see how bad it looks.
In the example I gave I talked about red pixels but of course you can do just about anything and get away with it. You could for example declare red anything that has an R value > 200 (typically RGB all range between 0 and 255) . Just about anything that you do with color, should in theory be rotation invarient and could work as a filter.
However you may run into picture that has no red at all. But since it does not matter what color you choose, when you get a diagram that does not have enough of the color you wanted, or it has too much you can choose another color and refilter with that.
As I said, I was assuming that you take the mass of any pixel that passes the filter as 1 and any that do not pass as zero. However if that is too harsh you can always soften the measure. For example you could declare the mass of a pixel to be 0 if the red value is <200 and (Red Value - 200)/55 for any value over. This is assigning a full mass of 1 to any completely red pixel with RedValue = 255 and ramping down to zero as you move away from full on.
The trouble is, that until you hack out some code and see how slow it is and how well or how poorly it works, it is hard to determine what you need to do and what you can ignore. Generally the less you do the faster it runs. It is hard to beat just hacking out some measure and looking at what it does.
BTW the measure that is used to find scene changes in film and video is pretty cute. The images on the screen is changing all the time. You got people running on and off screen. You have the camera panning and following someone, you have your explosions. Where are the scene changes? The measure that works very well considering how simple it is, is to just choose a couple hundred colors across the full spectrum, and create a historgram of how much of each color is in the picture. Then you compare histograms from frame to frame.
As the camera zooms in on a face, the pink level goes up, When the bomb goes off the red level and the brightness increase. The historgrams are ALWAYS changing, BUT they don't change ALL that much from one fraction of a second to the next until you get a scene change. So if you just look at the sum of the squares of the differences in the histogram, you find that if the number is big you declare it a scene change and if it is small it is not a scene change.
So what is the magic threshold number? Who knows? You got to encode some video and look. Depends on too many factors to tell from the outside.
So take a picture, rotate it. write some code to look at all the pixels in some circular region around the center of rotation (ignore pixesl where x*x + y*y > some threshold - you did think of limiting your computation to a circle, didn't you) calculate some torque vector using any method you like, both on the original picture and on the rotated one. If the vectors don't differ by the amount that you rotated the picture, then that measure clearly won't work and you need to look at why. Time to stop thinking and start hacking.
Enjoy!
This would work for any manipulation (rotation, flipping, anything else) of the image as long as you dont change color's.
0. (optional) Perform color space conversion and then perform color segmentation.
1. Divide the image into several partitions
2. Perform a general histogram on the image, and a local histogram on the partition.
3. Perform a distance measure (eg chi-square) for the general histogram. This gives you a general idea how close the 1st image compared with the 2nd image.
4. if the general histogram distance is not sufficient perform a bipartite matching algorithm (bma) using the distance measure as the distance in between nodes. This should give you a value on how different the image layouts are.
weaknesses:
How many partitions to divide the images with.
Histograms should be normalized.
I suggest reading more on CBIR techniques, I remember using Ornella Mich's and James Z. Wang's paper as guides in my undergrad days. Although I have read lots of other papers in those days.
I know that somewhere there is a rotation/scale invariant shape filter, this would be useful for you if you're processing monochromatic images.
Most CBIR systems would use a multistep filter system, using a fast-to-best ordering. This way the slower algorithms that gives better results only processes a few selected images.
It would also help if you could filter from which domains your pictures are coming from.
HiDo you have any sample code or API to compare source & target image(say Biometric Finger Print).Thanks in advance
i want some information to "compare two images"+"algorithm " related to that
> i want some information to "compare two> images"+"algorithm " related to thatHow does the information that is already in this thread meet or not meet your needs? Discuss.Did you mean to put this query to Google?