This post is about how you can processed a photo (spy photo) of a document using rectification and moving average thresholding in Matlab.
We have all been in the situation where we wish to take a photo of a classmate’s lecture notes. Because we know that poor lighting will boost the ISO level and create a noisy image, we find a way to introduce some light. Then you get home, you review the photo and realise that it is too dark and harshly lit, thus a bad idea to waste toner on. The lighting is also spotty, doesn’t make for a good document to print out as we can’t brighten the image because it will just make that part unreadble. Also, I should have taken the photo straight on.
These problems can’t be unique. After all, the spies in the movies with their small spy cameras should have the same problem. So, I had to look up what kind of camera they use. Yes, in the world of fiction, I know… The camera is called Minox, it’s the real deal. While searching on Youtube I found a channel dedicated to spotting Minox in the movies. It seems as though it has been in television and movies since the sixties. For more information see Wikipedia about Minox, for the Youtube channel, put in “minox in the movies” on Youtube and you will find countless of examples. Anyway, that camera certainly has these problems.
The photo from the book I’m reading called “Digital Image Processing” by Gonzalez and Woods, third Ed. The photo was taken using my old (but trusty) Canon Powershot SX130 IS.
Changing the perspective
A very useful function in Matlab is imwarp. It is able to warp a 2d image using a transformation matrix. In our case, we want the warping to be projective and defined by the four edges of the document in our photo. To create the projection matrix structure, called tform, we call fitgeotrans
tform = fitgeotrans([pts_x; pts_y]', [x_; y_]', 'projective');
pts_x and pts_y are the coordinates of the four corners, stored using ginput. The x_ and y_ vectors are the preferred perspective coordinates after transformation. For simplicity, I assume that we pick the points pts_* starting with the upper left corner and then pick the remaining three points clock-wise.
x_ = [0, 400, 400, 0]; y_ = [0, 0, 150, 150];
An example of the perspective correction, also called rectification, on a photo of Marilyn Monroe.
The photo of Marilyn is more interesting to study than a document because of the artefacts introduced, see below. Read more about her USO tour Link to historybyzim.
The artefact is due to the lower resolution at the lower right corner and it seems that there is not much we can do about it.
Now, in the document version we can suddenly see that the upper part of the document has become wavy. The wave is due to the pages coming from the spine of the book, the recification simply exaggerates the effect. It is therefore important to hold the page down while taking the photo.
Now on to the shading issues.
Destroying the shadows
A problem we have now is that the image has a bright spot in the center, but also shadows and darker parts around the edges. Generally we cannot eliminate these shades by simply brighten and increase the contrast the image, not even a global threshold can eliminate all shades in all cases. A filter that actually does very well is if we use the moving average as a threshold. This is an adaptive filter which is ‘aware’ of the local shades. We then use the moving average filtered image to threshold.
Here is a Matlab version of the description of the moving average thresholding as described in “Digital Image Processing” by Gonzalez and Woods, third edition, section 10.3.8. The arguments to the function are ws the window size and thresh, the threshold value.
function im = movavg_filter(im, ws, thresh) [u, v] = size(im); im(2:2:end, :) = fliplr( im(2:2:end, :) ); im = im'; im = im(:)'; k = ones(1, ws)/ws; filt = filter(k, 1, im); im = im > thresh*filt; im = reshape(im, v, u)'; im(2:2:end, :) = fliplr( im(2:2:end, :) );
The filter is interesting, because we use a kind of “sliding window” method for each horizontal line and then threshold against the original image. The method also calls for the moving average to be carried out line by line in a zig zag pattern, to reduce illumination bias. It seems that this is the standard way of applying a efficient thresholding filter.
After the threshold we have a black and white version of the photo. This is a rather low resolution because the perspective correction required more memory that I had available on my laptop when I did this experiment. I then stretched the image and corrected the bend of the text in Photoshop using Warp, which is not the same as the Matlab function. It basically lets you move larger parts of the image in a seamless way.