Posts Tagged ‘image processing’

Convert two-page color scan of book into monochrome single pdf

Friday, November 25th, 2016

Two days in a row now I’ve had to visit the physical library to retrieve an old paper. Makes me feel very authentic as an academic. Our library has free scanning facilities, but the resulting PDF will have a couple problems. If I’m scanning a book then each page of the pdf actually contains 2 pages of the book. Depending on the scanner settings, I might also accidentally have my 2 pages running vertically instead of horizontally. Finally, if I forgot to set the color settings on the scanner, then I get a low-contrast color image instead of a high-contrast monochrome scan.

Here’s a preview of pdf of an article from a book I scanned that has all these problems: scanned low contrast color pdf

If this pdf is in input.pdf then I call the following commands to create output.pdf:

pdfimages input.pdf .scan
mogrify -format png -monochrome -rotate 90 -crop 50%x100% .scan*
convert +repage .scan*png output.pdf
rm .scan*

output monochrome pdf

I’m pretty happy with the output. There are some speckles, but the simple -monochrome flag does a fairly good job.

I use Adobe Acrobat Pro to run OCR so that the text is selectable (haven’t found a good command line solution for that, yet).

Note: I think the -rotate 90 is needed because the images are stored rotated by -90 degrees but the input.pdf is compositing them after rotation. This hints that this script won’t generalize to complicated pdfs. But we’re safe here because a scanner will probably apply the same transformation to each page.

Closed mesh of piece-wise constant height field surface from an image

Wednesday, May 6th, 2015

I pushed a little function box_height_field.m to gptoolbox. This function creates a height field from an image where each pixel becomes an extruded square (rather than just a point/vertex). There are two modes, one that’s fast, vectorized version which doesn’t add vertices so that the mesh is closed (though the underlying surface will still be “water-tight”). And a slower version which really creates a perfectly closed height field. Heres’ the result of the second one on the red-channel of the Hans Hass image:

im = im2double(imresize(rgb2gray(imread('hans-hass.jpg')),0.1));
[V,F] = box_height_field(im);

box height field

Here’s the same result but computed after quantizing the colors:

im = round(im/0.25)*0.25;
[V,F] = box_height_field(im);

box height field

You can clearly see the piecewise-constant regions. Using remesh_planar_patches you can reduce the number of version with losing any information:

[W,G] = remesh_planar_patches(V,F);

box height field

VHS filter matlab script

Wednesday, April 23rd, 2014

A while ago I found this killer tutorial for apply a VHS look to an image. I was able to whip up a little matlab filter to do it procedurally. Here’s the vhs function:

function V = vhs(im,varargin)
  % VHS Apply a VHS filter to an image
  % V = vhs(im)
  % V = vhs(im,'ParameterName',ParameterValue,...)
  % Inputs:
  %   im  h by w by c image
  %   Optional:
  %     'VerticalLoop' followed by whether to loop the bent strip vertically
  %     over time and output a sequence of images.
  % Output:
  %   V  h by w by c by f image of (f long sequence of images)

  looping = false;
  % Map of parameter names to variable names
  params_to_variables = containers.Map( {'VerticalLoop'},{'looping'});
  v = 1;
  iter = 1;
  while v <= numel(varargin)
    param_name = varargin{v};
    if isKey(params_to_variables,param_name)
      v = v+1;
      % Trick: use feval on anonymous function to use assignin to this workspace
      error('Unsupported parameter: %s',varargin{v});

  if looping 
    strip_top = 1;
    strip_top = ceil(0.4*size(im,1));

  first = 1;
  while true

    A = im;
    B = im;
    C = im;
    D = im;
    % exclusion blend like photoshop
    ex = @(T,B) 0.5 - 2.*(T-0.5).*(B-0.5);
    % kill color channels
    A(:,:,1) = 0;
    B(:,:,2) = 0;
    C(:,:,3) = 0;
    % Shift color layers
    nudge = @(f) ceil(f*rand(1)*size(im,2));
    A = A(:,mod(nudge(0.02)+(1:end)-1,end)+1,:);
    B = B(:,mod(nudge(0.02)+(1:end)-1,end)+1,:);
    C = C(:,mod(nudge(0.02)+(1:end)-1,end)+1,:);
    A = A(mod(nudge(0.005)+(1:end)-1,end)+1,:,:);
    B = B(mod(nudge(0.005)+(1:end)-1,end)+1,:,:);
    C = C(mod(nudge(0.005)+(1:end)-1,end)+1,:,:);
    % exclusion blend colored layers and alpha blend with original

    F = D+0.3*(ex(ex(C,B),A)-D);
    N = rand(size(im));

    % inverse mapping function
    bend_w = (2*rand(1)-1)*5;
    bend = @(x,u) [mod(x(:,1)+(1-x(:,2)/max(x(:,2))).^2*bend_w,max(x(:,1))) x(:,2)];
    % maketform arguments
    ndims_in = 2;
    ndims_out = 2;
    tdata = [];
    tform = maketform('custom', 2,2, [], bend, tdata);

    % Bend strip
    strip_h = ceil(1/4*size(im,1));
    strip = mod(strip_top+(1:strip_h)-1,size(im,1))+1;
    F(strip,:,:) = imtransform(F(strip,:,:), tform);

    % overlay gray line
    ol= @(T,B) (T>0.5).*(1-(1-2.*(T-0.5)).*(1-B))+(T<=0.5).*((2.*T).*B);
    G = repmat(0.75,[numel(-1:1) size(F,2) size(F,3)]);
    F(mod(strip_top+(-1:1)-1,size(F,1))+1,:,:) = ...
      ol( F(mod(strip_top+(-1:1)-1,size(F,1))+1,:,:),G);

    % overlay horizontal lines
    L = zeros(size(F));
    L(1:4:end,:,:) = 1;
    L(2:4:end,:,:) = 1;
    L = imfilter(L,fspecial('gaussian',[5 5],1.5),'replicate');
    F = ol(F,L);

    % Fade in random color gradient
    m = rand(1,2)*2-1;
    R = bsxfun(@plus,m(1)*(1:size(F,2)),m(2)*(1:size(F,1))');
    R = (R-min(R(:)))./(max(R(:))-min(R(:)));
    R = gray2rgb(R,colormap(jet(255)));
    F = F+0.05*(ol(F,R)-F);

    % soft light
    sl = @(T,B) (B>0.5).*(1-(1-T).*(1-(B-0.5))) + (B<=0.5).*(T.*(B+0.5));
    F = F + 0.15*(sl(F,N)-F);

    % sharpen
    V(:,:,:,iter) = imsharpen(F);


    strip_top = strip_top+round(0.057*size(im,1));
    iter = iter+1;
    if ~looping || strip_top > size(im,1)


Doing it procedurally is cool because you can generate a whole animation of them. Like this:

kavinsky vhs animation

or this

rioux vhs animation

or this

dylan vhs animation

“Exposing Photo Manipulation with Inconsistent Shadows” on Tram 13

Thursday, October 17th, 2013

In our seminar today we discussed “Exposing Photo Manipulation with Inconsistent Shadows” by Eric Kee, James F. O’Brien, and Hany Farid. After the presentation I showed an extra example I’d witnessed in an advertisement on the Zurich Tram.

zurich tram mond original

Our drivers make this trip over 90 times a year.

The crux of “Exposing Photo Manipulation with Inconsistent Shadows” is that if you identify enough correspondences between points on shadows silhouettes and objects that may have cast them, then you can determine a feasible region where the light source is. Any correspondences that are inconsistent with the others are evidence of image forgery.

So I walked through the algorithm on this image marking plausible shadow correspondences as wedges or half-planes. The intersection of the consistent ones form a non-empty feasible region for the light source. The suspicious shadow of the tram sign is inconsistent as it does not intersect with that region.

zurich tram add exposed

In my searching for the original ad, I also found this album cover.

tuba on the moon "Tears of the tuba"

It uses the same image of the moon! Looks like the tuba’s shadows are a bit more realistic that the tram sign’s. Maybe there’s a tuba on the moon?

Charade dataset, collection of clean shots of Audrey Hepburn, Cary Grant, and Jacques Marin

Monday, September 30th, 2013

After remembering that the 1963 version of the film Charade is in the public domain, I thought it keep make a useful dataset for image processing and computer vision techniques. I watched the movie again and took screen grabs any time there was a clean shot of a main actor’s face.

charade still hepburn

charade still grant

charade still grant

charade still Marin

Download all 226 images

1920×1040 png (417 MB zip)

1920×1040 jpg (290 MB zip)

960×520 png (141 MB zip)

480×260 png (40 MB zip)

Transfusive Image Manipulation project page

Thursday, October 18th, 2012

transfusive image manipulation transfers edits on one image to other images of the same object
My colleagues, Kaan Yücer, Alexander Hornung, Olga Sorkine, and I have just submitted the camera ready version of paper “Transfusive Image Manipulation” to be presented at ACM SIGGRAPH Asia 2012. We’ve put up a Transfusive image manipulation project page where you can find the preprint version of the article, videos and implementation details and hopefully more to com.


We present a method for consistent automatic transfer of edits applied to one image to many other images of the same object or scene. By introducing novel, content-adaptive weight functions we enhance the non-rigid alignment framework of Lucas-Kanade to robustly handle changes of view point, illumination and non rigid deformations of the subjects. Our weight functions are content-aware and possess high-order smoothness, enabling to define high-quality image warping with a low number of parameters using spatially-varying weighted combinations of affine deformations. Optimizing the warp parameters leads to subpixel-accurate alignment while maintaining computation efficiency. Our method allows users to perform precise, localized edits such as simultaneous painting on multiple images in real-time, relieving them from tedious and repetitive manual reapplication to each individual image.

Accompanying video

Technical report: A Cotangent Laplacian for Images as Surfaces

Monday, April 23rd, 2012

A Cotangent Laplacian for Images as Surfaces
We decided to write up a quick, 2 page technical report about some ideas we’ve had in the last year.

By embedding images as surfaces in a high dimensional coordinate space defined by each pixel’s Cartesian coordinates and color values, we directly define and employ cotangent-based, discrete differential-geometry operators. These operators define discrete energies useful for image segmentation and colorization.

Getting raw data from an image of a chart

Wednesday, April 18th, 2012

I found this chart of CHF to USD conversion rates over the last 730 days online, but couldn’t easily find the data behind it:
chf to usd from april 2010 to april 2012

So I grabbed a screenshot and applied some thresholding in photoshop to get an 730 pixel-wide image of the data line:
chf to usd from april 2010 to april 2012 thresholded data line

Then using the following matlab calls I extracted the data:

im = imread('chfusd-04-2010-04-2012-thresh.png');
[I,J] = find(any(~im,3));
DM = sparse(I,J,size(im,1)-I);
D = max(DM);
minD = 0.96;
maxD = 1.39;
D = D/max(D(:))*(maxD-minD)+minD;

Which I can then replot any way I like:
chf to usd from april 2010 to april 2012 matlab plot

Map grayscale to color using colormap

Saturday, February 19th, 2011

In matlab you can view a grayscale image with:


Which for my image im shows:
matlab imshow grayscale
And you can also view this grayscale image using pseudocolors from a given colormap with something like:


Which shows:
matlab imshow colormap jet 255
But it’s not obvious how to use the colormap to actually retrieve the RGB values we see in the plot. Here’s a simple way to convert a grayscale image to a red, green, blue color image using a given colormap:

rgb = ind2rgb(gray2ind(im,255),jet(255));

Replace the 255 with the number of colors in your grayscale image. If you don’t know the number of colors in your grayscale image you can easily find out with:

n = size(unique(reshape(im,size(im,1)*size(im,2),size(im,3))),1);

It’s a little overly complicated to handle if im is already a RGB image.

If you don’t mind if the rgb image comes out as a uint8 rather than double you can use the following which is an order of magnitude faster:

rgb = label2rgb(gray2ind(im,255),jet(255));

Then with your colormaped image stored in rgb you can do anything you normally would with a rgb color image, like view it:


which shows the same as above:
matlab imshow colormap jet 255

Possible function names include real2rgb, gray2rgb.

Dithering in MATLAB

Tuesday, February 8th, 2011

Although I original set out to find/implement a halftone algorithm in MATLAB, I got side-tracked having fun with dithering. MATLAB is unfortunately a poor choice of programming paradigm for most of the standard dithering algorithms since they use a process called error diffusion: iterate from one pixel to the next, round it to the nearest color in your palette and pass on the error to neighboring pixels that have not yet been processed. Being iterative this can run very slow in MATLAB for large images. I briefly tried to imagine parallelizing this, but quickly decided to use random dithering instead. Here, every pixel is rounding up or down to the nearest color based its proximity compared against a random number. The big disadvantage of this method is that it tends to introduce so much static that the original detail is lost. Playing around with it I found a couple simple ways to retrieve some of the detail and experimented with some methods of adding expression.

So here are some results in dithering (turning a grayscale image into a binary image, where each pixel is either white or black):

Original image

original dithering image

im = im2double(imread('max-schmeling.jpg'));


threshold dithering

th = im > 0.5;

Thresholding is usually what dithering is trying to improve upon. Details in large grey areas are lost and sharp edges appear where they may have been smooth gradients in the original image. In terms of random dithering we can think of the threshold result as “random” dithering where the “random” number is always 0.5.


random dithering

ra = im > rand(size(im));

The simplest true, random dithering. Pick a truly random number for each pixel and compare the value against it.

High contrast, random

random high constrast dithering

sp = -(im).*(im).*(im-1.5)*2>rand(size(im));

Here, I crank up the contrast of the original image before computing the random dithering as above. This sharpens the detail a little, though there is still a lot of static.

Random dithering and threshold, randomly mixed

random dithering and threshold, randomly mixed

rr = round(rand(size(im)));
rara = rr.*round(im)+(1-rr).*(im > rand(size(im)));

Here, I compute the threshold image and the random dithering image then randomly mix them. This has a similar effect as the high contrast random dithering, but with a very high contrast, almost cartoony, look.

Random dithering and threshold, randomly mixed except near edges

random dithering and threshold, randomly mixed except near edges

e = edge(im,0.05);
h = fspecial('gaussian',round(blur_width),round(blur_width));
rre = (1-((1-rr).*(1-blurred_e)));
rarae = rre.*round(im)+(1-rre).*(im > rand(size(im)));

Here, I first compute an image that tells me where strong edges occur in the original image. I blur this a bit so that the image tells me if I’m near a strong edge. Then when I mix the threshold image and the random dithering image, as I did above, I only mix them randomly if I’m not near an edge. If I’m near an edge I use the threshold image. This has a very nice cartoony effect.