Posts Tagged ‘for’

Round images to width and heights divisible by 2, by cropping

Saturday, November 10th, 2012

To make an h264 movie from a bunch of images using ffmpeg I need all the images to have size dimensions divisible by two. Here’s a little one-liner to crop a bunch of images to the nearest size divisible by 2:


for file in *.png; do convert -crop `identify -format "(%[fx:w]/2)*2" $file | bc`x`identify -format "(%[fx:h]/2)*2" $file | bc`+0+0 $file cropped_$file; done

I suppose using mogrify would potentially be faster. But I’m not sure how to introduce the rounding.

All pairs distances, matlab

Thursday, November 10th, 2011

Often I like to get list of distances between all pairs of points in two sets. That is, if I have a set of points V and another set of points U, I’d like to build D, a #V by #U matrix of distance, where D(i,j) gives me the distance from point V(i) to point U(i).

For the sake of timings, construct some V and U values using:


V = rand(5000,3);
U = rand(5000,3);

Two for loops

You can achieve the all pairs distance computation between V and U in the most naive way using two for loops (like you would probably do it in C++):


  % Super-slow for loops method
  D = zeros(size(V,1),size(U,1));
  for i = 1:size(V,1)
    for j = 1:size(U,1)
      D(i,j) = sqrt(sum((V(i,:)-U(j,:)).^2,2));
    end
  end

Two repmats

You can also do this using two repmats in a one-liner, but this gets slow if you hit a memory wall:


  % Slow, especially if hitting memory paging
  D = ...
    squeeze(sqrt(sum((repmat(V,[1 1 size(U,1)])- ...
    permute(repmat(U,[1 1 size(V,1)]),[3 2 1])).^2,2)));

Single for loop, single repmat

You can try to alleviate this by using a single for loop and inside the loop a single repmat. This is faster, but will probably hit a memory wall later on, too:


  % slow single for loop, single repmat method, handles memory paging better
  D = zeros(size(V,1),size(U,1));
  for i = 1:size(V,1)
    D(i,:) = ...
      sqrt(sum((repmat(V(i,:),[size(U,1) 1]) - U).^2,2))';
  end

Single for loop, single bsxfun

The matlab function bsxfun will do the repeat virtually, so we can replace the minus operation inside the loop, giving us a pretty good speed up:


  % Fast single for loop, single bsxfun method, but matlab can handle both for
  % loops in bsxfun
  D = zeros(size(V,1),size(U,1));
  for i = 1:size(V,1)
    D(i,:) = ...
      sqrt(sum(bsxfun(@minus,V(i,:),U).^2,2));
  end

Single bsxfun

But there’s no reason for the outer for loop either, bsxfun is powerful enough to virtually repeat both list of points, resulting in a very fast, very concise, one-liner:


  % Fastest
  D = squeeze(sqrt(sum(bsxfun(@minus,V,permute(U,[3 2 1])).^2,2)));
Method 5000 points each 50000 points each
Two for loops 116.730218 seconds NA
Two repmats 2.294543 seconds > 900 seconds
Single for loop, single repmat 1.064348 seconds 20.843250 seconds
Single for loop, single bsxfun 0.801842 seconds 15.608364 seconds
Single bsxfun 0.617075 seconds 131.936023 seconds

Note: Contrary to what I thought would happen, the single bsxfun does hit a memory wall for very large input, so the single for loop, single bsxfun/repmat succeed here (see second column). It would be nice to have an automatic way of determining which code to run based on the machine’s memory.

Replace string in file names, bash one-liner

Saturday, December 5th, 2009

After running lame on a bunch of wav files I ended up with tons of files named:


song1.wav.mp3
song2.wav.mp3
song3.wav.mp3
song4.wav.mp3
...

Now, obviously I could have avoided this in this case with the proper options and arguments to lame, but it’s all in hindsight.

Here’s the bash line I use to strip out .wav from the middle of all these files in the current directory:


for filename in *.mp3; do newname=`echo $filename | sed 's/\.wav\.mp3$/.mp3/g'`; mv $filename $newname; done

It’s long for one line but I think it’s still understandable and manageable. The result is:


song1.mp3
song2.mp3
song3.mp3
song4.mp3
...

Note: I notice that I’m being overly carefully with my regexes and wildcards in the example above but it should make it more clear how to adapt it to your case.