Posts Tagged ‘matlab’

Setting row of sparse matrix to zero in matlab

Friday, May 24th, 2013

I wanted to set a whole row of a sparse matrix A to zero:


A = sparse(1e8,1e8);
tic;
A(r,:) = 0;
toc

But this was rather slow. About 1.7 seconds.

Turns out all rows and columns are not created equal in matlab. Because of how matlab stores sparse matrices it’s much faster to set a whole column to zero:


tic;
A(:,c) = 0;
toc

This is about 0.0002 seconds.

So if you can refactor your code you’re better off messing with columns. If you can’t refactor, you might think it would be fast to transpose before and after setting a row to zero:


tic;
A = A';
A(:,r) = 0;
A = A';
toc

But this helps not at all. It seems transposes are just stored as flags, which is usually a good thing. But here it means we don’t gain anything, since after the transpose, now rows the cheap route. This also implies that refactoring could be nontrivial.

One might also think you could first find all the non-zero entries and then set them each to zero. But just the rowwise find is expensive:


tic;find(A(10,:));toc

~1 second

In fact just the row-wise access is expensive


tic;A(10,:);toc

~1 second

Seems like there’s no happy answer.

Normalize rows a sparse matrix to sum to one in matlab

Monday, May 20th, 2013

I surprisingly found a bottleneck in my matlab program to be how I was normalizing rows of a sparse adjacency matrix. I was doing this using bsxfun:


A = bsxfun(@rdivide,A,sum(A,2));

This matlab forum lists a few other ways. I compared them with this script using the timeit function:


a1 = @() bsxfun(@rdivide,A,sum(A,2));
a2 = @() spdiags (sum (A,2), 0, size(A,1), size(A,1)) \ A ;
a3 = @() spdiags (1./sum (A,2), 0, size(A,1), size(A,1)) * A ;
% A is 10000x10000 with roughly 6 nnz per row
timeit(a1)
% 0.32 seconds
timeit(a2)
% 0.0023 seconds
timeit(a3)
% 0.0016 seconds

So I’m going with:


A = spdiags (1./sum (A,2), 0, size(A,1), size(A,1)) * A ;

Update: This observation is not valid if A is dense. The same methods above perform differently:


% A = rand(10000,10000);
timeit(a1)
% 0.49 seconds
timeit(a2)
% 1.02 seconds
timeit(a3)
% 0.72 seconds

Looks like bsxfun is still the best bet for dense matrices.

Deriving the probability density function of a non-uniform sampling of the circle

Sunday, May 19th, 2013

I recently posted about non-uniformly sampling a sphere and visualizing the result. As I was examining the result, I though it would be more convenient just to derive the probability density function analytically. This is an exercise in transforming the probability density function of one random variable into that of another.

As a warmup I followed this systematic approach on math.stackexchange for a 2d problem of non-uniformly sampling a circle, actually just the first quarter of the circle. The scheme under consideration is to first sample x and y from the unit square uniformly, then normalize (x,y) to unit length and read off the angle θ. This transformation of normalizing then converting to polar angle, takes these two uniformly samples to a non-uniform sampling of the quarter circle, θ in [0,π/2).

In other words the probability density functions of the first variables x and y are both 1 for x,y in [0,1) and we'd like to know what the probability density function of θ is.

We begin by acknowledging that for any reasonable function u we have the equality:

E(u(θ)) = u(θ) 10≤x<110≤y<1 dxdy.

Now let's define θ = atan2(y/sqrt(x2+y2),x/sqrt(x2+y2)) = atan2(y,x).

Here we'll perform a change of variables. Let y = t and let x = t/tan θ.

This means that dxdy = dx/dθ dθ dy/dt dt = -t/sin2(θ) dθdt, hence:

E(u(θ)) = u(θ) 10≤t/tan θ<110≤t<1 -t/sin2(θ) dθdt

We also know that for every reasonable function u that:

E(u(θ)) = u(θ) fΘ(θ) dθ,

where fΘ(θ) is the (so far unknown) probability density function of our random variable Θ. Setting this equal to the equation above we have:

fΘ(θ) = 10≤t/tan θ<110≤t<1 -t/sin2(θ) dt.

So far we've been dealing with indefinite integrals, but we can convert to a definite one by evaluating the 1 functions, leaving:

fΘ(θ) = 0min(1,tan θ) -t/sin2(θ) dt
        = -1/sin2(θ) 0min(1,tan θ) t dt
        = -1/sin2(θ) · min(1,tan θ)2/2

Finally we can also write this as a piecewise function:

fΘ(θ) = { 1/cos2θ/2 θ≤π/4
1/sin2θ/2 θ>π/4

We can verify this with the following small matlab program:


sam = 4000000;
method = 'naive';
%method = 'uniform';
switch method
case 'naive'
  N = normalizerow(rand(sam,2));
case 'uniform'
  S = rand(sam,1)*pi/2;
  N = [cos(S(:,1)) sin(S(:,1))];
end
S = atan2(N(:,2),N(:,1));

% analytic solution
f = @(th) (th<pi/4).*(1/cos(th).^2) + (th>=pi/4).*(1/sin(th).^2);

[h,b] = hist(S,100);
% normalize so integral is 1
h = h./sum(h*(pi/2/numel(b)))
bar(b,h);
hold on;
plot(th,-f(th),’LineWidth’,2);
hold off

which produces:
sample circle empirical vs analytic

Actually, it's also pretty straight forward to derive the probability density function geometrically rather than analytically.

Start with our uniform sampling of the unit square:
sample circle geometric 01

When we project this sampling to the quarter circle (green), we're really integrating along each ray (blue) of angle θ (grey):
sample circle geometric 02

If we had only considered the points inside the circle then we would indeed have a uniform sampling as this integral is then independent of θ. This invites a scheme called rejection sampling: choose x,y in the unit cube but only keep those inside the circle.
sample circle geometric 03

It's the points that lie outside the circle that cause this kind of sampling to be biased, non-uniform.
sample circle geometric 04

We can compute the integral along this ray geometrically. Since the sampling is uniform, it amounts to just computing the length of the line segment. We just need to know for a given angle θ where the ray intersects the unit square.

First let's consider θ≤π/4. We know that x = 1 and by the polar coordinations we have that y = r cosθ, thus the radius and the length of these line segments are 1/cos θ. Likewise for θ>π/4 we have line segments of length 1/sin θ. These values are the length of the line segments, and since we're integrating a uniform density function, they're also the mass accumulated along the line segment. Finally we must account for the change in units, exchanging each small change in angle with a change in area. Namely by exactly this length divided by two (two because the thin space between two close rays is a triangular, rather than rectangular). Thus, by multiplying our mass by this change we have:

fΘ(θ) = { (1/cos2θ)/2
(1/sin²θ)/2
θ≤π/4
θ>π/4,

which matches the analytic result above.

Update: I'm not so confident that this derivation geometric derivation tells the whole story anymore.

Lightness vs luma vs grayscale vs value

Friday, May 17th, 2013

In image processing it’s typical to convert an RGB input image into a more meaningful colorspace. For example it’s often argued that distances in CIE L*a*b* colorspace are better correlated with perceived differences. Other times people simply use luma (aka luminance). I want to see how different these are. Here’s a small MATLAB script to generate and compare different grayscale images:


im = imread('hans-hass.jpg');
imd = im2double(im);
% NTSC luminance == Matlab RGB2GRAY gray 
% [Levin et al 2004]
ntsc = rgb2ntsc(im);
% L*a*b*  lightness
% http://stackoverflow.com/a/6013156/148668
% ~[Lischinski et al 2006]
lab = im2double(applycform(im,makecform('srgb2lab')));
% HSV value
hsv = rgb2hsv(imd);
% simple rgb3gray
simple = imd(:,:,1)*0.3 + imd(:,:,2)*0.6 + imd(:,:,3)*0.1;
% Very naive rgb3gray
ignorant = sum(imd,3)/3;
% matrix of differences
row = [hsv(:,:,3) naive simple ntsc(:,:,1) lab(:,:,1)];
col = [hsv(:,:,3);naive;simple;ntsc(:,:,1);lab(:,:,1)];
diff = 0.5*kron(ones(5,1),row)+0.5*(1-kron(ones(1,5),col));
comp = [imd repmat(row,[1 1 3]);repmat([col diff],[1 1 3])];
imshow(comp);

This produces an array of comparisons:
lightness comparison of grayscale images

Some interesting things I noted were that the different implementations of luma are quite similar and they’re all not so far from lightness. HSV value (the maximum of RGB) not surprisingly is very different.

Mixing fscanf for ascii data and fread binary data bug

Sunday, May 12th, 2013

I’ve been using a simple format for dense matrices. The original idea was just to have a dead simple ASCII file format for reading and writing column-major matrices. The advantage over MATLABs simple dlmread/dlmwrite files is that my file format contained the size of matrix in a header at the beginning. This made reading the files much, much faster. Later I realized this would be even faster if I also supported binary data. I still used the ascii header to reveal the size of the binary data. So before the binary data I would have a line that read:


[cols] [rows]

Then the binary data immediately follows in 8 byte chunks (each dense matrix entry as a double).

This turned out to be a bit of a mistake, since a line starts to become badly defined for binary data. On my system I end lines with a ‘\n’ line feed character. But if I try to parse out that line using:


fscanf(fp,"%d %d\n",&cols,&rows);

Then fscanf might grab more than just the ‘\n’ character, depending on what’s sitting in the file next: i.e. what the first byte of the first double precision float is.

I found out this the hard way when I was suddenly reading junk for any matrix whose first entry was “-6.343889″. The first byte (in my little endian system) is in hex 0B which is apparently a vertical tab, which for some reason gets eaten up as part of the \n in the fscanf above.

My current, not-so-satisfying solution is to change the header reading to:


fscanf(fp,"%d %d",&cols,&rows);

without reading the ‘\n’ character and then explicitly read exactly one-character (byte). After verifying that this character is indeed the ‘\n’ then I know I’m safe to read in the binary data. Unfortunately this means I can no longer really say the header should be just a line reading as above, since some systems use multiple characters to denote then end of a line. For example, With my current description I have no way of knowing whether I should read ‘\n\r’ as the line ending or if I should read just ‘\n’ because the ‘\r’ is really just the value of the first byte of the first double interpreted as a char.

It seems like one right way to solve this would be to use a standard format. Maybe using xml’s Base64 element

Source code and Demo for Robust Inside-Outside Segmentation using Generalized Winding Numbers

Monday, April 29th, 2013

2d demo:

winding number demo 2d

3d demo:

winding number demo 3d
I’ve release a 2D and 3D matlab demo (with mex functions) of our SIGGRAPH 2013 paper “Robust Inside-Outside Segmentation using Generalized Winding Numbers”. You can find it on our project page.

public alpha release of libigl – a c++ geometry processing library

Monday, April 22nd, 2013


libigl a c++ geometry processing library

We’ve finally released our in-house C++ library for geometry processing called libigl.

libigl is a simple c++ geometry processing library. We have a wide functionality including construction of sparse discrete differential geometry operators and finite-elements matrices such as the contangent Laplacian and diagonalized mass matrix, simple facet and edge-based topology data structures, mesh-viewing utilities for opengl and glsl, and many core functions for matrix manipulation which make Eigen feel a lot more like MATLAB.

It is first and foremost a header library. Each header file contains a single function. Most are tailored to operate on a generic triangle mesh stored in an n-by-3 matrix of vertex positions V and an m-by-3 matrix of triangle indices F.

The library may also be compiled into a statically linked library, for faster compile times with your projects.

We use the Eigen library heavily in our code. Our group prototypes a lot in MATLAB, and we have a useful conversion table from MATLAB to libigl/Eigen.

Visit the project page and download our code!

Mismatching axis size of subplots due to colorbar

Sunday, April 21st, 2013

When using subplots and axis commands the colorbar command in matlab can mess up alignments:


subplot(1,2,1);
plot([0 100],[0 100]);
axis equal;
subplot(1,2,2);
plot([0 100],[0 100]);
axis equal;
colorbar

Produces:
mismatching axis in matlab due to colorbar

A quick fix to get the fix axis to be the same size as the second is to use:


set(colorbar,'Visible','off');

So that


subplot(1,2,1);
plot([0 100],[0 100]);
axis equal;
set(colorbar,'Visible','off');
subplot(1,2,2);
plot([0 100],[0 100]);
axis equal;
colorbar

Produces:
mismatching axis in matlab due to colorbar

Sparse matrix equality in MATLAB accidentally dense

Thursday, March 28th, 2013

I had in my code:


all(A(:) == B(:))

where A and B where both large, sparse matrices. Unfortunately the == operator in MATLAB is not conducted in a sparse way. Try this:


n=20000;
tic;
A = sparse(n,n);
B= sparse(n,n);
all(A(:)==B(:));
toc

There are zero non-zeros so this should cost no time at all. But the all() function is of course not smart enough to unravel the == operator. So this is the same as doing:


C = A(:) == B(:);

And of course all the zeros that match will turn into 1s in C, so if A and B are the same C will be completely dense.

Instead I should use the built-in


isequal(A,B);

Note that in the case where is A mostly the same as B then


~any(A(:)~=B(:))

will be fast. But if they’re different, this is effectively dense as above.

Close figure 1 in matlab only if it exists

Tuesday, March 5th, 2013

It was annoyingly difficult to find an elegant way to close a certain figure without causing an error if that figure doesn’t exist. Using gcf or figure(1) are bad ideas because if the figure doesn’t exist then it’s created. So calling something like:


close(figure(1));

when figure 1 doesn’t exist causes it to open then close. I came up with this instead:


close(intersect(findall(0,'type','figure'),1))