Drop-in replacement for MATLAB's graphconncomp from the bioinformatics toolbox

Alec Jacobson

January 11, 2015

weblog/

One of my colleagues could not run my matlab script because it called graphconncomp. This function computes the connected components of a graph represented by an adjacency matrix. Strangely it is not a built-in function of the standard MATLAB installation. Rather it's included in the bioinformatics toolbox, which my colleague's university did not purchase for him. This was frustrating so I wrote my own drop in replacement. Find this in my gptoolbox:

function [S,C] = conncomp(G)
  % CONNCOMP Drop in replacement for graphconncomp.m from the bioinformatics
  % toobox. G is an n by n adjacency matrix, then this identifies the S
  % connected components C. This is also an order of magnitude faster.
  %
  % [S,C] = conncomp(G)
  %
  % Inputs:
  %   G  n by n adjacency matrix
  % Outputs:
  %   S  scalar number of connected components
  %   C  
  [p,q,r] = dmperm(G+speye(size(G)));
  S = numel(r)-1;
  C = cumsum(full(sparse(1,r(1:end-1),1,1,size(G,1))));
  C(p) = C;
end

As a pleasant surprise, my implementation is also an order of magnitude faster than graphconncomp. On a 4,000,000 by 4,000,000 sparse adjacency matrix, graphconncomp takes 12 seconds and my conncomp using dmperm takes 2 seconds.