Posts Tagged ‘sort’

Make the most recent tex document in the current directory and open it

Wednesday, January 13th, 2016

Here’s a little bash script to compile (pdflatex, bitex, 2*pdflatex,etc.) the most recent .tex file in your current directory that contains begin{document} (i.e. the main document):

if [ -z "$LMAKEFILE" ]; then
  echo "Error: didn't find LMAKEFILE environment variable"
  exit 1
TEX=$( \
  grep -Il1dskip "begin{document}" *.tex | \
  xargs stat -f "%m %N" | \
  sort -r | \
  head -n 1 | \
  sed -e "s/^[^ ]* //")
if [ -z "$TEX" ]; then
  echo "Error: Didn't find begin{document} in any .tex files"
  exit 1
make -f $LMAKEFILE $BASE && open $BASE.pdf

Simply use it:


Quick and dirty Eigen Matrix/Vector as std::map key

Tuesday, April 28th, 2015

I tried to use an Eigen matrix type as the key to a std::map with something like:

std::map<Eigen::VectorXd,bool> m;

But got compile-time errors of the sort:

error: no viable conversion from 'const CwiseBinaryOp<std::less<Scalar>, const Eigen::Array<double, -1, 1, 0, -1, 1>, const Eigen::Array<double, -1, 1, 0, -1, 1> >' to 'bool'
    {return __x < __y;}

Seems that map wants a proper less than operator and Eigen has overloaded that with the coefficient-wise operator. A reasonable way to sort vectors would be lexicographically. Fortunately stl has a function for that, so I define my map like this:

  std::function<bool(const Eigen::VectorXd&,const Eigen::VectorXd&)> >
  m([](const VectorXd & a, const VectorXd & b)->bool
    return std::lexicographical_compare(,,,;

This will ignore the internal ordering of the matrix elements (i.e. ColMajor vs RowMajor) but if you’re just using the map for a uniqueness check this is good enough.

Corresponding list of number of occurrences for each vector element

Wednesday, January 29th, 2014

If I have a n-long vector x and I’d like a new n-long vector counts that contains for each element in x the number of times the same value occurs in x (aka the value frequency), then the following matlab code will compute counts.

[u,~,ic] = unique(x);
counts = histc(x,u);
counts = counts(ic);

The last step is essential to distribute the counts in the same order and length of x.

List text files sorted by line count

Sunday, January 5th, 2014

Here’s a bash snippet to list the text files (only text files, no binaries) in the current directory sorted by line count:

find . -depth 1 -type f -exec grep -Il "" {} \; | xargs wc -l | sort

This outputs something like:

    6 ./alexa.h
   35 ./Astrid.h
   70 ./
   74 ./dagmar.h
  133 ./Adrea.cpp
  168 ./max.cxx
  216 ./Sabina.cpp
 2339 ./Thorsten.cpp
39991 ./

Update: For the particular case of wc (and maybe) others its faster to do this:

wc -l `grep -Il "" *` | sort

See who’s been checking in frequently to mercurial repository

Friday, December 13th, 2013

Here’s a bash snippet to show who’s been checking in code to a mercurial repository:

hg log | sed -n "s/user:  *//p" | sort | uniq -c | sort -rn

This prints something like:

 262 Pablo
 108 juanita
  23 carlos
  23 Maria Castano <>
  21 Juan Hernandez (
  17 psalamanca
  13 chico
   7 Maria Castano <>
   1 paco


Update: For an alternative measure try the churn extension.

Sort in bash with capital words at the end

Thursday, September 12th, 2013

I wanted to do a sort of some lines in bash but instead of capitals being treated as coming before minuscules I wanted them after. Normally the sort command:

echo -e "Foo\nfoo\nBar\nbang" | sort

will produce:


Instead I wanted Bar and Foo to come after bang and foo. To do this I used:

echo -e "Foo\nfoo\nBar\nbang" | \
  ruby -ne 'puts $_.split("").map{|e| (e>="a"?e.upcase():e.downcase())}.join' | \
  sort | \
  ruby -ne 'puts $_.split("").map{|e| (e>="a"?e.upcase():e.downcase())}.join'

which produces


ls output sorted by “date added”

Sunday, August 11th, 2013

Mac keeps track of a useful bit of metadata: the date a file is added to its parent folder. Finder lets you sort a folders contents by “date added”. I wanted to do this with my ls output in the Terminal, too.

find . -depth 1 -exec mdls -name kMDItemFSName -name kMDItemDateAdded "{}" \; | sed 'N;s/\n//' | grep -v '(null)' | awk '{print $3 " " $4 " " substr($0,index($0,$7))}'

which lists files in the current directory prefixed by their date added timestamps:

2013-07-21 17:07:26 DGPDEC.pdf
2013-08-11 10:10:49 erlenmeyer-flask-difficult-inside-outside.pdf
2013-07-21 15:10:05 first-run
2013-03-13 23:10:43 new
2013-03-16 20:48:32 NPR Music Austin 100 ZIP (71 Tracks)
2013-04-25 09:04:26 particle.gif
2013-04-25 09:05:34 particle.html

I saved this as an alias by appending this to my .profile:

alias lsadded='find . -depth 1 -exec mdls -name kMDItemFSName -name kMDItemDateAdded "{}" \; | sed -e "s/^kMDItemFSName    = \"\(.*\)\"/ \1/g" | sed "N;s/\n//" | sed -e "s/(null)/0000-00-00 00:00:00 +0000/g" | '"awk '{print \$3 \" \" \$4 \" \" substr(\$0,index(\$0,\$6))}'"

This is especially useful when combined with sort and sort -r:

lsadded | sort


Unsorted (sort according to citation order) acmsiggraph.bst bibtex bibliography

Tuesday, August 21st, 2012

Open acmsiggraph.bst and simply comment out the sort line. So that




Find minimum non-zero entry in sparse matrix

Wednesday, April 4th, 2012

Using a sparse matrix to store a weighted adjacency matrix, I found myself trying to grad the minimal non-zero entry per row. This is a little tricky since taking min will just give the first zero (and obviously taking max of the inverse also returns 0).

Let your sparse matrix be:

A = sprand(10,10,0.5)-sprand(10,10,0.1);

This site proposes filling the zero entries with inf.

A(~A) = inf;
[mA,mI] = min(A);

But this turns our sparse matrix into a dense one. As A gets big this becomes a disaster.

[sA,sI] = sort(A,'descend');
[I,J,V] = find(sA);
msI = max(sparse(I,J,I));
mA = sA(sub2ind(size(A),msI,1:size(A,1)));
mI = sI(sub2ind(size(A),msI,1:size(A,1)));

Now, try this for 10000 by 10000 A with an average of 6 non-zeros per column. This replace with infs method takes crashed my matlab. So now try this for a 2500 by 2500 A with an average of 6 non-zeros per row. The replace with infs method takes 40.5 seconds, first converting the matrix to full format then replacing with infs does a lot better at 0.08 seconds, but the truly sparse method only takes 0.035 seconds.

Even another way to do it, which avoids the sort:

  [AI,AJ,AV] = find(A);                                                                                   
  A_i = sparse(AI,AJ,AV.^-1,size(A,1),size(A,2));                                                         
  [maxA_i,maxI_i] = max(A_i);                                                                             
  [minA,minI] = min(A);                                                                                   
  minA(minA==0) = inf;                                                                                    
  mA = [maxA_i.^-1;minA];                                                                                 
  [mA,J] = min(mA);                                                                                       
  mI = [maxI_i;minI];                                                                                     
  mI = mI(sub2ind(size(I),J,1:size(mI,2)));  

STL unique != MATLAB unique

Monday, March 26th, 2012

Got burned on the STL unique algorithm not actually implementing what I inferred its name to mean. In MATLAB unique is finds unique elements in an unordered set.

unique([1 2 3 3 2 1]) --> [1 2 3]

In STL, unique reorders an ordered set so that repeated equal entries are replaced with a single entry:

unique([1 2 3 3 2 1]) --> [1 2 3 2 1]

To get the behavior of the matlab unique you’ll need to sort first.

/*/../bin/ls > /dev/null
printf "//" | cat - $0 | g++ -I/opt/local/include/eigen3 -I$IGL_LIB/include -o .main -x c++ - && ./.main
rm -f .main
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
int main(int argc,char * argv[])
  using namespace std;
  vector<int> V;
  vector<int> SV = V;

  copy(V.begin(), V.end(), ostream_iterator<int>(cout, " "));

  V.erase(unique(V.begin(), V.end()), V.end());
  copy(V.begin(), V.end(), ostream_iterator<int>(cout, " "));

  SV.erase(unique(SV.begin(), SV.end()), SV.end());
  copy(SV.begin(), SV.end(), ostream_iterator<int>(cout, " "));

  return 0;

which produces:

V=1 2 3 3 2 1 
unique(V)=1 2 3 2 1 
unique(sort(V))=1 2 3