Remove longest common prefix from all file names in directory

Alec Jacobson

August 11, 2009

weblog/

Often when I'm organizing music or especially audiobook folders I run into the tedious problem where I have a folder with a bunch of files or subfolders all starting with the same word or prefix as the parent folder. Here's an example:
audiobooks/
  ernest hemingway/
    ernest hemingway - old man and the sea/
    ernest hemingway - for whom the bell tolls/
    ernest hemingway - sun also rises/
    ernest hemingway - a farewell to arms/
[and so forth]
Stripping or cropping off "ernest hemingway - " from each name is easily scriptable, and you could probably quickly write an even more generalized script to accept the prefix as an argument. Like strip "ernest hemingway - " ernest\ hemingway/, taking a directory path and a string. But I wanted to generalize even farther, since for me this usually occurs when there is nothing but prefixed files in the directory, I automated the task of determining the longest common prefix. Then I removed it, with a polite prompt, from each name. Here's the code, save it in a file called smartcrop.sh:
#!/bin/bash
path=$1
if [ "$path" != "" ]
then
  # be sure path has a single trailing slash
  path="`echo "$path" | sed s/\\\/$//`/"
  if [ ! -d "$path" ]; then
    echo "Usage: ./smartcrop.sh [directory]"
    exit 1
  fi
fi
LS=`ls "$path"`
PREFIX=""
# enclose to keep IFS change from affecting other code
(
  IFS=$'\n'
  for x in $LS
  do
    if [[ $MINLEN -gt ${#x} ]] || [[ $MINLEN == "" ]]
    then
      MINLEN=${#x}
      PREFIX="$x"
    fi
  done
 
  
  # decrement MINLEN in advance because can't `mv "prefix" ""`
  let "MINLEN -= 1"
  while (( 0 < $MINLEN ))
  do
    # throw results away into variable to avoid printing them
    res=`echo "$LS" | grep -v "^$PREFIX"`
    if [ 0 != $? ] 
    then
      break
    fi
    let "MINLEN -= 1"
    PREFIX=${PREFIX:0:MINLEN}
  done
  if (( 0 == $MINLEN))
  then
    echo "No common prefix found."
    exit
  fi
  echo "Longest common prefix: \"$PREFIX\""
  if [ ${#PREFIX} != $MINLEN ]
  then
    echo "  File $path$PREFIX exists."
    echo "  Next longest common prefix: \"${PREFIX:0:MINLEN}\""
  fi
  r=""
  for x in $LS
  do
    if [ "$r" != "A" ]  
    then
      r=""
      while [ "$r" != "y" -a "$r" != "N" -a "$r" != "n" -a "$r" != "A" ]
      do
        read -n 1 -p "rename $path$x to $path${x:MINLEN}? \
[y]es, [n]o, [A]ll, [N]one: " r
        echo
      done
    fi
    if [ "$r" == "A" -o "$r" == "y" -o "$r" == "Y" ]
    then
      mv "$path$x" "$path${x:MINLEN}"
      echo "  moving: $path$x --> $path${x:MINLEN}"
    elif [ "$r" == "N" ]
    then
      exit
    fi
  done
)
To run, execute on the command line in a terminal: ./smartcrop.sh [dir]. So in the above example I would use the command ./smartcrop.sh ernest\ hemingway/ Note: All the mumbojumbo at the beginning of this script is to handle the directory path given as an argument on the command line in the same manner as ls. Note: the majority of this code can be split into the longest common prefix task and the batch renaming task. Splitting these up into two programs might make more sense for you.