Archive for June, 2013

boxer at rest

Sunday, June 30th, 2013

Boxer at rest

Compress PDFs for camera ready publications using Adobe Acrobat Pro

Tuesday, June 18th, 2013

Here’re the settings I use to optimize the size of PDF files for camera ready publications. I have used this for all my recent publications and it usually reduces my PDF files from ~100MB to ~20MB without any noticeable loss in quality:

  1. Open input.pdf in acrobat pro
  2. Make compatible with Acrobat 5.0 and later
  3. Check Images
  4. Color Images: Downsample Off
  5. Color Images: Compression JPEG
  6. Color Images: Quality Maximum
  7. Grayscale Images: Downsample Off
  8. Grayscale Images: Compression JPEG
  9. Grayscale Images: Quality Maximum
  10. Monochromatic Images: Downsample Off
  11. Monochromatic Images: Compression JPEG
  12. Monochromatic Images: Quality Maximum
  13. Uncheck Fonts
  14. Check Discard Objects
  15. Uncheck all except Discard all alternate images and Discard embedded print settings
  16. Check Discard User Data
  17. Uncheck all except Discard private data of other applications and Discard hidden layer content and flatten visible layers
  18. Check Clean Up
  19. Object compression options: Compress document structure
  20. Uncheck all
  21. Save these settings as “Camera Ready”
  22. Hit OK and save as opt.pdf

Flushing php output on Safari with php scripts (hosted by Bluehost)

Thursday, June 13th, 2013

There are many, many posts about getting php to flush its output while a script is executing. No one solution worked for me, but I finally found a combination of things that did.

First of all my default php.ini shows these two relevant lines:

output_buffering = On
zlib.output_compression = Off

Then I put an .htaccess file in the directory that I’d like to have my buffering script with the following line:

SetEnv no-gzip dont-vary

Finally here was my small test php file:

<?php  // not in table tags for IE                         
    @ini_set('zlib.output_compression', 0);
    @ini_set('implicit_flush', 1);
    //for ($i = 0; $i < ob_get_level(); $i++) { ob_end_flush(); }
echo str_pad('',1024);  // minimum start for Safari
for ($i=10; $i>0; $i--) {
  echo str_pad("$i<br>\n",8);
  // tag after text for Safari & Firefox
  // 8 char minimum for Firefox

High resolution images from rijksmuseum

Monday, June 3rd, 2013

Here’s a php script to download and stitch together high resolution images from the rijksmuseum:


function prepareJSON($input) {
    //This will convert ASCII/ISO-8859-1 to UTF-8.
    //Be careful with the third parameter (encoding detect list), because
    //if set wrong, some input encodings will get garbled (including UTF-8!)
    $imput = mb_convert_encoding($input, 'UTF-8', 'ASCII,UTF-8,ISO-8859-1');
    //Remove UTF-8 BOM if present, json_decode() does not like it.
    if(substr($input, 0, 3) == pack("CCC", 0xEF, 0xBB, 0xBF)) $input = substr($input, 3);
    return $input;

$url = $argv[1];
$url = preg_replace("/^https/","http",$url);

echo "Getting title...";
  $contents = file_get_contents($url);
  preg_match('/objectNumber : "([^"]*)"/',$contents,$matches);
  $id = $matches[1];
  preg_match('/objectTitle : "([^"]*)"/',$contents,$matches);
  $offset = preg_replace("/^.*,([0-9]*)$/","\\1",$url);
  # extract id
  $id = preg_replace("/^.*\//","",$url);
  $id = preg_replace("/,.*$/","",$id);
  $title_url = preg_replace("/search\/objecten\?/",
  $title_url = preg_replace("/#\//", "&objectNumber=",$title_url);
  $title_url = preg_replace("/,[0-9]*$/", "",$title_url);
  $contents = file_get_contents($title_url);
  #$contents = file_get_contents("objecten.js");
  $items = json_decode(prepareJSON($contents), true);
  $title = $items["setItems"][0]["ObjectTitle"];
  $title = preg_replace("/^.*f.principalMaker.sort=([^#]*)#.*$/","\\1",$url).
$title = html_entity_decode($matches[1], ENT_COMPAT, 'utf-8');
$title = iconv("utf-8","ascii//TRANSLIT",$title);
$title = preg_replace("/[^A-z0-9]+/","-",$title);
$final = strtolower($title);
echo "\n";

echo "Getting images...";
$contents = file_get_contents(
#$contents = file_get_contents("levels.js");

$levels = json_decode(prepareJSON($contents), true);
$levels = $levels{"levels"};

foreach( $levels as $level)
  if($level{"name"} == "z0")
    $tiles = $level{"tiles"};
    // Obtain a list of columns
    foreach ($tiles as $key => $row) {
      $xs[$key]  = $row['x'];
      $ys[$key] =  $row['y'];

    // Sort the data with volume descending, edition ascending
    // Add $data as the last parameter, to sort by the common key
    array_multisort($ys, SORT_ASC, $xs, SORT_ASC, $tiles);

    $tile_x = 0;
    $tile_y = 0;
    foreach( $tiles as $tile)
      $x = $tile{"x"};
      $y = $tile{"y"};
      $tile_x = max($tile_x,intval($x)+1);
      $tile_y = max($tile_y,intval($y)+1);
      $img = "z0-$x-$y.jpg";
      $url = $tile{"url"};
      echo "(".$x.",".$y.") ";
      file_put_contents($img, file_get_contents($url));
      $list .= " ".$img;
echo "\n";
echo "Composing images...";
`montage $list -tile ${tile_x}x${tile_y} -geometry +0+0 -quality 100 $final.jpg`;
echo "\n";
echo $final.".jpg\n";

echo "Clean up...";
`rm -f $list`;
echo "\n";

Then you can call the script from the command line with something like:

php rijksmuseum.php ""

Buried inside of that script is also a nice way to clean up strings for use as filenames:

$title = html_entity_decode($matches[1], ENT_COMPAT, 'utf-8');
$title = iconv("utf-8","ascii//TRANSLIT",$title);
$title = preg_replace("/[^A-z0-9]+/","-",$title);
$final = strtolower($title);