Get largest image from webpage using php, wget and imagemagick

Alec Jacobson

March 10, 2011

weblog/

Here's a script I wrote to make autoblog a little more interesting. Now any time a spammer makes a comment it checks the URL they provide for a large image and appends it to their comment when their comment becomes a post. I use a little php script to get the largest image from the url and return an image tag (as long as the image is big enough). Here it is:
<?php

function isValidURL($url)
{
  return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $url);
}

function image_from_URL($URL)
{
  $tempdir = "temp_images";
  if(isValidURL($URL))
  {
    `wget -r -l1 -H -t1 -nd -N -np -A jpg,gif,png -P $tempdir -erobots=off $URL`;
    $handle = opendir($tempdir); 
    $max_size = 0;
    $biggest = "";
    while (false !== ($file = readdir($handle)))
    {
      $extension = strtolower(substr(strrchr($file, '.'), 1)); 
      if($extension == 'jpg' || $extension == 'gif' || $extension == 'png')
      {
        // identify from imagemagick can return the w and h as a string "w*h"
        // which then bc can compute as a multiplication giving the area in pixels
        $size = (int) exec(
          "identify -format \"%[fx:w]*%[fx:h]\" \"$tempdir/$file\" | bc", $ret);
        if($size > $max_size)
        {
          $max_size = $size;
          $biggest = $file;
        }
      } 
    } 

   // HERE YOU CAN ADD CODE TO DELETE THE TEMP FILES ETC

    if($max_size >= 80000)
    {
      return "<img src='$tempdir/$biggest' class='center'>";
    }
  }
  return "";
}
isvalidurl source wget one-liner source