Frames Modems Help Home Page Chipsets Search No Frames
Diary Entries See also Site Info & Diary.
26 April 2004 MySQL optimising + caching case-study: chips.php...
 

This little page is part 1 of how the preparation time of a page of html was reduced from 0.6 to 0.06 secs. It will be of interest to anyone that uses the PHP scripting language. Part 2 concerns the caching maintenance work.

Whilst beefing up security on the modem-help.com php-scripts, I came across yet another page that was a candidate for caching: namely the Chips search pages. This proved to be a very successful exercise, and I even managed to optimise some of the MySQL a little further, so the sketch of the details is presented as a case-study for your further illumination, particularly as I find that I learn far quicker myself with actual, practical examples.

 

The following paragraphs will concentrate on this specific page:

  • chips.php?search=S

...and will mention these previous diary entries:

  • php Pear Cache_Lite Class: 28 Mar 2004
  • php html Forms Controls Class: 18 Dec 2001

The base target for pages on modem-help.com is that the page-processing completes in less than 0.1 second. The target is more than met for most pages, but some of the search-pages are somewhat of a challenge. On Sun, 25 Apr 2004 at 19:23:15 the chips page took 0.59 secs to produce - clearly off-target.

 

Caching had already been introduced to the General Names Search pages, and had been successful in reducing even longer prep times down to 0.1 second, so it was an easy job to do the same for the Chips pages. This is the (pseudo-) code before caching:

   $ShowRows = 50;
   $select   = "select fields";
   $from     = "table + table-joins";
   $where    = "is-a-chip";
   $search   = "Name < '$entry'";
   $group    = "chip-name, chipset-name";
   $count    = "SELECT COUNT(*) FROM $from";
   $searchQ  = mysql_num_rows( mysql_query( "$count WHERE $where && $search GROUP BY $group" ));
   $totalQ   = mysql_num_rows( mysql_query( "$count WHERE $where GROUP BY $group" ));
   $rowsQ    = "SELECT $select FROM $from WHERE $where GROUP BY $group";

   $class    = new Search( $searchQ, $totalQ, $ShowRows );
   $navigate = $class -> showNav();

   $r        = mysql_query( "$rowsQ LIMIT $start,$ShowRows" );
   $body     = "";
   while( list( $var1, $var2, ... ) = mysql_fetch_row( $r )) {
      ...
      $body .= a-line-of-html-code;
      ...
   }

‘$entry’ comes from the page text-box, and ‘$start’ is a Hidden on the page, carried across sessions and altered by the Class Search according to $entry and/or the various navigation buttons. Adding caching affected only the last section:

   $id       = fQueryToID( $rowsQ );
   $options  = array(
      'cacheDir' => directory-name,
      'lifeTime' => one-week
   );
   $cache    = new Cache_Lite( $options );

   if( !( $body = $cache->get( $id, chips-group ))) { // cache miss
      $r        = mysql_query( "$rowsQ LIMIT $start,$ShowRows" );
      $body     = "";
      while( list( $var1, $var2, ... ) = mysql_fetch_row( $r )) {
         ...
         $body .= a-line-of-html-code;
         ...
      }
      $cache -> save( $body, $id, chips-group );
   } else $body .= caching-notice;

This worked well, but the timings dropped only to about 0.29 second. Checking the code-timings within phpMyAdmin showed that $searchQ & $totalQ were the issue. Both queries were fully optimised, but having to return every single row simply to count the number of them is foolish. However, how to count the number of rows in SQL on a multi-table join? Some chipset-names are empty (NULL) which complicated matters, but since it is $group which yields unique rows, the following fixed the delays:

   $count    = "SELECT COUNT( DISTINCT CONCAT_WS( ' ', $group )) FROM $from";
   $searchQ  = (int) ( mysql_query( "$count WHERE $where && $search" ));
   $totalQ   = (int) ( mysql_query( "$count WHERE $where" ));

...which took less than 0.001 sec for $totalQ compared to about 0.18 sec for the old coding.

 

In the course of the above, it occurred to me, why not cache the class itself? The key to caching with Cache_Lite is to find a unique $id. This was achieved in the old coding with fQueryToID(), which simply strips out all white space and quotation and lower-cases the query-string to yield a unique ID from the sql-query:

   function fQueryToID( $str ) { // make a SQL query string suitable for a cache id
      $s=array("&&","||","\t","\n","\r","\0",0x0B," ","\"","\'","\`");
      $r=array('and','or');
      return str_replace( $s, $r, strtolower( $str ));
   }

Simple, quick and very effective. If, however, the Class is to be cached, then the $home, $end, $next, $prev navigation buttons plus the $start & $entry values will also have to be concatenated into the ID. Here is one solution:

   $ShowRows   = 50;
   $select     = "select fields";
   $from       = "table + table-joins";
   $where      = "is-a-chip";
   $group      = "chip-name, chipset-name";
   $rowsQ      = "SELECT $select FROM $from WHERE $where GROUP BY $group";

   $id         = fQueryToID( $rowsQ ).
      (( $end or $home )?
         (( $end ) ? 1 : 0 ):
         (( $next )? 1 : 0 ).(( $prev )? 1 : 0 ).$start.$entry
      );
   $options    = array(
      'cacheDir'	=> directory-name,
      'lifeTime'	=> one-week
   );
   $cache      = new Cache_Lite( $options );
   $options[ 'automaticSerialization' ] = TRUE;
   $cacheClass = new Cache_Lite( $options );
   if( !( $body = $cache->get( $id, chips-group ))) { // cache miss
      $search   = "Name < '$entry'";
      $count    = "SELECT COUNT( DISTINCT CONCAT_WS( ' ', $group )) FROM $from";
      $searchQ  = (int) ( mysql_query( "$count WHERE $where && $search" ));
      $totalQ   = (int) ( mysql_query( "$count WHERE $where" ));

      $class    = new Search( $searchQ, $totalQ, $ShowRows );

      $r        = mysql_query( "$rowsQ LIMIT $start,$ShowRows" );
      $body     = "";
      while( list( $var1, $var2, ... ) = mysql_fetch_row( $r )) {
         ...
         $body .= a-line-of-html-code;
         ...
      }
      $cache      -> save( $body,  $id, chips-group );
      $cacheClass -> save( $class, $id, chips-class-group );
   } else {
      $body    .= caching-notice;
      $class    = $cacheClass -> get( $id, chips-class-group );
   }
   $navigate   = $class -> showNav();

This now is 0.06 secs for an entire cached page, which essentially is how long all the non-static elements of the page take to prepare. A total time reduction of 90% - pretty good, huh?.