404s and WordPress Server Load

A few months back we launched a site redesign/redevelopment project for a client, and made a simple mistake that had some interesting ramifications. It’s worth posting here so others don’t make the same mistake.

What Happened

When redeveloping the site, we moved the WordPress instance from the web root:

/public_html/wp-config.php
/public_html/wp-includes/
/public_html/…

to a subdirectory within the web root:

/public_html/wordpress/wp-config.php
/public_html/wordpress/wp-includes/
/public_html/…

I consider this to be a best practice for WordPress powered sites, as it makes version upgrades a bit easier (among other things).1 Some plugins don’t work well with symlinked “plugins” directories, but those generally have easy fixes (use ABSPATH, not dirname() in your plugin).

These changes were made and thoroughly tested on our dev server with historical data. We then made the changes in a staging environment under a beta.example.com style domain name before pushing the changes live. In short, we were pretty careful and tested pretty well.

The Result

When we finally pushed everything live, the production server was brought to its knees. Yikes!

A little poking around, some bug reports, and very shortly thereafter we figured out the problem. We had failed to set up a rewrite rule to map content from:

/wp-content/uploads/

to:

/wordpress/wp-content/uploads/

where it was now located.

An omission like this wouldn’t have had much effect on server load in some site configurations, but one of the wonderful features of WordPress – its lovely permalinks – has a hidden caveat that can be exposed if someone makes a configuration mistake (like we did).

WordPress’s permalink system basically works like this:

  1. See if a file exists in the location that was requested; if the file exists, serve it. This is how images, media, non-WordPress files, etc. are served without conflicting with WordPress.
  2. If no file is found at the location that was requested, then pass the URL to WordPress and see if WordPress can figure out what to show.

It’s a really elegant system and works very well. However, it also means that 404 requests – http requests that result in a “file not found” message – have a much higher impact on the server than a traditional 404 request does. For every 404, the server instantiates WordPress, does some database work to try to see if it can figure out what to serve, etc.

When we didn’t set up proper redirects for the content in “uploads”, we basically increased the server load by a factor of 20. Instead of a single request going to WordPress and 20 requests serving static content by Apache, all 21 requests were being sent to WordPress.

The increased load on the server had some… adverse effects on performance. Yeah.

Why We Missed It

So this should be a pretty easy thing to catch, right? If the URLs to the images are broken, we’d all be seeing a bunch of broken images all over the place in our testing – right?

Not exactly, and for two different reasons:

  1. When we tested on the production server using the beta.example.com hostname, the content was still pointing to example.com and the live production server was dutifully serving the images. It wasn’t until we pushed the changes live that the images were no longer at the previous URLs.
  2. Even when the images were no longer in the proper place, our browsers that we were testing in were showing the images properly. This was due to longer expires headers we had just implemented for media on the server in order to reduce overall server load. In casual testing, everything looked to us to be working properly.

Once we tested in browsers that didn’t have the images cached, we immediately saw the problem.

Easy Fix

Luckily, this is a really easy problem to fix. A simple mod_rewrite rule (placed before the standard WordPress rewrite block) fixed everything right up:

RewriteRule ^wp-content/uploads/(.*)$ /wordpress/wp-content/uploads/$1 [L,R=301]

(alternate code, improvements to syntax welcome)

With this in place, the old URLs were redirected to the new location and no longer were spinning up WordPress on every image request.

Hopefully this will be useful to people who might be doing something similar, or who have made the same mistake and are looking to recover from it. 🙂

  1. Corey has a nice writeup on the file structure he uses. [back]