wetpixel.com Migration Project

A few weeks ago I helped Eric convert wetpixel.com from a PostNuke/phpBB driven site to an Expression Engine (EE)/Invision Power Board (IPB) driven site. Wetpixel is a community web site for underwater digital photographers, The new site is now live, feel free to go poke around then come back to read the rest. :)

Welcome back. Eric got in touch with me last fall to discuss options for changing the CMS and forums he was using for Wetpixel – things just weren’t going well with PostNuke in particular. Eric had also liked IVP when he’d used it in the past, so he wanted to move everything and start fresh. Besides the new CMS and forum systems, he also wanted to give the site a new look and have a single sign-on to both systems for his users.

Overview

We talked over what needed to happen, then I did a quick code review of the EE and IVP log in code and wrote up a development outline and time estimate. It was a fairly informal process since Eric and I were colleagues and have been friends for a few years. He was definitely capable of doing this himself, but he knew it would be faster to bring in some additional help and I was happy to oblige.

Eric came out to Colorado for a couple of days and we banged out all the major functionality. Over the next couple of weeks as he got the servers ready to go, we cleaned up the final details. On a personal note, since Eric is normally either on a boat somewhere or flying over my head across the country, it was nice to get a chance to catch up with him too.

What we did

We started off working on the single sign-on process, because that had the most potential to get complicated. Since the only reason Eric wanted users authenticated into EE was to restrict commenting to registered users, we decided to make the IPB users the master user data.

I compared the users table in EE with the users table in IPB and wrote a script to populate the EE users table from the IPB user data and a function to update the EE data when an IPB user was updated. While I was writing this code, Eric found a script to convert phpBB data to IPB and did the conversion so we had real data to work with. This actually went quite smoothly and we had it done and tested in a few hours.

One of the goals (and challenges) when making changes to systems like EE and IPB is to keep the changes as small and isolated as possible. Upgrades happen, and re-applying changes is a pain in the arse. I was very pleased that we only touched a couple lines of code in IPB to enable the user data update to write to both systems.

The next step was to enable a single sign-on for Wetpixel users. The right way to do a single sign-on is to use LDAP, but neither EE or IPB support LDAP and we didn’t have an LDAP server anyway so we needed to hack this in as well.

The forums are used more than the article comments, so we decided to keep the IPB log in and not expose the EE login. Once a user is logged into IPB, we set a special wetpixel cookie for that user. If they go into an EE page in Wetpixel, we then know to automatically log them in to EE as well. Using this technique, we were again able to touch only a few lines of code in each system – maintaining upgradeability.

The last step was the messiest – write a script to convert all the PostNuke articles and comments to posts and comments in EE. Eric went through the PostNuke tables to create a field map of old data to new data, creating a bunch of custom fields in EE as he went. Then I started writing the data conversion script. While I worked on the script, Eric put the final touches on his new site design and shell.

The article data was stored in three different tables (a table for each type of article), each of which has different columns and column names. I wrote a few hundred lines of code to do the conversion for the first table, plugging in the field map data and (after cleaning up 2 or 3 typos), it actually ran as intended! Then I added in conversion code for the comments for those posts (gotta match up the old article IDs to the new post IDs). Once we tested that and had it working, I decidede to sacrifice elegance for practicality and copy-pasted the code a couple times to handle the other tables. :)

At this point the conversion is pretty much complete, but of course we didn’t want to let the URLs to the old articles turn into 404 errors. I already had the code in the data migration script to map the old article IDs to the new post IDs, so I added some code to the end of the conversion script to generate some code we could use for the redirect.

Then it was time to drive Eric to the airport.

Wrap up

It was a pretty busy couple of days, and we got quite a bit more done that I expected. A few things that worked really well:

  1. Using Tasks Pro™ (of course) to project manage on the fly. We created high-level tasks like ‘Single Sign On’ and ‘Data Conversion’ at the beginning, then created sub-tasks for the individual things we needed to do for each of these areas when we were ready to get started on that high-level task. Capturing closing notes was very useful as well – for example, if you note which lines of which file(s) you edited when working on a feature, it makes it much easier to quickly get back there to make changes when needed. I also made sure we created a task whenever we thought of anything we needed to do, so nothing fell through the cracks.
  2. Reviewing the code for EE and IPB as part of the initial project definition gave me much needed information about systems I’d never used before and gave me a good idea of where I needed to cut into each system once we started. This made the time we had working together much more productive instead of needing to do the research at that time. It also made the time estimates a lot more accurate.
  3. EE and IPB both have code that is clean enough to allow strategic code insertions rather than mad hacking. Eric’s choice of these products worked out very well. It was interesting being able to compare the code and database table and column names in IPB (which has evolved over time) with EE which was created much more recently. Both code is pretty clean, but EE is much more consistent with naming conventions.

There were also a few things that we missed during our coding frenzy that we had to fix afterward:

  1. my initial plan for handling the redirects was flawed (it expected 404 requests to get passed to index.php – which only sort of happened) and I had to redo it. The new solution actually works better because it adds no extra overhead in the PHP code at all. Instead, I check for elements of the old URL scheme in the Apache rules (.htaccess) and use mod_rewrite to send requests for old URLs to a separate redirect script.
  2. Some of the PostNuke article data had relative URLs that all broke with the new system. I had to write some additional code to go through each article and convert all the relative URLs to absolute URLs.

For me this was a very interesting project (besides the opportunity to do a little work with Eric again). This was a project I understood, but had no experience with. Though I’d never written a conversion script, a single sign-on script or worked with EE or IPB before, I was pretty sure about what I needed to do. I did it, and it worked. This is a common situation for today’s developers, especially consultants. Your job is to get something done, often using systems and technology you aren’t familiar with.

The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and relearn.

Alvin Toffler

Eric’s write-ups of the changes are here, here and here.