Converting to UTF-8

One of the things I still plan to do in the upcoming Tasks Pro™ and Tasks releases is change the default character set encoding (for English) from ISO-8859-1 to UTF-8. The trick is how best to handle this for existing customers. Here are the options I’m looking at:

  1. I’ll still have an ISO-8859-1 version of the English language file, so I can change existing prefs to use the (renamed) english_iso88591 option instead of the english option. I guess the default language in the server settings should be changed as well. Users can then choose to switch to UTF-8 if they like, but special characters will not be translated. New users (the new default is UTF-8) get the most benefit. This is probably the lowest impact option.
  2. Do option 1 above, and create a utility script (run separately) that can be run to convert data to UTF-8.
  3. Have an option in the upgrade script to change users language prefs and data to UTF-8.

#2 and #3 seem like nicer options to me, however I haven’t found a good solution for converting the data yet. Here are the requirements:

  • Must be a PHP solution (no PERL or executables)
  • Must be compatible with MySQL 3.2x – current
  • UPDATE: Must be PHP 4.1+ compatible and included in the standard installation

I’ve looked at the MySQL CONVERT() docs, but I haven’t found an example of how to use it how I need to. My test queries didn’t succeed and I’m tired… I’m hoping enlightenment will come to me in the form of a comment. 🙂

This post is part of the project: Tasks Pro™. View the project timeline for more context on this post.