Table Migrator

Saturday, 18 July 2009

This is my solution to dealing with large table migrations in MySQL. This method easily applies to tables managed by ActiveRecord since it maintains created_at and updated_at for you.

The good bits are on GitHub. Read on for some back story.

For any database-driven application having a well optimized table schema and indexes is crucial for good performance. Usually this is not a problem in development. However, once your app goes live and you are dealing with millions upon millions of rows of data, the cost of revising your table schema quickly becomes very large.

With MySQL, an ALTER TABLE statement is extremely expensive. MySQL is fairly conservative (or unintelligent) in how ALTER TABLE works: Within a table-level write lock, MySQL creates a new table with the revised schema and inserts every row from the original in primary key order (building the indexes along the way).

Sound slow? It sure is, as this process is essentially unoptimized. An ALTER TABLE on a decently sized table can take on the order of hours to complete, and you have no access to it in the meantime. Although MySQL isn't helping us much, by digging a little deeper it is possible to take matters into our own hands to get the job done.

The key to improving the situation is to take advantage of what we know about our data. While locking the table during the migration is the most straightforward way to get a consistent copy, if you can determine when a row has been created or updated, you can copy most rows in parallel to normal application traffic, and then within a table lock copy over only the difference before swapping old for new, based on row creation or update info.

Additionally, by copying this difference asynchronously multiple times, you can shrink the size of the difference down to a only a few rows, such that the final lock-copy-swap is almost instantaneous.

Since almost the entirety of the operation happens outside of a table lock, this allows for zero downtime migrations of very large tables. At this point, the bottleneck is not the copy operation, but determining the changes between the old and new tables between passes. If you can take advantage of an index on the relevant row timestamp, you're set. If not, although the time in the table lock can be minimized with this method, it cannot be eliminated. I recommend adding an index when you can.

Table Migrator is a fairly simple script we wrote at Serious Business that encapsulates this logic in order to make it easier to use. Feel free to fork and improve!