Scott Weston on March 13, 2014
One of our clients is in the process of migrating from Drupal 6 to 7. One of the requirements for this migration is to retain the same IDs from the previous version of Drupal. In Part 3 of this series I will outline the Keys to Success I learned during the migration. Did you miss post #1 or #2? Click back to set the scene.
While I wasn't perfect and I had to re-import a few times while developing the migration process, I did learn some key takeaways to succeed during this type of data migration.
Keys to success
(Since PHP array keys are zero-indexed, I thought I'd make my list of keys the same way.)
Key 0: Start with a clean slate
If performing a migration where the IDs need to be the same from Site A to Site B, it's imperative that the new site has no existing content / users (except for UID 1, of course).
Obviously, you need to build out the content model (content types, vocabulary, and user fields) to have something in which to place your new data.
If there is an existing content in the destination site, you will likely run into problems with trying to insert data into database tables with primary keys that already exist. In MySQL, these errors appear as "duplicate key" or "integrity constraint violations."
Key 1: Know thy data.
One of the factors that made this type of data migration possible was my familiarity with the client's site and their data. I originally built the site out in Drupal 6 a few years ago, and imported their original data from their SharePoint site. This background knowledge was critical to tease out any oddities in the import process to ensure that everything made a smooth transition from Drupal 6 to Drupal 7.
Key 2: "is_new" does amazing things.
In previous data migrations where the Node ID/ User ID could change, I didn't pay much attention to this little attribute while building the node or user object. But when also adding nid/uid to the object, you need to make sure that Drupal knows for sure that this is a new node or user. Therefore, setting $node->is_new = TRUE or $user->is_new is a must.
Key 3: Don't enable every module up front
There were a few contrib modules that didn't like seeing $node->is_new = TRUE and $node->nid both set in their hook_node_presave function and this caused problems.
I'm not saying this is an issue with these modules at all!
Having both of these set in a node_presave object is not normal. For me the modules included pathauto and xmlsitemap. In the case of the former, I had URLs I wanted to map, so the import phase didn't need pathauto enabled. In the latter case, I had to enable the module then rebuild the site's xmlsitemap to get the data populated for this module. All in all, this was a small trade off.
If you approach a data migration such as this, don't be afraid to dig in to the code to find out why a module is throwing errors when trying to save a node, and evaluating if temporarily disabling the module is a simple workaround.
Key 4: Know convention changes from Drupal 6 to Drupal 7
There were a few gotchas that I encountered in the course of this migration, including:
The convention for field naming and structure within objects changed slightly. For example, a basic field mapping in D6 to D7 could look something like:
$old_node->field_foo['value] = $new_node->field_foo[LANGUAGE_NONE]['value']
In the case of a D6 Node Reference to D7 Entity Reference, it looks something like:
$old_node->field_foo['nid'] = $new_node->field_foo[LANGUAGE_NONE]['target_id']
Date fields changed by dropping the "T" separating the date segment and the time segment:
"2012-06-13T18:30:00" changed to "2012-06-13 18:30:00"
Key 5: Error reporting your friend. Seriously.
When I was coding, testing, and performing the migration, I turned up the sensitivity of PHP error reporting to the max, so that any error resulted in (at least) a warning to the screen. This was invaluable for two reasons:
1. It quickly told me if I did something wrong with my coding (like not checking for the existence of an imported value before trying to set it in the destination object.)
2. It let me know if there was bad data coming in from D6 and would cause me to adjust my import tactics as necessary.
Key 6: Devel module is your BFF.
If error reporting is my friend, then the Devel module is by best friend! For each of my import scripts, I extensively leveraged the Devel module's dd() function. This little gem of the Devel module works much like dpm(), but the output goes to a file named drupal_debug.txt in your site's temp directory rather than to the screen. To see what dd() was reporting in real time, I used the following command:
tail -f drupal_debug.txt
The -f option tells tail to wait for data to be appended to the file and then output it to the screen as well.
As a naturally curious and problem-solving person, I loved the challenge of performing this data migration where the node and user IDs needed to stay the same between Drupal 6 and Drupal 7.
It definitely took me out of the mindset of "The Drupal way is the only way", but only a little. I did leverage Drupal APIs and my knowledge of the database structure to keep the migration clean and smooth.
While not every upgrade would require this approach, I am happy that we were able to do this for the client and deliver a site that met their requirements. In the future, we will look at this approach as one strategy we could employ when performing upgrades from Drupal 6 to 7.