Tuesday 6 May 2008

Rails gotchas: Data Migrations conflict with validations

So the problem is that we have some basic data that we like to preload into the database - eg an initial admin user. We created the data migration to do this and it ran fine, validated correctly and was loaded into the database for us.

Over time we added new migrations and new validations and at every step the db and code were in-synch.

Now we need to set up a new testing server. We ran the mmigrations from VERSION=0 and suddenly we get an error on the data migration along the lines of:

rake aborted!
undefined local variable or method `group_id' for #<User:0x3367bc8>

Obviously this column wasn't needed when we wrote that migration, and by now we already have the admin user in our system, so the migration hasn't been run on our latest code... until now.

The problem is that the code *now* requires a group_id for the user to be valid; but at this point we haven't reached the migration that adds the groups table to the database, along with the corresponding group_id column on the users table.

However, all the latest code is in our model, including the validation that requires valid user to have a group_id. *BUT* the group_id method simply doesn't exist yet, because ActiveRecord will only find that method by reflection once the table has the appropriate column.

We have a conundrum. We don't want to remove that validation, but we also need that data to be in the database.

So what can we do?

There are several nasty options we considered, such as adding a rescue-block around the validations and catching "just that error". Unfortunately this has the problem of hiding any future errors that we may need to fix. Allowing errors to get artificially caught, like this, can lead to them ending up in production code without us having noticed their existance. :P

We could also go back and delete the data-migration from the original file and generate a new migration for it. The new migration will be called after all the current ones and so will match match the current code. This just delays the inevitable, though. We will cimplu come up against the same problem at some future time when new validations are brought in. There needs to be some solution that will fix this for all time.

There were two reasonable solutions that we considered.

The first is to put the required data into a fixture. The fixture can be kept up-to-date with the code and therefore will always pass the validations of the present-set of code... it could then be loaded on-demand (via the console), or we could alter the original data migration to load the fixture instead of creating an ActiveRecord object. The latter has the same problems as the original issue - it will try to load in things like the group_id column at a point in the migrations that the column doesn't yet exist. The former would work - but we'd need to remember to do it every time - which means it's prone to user error.

The alternative involves a simple one-word fix... which is why it won out.

The Validations module overrides the standard ActiveRecord::save function with the save_with_validations function. This latter function takes a single, optional parameter that can tell ActiveRecord to ignore validations. ie you call save(false) and it will save the record without checking that the data is valid.

If we can assume that our own data in the data migration is valid for our purposes, then we can use this to solve the problem, as the validations will simply be ignored.

2 comments:

Anonymous said...

There are other options. One is to build and execute the INSERT statement yourself instead of using ActiveRecord. This is the most future-proof solution, and you can easily write a helper to do this.

Another option is to define the model class in the migration without the validations, so that it overrides the definition in the models directory. A simple "class Foo < ActiveRecord::Base; end" will do the trick.

Taryn East said...

Cool. these are some good options for the future.

In this particular case I wanted to change past migrations so they "kept working" with as little pain... er change as possible ;)

Which is why save(false) was good in this instance.

Hopefully we won't need to do this ever again... but life being what it's like, it's great to hear there are even more options open that will keep pace with future changes.

Thanks heaps!