Showing posts with label tweaks. Show all posts
Showing posts with label tweaks. Show all posts

Tuesday, 11 March 2008

Performance tips > indexing matters

As I've mentioned in a previous post, I've been doing a lot of performance tweaking recently. By far and away the biggest gains I've garnered have been by tweaking the SQL/DB in various ways. One of the biggest things I learned was to use indexes appropriately. This means two things:

  1. Put an index on anything important
  2. Get the field order right

So what's important?

From the perspective of an index, immportant means anything that you will need to fetch repeatedly - especially if that fetch is in a tight loop. This is one place where RubyProf is your friend. You can also take a look at the SQL queries scrolling up the screen while watching the console output to get a rough idea of the most important queries.

What about the field order?

Indexes work better if you put the "most discerning" fields before the less-discerning ones. As an example, suppose you have a (completely spurious) table as follows:

owner_name pet_name toy_name
Bob Fido Bone
Bob Fido Rubber chicken
Bob Whiskers Jingly ball
Bob Whiskers paper on string
Jane Woofy chew toy
Jane Woofy bouncy ball
Jane Kitty paper bag
Jane Kitty laser pointer

Assume for some reason it's important to have an index on all three fields above. The field that is most unique is the "toy_name" - doing a lookup on that field will get the best bang for your buck, so it should come first in your index, followed by the next-most discerning field. In the example data above it's the pet_name, but maybe you know your intended audience a bit better. If you know that a lot of people are calling their dogs 'fido' and their cats 'kitty', you might index on the owner_name. The most likely circumstance is that you'll have the owners listed in a users list and there's actually an owner_id - in which case you'd definitely set that as the second key for the index.

Thus you'd end up with:
add_index :pet_toys, [:toy_name, :owner_id, :pet_name], :name => :idx_pettoys_by_name_oid_pet

Is it really that important?

The contract I'm currently working on requires us to display reports with tens of thousands of orders and trades in them, cross-referenced against multiple calculated benchmark values. It's imperative that the indexes are set up to pull that data out of the db in the most efficient way possible.

I acheived an order of Magnitude speedup just by changing around the ordering of the indexes! This was worth it for us - YMMV