As I've mentioned in a previous post, I've been doing a lot of performance tweaking recently. By far and away the biggest gains I've garnered have been by tweaking the SQL/DB in various ways. One of the biggest things I learned was to use indexes appropriately. This means two things:
- Put an index on anything important
- Get the field order right
So what's important?
From the perspective of an index, immportant means anything that you will need to fetch repeatedly - especially if that fetch is in a tight loop. This is one place where RubyProf is your friend. You can also take a look at the SQL queries scrolling up the screen while watching the console output to get a rough idea of the most important queries.
What about the field order?
Indexes work better if you put the "most discerning" fields before the less-discerning ones. As an example, suppose you have a (completely spurious) table as follows:
owner_name | pet_name | toy_name |
---|---|---|
Bob | Fido | Bone |
Bob | Fido | Rubber chicken |
Bob | Whiskers | Jingly ball |
Bob | Whiskers | paper on string |
Jane | Woofy | chew toy |
Jane | Woofy | bouncy ball |
Jane | Kitty | paper bag |
Jane | Kitty | laser pointer |
Assume for some reason it's important to have an index on all three fields above. The field that is most unique is the "toy_name" - doing a lookup on that field will get the best bang for your buck, so it should come first in your index, followed by the next-most discerning field. In the example data above it's the pet_name, but maybe you know your intended audience a bit better. If you know that a lot of people are calling their dogs 'fido' and their cats 'kitty', you might index on the owner_name. The most likely circumstance is that you'll have the owners listed in a users list and there's actually an owner_id - in which case you'd definitely set that as the second key for the index.
Thus you'd end up with:
add_index :pet_toys, [:toy_name, :owner_id, :pet_name], :name => :idx_pettoys_by_name_oid_pet
Is it really that important?
The contract I'm currently working on requires us to display reports with tens of thousands of orders and trades in them, cross-referenced against multiple calculated benchmark values. It's imperative that the indexes are set up to pull that data out of the db in the most efficient way possible.
I acheived an order of Magnitude speedup just by changing around the ordering of the indexes! This was worth it for us - YMMV