Showing posts with label rails. Show all posts
Showing posts with label rails. Show all posts

Wednesday, 20 May 2015

Rails Active Record .median

Rails neatly provides an .average method to easily find the average value of a column on your db, but it doesn't provide one for finding the median. Thankfully it's really easily to implement

If you just want to add it to one class, put this into the model:

def self.median(column_name)
  median_index = (count / 2)
  # order by the given column and pluck out the value exactly halfway
  order(column_name).offset(median_index).limit(1).pluck(column_name)[0]
end

If you want it to work across all Active Record models in your db... you could add it as a concern (then include it in every class you want it in) or you could just extend ActiveRecord::Base (so it's available to every Active Record, just like .average).

If you want to do that, add the following to /config/initializers/active_record.rb

class ActiveRecord::Base
  def self.median(column_name)
    median_index = (count / 2)
    # order by the given column and pluck out the value exactly halfway
    order(column_name).offset(median_index).limit(1).pluck(column_name)[0]
  end
end

Note: I've tested it even with chained scopes and it seems to work ok on our project (this counts as a final - ie you can't chain after it)

Note: as pointed out to me, the median of an even number of things is the average of the middle two values... and this version doesn't do that - this is a quick-and-dirty version. Feel free to adapt to be mathematically correct if you need that level of precision

Saturday, 6 September 2014

gotchas: require/permitting an array of ids

This one got me the other day... it seems that sometimes the following won't work:

params.require(:my_thing).permit(:widget_ids)

When the set of ids is expected to be an array, instead use

params.require(:my_thing).permit(:widget_ids => [])

Sunday, 26 August 2012

Rails: five year plan?

My workplace is putting a tender out to a government body, hoping to secure some work for a big development happening soon.

As part of the tender process, they asked us what is the five-year plan for the development of the Rails platform.

Keeping in mind that 37Signals' approach to planning is "don't - it's all fiction anyway", we still had to provide them with an answer... and this is what I came up with.


Asking for a five year plan is a question with a hidden assumption. What you *really* want to know is:

am I going to get stuck with a dead duck?

I think the nightmare scenario you're trying to avoid is that you'll get maybe 1-2 years down the line, have spent a lot of money on changing your system over to the new-fangled software, only for the Software Vendor to go bankrupt and leave you hanging in the air.

Suddenly you lose any support if things go wrong. Bugs are found, but you've got nobody to call to fix them and you certainly won't be seeing any new features. hackers might find a new security vulnerability and there'll be nobody on the front-line to patch that hole and stop them getting to your protected data.

You're in a bad, bad place.

Thankfully, this scenario can't happen with Ruby on Rails (RoR), due to its Open Source Nature.

RoR was originally developed by a company called 37signals - but unlike a proprietary product, 37signals doesn't own the codebase, because they gave it away to the community.

It takes a little while to wrap your head around how that changes things, and most people will instinctively think that this means less accountability, less power and less security - when actually the opposite is true... so bear with me as I compare this situation against what you might be used to with a proprietary vendor such as Microsoft's .Net

Microsoft own .Net they also keep all the code locked away so you can't see it and don't really know what goes on under the hood. They make all the design decisions, but they also make all the decisions about what gets fixed and when.

So, if you spot a bug in the code - you have to go ask Microsoft (nicely, and generally after waiting a long time on hold) to please fix it for you... If they say yes - it goes onto the end of a very long support queue for their team of developers to prioritise against all the other features and bugs they've had pour in over the last several years. and if they say no - you have no recourse. There's nobody else but Microsoft that can fix it for you.

You can complain about it - which will also get ignored.

If you have an SLA with them you can call and tell them to do it now... which *might* change their mind... but if things go sour - nothing on earth will persuade them to do anything. and your only recourse would be to take them to court...
and of course we all know that if it gets to court you've already both lost, you're just hoping to recoup some of the damages. Better to avoid the whole thing in the first place.

The short answer is that all the power lies in the hands of Microsoft, and you are reduced to a supplicant throwing yourself upon their mercy.

By comparison, RoR presents a very different picture.

As I mentioned before, 37signals doesn't own the code. If you have a copy of Rails, you do, and so do we, and so does everybody else that is using Rails.

This changes everything.

If 37signals goes bankrupt... there is still a huge community of people using (and contributing) to the development of Rails... and even if every single one of them goes bankrupt too - you still have a copy of it yourself. So if you need to fix a bug or patch a security hole - you have the code in your own hands and can do it yourself or get somebody like us to do it for you. Same thing for new features or even just customisations to make it work more like you'd wish it would.

The power-balance is radically different - you are the one with all the control.

Luckily I don't see 37signals going under any time soon. Rails was released in July 2004, and went 1.0 in 2005 and since then it has moved on from being an obscure platform used only by web-geeks and enthusiasts like me, to being enthusiastically embraced by a huge variety of businesses worldwide.

Rails has recently undergone a Major Upgrade with the release of Version 3.0 - totally overhauling and improving the structure of the framework to meet the ever-changing demands of the web world, along with security and performance improvements.

Meanwhile, Rails 2 is still strongly supported by the community - as many businesses are still using it, and it will continue to be supported until there is no-one left using it anymore This stands in stark contrast to the dire fate of obsolete Microsoft platforms - which Microsoft rigorously cuts off after a small number of months, stranding anybody left behind without support - or forcing them to pay huge re-licensing fees to upgrade to the newer version.

Rails is free, including all upgrades to new versions, so you can afford to stay on the bleeding edge as soon as you'd like to.

Even if 37signals drops support for an older version of Rails - the fact that the codebase is out there in the community means it remains alive and supported as long as the community remains. And that community is very big, and very active.

There are roughly 2000 active contributors to the rails-core codebase, including myself, and all of them are herded along by the rails-core team - a set of 12 developers, most of whom are currently employed full-time by 37signals to coordinate contributions.

What this means is that even if all of 37signals were suddenly wiped off the planet by a stray meteorite, there are still 1988 very active developers who are willing and able to continue developing the codebase... and not just minor enhancements and customisations of small side-features either, even my own contribution (while small) was a change to one of the fundamental core classes of Rails. While developers such as Yehuda Katz have overhauled the basic mechanics of the whole system (for the better, of course).
and there's nothing stopping you from making changes like this yourself at any time for the forseeable future - if you ever see a need.
Open Source means it's all available to you, all the time... forever.

Imagine trying to change something fundamental in one of Microsoft's products all by yourself... you wouldn't even get the chance because you don't have access to the code to even look at what's inside.

One final aspect that I'm sure is on your mind is security.

You're probably used to a big vendor taking responsibility for checking that the system is secure from hackers and fixing any security holes as they come up, and are wondering what happens with a scattered base of developers and code.

Again, I'll compare with Microsoft.
They're probably one of the biggest vendors on the planet, and they make a big noise about how secure their code is... and yet every week or two we hear about a new virus or worm exploiting a vulnerability in Windows.
Big Vendors make a big noise - but they actually can't guarantee security, however much they'd like you to think so.

So what can you get?

You can get an active community of developers continually testing the software for vulnerabilities and being responsive and timely in fixing any newfound holes.

and RoR has that in spades.

In fact, it's much better at it than an in-house development team like Microsoft. The reason being that shift in power-balance I mentioned earlier, coupled with the wider community base.

If a security hole is pointed out to Microsoft, it goes into the queue and *eventually* one of their developers gets a chance to look at it. Maybe it's fixed, or maybe Microsoft decides it's not important enough to fix because it's only affecting one customer so it's not worth their while... of course if that customer is you, then this decision is catastrophic, but given that Microsoft has all the power - there's nothing you can do about it.

With RoR, however, the power is all in your hands. You submit the bug-report to the community and chances are, somebody else is either experiencing the problem or also concerned about it affecting them. Given the size of the community, this means there's ten times the number of people available to fix the problem as with the in-house Microsoft developers team - and many of them are personally motivated to get the bug fixed because it affects their business too.

But even if nobody else in the world is affected by this but you - you can still get it fixed yourself - because you have the codebase, and you have access to developers that can get it done.
Something you simply can't do with proprietary code.

Far more likely, though, is that if you have a problem - then everybody else does too, so you announce the problem to the community, who picks it up and gets it solved and submits it back to the community for everybody to share... including you.

Sure, you don't have one company to smack with a stick if something doesn't get done... but you will actually find that you'll never be in that position in the first place - because there's always a better alternative.

and don't forget that with a bigger community - that's more people to find and fix bugs and security holes - before you even got to Rails in the first place. The Rails community has more collective eyeballs on the code than Microsoft can employ, and more eyeballs means more bugs spotted and therefore fixed.

So as a quick comparison:

Vendor-owned proprietary system: RoR and Open Source approach:
lock-in
- they choose when you upgrade
- restricted backwards compatibility
- you *must* upgrade or lose the software
Freedom
- you are free to upgrade when you want *if* you want
- it costs you nothing when you want to upgrade - you can afford to be on the bleeding edge
- support continues as long as people are using it
- or you can pay somebody to fix it for you whenever
They keep all the power
- bugs and holes are fixed when they want (or not at all) according to their own profitability
- features are only added if they think they can make a profit out of it
You have all the power
- bugs and holes are fixed by the community when they need fixing
- features are added on the fly by anybody that wants/needs them - you can add them yourself if you like
Security-by-obscurity
- only a few people get to look at the code, which means only a few people get to check that it works correctly. holes are more likely to be found by exploiters than by developers which means the first you hear of it is when somebody hacks your system
- holes are fixed if the vendor has time and only if they think it's worth the money they spend on it
Security through openness
- a hundred thousand eyeballs on the code means security holes are more likely to be spotted by developers and fixed before they even become a problem
- holes are fixed as soon as they become a problem - because the people fixing them are the people that are affected by the problem that are actually concerned about security rather than just the profitability.
Conclusion:
The software lives and dies with the vendor Even if the vendor dies, the community lives on and therefore so does the code. So you never get stuck with a dead duck.

Wednesday, 7 December 2011

Acts-as-taggable-on

Tagging is pretty popular these days, and it was time to add it to our site. Unfortunately, lots of the gems are old, and it's had to know if that means "good and has stuck around" or "buggy, obsolete and no longer supported".

The rubytoolbox page on rails tagging has several gems - most of which are marked as inactive now. The only one that looked like it had any recent activity is: acts-as-taggable-on. It's also the top-most-downloaded, so that looked good.

Best news: it is rails-3 compatible AND supports a recent build-version that is still Rails-2 compatible. As I've mentioned before, my current client is still on Rails-2 - because the upgrade pain is not currently outweighed by the new features.

The only annoyance is that the rdoc only has rails-3 post-install instructions rails generate acts_as_taggable_on:migration. These don't work for rails-2, and if you try just substituting "script/" for "rails ", it'll give you an error saying: Couldn't find 'acts_as_taggable_on:migration' generator

I had to hack about a bit to find the new migration name, but what you need is: script/generate acts_as_taggable_on_migration

After that I used this extremely good tutorial on tagging with acts-as-taggable-on.

I don't like the tag-cloud style of tag-selection, and instead prefer something much more like Stack Overflow. So I created my own tag-list as per the code below.

It lists all current tags for the class (assuming similar code setup to the tutorial above), and filter based on that keyword - incorporating any existing search or pagination conditions you already have. It will highlight the current keyword, and change that link to a "deselect if you click" link.


    # code in index page
    <% @tags.sort_by(&:count).reverse.each do |k| %>
      <% url_opts = {:action => "index", :controller => "posts"}
         link_name = "#{k.name} (#{k.count})"
      %>
      <% if @keyword == k.name %>
        <%= link_to link_name, url_opts.merge(:keyword => nil), :class => "tag current_tag", :title => "Click again to see all" %>
      <% else %>
        <%= link_to link_name,  url_opts.merge(:keyword => k.name), :class => "tag", :title => "Click to filter by #{k.name}" %>
      <% end %>
    <% end %>


   # code in controller
   options = {} # any search/pagination conditions go here
   @tags = Post.tag_counts_on(:keywords)
   klass = Post
   klass = klass.tagged_with(@keyword) if (@keyword = params[:keyword]).present?
   @posts = klass.paginate( options )




  /**** and associated tag-cloud styles ****/
  /* basic tag-box */
  .tag {
    background-color: #eee;
    border: 2px solid #ccc;
    color: orange;
    border-radius: 7px;
    -moz-border-radius: 7px;
    padding: 2px 15px;
    text-decoration: none;
  }
  .current_tag {
    background-color: #ddd;
    color: orange;
    border: 2px solid orange;
    border-radius: 7px;
    -moz-border-radius: 7px;
    font-weight: bold;
  }
  .tag:hover, .current_tag:hover {
    background-color: #bbb;
    color: red;
    border: 2px solid red;
    border-radius: 7px;
    -moz-border-radius: 7px;
  }

Next up is to figure out ye olde ajax auto-suggest when I add them.

Friday, 14 October 2011

rubygems upgrade killed rails

Ack! I just tried to upgrade rubygems... and it destroyed my working version of rails 2.3.5 (yes, we use it for a client that has not yet upgraded due to the quite reasonable "it ain't broke" assumption).

Now I can't run script/server without one of the following errors: /usr/lib/ruby/gems/1.8/gems/rails-2.3.5/lib/rails/gem_dependency.rb:268:in `==': undefined method `name' for "Ascii85":String (NoMethodError) or /usr/lib/ruby/gems/1.8/gems/rails-2.3.5/lib/rails/gem_dependency.rb:119:in `requirement': undefined local variable or method `version_requirements' for # (NameError)

Chris Oliver provides a quick fix for getting back to a previously-working version in his description of the undefined local variable or method `version_requirements` and there's an active bug-report for the undefined method `name' for "Ascii85":String

For now, though, it looks like the only solution is to downgrade and hope for a fix... which seems to be coming only for edge rails. I really don't like it that I have to maintain and old version of rubygems just so that I can run my client's perfectly functional rails stack. :(

Monday, 20 June 2011

What the hell is happening with Rails?

A whole slew of controversy has been stirred up with the post: What the hell is happening to rails - which managed to top the hackernews charts for quite a while last week.

It basically gave vent to a lot of concern in the rails community that rails is becoming too difficult to learn, and that may be scaring off newbies.

The new rails has certainly been in a great state of flux - and pushes the whole framework in a new direction. Whether or not you like that direction is one thing - but the fact that it is such a big change makes it difficult to know where to start if you're coming into the community for the first time.

I recently had an eye-opening experience at a hack-day where I suggested rails as a platform-of-choice for us to use... then spent the entire day helping one guy just get a basic rails stack up and running on his laptop, and then I still had to explain how to actually use it.. This isn't as easy a I remember when I first began.

One of the big benefits of original plain-ole-C was that anybody could learn the entirety of the language and keep it in their head all at one time. Contrast that with any of the big enterprise languages, which require a long ramp-up time even just to learn everything that is available in the basic suites. Not that having less is better - but it does make the early-learning stages much quicker... so I can see the point being made here.

In any case - I think the gist of the post/discussion is worry over the potential dilution of the framework. If we want to be all things to all people - it means a lot of work to genericise the platform, and that means big changes.

On one hand, I can see why it's being done. There are good reasons for all of the alternatives to the core-defaults... and being able to support them therefore opens up our audience to a greater market. But I can also see the point that it makes Rails *feel* a lot more bloaty than before... even if it isn't actually degrading performance, and it gives newbies so much more to learn just to get out the starting gates.

In any case - Yehuda Katz has now done a follow-up post explaining just What’s Up With All These Changes in Rails? - and it's worth a read to see why they've made some of their recent decisions.

and I plan to watch the continuing discussion closely...

Wednesday, 19 January 2011

Rails namespacing test gotchas

We have some controllers that are namespaced as Admin controllers eg: Admin::ProductController.

...and this is a legacy system, and never had any functional tests created for those...

So I was trying to get started on adding some functional tests. I created a test/functional/admin folder, then added a product_controller_test.rb under that, with the usual:

class ProductController < ActionController::TestCase

At first, the tests wouldn't even load properly, and I was getting the following error:

uninitialized constant ActionController (NameError)

That, apparently, is solved by explicitly putting:

require 'test_helper'

at the top of the file. After which it loads fine, and the null-test runs. but after that, I kept getting the following error:

RuntimeError: @controller is nil: make sure you set it in your test's setup method

It seems that you need to namespace your tests too, so instead of the class declaration I have above, you need:

class Admin::ProductController < ActionController::TestCase

The admin folder isn't enough.

Thursday, 26 November 2009

gem bundle with capistrano

[Update: There's a capistrano recipe in bundler now, if you just require "bundler/capistrano" apparently it automatically runs bundler install and installs the gems in shared/vendor_bundle]

Ok, so now you've converted your existing rails project to using gem bundle, and it runs fine on your development box, how do you update your capistrano recipes to do it automatically?

Current wisdom is varied. You can:

  1. check *everything* into your source control (very heavy!)
  2. check in only your Gemfile and the vendor/bundler_gems/cache (which holds all the *.gem files) into source control (still pretty heavy, but makes sure you don't need internet access every time you deploy - use 'gem bundle --cached' in your recipes
  3. symlink your vendor/bundler_gems/cache directory to a shared location, and then all you need to checkin is the Gemfile (quicker, still needs to unpack/bundle gems each time)
  4. symlink all your vendor/bundler_gems/ directories to a shared location, and then all you need to checkin is the Gemfile (quick and not too heavy)
  5. symlink the entire vendor/bundler_gems directory to a shared location (much quicker).

5: Symlink the lot!

This would be my preference. For a single project deployed on production, there should be no reason why we can't just symlink the whole bundled-gem directory, and let the Gemfile keep that directory up-to-date. This feels no different to using a shared file-attachments directory.

Sadly doesn't work due to this:
No such file or directory - MyProjectRailsRoot/vendor/bundler_gems/../../Gemfile
which will resolve to two directories above the *shared symlinked* directory :P

This is because the bundled_gems generates an 'environment.rb' that points back up the directory at the Gemfile that created it... by using a relative path that unfortunately hits the symlink as described above. It'd be nice to be able to tell gem_bundler to make that link absolute...

If anyone knows a way around this, please do let me know!

So we fall back on 3 or 4. My first attempt was to use 3:

3: ci the Gemfile, symlink the cache directory

This seems to be a reasonable compromise. Our deployments are getting pretty damn bloated as it is - with all those plugins and vendored rails etc etc... we don't want to add anything more if we can avoid it. Even gems. We have had no problem with downloading gemfiles, so there's no need to check them in, as long as the deployed directory has access to them when needed. Thus, checking the Gemfile, symlink the cache directory... maybe even symlink, the 'gems' directory sometime too.

So, how to do it?

First - prepare the bundler_gem directory for the first time login to your deployment server, and create the shared directory in, eg:

mkdir ../shared/bundler_gems 
mkdir ../shared/bundler_gems/cache

Next add the symlink and gem bundle commands to your capistrano deployment recipe:

# symlink to the shared gem-cache path then bundle our gems
namespace :gems do
  task :bundle, :roles => :app do
    run "cd #{release_path} && ln -nfs #{shared_path}/bundler_gems/cache #{release_path}/vendor/bundler_gems/cache && gem bundle"
  end
end

after 'deploy:update_code', 'gems:bundle'

And you *should* be able to deploy fine from there. My experience wasn't completely smooth, and I had to set up on the server manually the first time - but from then on the capistrano-task worked fine. Would love to hear your own experiences...

Hudson

If you're using hudson for Continuous integration, you can achieve the same effect by updating the Hudson build script adding:

# prepare gems and shared gem-cache path
ln -nfs <hudson config path>/bundler_gems/cache  vendor/bundler_gems/cache
gem bundle

Also - if you have test/development-environment-only gems... make sure you add your integration-server's environment to the Gemfile environment list, or Hudson won't be able to find, say, rcov or rspec - and then the build will break.

4: ci the Gemfile, symlink the sub-directories

I started with the solution described above, but it still takes a while unpacking all the gems. I'd much prefer to use the solution #5, but that fails due to the relative links in environment.rb. So the ugly compromise is to symlink everything *except* the environment.rb

It works and we can deploy ok with capistrano... but I'm looking for a better solution.

...and after a few deploys...

Well, it was a nice try, but after a few deploys suddenly we started getting breakages... the application couldn't find shoulda for some reason. Now we use a hacked version of shoulda and have vendored it as that's a quickie way to monkey patch a gem.

We told gem bundle where we'd vendored it, and it all seemed to work fine. Unfortunately it broke, and a quick look at the symlinked 'gems' directory tells us why:

rails-2.3.4 -> /var/www/apps/my_app/releases/20091201120443/vendor/rails/railties/
shoulda-2.10.2 -> /var/www/apps/my_app/releases/20091201120443/vendor/bundler_gems/dirs/shoulda

What you can't see are these lines blinking red. What you can see are that these gems are pointing at a specific revision's vendor directory... and not the latest one! Coupled with a :keep_releases = > 4... that release's directory is quite likely to disappear very quickly - and in this case, it already has. This makes these symlinks (set up by gem bundle during release 20091201120443) point to a vendor directory that no longer exists. So it's really not as much of a surprise that our app can't find shoulda anymore.

I think the problem comes along because of gem bundle's rather useful feature of not reloading a gem if it's already installed. It looks to see if the specification exists - if so, it assumes that it's been installed correctly - it doesn't check that the vendored location still exists. That unfortunately spells our doom when capistrano deploys: because gem bundle runs from the newly-created release-directory, that's where the symlink is initially set up. gem bundle doesn't then check it later - even though later its been swept away in a release-cleanup.

So - we're currently working on a fix. Two options present themselves:
1) find a way for gem bundle to symlink from 'releases/current'. This means it has to exist before we do a gem bundle... and that's dangerous because it lets users through into a not-yet-working release. Or
2) we could not vendor any gems - but set up our own gem-server. This will work, but a bit more of a hassle than we prefer for vendored gems.

Troubleshooting:

Git repository 'git://github.com/taryneast/shoulda.git' has not been cloned yet

The recipes online all tell you to use gem bundle --cached and that's a great idea if network connectivity is scarce. As it uses gems that are already in a cache (without going to fetch them remotely)... but it will fail if you don't already have the gems in the cache. SO it relies on you applying solution number 2 above.

There are two solutions:
The first is to just use gem bundle always in your recipes.
The other is to use gem bundle on your development directory - then check the cache into your source control instead of symlinking it to a shared directory. (ie use solution 2 above). This will work just fine, and if you add a new gem, you'll have to make sure you run gem bundle on your working directory before checking in.

I'm not sure of a nice way to get around this if you're using solution 3. It might be worth setting up a rake task that checks if the Gemfile changed and opportunistically runs gem bundle instead of the --cached version (in fact, it'd be nice if gem bundle --cached had a --fetch-if-not-present option!). If you have a solution that worked for you, let me know.

rake aborted! no such file to load -- spec/rake/spectask

I got this when running rcov. It just means you need to add a few extras gems to your Gemfile. We needed all of these. YMMV

  only [:development, :test, :integration] do
    gem 'rcov'  # for test coverage reports
    gem 'rspec' # rcov needs this...
    gem 'ci_reporter' # used somehow by rake+rcov
    gem 'ruby-prof' # used by Hudson for profiling
  end

:bundle => false with disable_system_gems

Looks like these two don't play nice. ie if you choose for a single gem to be unbundled - but have disable_system_gems set - it isn't smart enough to realise that this one gem is meant to be an exception to the 'disable' rule. If you have any unbundled gems, make sure you remove disable_system_gems - or your application simply won't find it.

Wednesday, 25 November 2009

Convert from config.gem to gem bundler

Why gem bundler?

Our sysadmin hates rake gems:install

It seems to work for me, but all sorts of mayhem ensues when he tries to run it on production... of course I have a sneaking suspicion it's really our fault. After all - we tend to forget that we already happened to globally-install some gem while we were just playing around with it... which means we didn't bother putting it into the config.gem section of environment.rb... oops!

However, there's a new option on the horizon that looks pretty interesting, and is built to sort out some of the uglier issues involved in gem-herding: gem bundler

Yehuda has written a very thorough a tutorial on how to set up gem bundler. But I find it kinda mixes rails and non-rails instructions and it's not so clear on where some things go. I found it a little fiddly to figure out. So here's the step-by step instructions that I used to convert an existing Rails 2.3 project.

Step-by-step instructions

1: Install bundler

sudo gem install bundler

2: create the preinitializer.rb file

Yehuda gave some code in his tutorial that will load everything in the right order. You only need to copy/paste it once and it will then Just Work.

Go grab the code from his tutorial (about halfway down the page) and save it in a file: config/preinitializer.rb

Don't forget to add that file to your source control!

Update: If you're using Passenger, update the code to use:

module Rails
  class Boot
  #...
  end
end

Instead of:

class Rails::Boot
  #...
end

3. create the new gems directory

mkdir vendor/bundler_gems

Add this directory to your source control now - while it's still empty!

While you're at it, open up your source-control's ignore file (eg .gitignore) and add the following:

vendor/bundler_gems/gems
vendor/bundler_gems/specifications
vendor/bundler_gems/doc
vendor/bundler_gems/environment.rb

4. Create the Gemfile

Edit a file called Gemfile in the rails-root of your application (ie right next to your Rakefile)

At the top, add these lines (comments optional):

# because rails is precious about vendor/gems
bundle_path "vendor/bundler_gems"
# this line forces us to use only the bundled gems - making it safer to
# deploy knowing that we won't accidentally assume a gem in existence
# somewhere in the wider world.
disable_system_gems

Again: don't forget to add the Gemfile to your source control!

5. Add your gems to the Gemfile

Now comes the big step - you must copy all your config.gem settings from environment.rb to Gemfile. You can do this almost completely wholesale. For each line, remove the config. from the beginning and then, if they have a version number, remove the :version => and just put the number as the second param. I think an example is in order, so the following line:
config.gem 'rubyist-aasm', :version => '2.1.1', :lib => 'aasm'
becomes:
gem 'rubyist-aasm', '2.1.1', :require => 'aasm'

For most simple gem config lines, this should be enough so that they Just Work. For more complicated config.gem dependencies, refer to the Gemfile documentation in Yehuda's tutorial.

If you already have gems in vendor/gems You can specify that bundler uses them - but you have to be specific about the directory. eg:
gem 'my_cool_gem', '2.1.1', :path => 'vendor/gems/my_cool_gem-2.1.1'

Extra bonus: if you have gems that are *only* important for, say, your test environments, you can add special 'only' and 'except' instructions (or whole sections!) that are environment-specific and keep your staging/production environments gem-free eg:

gem "sqlite3-ruby", :require => "sqlite3" , :only => :development
only :cucumber do
  gem 'cucumber'
  gem 'selenium-client', :require => 'selenium/client'
end
except [:staging, :production] do
  gem 'mocha'          # http://mocha.rubyforge.org/
  gem "redgreen"       # makes your test output pretty!
  gem "rcov"
  gem 'rspec'
end

5a: Add rails

Now... at the top of the Gemfile, add:
gem 'rails', '2.3.4'
(or whatever version you currently use)... otherwise your rails app won't do much! :)

Obviously - if you've vendored rails you will need to specify that in the Gemfile way eg:
gem 'rails', '2.3.4', :path => 'vendor/rails/railties'

If you've opted *not* to disable_system_gems, you won't need this line at all. Alternatively, you could tell the Gemfile to use the system-version anyway thus:
gem 'rails', '2.3.4', :bundle => false

Also, I'd recommend adding the following too:

 gem 'ruby-debug', :except => 'production'  # used by active-support!

6. Let Rails/Rake find your gems:

Edit 'config/environment.rb' and at the bottom (just immediately after the Rails::Initializer.run block, add:

# This is for gem-bundler to find all our gems
require 'vendor/bundler_gems/environment.rb' # add dependenceies to load paths
Bundler.require_env :optional_environment    # actually require the files

7. Give it a whirl

From the rails root directory run gem bundle

The bundler should tell you that it is resolving dependencies, then downloading and installing the gems. You can watch them appearing in the bundler_gems/cache directory :)

and you're done!

...unless something fails and it can't find one - which means you probably forgot to add it to config.gems in the first place! ;)

PS: ok... so I've also noticed you sometimes have to specify gems that your plugins use too - so it may not be entirely your fault... ;)

PPS: if Rake can't find your bundled gems - check that config/preinitializer.rb is set up correctly!

Friday, 4 September 2009

UUID over the wire

Need to use Active Resource on a remote object that has a UUID?

Last time I checked, Active Resource still overwrites the id with a to_i version of the uuid... this makes "123ABCDE456" turn into 123... not what we want.

But Hyperactive Resource (HyRes) works just fine.

UUID created by the remote system

Does the remote API create the UUID for you? If so - you're laughing. Just add: self.primary_key = :uuid to your HyRes object and you're set. HyRes doesn't do a to_i conversion, so the UUID will still be stored as a string.

Need to set UUID locally?

Install the 'uuidtools' gem then add a before_create callback (in your resource) thus:

# for uuid-generation
require 'uuidtools'
class Widget < HyperactiveResource
  self.site = 'http://localhost:3001' # etc...
  self.columns = [:uuid, :widget_type, :name, :description] # etc...

  self.primary_key = :uuid
  before_create :generate_uuid

  # helper to generate a uuid for each asset
  def generate_uuid
    self.uuid = UUIDTools::UUID.timestamp_create.to_s
  end
end

Thursday, 6 August 2009

HyRes : ActiveResource that actually works!

It's three months on from my original post whinging about the lack of Validations in Active Resource.

At that time I put my money where my mouth was and forked a copy of the HyperactiveResource plugin, which provided a very crude, basic improvement over vanilla Active Resource.

So What have I achieved?

I've actually done a lot since then. I have implemented a lot of the TODOs that I wrote down as essential for us to get a good, solid basis for a workable system (see the done list below).

I'd class this plugin as working and functionally useful. Right now it's almost as useful as Active Record was a few years back... which is a vast improvement on the basic Active Resource interface.

I think there's still a lot of room for improvement, but the basics are there, and it *feels* like Active Record now - which before it definitely did not. Before it felt like you could play around with it for toy systems, or implement minor bits of functionality - but now you can really implement a fully-functional system on top of it. I know this - because we have done[1].

What's missing?

HyRes still has some niggling issues, and some stuff that would make it a much smoother migration for Active Record. I'm working on these and there is a perpetually-renewed TODO list on the HyRes README file

The funkier aspects of Active Record

I haven't yet re-implemented proper AR reflections - so the associations need some work... you can't do: widget.wodgets.find(...) and you can't use named_scope - which means that many basic plugins (eg restful_authentication) won't work out-of-the-box. But there's enough functionality there that pulling together a system that is totally non-Active Record is feasible.

API assumptions

Currently HyRes makes some assumptions about your API - it assumes a RESTful interface as the optimal configuration. If your API does not match the assumptions, there are some workaround available, but it might not be as useful.

An example is the validates_uniqueness_of method. This currently assumes that your API takes an array of conditions on the query-string, and that this will filter your returned set of findable objects.

If your API doesn't do this... the method currently defaults to fetching *all* objects and filtering on the rails-end... which is likely to be extremely slow (and may lead to timeouts). But it's there in case it's necessary. You may have to simply re-write that method with your own API-requirements in mind. I welcome alternative solutions to the problem...

testing...

Probably one of the nastier downsides atm is that HyRes doesn't have its *own* set of tests... We currently test it through the full suite of tests on our own system. This is mainly due to the fact that it's really quite hard to set up tests that don't rely on an existing API that's up and running. In our system we use a combination of mocks in our functional tests, backed by a fully-functional, running mock API for our integration tests... this is near-impossible for the HyRes plugin itself... so I'm still thinking about how I'll test it independently.

So what *does* it do?

The current list of implemented features (along with how to set up your API to use them) is also available on the HyRes README file but a quick snapshot of the list as of today is below. Note - in some cases, I still find it hard to believe that Active Resource didn't already implement these...

Columns

Because Active Resource isn't backed by a db... you can't use the table columns to determine the known set of attributes of a resource. ARes currently works by accepting any incoming data as an attribute, and using MethodMissing as an accessor for any known attribtues. This is fine for situations where you don't know what attributes will be returned by the remote system.

The downside is that if no value is returned for an attribute and you try to access it... it throws a MethodMissing exception (uuugly!).

If, however, you know what attributes to expect (because it's an agreed API-interface ), it'd be nice to be able to tell Active Resource which attributes we expect - and have it return any missing attributes an nil (rather than explosion).

Thus was born the columns= method.

Currently this method also works a little like attr_accessible - any attributes listed like this will be passed through on create/update... but nothing else will. This allows you to set temporary accessors (eg password and password_confirmation without them being passed over the wire.

Validations

directly pulls them in from ActiveRecord::Validations. For Rails 3.0 - we can use the ActiveModel::Validations component.

Note: validates_uniqueness_of is still experimental - as it requires a network-hit to actually determine whether you have any existing objects with the given field-value. You can't rely on uniqueness-of as there is no way of locking the remote-system's db - and thus somebody could have added a resource with that value in the meantime.

Callbacks

There are now callback hooks for before_validate, before_save etc - all the standard Active Record callbacks we've come to know and love. You can use them for callback chaining a la Active Record (still experimental - may have missed some, but you can definitely use validate :my_validation_method)

Finders

conditional finders

eg Widget.find(:all, :conditions => {:colour => 'blue'}, :limit => 5). This functionality working depends on your API accepting the above filter fields and returning something sensible. There's a lot of doco on how to set up your API (if you have that luxury) in the HyRes README

Dynamic finders and instantiators

eg find_by_X, find_all_by_X, find_by_X, find_last_by_X These rely on your API accepting filter fields as they are really just a bit of a convenient alias for find(:conditions => X). Dynamic finders take any number of arguments using _and_: find_by_X_and_Y_and_Z

We also have dynamic instantiators: find_or_create_by_X OR find_and_instantiate_by_X - both of which also take any number of args, just like the dynamic finders.

Dynamic finders/instantiators also take ! eg: find_or_create_by_X! will throw a ResourceNotValid exception if create fails

Other ActiveRecord::Base-like functions

  • update_attribute / update_attributes - actually exist now!
  • save! / update_attributes! / update_attribute! / create! - actually exist, now that we can do validation. They raise HyperactiveResource::ResourceNotSaved on failure to save/validate
  • ModelName.count (still experimental) - with optional finder-args. This works by first trying a /count request on your model object - and if the route fails it just pulls out all the objects and returns the length (ie - a nasty - but functional fallback).
  • updated collection_path that allows suffix_options as well as prefix_options = allows us to generate Rails-style named routes for collection-actions
  • no explosion for find/delete_all/destroy_all when none are to be found. (Active Record just returns nil - so should we)
  • ActiveRecord-like attributes= (updates rather than replaces)
  • ActiveRecord-like load that doesn't dup attributes (stores direct reference)
  • reload that does a full clear/fetch to reload (also clears associations cache, now that we have one)

Associations

A lot of the basic associations were in HyRes before I arrived - I've polished a few up (eg adding the belongs_to/has_many functions) and bugfixed

  • Resources can be associated with records
  • Records can be associated with records
  • Awareness of associations between resources: belongs_to, has_many, has_one (note - no HABTM yet)
    • Patient.new.name returns nil instead of MethodMissing
    • Patient.new.races returns [] instead of MethodMissing
    • pat = Patient.new; pat.address_id = 1; pat.address # returns the address object
  • Can fetch associations even with a nested route by using the ":nested" option on the nested resource's class. This command automatically adds a prefix-path, and will pre-populate the parent's id when you do an association collection_fetch.
  • Supports saving resources that :include other resources via:
    • Nested resource saving (creating a patient will create their associated addresses)
    • Mapping associations ([:address].id will serialize as :address_id)

What's next?

Well, one big thing that's next is that I'm planning on starting to work this stuff back into Rails itself. The brilliant component-architecture and upcoming Active Model changes for Rails 3 are just made for this. What I've been doing on HyRes can feed into that process to make Active Resource what we've always wanted - a real replacement-candidate for Active Model. For this purpose I have a fork of rails - but work will progress slowly (as I'm working on it in my spare time).

Until then, I'll keep on upgrading HyRes as needed for our own corporate requirements. Below, I've listed some features that Active Record provides and that I'd *personally* like to see implemented next... but this is certainly not an exhaustive list. (and I'm very open to other ideas...)

One thing I'd love to see is other people using HyRes for their own projects. I'd love to hear feedback on how it works (or not) as that will feed the project and feed into the work that eventually goes back into Rails. Better-still, if you're willing to work on implementing some of the still-missing features... you would be most welcome. Feel free to check out the Hyperactiveresource project on github and have a go.

Still TODO

  1. Testing - inside the plugin! (currently we test via our actual web-app's set of exhaustive tests)
  2. MyModel.with_scope/named_scope/default_scope
  3. MyModel.find(:include => ...)
  4. attr_protected/attr_accessible
  5. MyModel.calculate/average/minimum/maximum etc
  6. Reflections. There should be no reason why we can't re-use ActiveRecord-style reflections for our associations. They are not SQL-specific. This will also allow a lot more code to automatically Just Work (eg an Active Record could use has_many :through a HyRes)
  7. has_and_belongs_to_many
  8. Split HyRes into Base and other grouped functions as per AR
  9. validates_associated
  10. write_attribute that actually hits the remote API ???
  11. a default format for when it doesn't understand how to deal with given mime-formats? One which will just pass back the raw data and let you play with it?
  12. cache the raw (un-decoded) data onto the object so we don't have to do a second fetch? Or at least allow a universal attribute to be set that turns on caching

Notes

[1] The subject of an upcoming-blogpost will be on the work we've done: "Rewriting monolithic legacy systems in Rails"

Wednesday, 29 July 2009

CSV views with FasterCSV and csv_builder

We were asked to add a "Download as CSV" link to the top of each of our reporting actions. We already had a "revenue_report" controller-action with it's own html view template... and just wanted to add something similar in CSV.

FasterCSV seemed to provide some great CSV-generation functionality... but it builds the csv and spits it out right there in the controller.

uuuuugly!

We shouldn't put view-generation code into the controllers - that splits up our nicely-decoupled MVC stack!

I've also seen some ideas about putting a to_csv method into an individual model. Now, if I'm leery about putting view-generation in the controller... I'm even less impressed by putting it into the model! But I can see the benefits of simplicity.

However - I'm still not sure it fits with our requirements. A @widget.to_csv call works just fine for the WidgetsController#show action... and probably even the WidgetsController#index action (where you can run @widgets.map &:to_csv or similar)... but most of our reports span multiple associations and the data is grouped, filtered, and summarised. Each of these sets needs headings explaining what it is and laying it out nicely in a way that makes sense to the user reading the CSV file. Putting that kind of visual-only functionality in your model is where it really starts to get ugly again. I am left with one big thought:

Why can't I just put my CSV-template into the views directory?

enter: csv_builder plugin

csv_builder comes with extremely minimal documentation... but it's not that hard to use. Here's a step-by-step with examples to follow:

  1. install fastercsv and csv_builder
  2. add a .csv option to the respond_to part of your action
  3. add your <actionname>.csv.csvbuilder template into your views directory
  4. Add a "Download as CSV" link for your user

Step 1: Install FasterCSV & csv_builder

sudo gem install fastercsv
./script/plugin install git://github.com/econsultancy/csv_builder.git

Don't forget to restart your server!

csv_builder has already set up a format-extension handler for csv so you don't need to add it to the mime-types yourself, just start using it in your actions.

Step 2: In your controller action:

csv_builder will work out of the box even if you just add the simplest possible respond_to statement:

  respond_to |format| do
    format.html # revenue_report.html.erb
    format.xml  { render :xml => @payments }
    format.csv  # revenue_report.csv.csvbuilder
  end

With nothing more than the above, csv_builder will grab out your template and render it in csv-format to the user.

You can tweak the FasterCSV options by setting some @-params in your action. The csv_builder gem README explains how to use those, but the most useful would be to set @filename to give your users a meaningful filename for their csv-file. You can also specify things like the file-encoding, but go look at the gem README to find out the details

Here's an example of a full controller action:

# GET /sites/1/revenue_report
# GET /sites/1/revenue_report.xml
# GET /sites/1/revenue_report.csv
def revenue_report
  @site = Site.find(params[:id])
  @payments = @site.revenue_report_payments
  respond_to do |format|
    format.html # revenue_report.html.erb
    format.xml  { render :xml => @payments }
    format.csv do
      # timestamping your files is a nice idea if the user runs this action more than once...
      timestamp = Time.now.strftime('%Y-%m-%d_%H:%M:%S')
      # passing a meaningful filename is a nice touch
      @filename = "revenue_report_for_site_#{@site.to_param}_#{timestamp}.csv"
    end # revenue_report.csv.csvbuilder
  end
end

Step 3: add a template into your views directory

The final step is to add a .csv.csvbuilder template to your views directory. This is just like adding an html.erb template for your action, except that the file will have the extension .csv.csvbuilder. AFAIK csv_builder can only support template that have the same name as your action, so in my example the template would be called revenue_report.csv.csvbuilder

Your view template is where you can stash away all the FasterCSV guff that will build your csv file. This works the same way as any other FasterCSV-generated content - but just make sure you generate it into a magic array that is named 'csv'. Here's an example:

  # header row
  csv << ["Revenue report for site: #{@site.name}"]
  csv << [] # gap between header and data for visibility

  # header-row for the data
  csv << ["Date", "", "Amt (ex tax)", "Amt (inc tax)"]

  row_data = [] # for scoping
  @payments.each do |payment|
    row_data = [payment.created_at.to_s(:short)]     
    row_data << "" # column gap for visibility
    # note you can use the usual view-helpers
    row_data << number_to_currency(payment.amount_ex_tax)
    row_data << number_to_currency(payment.amount_inc_tax)

    # and don't forget to add the row to the csv
    csv << row_data.dup # it breaks if you don't dup
  end # each date in date-range
  csv << [] # gap for visbility

  # and the totals-row at the very bottom
  totals_data = ["Totals", ""] # don't forget the column-gap
  totals_data << @payments.sum &:amount_ex_tax
  totals_data << @payments.sum &:amount_inc_tax
  csv << totals_data

Step 4: add a CSV link

Something like the above code will generate a nice plain csv file and spit it out at the user in the correct encoding... but now we need something to actually let the user know that they can do this.

This is pretty simple - it just requires adding the format to your usual path-links. eg:

<%= link_to 'Download Report as CSV', site_revenue_report_path(:id => @site.to_param, :format => :csv) -%>

Testing

Testing for non-html formats still seems pretty crude. You have to manually insert the "accepts" header in the request before firing off the actual request - then manually check the response content_type. It'd be nice if we could just add :format => :csv to the request... anyway, here's a sample shoulda-style test case:

  context "CSV revenue_report" do
    setup do
      @request.accept = 'text/csv'
      get :revenue_report, :id => @site.to_param
    end

    # variables common to all formats of revenue-report
    should_assign_to :site
    should_assign_to :payments

    should_respond_with :success
    should "respond with csv" do
      assert_equal 'text/csv', @response.content_type
    end
  end # context - CSV for revenue report

Gotchas:

Don't forget to add config.gem!"

I fond I needed config.gem 'fastercsv' and a require 'csv_builder' in my environment.rb. You may need a gem-listing for both (esp if you're using bundler)

Rails 3 compatitbility?

The original csv_builder plugin is not rails 3 compatible and is in fact no longer being supported. But it has officially changed hands, to the newer csv_builder gem. This is Rails-3 compatible and not backwards compatible - though he maintains a 2.3.X branch for people to submit bugs.

csvbuilder or csv_builder

While the plugin is named csv_builder, be careful to name your files with the extension: csv.csvbuilder or you'll spend hours pulling your hair out about a Missing template error while it fruitlessly searches for your csv.csvbuilder file!

All your report data must be generated in the controller!

This makes you more MVC-compliant anyway, but if you had any collections being fetched or formatted in the view... now's the time to move them into the controller action as your CSV-view will need them too.

duplicate inserted arrays

You'll notice that the example template has a set of data being added to the csv array... you'll also notice that each row is generated on the fly - then I save a *duplicated* version into the csv array. If you don't duplicate it... you may do funky things with overwriting the row each time. In my case I also needed to declare the temporary array outside the loop in order to preserve scope. YMMV - I'd appreciate any Ruby-guru answer as to why this doesn't work without that.

don't reset csv

Don't do this: csv = [] it breaks everything!

Rendering as html

I've noticed there's a pathalogical condition where everything is going well - and it's even getting o the template... but it's *still8 rendering html.

I eventually figured out that it's actually rendering the layout around the csv... and it occurs when you've explicitly stated a layout (eg layout 'admin') somewhere at the top of your controller. To fix it, you just need an empty render without layout false eg

  respond_to do |format|
    format.csv do
      timestamp = Time.now.strftime('%Y-%m-%d_%H:%M:%S')
      @filename = "revenue_report_for_site_#{@site.to_param}_#{timestamp}.csv"
      render :layout => false  # add this to override a specifically-stated layout
    end # revenue_report.csv.csvbuilder
  end

Tuesday, 28 July 2009

rails gotchas: undefined method `expects'

If you've just installed rails edge and run the tests, they may fail with: NoMethodError: undefined method `expects' for ... etc etc over and over again (along with a few other errors).

Install the mocha gem via rubygems to get rid of a lot of these errors.

It was the line NoMethodError: undefined method `stub' for ... that clued me in.

It seems that rails requires a short list of gems/packages (beyond rails itself) simply to run its own tests... yet there is no rake gem:install task available to help you figure out which ones... and they aren't in the "contributing to rails" guide. I'll be submitting a patch shortly...

Following on from this I got:

MissingSourceFile: no such file to load -- openssl

and farther down the list:

LoadError: no such file to load -- net/https

Unfortunately, these errors occur when Ruby hasn't been installed with openssl-support. If you're running ubuntu, you can just run apt-get install libopenssl-ruby.

Monday, 27 July 2009

40% speedup using Nokogiri!

Cut to the chase...

To cut your XML-processing time dramatically, sudo gem install nokogiri then add the following to config/environment.rb inside the Rails initializer.

config.gem "nokogiri"
ActiveSupport::XmlMini.backend='Nokogiri'

Back-story and pretty pics

The problem

So, our site makes heavy use of ActiveResource [1], meaning that most of our data is located remotely.

Not surprisingly, some of our pages are *really* slow, so I landed the task of speeding them up. Apart from page-caching (not possible), fragment caching (only helps on the *second* hit), or some complicated messy idea of data-caching locally (tedious and likely to be evil); my first thought was to reduce the number of network hits. Clearly that's a high pain point, especially on our heavy pages that have many resource fetches.

Before I dove into performance hacks and updating the business logic into twisty little data reuse-patterns for network-hit reduction... I decided to actually try profiling.

I've been setting up a ruby-prof and kcachegrind recently[2]... and figured I should at least give that a look-at to see if my assumptions are correct.

I'm really glad I did, because when I ran it over our heaviest action, I saw that all the highest-weight method-calls led back to some form of ReXML parsing.

Searching on the ReXML components showed that the heaviest ReXML method took up a whopping 1 million process cycles. When our total process-cycles came to 5.8 million - that's a significant chunk of time spent in that one library.

As I mentioned - our site makes heavy use of ActiveResource, and one *big* problem with ActiveResource is that all your objects are parsed and re-parsed as xml for every fetch of data... so, in hindsight, it's fairly obvious that our site would spend a *lot* of time in the XML-parsing library. Any speedup in that department would help us immensely.

The solution?

We've recently been to Rails underground, and one of the lectures[3] had a slide comparing the speed of ReXML to several other ruby XML-parsing libraries[4]. Nokogiri came out as a clear winner in the speed department. The loser was equally clear... that being the Rails-default: ReXML

So, switching out the library would be an obvious speed win.

As it turns out - it's really easy to do this. Just install the gem, and require it in your Rails initializer using the instructions at the top of this post

But did it really help?

It seemed faster... but can we prove it?

From ReXML to Nokogiri - 40% speedup

Yup.


Notes

[1] through the HyperactiveResource plugin
[2]I'll be giving a talk at LRUG on 12/08/2009 on how to use and interpret ruby-prof and kcachegrind
[3]During the talk by Maik Schmidt on Sneaking Ruby & Rails Into Big Companies
[4] I'm not sure, but it's possibly the one from this page comparing Ruby-XML performance benchmarks

Friday, 17 July 2009

Adding prompt to select_tag

Rails's select method is pretty advanced. You can pass in all sorts of funky options to format your selector nicely, including :include_blank => true and the related :prompt => 'Pick one or else!'

select_tag, however, seems to have fallen behind in the usefulness-stakes and implements neither of the above... even though it does an extremely similar thing.

So here's a quick-hack function you can drop into ApplicationHelper to implement that functionality for you.

  # override select_tag to allow the ":include_blank => true" and ":prompt => 'whatever'" options
  include ActionView::Helpers::FormTagHelper
  alias_method :orig_select_tag, :select_tag
  def select_tag(name, select_options, options = {}, html_options = {})
    # remove the options that select_tag doesn't currently recognise
    include_blank = options.has_key?(:include_blank) && options.delete(:include_blank)
    prompt = options.has_key?(:prompt) && options.delete(:prompt)
    # if we didn't pass either - continue on as before
    return orig_select_tag(name, select_options, options.merge(html_options)) unless include_blank || prompt

    # otherwise, add them in ourselves
    prompt_option  = "<option value=\"\">" # to make sure it shows up as nil
    prompt_option += (prompt ? prompt.to_s : "") + "</option>"
    new_select_options = prompt_option + select_options
    orig_select_tag(name, new_select_options, options.merge(html_options))
  end

Wednesday, 15 July 2009

Gotcha: UTC vs Local TimeZones with ActiveResource

So... your database is filled with datetime data and it's all configured to localtime, not UTC... We also have this you-beat nifty ability to set all our datetime-handling functionality to a given timezone by setting, say: config.time_zone = 'London' in config/environments.rb... or do we?

If you also use ActiveResource (or the new, actually-working HyperactiveResource), you'll find that suddenly you're getting a UTC-local timezone issue once more.

The problem is that the xml that comes back from a remote API is converted into a Date/DateTime using the core-extension to the Hash.from_xml method... which has the following LOC:

"datetime"     => Proc.new  { |time|    ::Time.parse(time).utc rescue ::DateTime.parse(time).utc }

The fix

You need to do two things. Firstly. Hack that line[1] and replace it with:

"datetime"     => Proc.new  { |time|    ::Time.zone.parse(time) rescue ::DateTime.zone.parse(time) },

Secondly... somehow it doesn't pick up the timezone even though it's been helpfully added in via the config... so you need to open up config/environments.rb (or create a rails initializer) and put:
Time.zone = 'London'[2]
in there (outside the rails initialization block).

Notes:
[1]To hack rails, you can either
a) hack on your own rails gem = risky... will be overwritten the next time you sudo gem update or
b) rake rails:freeze:edge - which means you have your rails in your own vendor/rails directory... but means you have to rake rails:update manually... up to you which you hate more.

[2]Obviously substituting your own timeZone as appropriate here. See the TimeZone doc for what you can pass in.

Tuesday, 9 June 2009

acts_as_good_style

Originally published in 2007, this article is updated on a semi-regular basis and republished when I add something new...

Rails Idiom

There are a number of small pieces of rails idiom that contribute to better rails code. I've been collecting them at work as they come to me (often by reading other people's code). I know that a lot of this can be gleaned simply by reading the "Agile rails" book... but apparently this would actually take effort, and thus doesn't occur as frequently as I'd like. A lot more is just basic good common sense... but we all know how common that is. In any case, I'll probably put this up on my website soon and keep adding to it.

Note - I don't always have a justification for these things... as with everything - it's up to you to pick out the bits that work for you.

Basic good sense

Clarify your code

This is something that I learned from Joel Spolsky from his article on making wrong code look wrong. He suggests that code smells should stand out a mile off. Doing so will make it much less likely that we'll miss one and leave a hole for a bug to form in.

He further suggests that we should change the way we code to systematically remove all habits (even benign ones) that muddy the waters. If we make it really obvious what is right and what is potentially wrong - then it'll be easier for us to discriminate between the right and wrong bits of the code. This follows from basic psychology. By analogy: it's easier to notice black when all you can see is black or white - but adding many shades of grey makes it harder to detect. If we make a habit of writing only white (or pale-grey)... it's harder for us to miss a black spot.

Joel goes on to discuss a technique for using Hungarian Notation (in the way it was orignally intended) to increase visbility of safe/unsafe strings - which is meant to be just an example of one technique. (So don't take it as the main essence of his message).

Many of the things I bring up in this Idiom-Guide are techniques that draw from this philosophy.

Don't comment-out dead code

Don't comment out code that's no longer needed - just delete it. It'll still be there in the SVN repository if we ever need it again.
Dead code just clutters up the project and slowly makes it less readable/maintainable. :(

Avoid Magic global variables where at all possible

The "don't use magic globals" argument has been around for a while, but here's the rails take on it.

The Rails equivalent of a magic global variable is the "@variable" (though I've also seen some nasty uses of the params hash). I know that instance variables are essential for passing data from the controller to the view. But they should not be relied upon for anything else! eg do not set them and assume they exist when calling methods on a model - or in a library.

From a given piece of code that uses said variable, you cannot tell where the variable originates. If it were simply passed as a parameter to the method, there would be an obvious line of command. Instead, you have to trawl through all the earlier functionality to figure it out or resort to a recursive grep on the rails directory.

This becomes a problem if you have to maintain the code. Given a magic global variable: Can I delete the value? Can I change it without affecting later code (that relies magically on its value)? If I run across a bug whereby it comes in with a value I didn't expect - how do I find which method set the value incorrectly?

On balance - it's much better to simply pass an explicit parameter through the usual method-call interface, so at least you have a chance of tracing the line of command in future.

Don't add ticket/bug comments into the code

I don't wanna have to wade through hideous piles of comments referencing some obscure (fixed) ticket/bug such as:

# DRN: ticket ARC256 -- remove currency symbol
# end ticket ARC256

It wastes time and eyeball-space. This stuff can conveniently go into the SVN log message - where it'll be picked up by the appropriate ticket in the "subversion comments" section. This way it'll only get read by people who actually care about the ticket/bug in question.

Alternatively - what we get is a buildup of comments referencing random tickets that are probably no longer relevant. Especially because the ticket numbers mean nothing when taken outside the context of our ticketing system.
Think about it this way: how long would we keep that comment in the code? when does it get removed? whose job is it to remove the ticket comment or do they all just build up until we have more ticket-comments than code? How unmaintainable!

Please just don't do it.


Ruby idiom

Don't use .nil? unless you need to

There are very few times when you are specifically checking for actual nil results. Mostly you are just checking to see if an object has been populated, in which case use: if foo don't bother with if !foo.nil?
This also brings me to:

Use unless where appropriate

Don't check if !foo instead use: unless foo

This is just the ruby-ish way to do things, and it reads better in many circumstances... given that code is meant to be read by humans, this is a Good Thing.

Do not use and/or - instead use &&/||

The words "and" and "or" mean something quite different to the symbols: &&/||
Unless you understand *exactly* what you are doing - you should use the latter (&& / ||) instead.

Any expression that contains and/or is almost certainly inelegent and should probably be rewritten.

do-end vs {} blocks

{} should be used for single-line blocks and do-end for multi-line blocks... this is a "taste" thing - but it seems to be fairly common practice now.

Use single-line conditionals where possible

if you're only doing one thing based on a conditional, use: foo_something if x
rather than:

if x
   foo_something
end

Obviously this doesn't work if you're doing more than one thing or when you have an else-clause, but it's useful for reducing space eg:
return render(:action => :edit) unless @successful
which also brings me to:

Nested function calls need parentheses

In a nested function call eg :
return render :action => :edit
Rails can make a good guess at whether :action => :edit is the second parameter of the "return" function, or the first parameter of the "render" function... but it's not wise to rely on this and it will generate warnings. It's much better to parenthesise parameters to any nested function eg:
return render(:action => :edit)

However:

Don't parenthesise where unnecessary

When parentheses are not required for the sake of clarity, leave them out.
do_something foo, :bar => :baz


Basic Rails security

h( x-site-scripting )

In your view templates, surround all user-input data with the "h" function. This html-encodes anything within it - which will prevent cross-site scripting attack.

Do this even if the source is trusted!

If you get into the habit of making it part of the <%=h opening, it will just become natural. An example:

  <p> Name: <%=h @object.name -%>
  <br />Description: <%=h @object.descr -%>
  <br /><%= image_tag image_url_for(h @object.image.file_name) -%> 
  <br /><%= link_to "Edit object", edit_object_path(@object) -%>
  </p>

:conditions => ['protect_from = ?', sql_injection]

Most SQL queries are built up by Rails automatically. But adding conditions to your finds or your paginators brings in a slight, but real possibility of introducing sql-injection attacks.
To combat this, always pass everything in as variables. eg:

:conditions => ['name like ? AND manager_id = ? OR is_admin = TRUE', 
@user.name, @user.id]

Generators, Plugins and Gems... oh my!

Generators

use scaffold_resource

When you create a new model, generate a scaffold_resource, even if you aren't going to use the view pages.
It generates wonderful tests as well as your model and migration for free.

It's worth it!

It's also worth it to go through the hassle of entering all of the attributes on the command-line. Again because it generates tests, fixtures, migrations, views etc for you.
For usage: script/generate scaffold_resource --help

-x and -c

If you pass the "-c" flag to a generator (script/generate whatever), it will automatically svn add the generated files. This isn't necessary, but it's a really useful thing so you don't have to hand-add every file individually.

There is a similar flag (-x) for when you install plugins. This will automatically add the plugin to the project via svn:externals.

Plugins

Don't use a plugin unless it does exactly what you want

It's so easy to write functionality in Rails that there's no point in using somebody else's plugin unless it does *exactly* what you want it to do.
Especially don't bother using a plugin if you have to hack it to pieces and pull out half the useful functionality. You are just creating a maintenance nightmare in your future.

That being said, if a plugin has half the functionality you need... it's a great place to learn the bits you need to use.


DB and migrations

Use migrations

But you weren't going to hand-hack the db anyway were you?
If you don't use migrations, then capistrano can't deploy the changes automatically. That only means more work for you - and means we can't auto roll-back if something horrible goes wrong.

Don't go back and edit old migrations!

I now have a policy that states "migrations only go forwards". This comes from several bad experiences where a developer (sometimes me) has gone back to edit an old migration and totally locked up the system for everybody else.

You may know that a deployment requires you to "migrate down then back up again". But sometimes that either gets forgotten, or is actually harder to do in practice than in theory eg if you're using capistrano to auto-deploy your site, it's harder to get it to do twisty migrations than just to "move forwards one".

Besides, which, it's a hassle for you to tell everyone on the dev team "before you do an svn up, migrate down to VERSION X, then svn up, then migrate back up again"... what if somebody doesn't want to lose their existing data? what if they're an svn-junkie and have already done an svn up before you get to them? what if somebody is not in today and you have to catch them before they come in tomorrow?

It's way too easy to get your bodgied-up back-hacked migrations into a wedged state where a new person cannot back up or go forward, and often cannot even start from scratch (eg from a fresh copy of a database). This becomes a serious problem for you if you ever have to change servers (or set up a new testing server or whatever) and have to run all the migrations from 0 to get the db ready.

Even if it all runs as you expect, it's three extra steps to remember than just "rake db:migrate", which raises the spectre of Human Error.

It's annoying to make it work, prone to error and bloody awful to fix when it breaks. All in all much easier just to write a new migration to change the existing stuff.

Extra migrations don't actually hurt you - they really don't take up that much more space in your SVN repository - they're worth having them around for the knowledge that anyone, anywhere will have a completely up-to-date system even if they start from complete scratch.

Add messages to data migrations

All of the basic ActiveRecord migrations tend to spit out their own status messages. Data migrations don't, and it's often useful to know what your migration is doing - especially if you've got a long data migration that would otherwise sit there in silence, looking like it's hung your machine.
So drop in a few status messages eg say "migrated #{orders.size} orders, moved widgets over to new wodget style"

Don't add excessive status messages to migrations

Obviously the above messaging can go too far. Don't write out a line for every piece of data you migrate or you'll get heaps of scrolling text rather than an idea of which migration we're up to - that useful data can get lost in the noise. Just a status message showing how much has been done or what stage we're up to in the migration will do.

Use "say" in migrations

Instead of using "p" use "say" - it gets formatted nicely.


MVC-specific idiom

If you are unsure about what MVC is all about - I suggest you do a bit of basic research on the subject. It's really important to keeping the code clean in a Rails site. Rails is completely founded onthe principle of keeping the MVC layers separate - if you begin tying them together, your run the risk of causing the code to become brittle - losing the flexibility that is the reason why you're using Rails in the first place.

Basic MVC principles

Keep view code out of the model

Your model should be independant of your views. A widget doesn't need to know whether it's being displayed as html, XML or via a screen-reader. It is an independantly-addressable object that contains "data and some knowledge of how to maniupulate that data".

Despite this I consistently see things like methods named "show_edit_link" which apparently tell the html whether this model's edit link should be visible on the html list page. That sort of logic should be in the view only. The model can have methods that, say, determine if they should be "editable" by the current user... and the code that determines if the edit link is displayed could check that method, but should remain inside the view.

If there is common code in your views - then it can be pulled into the helper file for that model. If it's used across multiple models, then it should go in application_helper.rb. It should never appear in the model.

The other common booboo I see are "display" functions that include html (eg "format field X to display nicely"). This is also very wrong! It means that, later, when you want to change html styling across the entire site you have to dig around inside every file in the site to change it... rather than simply changing the views - which is How It Should Be. From having to do just such a nightmareish site-upgrade, I can tell you that the fewer files you need to change - the easier your life will be in the long run.

All view code should remain in the views - with commonality pulled out into helpers.

Data manipulation code should go in the models - not views or controllers

Ok, so you have a report that you want to display that pulls data out of your widget... and your report view pulls it out bit by bit and does some calculations on it - then displays it.

So what happens when your client asks you to now do that report in PDF format... or CSV as well?

If all your data-manipulation methods are on the model itself, then you only need to pass the model object into the view and the view can then pull the data out in the same way for each view style . The alternative is to have a messy set of views/controllers where the same code is duplicated, violating the DRY principle and adding to your future maintenance headaches.

All data-manipulation code should remain on the model that owns the data.

Model idiom

Group relationship commands at top of model

One of the most important and useful things to know about a model is what it is related to - eg the "belongs_to"/"has_many" (etc) relationships.
If these calls get scattered throughout the model object, they will get lost amongst the noise and it will be hard to tell whather a given relationship has been declared...
Thus it's Good Practice to group them all together - and to put them right at the "top" of the class. eg:

class Widget < ActiveRecord::Base
  belongs_to :widget_class
  has_many :widget_parts
  has_one :pricing_plan

  #... rest of the model goes here
end

IMO this goes also for composition objects, "acts_as_<whatever>" declarations and auto_scope declarations - all of which declare the "shape" of the model and its relationships with other objects.

Group validations

The second-most important group of information about a model is what consists of valid data for the model. This will determine whether or not the data is accepted or rejected by ActiveRecord - so it's useful to know where to find it. So again, do not scatter it all about the model - put it near the top... preferably just below the "relationship" data. EG:

class Widget < ActiveRecord::Base
  belongs_to :widget_class
  has_many :widget_parts
  has_one :pricing_plan

  validates_presence_of :widget_class, :pricing_plan
  validates_numericality_of :catalogue_number, :integer_only => true
  def validate
     errors.add_to_base "should have at least one part" if widget_parts.blank?
  end

  #... rest of the model goes here
end

Validations should obey the DRY principle...

Please save my eyeballs from having to read the same thing over and over, and obey the DRY principle. You can write:

# please use:
validates_presence_of :field_one, :field_two, :field_three, :field_four
# instead of:
validates_presence_of :field_one
validates_presence_of :field_two
validates_presence_of :field_three
validates_presence_of :field_four

You can also use "validates_each" if you have several things that perform ostensibly the same validation... but it isn't an included one. eg:

# please use:
validates_each [:field_one, :field_two, 
                :field_three, :field_four]  do |model, attr, val|
  model.errors.add(attr, "should be a positive number") unless val.nil? || 0 <= val
end
# instead of:
def validate
  errors.add(:field_one, " should be a positive number")  unless field_one.nil? || 0 <= field_one
  errors.add(:field_two, " should be a positive number")  unless field_two? || 0 <= field_two
  errors.add(:field_three, " should be a positive number")  unless field_three? || 0 <= field_three
  errors.add(:field_four, " should be a positive number")  unless field_four? || 0 <= field_four
end

Read the section on validations in the book!!!

There are lots of really useful standard validations available in Rails... PLEASE check for whether one exists before writing your own hack into "def validate"... you might be surprised at how many are provided absolutely free of charge!
One of the most useful being "validates_each"... so you don't have to repeat the same thing over and over and over... remember that DRY principle?[1]

Another is "validates_format_of" which allows you to perform several validations at once - and give a consistant error message eg checking if a percentage falls within the range (0-100) with an optional percentage sign:[2]

  validates_inclusion_of [:my_percentage, :your_percentage], :in => 0..100, 
                       :message => "should be a percentage (0-100)"
  validates_format_of [:my_percentage, :your_percentage], :with => /^(\d{1,3})%?/, 
                       :message => "should be a percentage (0-100)"

If a validation doesn't refer to a single field... put it on base.

When it refers to more than one field, feel free to use

errors.add_to_base <whatever>
don't do something silly like: errors.add(nil, ...)
That being said - if you can squeeze it onto a field, please do - as it really helps the user to quickly find where the problem is in their form.

Give the poor users some useful feedback when validating

Don't tell them "Your weights should add up to 100"... tell them "your weights add up to 82%, your weights need to add up to 100% to be valid". It actually helps them see exactly what's missing.

When you update your validations, update your fixture data

This one has got me a couple of times. The problem is that fixtures are loaded into the database without going through validations - so your test data could be wrong - even when rake test passes flawlessly :(

Name your callbacks and declare them as handlers

If you have a number of callbacks (eg before_save/after_create hooks), it's better if you name them as a method, and just refer to them near the top of the model. eg:

# instead of:
class Widget < ActiveRecord::Base
  belongs_to :widget_class
  has_many :widget_parts
  has_one :pricing_plan

  validates_presence_of :widget_class, :pricing_plan
  def before_save
     # adds default pricing if none was specified
     pricing_plan ||= PricingPlan.find_default
  end
  #rest of model here
end
# try using:
class Widget < ActiveRecord::Base
  belongs_to :widget_class
  has_many :widget_parts
  has_one :pricing_plan

  validates_presence_of :widget_class, :pricing_plan
  before_save :add_default_pricing

  # handler to add default pricing if none specified
  def add_default_pricing
     pricing_plan ||= PricingPlan.find_default
  end
  #rest of model here
end

This example is too simple to easily show it, but there are a number of reasons why this is a good idea.

  • Firstly, you've named the method, which gives you an idea of what it's for = easier to maintain.
  • Secondly, if you need to add multiple of the same hook, you don't need to mash them all into the same callback method - which also aids readability.
  • Thirdly, by keeping each call as a single line, you can list them all near the top of the model class without taking up heaps of room (especially annoying in the case that the methods start getting really long). This helps you to quickly see what is and is not being done, without having to read through the details of the methods... or, even worse, seeing the first "after_create" hook and thinking it is the only one, when another is hidden later in the class.
  • Fourthly, if the method ever gets more complicated, or has to change, your method is encapsulated - the call to the method is independant of any other callback handlers of the same type.

Use Rails helpers for simple find calls

Rails comes with lots of helpful convenience functions. Read the book and learn about them! The find helpers are a good example. Something like the following is an excellent candidate.

  # instead of
  thing = Widget.find(:first, :conditions => "name='wodget'")
  # use the find_by_<whatever> method
  thing = Widget.find_by_name('wodget')

It shortens the code and makes it clear exactly what you're trying to find.

View idiom

Use -%>

Make sure you put a minus sign before closing your %>. This reduces whitespace in the resulting html - which makes it easier for us to read when we need to diagnose a display problem. eg use

<%= render :partial => :foo -%>
<% if some_condition -%>
   <div id="bar">
       <%= bar_fields -%>
  </div>
<% end -%>

h( x-site-scripting )

See this topic in the RailsSecurity section under XSiteScripting

Apply the DRY principle

Templates can have common code too - even across resource-boundaries.
Eg pagination links, or a common header/footer. Common view code should be factored out into a partial template and can be shared by a single resource-type, or across multiple resources.

Shared partials go in /app/views/shared

If a partial is to be used in a common layout, or across multiple resources (eg a common header for all resources reached after login), then the partial should be stored under the "/app/views/shared" directory.

Suppose you have a partial named "_common_footer.rhtml" in the shared directory. You can access it by referring to the partial thus:

<%= render :partial => 'shared/common_footer' -%>

Don't write a string inside a code snippet

This is silly: <%= " | " -%>
In a template, you can just type the pipe (or whatever) without the extra FrouFrou around it.

Controller idiom

Never, never, never use @params or @flash

These were deprecated some time ago. Instead use the similarly named methods: params or flash
Incidentally - you should be reading and taking heed of the warnings that the script/server is giving you about these things...

redirect when moving on

When you are moving on to another stage in the process, you should use redirect. eg you have successfully updated a page and are now moving to the index page.
When you are coming back to the same page, you can use render (eg the user entered an error and you are redisplaying the 'edit' page).

The reason we do this is that a redirect will change the URL in the address bar to the new page. If you just render a different page-action, the user will still have the URL of the previous page, even though they are now looking at a different page. This is Bad because if the previous page was, say, a form-POST... and the user clicks "refresh"... Bad Things will happen. Whereas if you redirect, then the user will have the URL of the new page and it's perfectly fine for them to hit refresh to get the page they're on.

Make private methods obvious!

Private controller methods aren't visible as actions. It can be a problem if somebody doesn't notice your little word "private" and starts to add methods below it without realising - and then wonders why Rails can't find their shiny new action.

So make sure that it's obvious and difficult to make the mistake.

Firstly, put private methods only at the bottom of the file!
Why bother going through spaghetti code with continual calls to private then public then private again... etc and having to figure out exactly what level of visibility we are right now? sort them and put the private ones all together at the bottom.

Secondly, use a line like this:

    private ###################################################################################

To make it painfully obvious to all who follow that below this line thar be dragons...


Testing idiom

All testing

Learn your basic assertions

Another "please read the book" request - there are lots of useful assertions already written for you. Using the appropriate ones makes your life easier as the error messages are more appropriate/meaningful for your test case. I especially find it irritating to see things like assert foo == bar when assert_equal not only has a better error message, but also means you're unlikely ever to accidentally use only one '=' - causing the test to silently pass when it should have failed.

Put the expected value first in assert_equals

The error message reads better that way.

Don't cram multiple tests into a single test case

Do not write generic tests with a dozen things to test all in a single test case.
Example shown below. Note - the real code example I took this from actually had another 6 tests crammed into it...

There are (at least) two reasons for not doing this:

Firstly - if the first test here breaks... it doesn't keep running the other tests. If it breaks, you can only tell that the first test isn't working... if you split them out into individual tests, rake will keep running the other tests - this is extremely helpful as you will then know whether all the validations are failing - or just that one.

Secondly - naming tests to mean what they're testing actually helps you find out what's going on when it breaks. A generic "test all sorts of stuff" name doesn't help you much, but a "test that widgets can't be created if the size is negative" does. You won't be thinking about this when you're writing that test case (and are fully immersed in the code you're testing), but six months later when something breaks your test... you will want to know what it's actually breaking and every bit of help (no matter how small) counts.

# don't use:
  def test_create_should_not_accept_invalid_parameters
    assert_no_difference Widget, :count do
        wid = create_widget(:vat_percentage => -1)
        assert wid.errors.on(:vat_percentage)
    end

    # invalid price range
    assert_no_difference Widget, :count do
        wid = create_widget(:price_range_min => -1)
        assert wid.errors.on(:price_range_min)
    end
    assert_no_difference Widget, :count do
        wid = create_widget(:price_range_max => -1)
        assert wid.errors.on(:price_range_max)
    end
    assert_no_difference Widget, :count do
        wid = create_widget(:price_range_max => 100, :price_range_min=>200)
        assert wid.errors.on(:price_range_max)
    end

    # invalid size range
    assert_no_difference Widget, :count do
        wid = create_widget(:size_min => -1)
        assert wid.errors.on(:size_min)
    end
    assert_no_difference Widget, :count do
        wid = create_widget(:size_max => -1)
        assert wid.errors.on(:size_max)
    end
    assert_no_difference Widget, :count do
        wid = create_widget(:size_max => 100, :size_min=>200)
        assert wid.errors.on(:size_max)
    end
  end
# instead try:
  def test_create_should_fail_on_negative_percentage
    assert_no_difference Widget, :count do
        wid = create_widget(:vat_percentage => -1)
        assert wid.errors.on(:vat_percentage)
    end
  end
  def test_create_should_fail_on_negative_min_price
    assert_no_difference Widget, :count do
        wid = create_widget(:price_range_min => -1)
        assert wid.errors.on(:price_range_min)
    end
  end
  def test_create_should_fail_on_negative_max_price
    assert_no_difference Widget, :count do
        wid = create_widget(:price_range_max => -1)
        assert wid.errors.on(:price_range_max)
    end
  end
  def test_create_should_fail_on_backwards_price_range
    assert_no_difference Widget, :count do
        wid = create_widget(:price_range_max => 100, :price_range_min=>200)
        assert wid.errors.on(:price_range_max)
    end
  end
  def test_create_should_fail_on_negative_min_size
    assert_no_difference Widget, :count do
        wid = create_widget(:size_min => -1)
        assert wid.errors.on(:size_min)
    end
  end
  def test_create_should_fail_on_negative_max_size
    assert_no_difference Widget, :count do
        wid = create_widget(:size_max => -1)
        assert wid.errors.on(:size_max)
    end
  end
  def test_create_should_fail_on_backwards_size_range
    assert_no_difference Widget, :count do
        wid = create_widget(:size_max => 100, :size_min=>200)
        assert wid.errors.on(:size_max)
    end
  end

Group tests in a meaningful way

Test files start out with only a handful of tests... but this rapidly changes. Very soon we get vast numbers of tests covering all sorts of actions.
To reduce the potential maintenance nightmare, group related tests together and put in a comment to indicate why they are grouped together. Any new test should be either put with related groupings or a new grouping created for it.

As an example: the most common set of tests revolves around resource CRUD - in which case it makes sense to group tests with related CRUD actions (eg "edit" and "update") near one another.
Another example is for the "user" object. users go through stages (process of creating a new user, the activation process, CRUD for an ordinary/active user, then destroying/reviving a user), so tests in these stages are grouped together.

If you're using rspec or shoulda, you are provided with a natural grouping mechanism (describe and context respectively). Use that to your advantage and make your future life easier.

Use named fixture records

Don't use bare fixture ids in your tests (eg get :edit :id => 1), instead use named references (eg @quentin.id).

This is a bit of a gotcha as the automatically-generated tests often use bare ids. However, these should be avoided where at all possible.

There are three main benefits:
Firstly, it makes the code more readable, with fewer magic numbers and more meaningful variable names.
Secondly, if we must reorder the fixture records (and thus change the ids) - at least we know which record we really wanted and we don't need to scour the tests looking for the ids to change.
Thirdly - if you're using auto-generated ids (rather than specifically passing them in) you won't know the id - and assuming that it's 1 sets you up to fail.

Similarly, don't use magic strings - especially for things like testing login/authentication. This will eventually bite you. It's much better to specify @quentin.login, @quentin.email (or whatever) so you don't have to go through your tests updating at a later time.

Don't forget to reload

The number of times I've been caught out by this... when you change something and are testing that it got saved into the database correctly, make sure you use @foo.reload. It's especially important if you're checking that you get the right validation errors on your ActiveRecord... you then want to test that it *didn't* change in the database too.

Common functions can be made into methods

If you find yourself performing the same action over and over, don't forget that you can DRY it up by putting it into a common function at the bottom of the test file.

  def setup
    @default_widget_hash = {:name => 'My widget', :size => 42, :wodget => wodgets(:my_wodget)}
  end
  def test_should_create_widget
    assert_difference Widget, :count do
      w = create_widget
      assert !w.new_record?, "should not have errors: #{w.errors.full_messages.to_sentence}"
    end
  end
  def test_widget_should_require_name_on_create
    assert_no_difference Widget, :count do
      w = create_widget(:name => nil)
      assert w.errors.on(:name), "Should have required name"
    end
  end
  def create_widget(options = {})
    Widget.create(@default_widget_hash.merge(options))
  end

Really common tests can be made into assertions

If you find yourself testing the same thing in multiple unit/functional tests, you can DRY that up too it up by putting it into the test helper as a new assertion. Don't forget to name it "assert_X" (where X is descriptive).

Make sure you name it starting with "assert_". This convention allows you to see which methods will have assertions in them (and thus possibly cause your test to fail), as opposed to general methods that simply set-up/tear-down information.

  # Assert that the handed object (thing) has no errors on it.
  def assert_no_errors(thing, msg = "")
    fail("should have passed an object") unless thing
    e = thing.errors.full_messages
    assert e.empty?, "Should not have error: #{e.to_sentence} #{msg}"
  end

Unit tests

Test your CRUD actions

It may seem like overkill to test CRUD here *and* in functional tests, but this is the place to test validations and any things that will affect basic CRUD-actions (eg if you have a method that stops your Widget from being deleted after it's been submitted for approval... then you need to try deleting a submitted one and make sure it fails!

If you only leave it up to your controller tests to check this stuff, then you are getting your data back filtered through one more layer of abstraction and will have to sort out whether it's a model-bug or a controller-bug. It's better to be sure of your model before adding an extra layer on top. It really makes your life easier in the long run!

Sadly scaffold_resource no longer writes these tests for you. :P

Test your validations

Each validation should be tested by trying to create your model object with that one set to nil. Trust me, it's worth it in the long run!

Functional tests

Test your validations

Create a functional test for each validation - again, try to create an object, nulling out the attribute-under-test. I know this seems to double-up on the unit tests - but what you're testing here is that the model is correctly getting back an error on the required attribute. This will be what is displayed on the user's screen - so they can fix the error before moving on. You also need to test that the template is still on the "edit" page. If it's not, then all the ActiveRecord Errors will be in vain as your user will never get to see them.

Test authentication-levels

If you have different levels of authentication (eg administrators get to see everyone's stuff, but you can only see your own stuff) then you must test both ways... test that you can see what you should see, but also test that you can't see stuff that you shouldn't see! You don't want your users peeking into your admin accounts just because you missed the "unhappy path" tests.
...and don't forget to test what happens when you're not logged on too.

Tests need more than just "assert_response :success"

Unfortunately, "assert_response :success" only checks that *something* happened - not that it was the right something.

To be more precise, "assert_response :success" asserts that rails successfully returned an "HTTP 200" response (as opposed to a 404 or a redirect). Unfortunately it does not check that you received the response you were expecting... (though I wish it were that simple) - sadly you do have to actually check that it not only returned a response - but that it was the correct response with the correct variables etc.

In the case of a "get" request - you should check that we are rendering the correct template, and that the "assigns" variables have been correctly populated.
In the case of a request that does something - check all the variables afterwards to make sure they changed in the way that you expected (eg that the variables actually were updated).


Notes

[1] Why is it that the DRY principle is the one I seem to have to harp on the most? Repeated endlessly reaching a pinnacle of irony.

[2] check my earlier post on "validates_percentage_of" for this example in context.