Now I'm not 'new new' to git anymore, I've been using it off and on for at least a couple of years, but I've been using cvs/svn for many more years than that. git has a steep learning curve[1], and have found the git-svn crash course invaluable. But there was one thing missing:
The #1 thing I wish I had learned when I was new to git:
...is that you must commit to git *far* more frequently than you do to svn.
From my thrashing about, I have discovered there are many more 'dangerous' commands in git than in svn, and it's really easy to get yourself into a 'stuck'[2] state.
git actually provides a whole set of tools that will help you get back out of whatever hole you've dug yourself into... but you're likely to end up having lost your working-copy changes[3] along the way. So the best practice now is to commit often - so that everything is in the repository: there is never a big working copy to lose.
I had a lot of trouble with that as it considerably changes the workflow I'd built up over the last decade or so. From cvs to svn and then adding agile on top, my workflow is now roughly:
- Checkout a fresh, clean copy of the repository (or svn up to achieve same effect)
- add your tests and make some changes
- check the rest of the tests still run - make changes until they do
- do an svn st to see if there are any files I've forgotten to svn add - add them
- do an svn diff and see all my changes - eyeball them to make sure there's nothing obviously wrong
- do an svn up to pull in latest changes by others and make sure the tests *still* pass
- svn commit
So at all times, everything would be *not* checked in - all of it just sitting in the local working copy until I was sure it was ready.
The problem is that, on the surface, git appears to support this workflow perfectly. All the svn-commands described above have git-equivalents (they're even called the same thing) and so you can (supposedly) transition smoothly over to git with only minimal effort. Even adding a branch, rather than hacking on master is not too far a departure from svn-style workflow, as branching is familiar in svn, and git just gives you a beezier easier interface.
So where does it break down?
Well, in my case, usually at the second-last point. A git-pull can completely mess you up if you get to a merge-conflicted state. You can't commit your working-copy because of the merged state, you often can't even properly diff because you've got a mush of the git pull's changes plus your own changes and no way to tell which is which. and there seems to be no obvious way to 'just commit the merge-conflict changes' or update the files that are conflicted and just *tell* git that they're not conflicted anymore... the way you can in svn. So at this point you're screwed.
What makes it worse is that at this point you often don't know exactly what commands you did to get you here - if you're anything like me, you've probably tried a whole bunch of stuff only partly understanding exactly what it does. Each command simply tells you in it's own way that you can't do that. You can look up what you're supposed to do to fix it - but generally find that's just another command that tells you that you can't do it either. So you feel like a truck that's stuck sideways in a narrow alley and can't even understand how it got here, let alone how to get itself back out.
Frustrating!
Underlying that is the, quite reasonable, fear that you may lose all your work[3] since your last commit...
and of course that's because we're used to the underlying 'don't commit until' mentality that we may not even be aware we are sporting.
don't commit until (perfect)
The workflow I described above is a perfect example of this mentality. It makes sense to hold back on committing anything until it all works. After all, you know that the moment you commit, the CI server will pull all your changes and let everybody on the team know that you just broke the build (again). So eventually you adopt a "don't commit until the tests pass" workflow, and keep everything in your working copy until everything's green before committing to the svn repository. Fostering this "don't commit until it's right" mentality is a natural consequence of not wanting to look like an idiot to your colleagues, and works in perfectly fine with the svn-based workflow.
but git doesn't work that way!
Or should I say that git doesn't *need* to work that way. After all, you still need to make sure that your tests pass and don't break on the CI server... but what I've found is that you need to get over the whole git commit thing. It may be named the same thing as svn commit - but it doesn't mean the same consequences (eg that your colleagues will all see that the feature's only half-complete and the tests are all spazzing out).
Instead, change the way you think: the command that *matters* now is actually git push. You can commit whatever you like to your local repository, even if it breaks the build; it's only when you push up to the remote repo that it must be working perfectly.
Any other problems with this?
Unfortunately, there are some other consequences to this small change in workflow. One of them being the fact that you can't do a 'git diff' that covers all your changes since last push. git status and git diff are *just* like svn status and svn diff - they check against the latest commit, not the latest 'push to remote', which means it's hard to do a complete check of all your changes before going 'live'... you have to just trust that all your smaller commits all add up to the Right Thing.
That makes me feel uncomfortable as I like to be sure. I know about human error - and I know that I'm as prone to it as the next guy...
Having to make a patch-against master and then read through *that* (which is far less clear to read than a diff) is not a good substitute, IMO. If anybody has a good way on how to mutate the workflow to accommodate this I'd love to know.
a new workflow?
I'm still working on this but so far I've got:
- Clone a fresh, clean copy of the repository (if I don't already have one)
- git checkout -b my_new_branch
- add tests and/or make some changes
- do a git diff and check this change - eyeball it to make sure there's nothing obviously wrong
- git commit (then repeatedly return to step 3 until 'done')
- check the rest of the tests run - make changes (and git commit) until 'done done'
- do a git pull origin master to pull in latest changes by others and make sure the tests *still* pass
- fix any merge-conflicts and commit the merge
- git push
This is still a work-in-progress, and I would appreciate informed opinions, advice or your own war stories.
Notes:
[1] IMO the git developers could learn a thing or two from Kathy Sierra... but that's another topic.[4]
[2] If you've ever got into a state where you can't run git commit because you're in a 'failed merge', you can't git pull because you get 'fatal: merging of trees' or 'Automatic merge failed; fix conflicts and then commit the result.'. You edit the files to un-conflict them and try to reapply your stash you suddenly get 'CONFLICT (content): Merge conflict in ...' again... After thrashing around for a while between git stash, git pull, updating merged files then trying to re-apply your stash before git committing... I can tell you where I wanted to stick git.
[3] If you're anything like me, you look on the words git reset --hard HEAD with some trepidation. You just can't quite believe that blowing everything away in your working copy is the only way out of a simple merge-conflict.
[4]...and please don't just tell me that git is open-source and I should just go hack on git myself if I hate it so much. In theory I absolutely agree with you, but in practice I can only work on one thing at a time - and right now I'm still working on Active Resource, some projects of my own, a novel...