365Git

Month

April 2010

17 posts

Quick commit editing with vim

On Mac OS a and Unix systems Git defaults to using vim as the editor for commits.(I don’t know what the default is on Windows) You can change this, if you like, but for now I’ll just give a couple of tips for using vim — just enough to get by.

When Git needs a message for an action it will bring up vim in view mode. You can move around using the h(move left), l(move right), k(line up) and j(line down)

But for quick editing, the i key will put you in editing mode where you can type and move around as you would for most editors. When you’ve finished press the esc key to get you out of editing mode.

The normal command to save the message and quit is :wq which is bit of a finger twister. An alternative is ZZ which does the same thing, but it’s a much easier key combination to use.

A final thing. An empty commit message cancels the commit, (or rebase or tag, or whatever action the message is being added for). So if you change your mind just delete the message (the lines starting with the comment character # don’t count). A useful way of doing this is to make sure you’re not in insert mode by pressing the esc key and then using dd which deletes the whole line.

Apr 17, 20106 notes
#editor #git #save #settings #vim #day25
The “Hole Hawg” of version control

I use Git. I use Mercurial, sometimes, but I use Git. I used to find it hard going (actually, I still find it hard going, but it’s a lot easier now). I used to find it frustrating that you pull from origin/master, but push to origin master; and how the documentation talks of things like refspecs, HEAD, and ORIGIN_HEAD.

Really; you don’t want to wade through stuff like that when you’re trying to fix a merge, or push to a colleague’s repository. You just want to get stuff done. I imagine Matt Gemmell pulling his hair out when using it.

A while ago I read a piece by Neal Stephenson where he described Unix as the “Hole Hawg” of operating systems. (It’s a short piece, go read it). I think of Git as the Unix, and Mercurial as the Mac of version control. It’s complicated and can really mess you up but there is a lot of power there and for the times that you need it; you’ll be glad of it. Most of the time a quick Google search is enough to answer questions.

There is more than one way of using Git. If you look at some of the comments on here you’ll see that my approach isn’t necessarily shared. I welcome these comments. I’ve learned from them and I’m grateful that someone took the time to share their views. If I don’t reply it isn’t out of churlishness; rather that I have nothing to add. I’m sure that there will be many more over the year.

My intention with this site is to set up a resource of quick answers along with longer explanatory posts and add to the list of Googleable resources. It’s also a way of pushing myself to learn more about Git as I write them. It isn’t to push a “One True Way”. But with a little thought and understanding using Git isn’t that difficult. Yes, it can be dangerous to rewrite history. Yes you can lose your work if you delete a branch. Yes you can mess up XML files with a merge. But these risks can be minimised with a few simple rules. If you can figure out the memory management rules in Cocoa, you can work with Git (and even if you can’t you can). And with a little knowledge and confidence you can develop your own, consistent, Git workflow.

But, for the love of whatever you hold holy; don’t just use it as you would Subversion

Apr 16, 20104 notes
#git #day24 #background #why #mercurial #danger Will Robinson #workflow #learning
Git manpages

Depending on how you installed Git, there may or may not be man pages on your system.

Using manpages

Manpages are the standard way to get help with command line programs. To see if they are installed, type man git into the terminal and you should get the man page for Git.

The space bar will move the page forwards and q will get you back to the terminal. If you want to know more about man, type man man into the terminal.

There are manpages for the the Git commands, but you need to hyphenate the command. e.g. to get help on commits, type man git-commit. Actually the full command names are hyphenated, but the git command wraps these up for you.

Updating manpages

It is easy to download the latest source code for Git (using a Git repository, of course) and build the program yourself. But on the Mac it takes more work to build the man pages. However, you can keep up to date with the manpages by downloading prebuilt ones from http://kernel.org/pub/software/scm/git/

Installing and downloading them is easy I look on the page to see what the latest version is, and then I run this script in a temporary directory. It downloads and installs the latest manpages and tidies up after itself.

#!/bin/sh

VERSION=1.7.1.rc1

curl -O "http://kernel.org/pub/software/scm/git/git-manpages-$VERSION.tar.bz2";

sudo tar xjv -C /usr/local/share/man -f "git-manpages-$VERSION.tar.bz2";

rm git-manpages-$VERSION.tar.bz2

Replace the version with the current version rather than 1.7.1.rc1 as I have in this example

If you’re using the current source, then you already have a full set of documentation and I’ll get around to then when I cover independent branches.

Apr 15, 20104 notes
#git #day23 #manpages #script
Forcing yourself on Git

Consider this scenario: You’re working on a branch and you don’t like where it’s going. You want to switch to another branch and delete the branch you were on. However, Git won’t let you change branches, because you have untracked changes to files (i.e. changes that are not in the index), and by switching branches the changes to the files will be overwritten and, because they are not in the index, lost.

Consider another scenario: You’ve made some major changes to your local repository. This is a private project that you don’t share with anyone, but you have a remote repository set up so that you can share changes between more than one machine. You try to push the changes to the remote repository but Git will not let you because you’ve made changes to the history.

These protections are there to prevent problems. But if you know what you’re doing and making such changes intentionally such stumbling blocks are just annoying because the issues can be worked around with a little extra work. In the first case; add the changes to to the index before switching branches, and in the second case, just delete the remote and create a new one with the new state of the project.

But why bother. There are Git commands that have a -f or a --force flag that can be used in these situations to force the changes. in the first case:

git checkout -f someBranch

And for the second case:

git push -f someRemote someBranch

There are other commands that accept this flag, but it’s the sort of thing that you’re better off learning about as you need to use it.

Edit

The original point of this article was to highlight the use and availablity of the -f flag. Although it can be used with the git push command, it might be better to use an alternative: git push someRemote +someBranch instead. Have a look at the comments to see why. This is still a nuclear option which changes the history of the branch so it needs to be used with care.

Apr 14, 20105 notes
#day22 #git #force #branch #push
Three ways of excluding files.

Almost every git user knows about adding a .gitignore file to root of the project to control the visibility of files and folders. Because this file is in the repository; this configuration also applies to remote repositories of the project. But it’s not the only way. I’m going to tell you how to get Git to ignore files on a per computer and per repository basis. These could be better choices in some circumstances.

There are three types of exclude files; from highest to lowest order of precedence they are:

Per Project: .gitignore file in the repository

This is the usual way of adding an ignore file. Call it .gitignore and save it to the root of your project to apply to all the files (you can add different .gitignore files in subdirectories where they have lesser scope). It is a part of the repository, so it will need to be git-added and committed for each change. This is useful for repositories that are passed around with others who may not have a per computer exclude file, or when there are project specific files that need to be taken into account.

Even easier if you have per computer file, you can copy it straight in to your project with a simple cp ~/.gitignore .gitignore and edit to handle your specific requirements.

Per Repository: in .git/info/excludes

You can exclude files on a per repository basis by editing the .git/info/excludes file in your repository. (Why it takes it from this location rather than .git/config I don’t know: add it to the list of git annoyances). These exclusions (or inclusions, you can override the higher level exclusions by prepending ! to lines that you want to include) are not shared with the working directory, so they only apply to that particular repository, and are not shared with any remotes. This is useful when you have particular requirements because of your workflow or machine setup.

Per Computer: through settings in ~/.gitconfig

There should already be a .gitconfig file in your home directory. This is where the global setting for your git installation are stored; such as the user’s name and email address. Within this you can set a path to an excludes file that will apply to all git repositories on the computer in the same way as the name and email defined in this file apply to all repositories.

For example: Most of what I do is in Xcode so I have the following ~/.gitignore file

# Mac OS X
*.DS_Store

# Xcode
*.pbxuser
*.mode1v3
*.mode2v3
*.perspectivev3
*.xcuserstate
project.xcworkspace/
xcuserdata/

# Generated files
build/
*.[oa]
*.pyc

# Backup files
*~.nib

And in my .gitconfig file under the [core] section I’ve added the path to this file for the excludesfile key.

[core]
    excludesfile = ~/.gitignore

Now I have a standard set of ignores that apply to all my git repositories on this machine without me having to add a specific .gitignore file to each one. This is probably most useful if you create a lot of repositories for yourself, but I recommend it to everyone. It’s lowest on the precedence scale and provides a neat catch-all.

Summary

Most of the time the first solution is quite adequate, having exclusions within a repository that is likely to have a public face is probably the most effective way of managing file visibility. But, as with most of Git, there are ways of handling edge cases. You just need to know that they are there.

REFERENCE

The gitignore man page.

Note

This may seem familiar, I’ve taken it from a post on my personal site.

Apr 13, 201036 notes
#git #ignore #exclude #global #local #remote #repository
Checking the code in the index.

I was at Barcamp Bournemouth over the weekend where I gave a talk on Git.

I was asked a question and the Barcamp combination of sleep deprivation and overnight indulgence led me to give an incorrect answer. This is an attempt to clear things up (sorry Adrian)!

Say you have changes staged in the index and untracked changes in the working directory. Before checking this code in, you want to run some tests, so you want to get rid of the code in the working directory and just leave the changes that are in the index. This is one way to correctly do things.

  1. Stash the changes in the working directory, but leave the changes in the index as they are with git stash --keep-index
  2. Do what you need with the code and then check it in as usual git commit -m "message"
  3. Bring back the stashed changes to the working tree with git stash pop

It’s simple enough, and described in the documentation for git-stash.

Apr 12, 20103 notes
#git #stash #apply #pop #index #commit #partial #day20
Adding an empty directory to a commit.

I believe there are cases where a directory needs to exist in a repository but it’s empty. Either because it will be written into at a later stage, or because the contents have been ignored.

If you remember, Git tracks content, not files. Because there is no content, the directory is not tracked, even if you add it. This might cause problems. The work-around is simple. Just create an empty file in the directory (call it something like .empty, use a dotfile so that it won’t necessarily show up in file lists) and git add this to the repository. Because there is something in the directory, it will now be tracked.

A little inelegant, maybe. But not as inelegant as having code that depends on the existence of a directory.

Apr 11, 20105 notes
#day19 #directory #empty #git #repository #track
Getting files from another branch

Say you are on your working branch, let’s call it feature. You’ve made some changes to your files and you decide that you didn’t really want to change (say) the main.c file. You don’t need to check out the master branch, copy the file change back to feature and paste the changes. What you can do is get the state of the file from the master branch directly.

git checkout master main.c

This will replace the current main.c file with the one at the tip of the master branch. It will be added to the index, so you just need to check the changes in. If you want to replace more than one file, just list all of their names.

You don’t need to to get it from the tip of the branch. If you want to get the state of the file from a tagged commit, use the name of the tag instead. Or, if you know the sha of the commit from which you get the file you can use that directly. This is not the only way to do it, but it’s a handy way of getting a previous state of a file.

Apr 10, 20103 notes
#day18 #git #checkout #file #branch
Deleting branches on a remote.

There comes a time when you need to delete a branch on a remote. Say you have a repository set up only for your own use which you use to keep half-baked development branches in sync between machines. Once you’ve done with the branch, you don’t need to have it exposed, not least because the branch is unlikely to exist on your local repository anyway. Say the remote branch is named backup and the branch to delete is called feature. The command to use is:

git push backup :feature

This is one of the confusing commands, you push a branch prefixed with a colon. If that’s too weird for you you can use this instead

git push --delete backup feature

which makes it clearer, if a little longer to type.

Apr 9, 20104 notes
#branch #delete #git #push #remote #day17
Simple aliases

All this typing, all these commands. It doesn’t have to be this hard.

You can set up aliases in Git’s configuration that let you concatenate these commands and flags or even run scripts without as much keyboard action. But sometimes for the simpler commands I find it easier to use shell aliases. Another advantage of this is that you don’t have to type gitin front of the aliases.

Here’s the way I’ve set up my aliases.

alias gst='git status'
alias gl='git pull'
alias gpu='git push'
alias gc='git commit'
alias gca='git commit -a'
alias gb='git branch'
alias gco='git checkout'
alias gba='git branch -a'
alias gsb='git show-branch'
alias glo='git log --oneline'

There aren’t many, and the most destructive command is git pull. But I use these a lot.

Apr 8, 20103 notes
#git #alias #shell #shortcut #day16
Fast forward merge

Say we have a repository with a master branch and a feature branch. Since the feature branch has come off master, there have been no other changes added. Also, the master branch is the currently checked out branch and we can see that because HEAD points to master.

So, we want to bring in the changes from the feature branch to the master branch so we issue a simple git merge feature. Even if you use the -m flag to add a message to this merge a terse message appear saying that there has been a fast-forward. What has happened is that since there are no other changes, there is no merge and the master branch is just moved to point to the tip of the feature branch. Since HEAD points at master, it follows along. So the position is this.

Nice and simple, and makes sense. But it can annoy some users who like to work on a feature in a branch but want that loop to exist after merging it back in the main line. It can make it easier to see the changeset in one glance and more convenient to back out later if needed. Git has a way of fixing that. If instead of git merge feature we change the command to git merge --no-ff feature (note the --no-ff flag) and the branch is merged instead.

Personally, I’m comfortable with a rebase based workflow, so I don’t use this flag, but it’s just another example of the annoying flexibility of Git.

Apr 7, 20102 notes
#day15 #git #merge #fast-forward
What’s in the commit?

I’m as reluctant as the next person to read a slab of text when I’m just looking for information. So just a short one today.

Say you want to see what exactly you are adding to a repository with a commit. Here’s a handy command:

git commit -v

The important thing here is the -v (shorthand for the --verbose) flag. When the editor opens up with the commit message, the diff between the current HEAD and the commit will be added to the message. This is useful if you want to see, at a glance, all the changes that the commit will introduce.

There are a couple of points to be aware of, though. Firstly, the extra diff does not have the preceding # character which marks a line as a comment, so this diff is included in the message. We might think of this as a tautology. The diff is immediately available between any two commits with git diff anyway. But if you want to highlight a particular change in a commit message this is an easy way to get a complete diff that can be edited to fit any purpose.

Secondly, Why get the diff at the commit stage when you can get one before committing using git diff --cached as explained in my previous post? I can’t speak for everyone, but sometimes it occurs to me to look at the diff as I am typing the git commit command. Rather than change my line of thought I just pass the -v flag and edit the commit message. Of course it helps if you are competent with your editor. Git defaults to a flavour of vi. But if you would rather use your preferred editor have a look at this post from my personal site which should give you some pointers until I write a fuller post here.

This technique is particularly useful if, like me, you make lots of small commits on a working branch which you then rebase and squash down to larger chunks of commits on a main branch. That way, the changes are small and focussed and easy to look over in the commit editor.

Apr 6, 20102 notes
#git #commit #git-diff #verbose #day14 #editor
Git objects: the tag

The final object left to cover is the tag. Strangely (or not, depending on your view of Git) there are two types of tags. There are lightweight tags which are almost like branches, but here I’ll introduce the tag object.

Tagging

A tag is a shorthand way of naming an object, usually a commit, so it can be referred to easily. A tag object differs from a lightweight tag in that it is a complete object with an author, message, and date. It can also be cryptographically signed with GPG. Very much like a commit.

The obvious use for tags is to mark releases. v1.0, v1.7, etc. It’s easy to check out these commits by using the memorable name rather than the sha. Using the object tag is useful here if you need to sign these milestones.

Another use for tags is to mark points in code that you might want to go back to; such as before you embark on implementing a mad idea and just creating a separate branch isn’t enough. A tag object here is useful because you create the tag with a message explaining why you are creating the tag.

I sometimes use tags to park dead branches. If I’ve followed a line of development that I want to shelve, I create a tag at the tip of the branch and delete the branch. That way, when I look at the state of my repository, I don’t see these dead branches listed, but I can always go back to it by creating a branch from that tag.

Graphically:

You can create a branch based off a tag by using the normal git checkout or git branch commands:

git branch newBranch PreCrazy will create a new branch based on the commit pointed to by PreCrazy.

You can checkout a tag directly without creating a branch off it. But this creates a “detached head”. Usually HEAD points to a branch rather than to a commit directly, but when you check out a commit without creating a branch then HEAD points directly to a commit, and you will see a helpful message on the console to tell you what you have done. If you wish, you can always move back to a branch with git checkout because those references still exist even though HEAD has moved. You can still add commits to the repository and they will form a chain, but since they are not on a branch, they are liable to be garbage collected away at some point and deleted.

Creating a tag object.

The simplest way to create a tag object is to create one with an inline message:

git tag v1.0 -m"Tagging version 1.0"

If a longer message is required use the -a flag (shorthand for creating and annotated commit) instead and the default editor will come up making it easier to create a message.

git tag -a v1.0

These two examples create tags at the current head commit. But you can tag previous commits simply by passing in the sha of the commit

git tag -a v2.0 e34f77c

You can create tag objects for more than just commits, you can tag blobs and trees as well, just by passing in the sha of the blob or the tree. Why anyone would want to do this is beyond me. It isn’t as if you can check out a single file based on a tagged blob, although you could reference it’s contents. Maybe it can be used to cryptographically sign a file or a tree as coming from a particular source so verify it’s integrity.

Listing tag objects

A simple git tag will list all the tags in the repository. If you mostly use version tagging, then the names will be self explanatory. You can further examine tags using git cat-file -p <tagname> which I prefer to git show <tagname> which also shows a diff. Graphical tools such as Gitk show tagged commits, but not tagged files or trees, so it’s worth using the command line options sometimes to see if there is anything else lurking.

Summary

A tag is a convenient way of assigning a shorthand name to a commit and marking significant points. A tag object allows for extra information to be added to this shorthand name, such as committer information, a message, or an electronic signature. Tag objects can also be assigned to git objects other than commits, such as blobs and trees.

There is more that I can write about tags and the other objects. It helps me a great deal when using git to know about how these building blocks combine to create a repository. Even though there are hundreds of ways of sending commands to git, they mostly just work on these objects. Keeping that in mind makes using Git less complex.

Apr 5, 20103 notes
#Fundamentals #cat-file #day13 #git #object #tag #git-show
Git objects: the commit

The third Git object is the commit, and this is what holds everything together. The top level object in the repository is the commit object. It references the top level tree and also keeps other information such as the committer, the date and time of the commit, references to the previous commit (or commits in the case of a merge) and the commit message. It can also have an author and can be cryptographically signed with GPG in projects where such verification is important.

There isn’t much to this because all the heavy lifting is being done by trees and blobs. The commit has the top level tree which in turn references all the other trees and objects. The beauty of them is that they can be strung in a line like a string of pearls to build up a timeline that maintains a history of a project.

Let’s take a look at the Todo project from yesterday’s post:

There is a commit object that points to a top level tree which in turn points to other trees and objects. Say we make a change to the shopping.txt file. Because the contents have changed there is a new blob for this file. Becuase there is a different blob in the home tree for shopping.txt the a new home tree is created. And because there is a different sha for the home tree in the top level tree, there is a new top level tree object. And a new commit object points to this. Since the diy.txt and the work tree are unchanged, those objects are reused. The old objects are not used in this commit, but they still exist in the object database becuase they are referenced by the previous commit. So this is what the new state looks like:

Note that there is a reference to the previous commit from the new commit, but not the other way around. Commit objects only reference their past, not their future.

A final example is renaming. Suppose I change the name of diy.txt to diny.txt (do it next year). Since the contents haven’t changed, the blob stays the same. However, as the name is kept in the home tree, a new tree object is created for that and because of that, there is a new top level tree object for a new commit to point to and all the unchanged objects are still connected. So we have:

It’s easy to see how a complex graph of changes can build up over time. But becuase the objects are made up of their contents, The same tree on my system will expand to the same tree on your system. But if I commit a tree and someone else commits the identical tree on their system, the commit objects will not be the same. Although the top level tree may be the same, because the user name, email and time are added to the commit, it is highly unlikely that they will match with anyone else making the same commit.

Viewing the contents of a commit is quite easy with git cat-file -p and passing it the commit sha, or you can use any of the git log methods to see the history of commits.

Apr 4, 20103 notes
#git #day12 #commit #fundamentals #cat-file #tree #object #history
Git objects: the tree

Yesterday I wrote about the blob, the basic building block of the git repository. Today, I’ll talk about the tree.

The tree is a nice recursive object that holds blobs and other trees to make up a graph. It’s the tree object that contains the names of the files that the content in blobs expands into. It also holds the names for the tree objects that it contains. Just like other git objects, the name of the object is not stored with the object. So the name of a tree object (which represents a directory) is stored in the next tree object upwards.

Diagrams always make these things clearer. Imagine I’ve got a simple Todo list which I keep in a git repository so that I can access it from work as well as from home. This is what it looks like:

You can see that I’ve added the shas to the diagram. Don’t ask me how I got them.

The tree

The tree is a list of the trees and the blobs that it contains. It’s easier to see by examining the sha itself using the handy git cat-file command that I introduced yesterday.

And on the diagram:

So you can see that the contents of the tree are the file mode, the type of the object, the sha of the object and the name of the object. You can see that the top level tree holds the sub-objects and their names, and each subtree holds their own objects and names. But, what is the holding on to the top level tree? That will be the commit object that will be covered tomorrow.

git ls-tree

Okay, I will show you how I got the shas. There is another handy command git ls-tree which shows the contents of a tree object. Two flags are useful -r recurses into subtrees and -t shows trees when recursing. So to get the list of all the shas in the repository this is what I did:

  1. Get the commit sha. The easiest way is just to call git cat-flle -p head which pretty prints the latest commit object. From this I can see the tree object, which is the top level tree id.
  2. call git ls-tree -r -t on this tree object and I can see all the trees and objects in the repository that can be reached by this commit.

This is what I get:

Go and find a repository on your system and try and get all the objects. You can use git cat-file -p on any of the object shas to see what is in them.

Apr 3, 20104 notes
#git #tree #Fundamentals #day11 #cat-file #object #ls-tree
Git objects: the blob

I’m going to step back from tips over the next few days and cover some fundamentals over the Easter weekend.

There are 4 kinds of objects that Git works with:

  1. The blob
  2. The tree
  3. The commit
  4. The tag

Although I’ll cover the blob today there are common characteristics for each of these objects

  • These objects just store contents. For a file, it’s just the contents of the file, not the name or other properties. For trees, the contents are just a list of the objects it contains, not the name of the tree.
  • These objects are referenced by an id. This is made by taking the contents of the file (or tree, or whatever), adding a header with its size, compressing it with zlib and taking it’s SHA-1 hash. For convenience, I refer to these as the sha
  • You usually see objects referred to by their 40 character string. Objects are immutable, and for our purposes, unique. Generally, using the first 7 or so characters of the sha is enough to completely reference them. By immutable, I mean that if you edit a file and check it in again, a whole new object is created with a new sha, but the old sha still exists so you can usually go back to it. This is what makes it easy in Git to go backwards and forwards in time.
  • If the contents of two files are the same, then their shas are the same. This is important. It means that a file I create on my computer has the same sha as a file on your computer with the same contents, regardless of the name of the file.
  • These objects live in your git repository, which is in the .git/objects/ directory, but don’t try and look at them directly. If you’re a software developer you know better than to use an implementation over an interface. There are plenty of commands that let you look at these objects from the command line.

With this in mind, let’s look at the quantum of Git — the blob

The Blob

This is representation of the contents of a file within git. I cannot say this enough: it represents the contents of the file, not the name. not the permissions, not the location within your project tree. Nothing but the contents of the file.

This is the fundamental building block of the repository. Blobs are collected into trees, trees are collected into a top level tree which represents the top level of your project, and this tree is represented in a commit object which completely represents the stored state of your project at that point. We’ll cover this recursive object graph over the next few days.

But for now, let me prove to you that objects are unique to the contents of the file independently of the machine.

  1. Create a folder for your project. In my example I’ve used the name blob, but you can call it what you like.
  2. Go into this directory.
  3. Create a git repository.
  4. Create a text file within this directory. I’ve called mine blob.txt but you can call it what you want. In fact, you’re better off calling it something else.
  5. use echo "I'm a blob" > blob.txt to put content into your file. You can put your own file name after the >. But the part between quotes should be the same because we want to have the same content in the file.
  6. Do a cat to see that the contents are the same. This is what it looks like in my Terminal

  7. Add the file to the repository
  8. Commit the changes with a short commit message.

git cat-file

I need to make a slight diversion to introduce this handy command. It isn’t part of the generally used command set, but it’s useful for spelunking your repositories. In the same way that you can see the contents of files from the command line using the cat command (on Mac OS and Linux at least) the git cat-file command can show the contents of git objects. For now, we just need to know about the -p flag which “pretty prints” the output.

Just to show that the same sha is generated on my machine as it will be on yours enter this in the terminal

git cat-file -p e80e4f2

Now I know that the sha for the file with the contents “I’m a blob” is e80e4f2 on my machine, so I know that for a file with the same contents in your repo it will also have the same sha.

I’ve had to cheat a little, of course. The contents of the file are the same across machines, but all the other details will be different, so our commit shas won’t match, nor the trees. I’ll explain this over the next few days.

Apr 2, 20103 notes
#Fundamentals #blob #object #day10 #content #cat-file
Fixing merge conflicts

Branching is great. Being able to work on code that you don’t have to check in, that doesn’t risk polluting the main branch until it’s finished is great; but you have to be able to bring the changes back in at some stage. One way is to use git merge. Here’s a nice contrived example of a simple repository containing two files. Greetings.txt and Partings.txt. There are three branches; master, es and fr. The currently checked out branch is es so the HEAD reference points to that. Can you spot the deliberate mistakes in the contents of the files?

I want to merge the changes from the fr branch into the es branch and since I am on the es branch already, I use the command git merge fr, and that’s when the problems start. It looks like the merge didn’t work and has left me with conflicts.

Conflicts are an unavoidable fact of working with merges in any version control system. If there has been a change to the same line in both branches then git does not know which one to apply. Although it’s very good at interleaving changes to different lines in files, there is nothing magical about git: it isn’t going to risk corrupting my files in cases where there is a conflict. So, that means that they need to be fixed by hand. Before we look at ways of fixing these commits, let’s take a look at the state of our repository now.

Examining the conflicts

Run git status to see the state of your working directory.

Here you can see that the two files are in the working directory, but not in the index. These files have been modified by git with familiar conflict markers.

Notice that it has marked the changes to the HEAD and fr branches in each case. What this doesn’t show you are the hidden references to the state of the files at each branch. This is what your repository looks like now:

Quick and dirty resolution

For small files like this the fixes are quite simple:

  1. Search for the conflict markers in each file.
  2. Edit the file to the state that you want, and remember to remove the conflict markers.
  3. Add the changed files to the index with git add.
  4. Once all the conflicted files are fixed, commit them with git commit.
  5. Edit the commit message to show that I’ve merged the files and fixed the conflicts.
Not so dirty resolution

In this case, I can see that files in one or the other branch should take priority. That is, changes in those files are the ones to be applied in conflicts. So here; I am on the es branch and I can see that I want my version of the Greetings.txt file to be in the merge, and the fr branch version of Partings.txt is the correct one. What I can do is to use the references to check out the version of the file that I want. So what I do here is:

  1. Get the es version of Greetings.txt with git checkout --ours Greetings.txt. Note: I am using --ours because it is the one from the branch I am currently on.
  2. Get the fr version of Partings.txt with git checkout --theirs Partings.txt. Note: I am using --theirs because it is the one from the branch I am merging with.
  3. Do a quick sanity check of the files; and I can see that the correct versions exist, and without the conflict markers.

  4. Add the files to the index with git add .
  5. Commit the changes with git commit
  6. Edit the commit message to show that I’ve merged the files and fixed the conflicts.
Resolved Merge

After resolving the conflicts and checking in the changes this is what the repository looks like.

You can see that there is a new commit with the correct versions of the files. Also, since this was a merge commit, this new commit has two parents which are the commits that it merged.

Summary

This is a contrived example but merge conflicts are a fact of life. Git doesn’t replace communication between team members and cannot read minds when deciding which changes to apply if there are conflicts. It’s also generally safe to merge branches because you will be asked to resolve conflicts rather than have changes applied under you that might mess up the project.

There are other ways of joining branches and dealing with the conflicts. If you can’t remember them all, the quick and dirty method is the one that you can always use.

Of course, if things become too much of mess, and there are too many conflicts to resolve all at once, you can always about a merge (before resolving and committing, of course) with git reset --merge which will restore the repository to the state it was before you started the merge. You can then use other methods to get your project into shape before trying to merge changes.

Apr 1, 201040 notes
#conflict #day9 #git #merge #checkout
Next page →
2010 2011
  • January 2
  • February 2
  • March 1
  • April 1
  • May
  • June
  • July
  • August
  • September
  • October
  • November 2
  • December
2010 2011
  • January
  • February
  • March 8
  • April 17
  • May
  • June
  • July
  • August
  • September
  • October
  • November
  • December