| Version 8 (modified by cdavid, 4 years ago) |
|---|
* This is work in progress *
Installation
Please do not use any version of git below 1.5.3.
Linux
Git is included in most linux distributions (git-core on Ubuntu).
Mac OS X
Reasonably up to date binary installers can be found here: http://code.google.com/p/git-osx-installer/.
Installing git itself from sources is easy, xcode should give you everything which is needed, but installing the documentation (man, html and info) is a PITA, with many dependencies (asciidoc, etc...). So avoid it if you don't want to go through the hassle.
Windows
There are two easy ways to install git: the native installer or the cygwin installer. Unless you are a regular user of cygwin, the native installer is the best choice. It can be found there: http://code.google.com/p/msysgit/
GUI
Git has a basic TK-based GUI, called gitk. It works well to navigate the history. There are native UI for git for most platforms, including windows and mac os X:
- TortoiseGit?: http://code.google.com/p/tortoisegit/
- gitx (native mac os X client): http://gitx.frim.nl
Before starting
Basic configuration
At minimum, set up your name and email, so that they appear correctly for commits:
git config --global user.name "Your Name Comes Here" git config --global user.email name@domain.example.com
You can add some aliases so that some git commands spell like the svn ones. The following are useful:
git config alias.co checkout git config alias.ci commit git config alias.st status
Getting help from the command line
Git documentation is pretty massive - it can definitely be difficult to apprehend, as it is meant to be exhaustive and reference-like. This page is intented to help you through the first steps. The suggested way is as follows:
- Read this page.
- Once you have a good grasp of the basic scenario, you could either go to [git for svn users], or to the [git tutorial] for a more "git-oriented" introduction.
- Then the git manpages should be less daunting.
Note: as using svn did not require you to read the svn book, using git does not require reading all those material. It is hoped that this page is more than enough for development in numpy.
Common scenario
Those scenario are the basics - they are written to minimize as much as possible disruption from the common svn workflows. They are not necessarily the best ways to do a specific task under git, but they are the least surprising for someone used to svn. They are summarized on the following cheat sheet:
Scenario 1: getting the numpy source code
Getting the sources from the NumPy? repository, just to look at the sources, or to build from last version instead of released:
git clone http://git.scipy.org/git/numpy numpy-git
Do NOT use checkout - checkout has a different meaning than in svn. Clone is what you want. An up-to-date tarball is also made available for each new commit in the git repo menu in trac (snapshot link).
After a clone, to get last changes, you need to go into your repository, and then use git pull
cd numpy-git git pull
Scenario 2: prepare a simple patch ala svn, don't bother me with git
I have found a bug, and I want to submit a patch. I want to do it like in svn, I don't care about git:
# This will list the changed files git status # This will put the changes into a patch git diff
Scenario 3: reverting changes
I have made some changes, but I am confused, I just want to restart from last revision and throw everything away.
There are several solutions - do NOT use revert, git revert is totally different from svn revert. The safe and easy one:
git stash
This will put your changes aside (in a 'stash'), and your working tree will be exactly as if you checked out from the last revision of your repository. It is safe because your changes are not lost - you can reapply them:
git stash apply
If you really don't care about the changes, and are ready to throw them away with no change of recovering: use the checkout option:
git co myfile
This will have the same semantics as svn revert.
Q: I thought that git reset was the option to use ?
No, don't use git reset. Git reset can be used to revert changes, but can be dangerous to use, as it can also remove *commits*, not just changes. git reset is only useful for advanced usage of git. Use git stash or git co, not git reset.
Scenario 4: simple commits, no branching
To do a commit, use the commit option:
git commit -am "My commit."
The -m option has the same meaning as the svn commit command. By default (without the "-a"), git only commits the changes you explicitly told him about with the add command. Although extremely useful, it can be a bit confusing at first when you come from a svn background, hence the -a option.
A big difference of git compared to svn which cannot be skipped even at this level: git clone gives you a working tree (a snapshot of the sources at one revision) AND the repository with the full history. It means in particular that committing a change will NOT propagate it to the original repository you cloned from. For this, you need to use push:
git push git.scipy.org/git/numpy
A first few git specific workflows
Before showing a few simple but powerful git-specific workflows, we need to talk about two features of git. One, branches, is not specific to git, but the index concept is, and a basic understanding is necessary for most git-specific workflows.
The branch concept
Git, like other DVCS, is strongly designed around the notion of branches. Instead of everyone committing directly to the trunk, most development happens in branches, which are then merged into the mainline. What's the point, you may ask ?
- commit is fast: it is instantaneous.
- branches are isolated: if you work on a non trivial feature, having a separate branch means you can commit regularly on it, without pushing things into the "trunk". In particular, you can break things without disturbing anyone else.
- branches are a useful unit of decomposition. Although it still certainly makes sense to commit things directly into the main line of development, regularly using separate branches is a good way to split tasks. This is especially useful for reviews: having a separate branch means everyone can easily look at those changes only. The examples will obviously make this clearer.
The index and content-oriented tracking
In the simple scenarios, we mentioned the '-a' option as necessary to commit all changes. That's because in git, you have to explicitly say which changes you want in a commit. Although a minor inconvenience in simple cases, this is extremely useful in advanced cases, especially for complex merges (to deal with conflict). This is linked to the fundamental idea that git tracks content, and not files. When you do
git add foo.c
You are not really adding the file foo.c to the repository, but you add its content to the git repository. By default, and without the '-a' option, a commit will NOT include content which has not been added with git add - the '-a' option is equivalent to do an add on every changed file, and then commit. The index is thus a staging area between your working tree (the files as they are on your filesystem) and the repository (the set of commits).
Commits, sha1 and versions
One thing which does not seem right at first in git is the notion of sha1 for "versions". Each commit has a unique number, which is a sha1 (a checksum), and not a plain integer as in svn. The lack for simple integer is inherent to DVCS, and branches in general. Indeed, simple integers only make sense when where is only one line of changes (the trunk), but once you have several branches, numbers are intermixed, and the ordering does not make sense. In git, by design, each commit sha1 depends on the commit content as well as its parents (so the sha1 act as a secure mechanism: if one commit is changed, every other commit below in the graph will have its number changed as well, to prevent malicious content change).
Now, this hardly matters from a user POV, because you usually do not use sha1 to refer to a commit. You use either branch names, tags (which are just a "pretty" name to a commit in git), dates, etc...:
# Look at the last commit: git show HEAD # Look at the last commit parent: git show HEAD^ # Look at the log of changes made the last 2 days git log --since="2 days"
Scenario 1: creating a new branch
Creating a new branch to make my changes.
# Create a branch from an old branch named oldbranch git branch newbranch oldbranch # Switch to the new branch git checkout newbranch
This can be done in one command:
git co -b newbranch oldbranch
Now, every commit will be put in newbranch. Again, as for commits, the branch is only created in your repository, and not propagated to the remote repository, unless you explicitly push for it:
# This push all the changes in newbranch onto the remote repository git push url_repo newbranch
Scenario 2: comparing branches
This is one example where git is much more powerful AND easier than svn :) To compare HEAD of two branches (that is the last revision of each branch), you simply use the branch1..branch2 syntax:
# Get the diff "between" two branches git diff oldbranch newbranch # Get the log of commits "between" two branches git log oldbranch newbranch
We use "between" very loosely. For the simple following scenario:
- o -- o -- o oldbranch
\ -- o -- o newbranch
Where o is one commit, the above commands will give you the commits specific to newbranch AND the commits from oldbranch since newbranch was started. To get only the changes related to the commit *specific* to new branch, use the ... syntax instead:
git diff oldbranch...newbranch git log oldbranch...newbranch
Note the difference between '..' and '...'. '..' (2 dots) between the two branches to compare is the same as a space:
git diff oldbranch newbranch git diff oldbranch..newbranch
Are exactly the same.
Scenario 3: Merging branches
Merging branches is easy:
# Will merge branch1 into the current branch git merge branch1
In case of conflicts, you solve the conflicts as in svn, and then use add to mark the content as resolved.
# Edit the file foo.c which has a conflict # Mark the conflict as resolved git add foo.c git ci -m "Merge branch to make foo from bar."
Throwing away branches and commits
Note: obviously, this can potentially lead to data loss. A good way to avoid unattended consequences is to do those changes in another branch, so that if you made a mistake, you can go back to the initial state by checking out the original branch.
The first, and safest way to rollback a change is to use revert. Revert will revert the changes made in a commit, by creating a new commit with the changes:
git revert <commit>
where <commit> is a sha1, or any other name of this commit. After this command, the new state of the repository will be as if the original <commit> was not done, but the history will have both the original commit and the revert commit.
To remove a commit completely, use the reset option:
# Remove last commit, and set up the working tree exactly as it was at that commit git reset --hard HEAD # Remove last commit, but do not touch the working tree (files untouched) git reset --soft HEAD
In this case, the removed commit(s) will be removed completely, with (almost) no chance to be recovered. USE RESET WITH CARE !
Misc
tags
git has a real tag concept. Tags are created, listed and modified using the git tag command:
# List tags git tag -l # Create tag git tag mytag
Tags cannot have space in them.
Attachments
- simple.png (59.6 KB) - added by cdavid 4 years ago.

