- 1 Installation instructions
- 2 Unix Shell
- 3 Why are we here?
- 4 Setup
- 5 Creating a repository
- 6 Tracking changes
- 7 Exploring history
- 8 Moving through time
- 9 Branching and merging
- 10 Local conflicts
- 11 Ignoring Things
- 12 (Optional) Github
- 13 (Optional) Collaborating
- 14 (Optional) Collaboration conflicts
- 15 Version control with Python source vs. iPython notebooks
- 16 Git command summary
- 17 Graphical User Interfaces
- 18 Next steps (intermediate Git)
- 19 Credits
- 20 References
- Up-to-date installation instructions for Git and Bash are available here: https://libguides.ucmerced.edu/software-carpentry/git/install
- Create a Github account here: https://github.com/
- Download Github Desktop: https://desktop.github.com
- Broadly speaking, there is a tension between making computer systems fast and making them easy to use.
- A common solution is to create a 2-layer architecture: A fast, somewhat opaque core surrounded by a more friendly scriptable interface (also referred to as "hooks" or an "API"). Examples of this include video games, Emacs and other highly customizable code editors, and high-level special-purpose languages like Stata and Mathematica.
- Unix shell is the scriptable shell around the operating system. It provides a simple interface for making the operating system do work, without having to know exactly how it accomplishes that work.
- Minimal example: https://swcarpentry.github.io/shell-novice/02-filedir/index.html
- "directory" == folder
- Your files are in "/home/<your login>"or "/Users/<your login>"
- Trees are upside-down in computer science
whoami
pwd # Print Working Directory
Command flags modify what a command does.
ls # List directory contents
ls -a # ... and include hidden files
man ls # Manual for "ls"
ls --help # In-line help info; should work in Windows
- You can navigate through the man page using the space bar and arrow keys
- Quit man with "q"
- Online references are available for Windows users who don't have man pages: https://linux.die.net/
When a command is followed by an argument, it acts on that argument.
cd Desktop
ls *.pdf # List all files ending in ".pdf"
cd .. # go up one directory
The terminal saves your command history (typically 500 or 1000 commands)
- You can see previous commands using the up/down arrows
- You can edit the command that's currently visible and run it
Once your command history gets big, you might want to search it:
history
history | grep ls # pipe the output of history into search
- Move backwards and forwards in time using save points in your code history.
- Control what goes into a save point
- Collaborate
- Explore alternative versions of your project without destroying prior work
- Useful for text files, less useful for binary files (most of the useful features are text-oriented)
git config --list # or -l
git config --list --show-origin # where is this setting coming from?
All git commands are 2-part verbs, followed by flags and arguments. Use quotes if you have spaces in your arguments (e.g. user name):
git config --global user.name "Gilgamesh"
git config --global user.email gilgamesh@uruk.gov
git config --global core.autocrlf input # Unix and MacOS
git config --global core.autocrlf true # Windows
You can use any text editor, but you want a sensible default in case Git opens one for you:
git config --global core.editor nano
-
Only push the current branch (more about this later):
git config --global push.default simple
-
Merge, don't rebase (more about this later):
git config --global pull.rebase false
git config --global init.defaultBranch main
We are going to create and track plans for our garden.
cd ~/Desktop
mkdir garden
cd garden
git init
ls
ls -a
- Git uses this special subdirectory to store all the information about the project, including all files and sub-directories located within the project's directory. If we ever delete the `.git` subdirectory, we will lose the project's history.
- Only one version of a file is visible; the rest are available in the database
git status
You can edit with nano or with the text editor of your choice. We'll try to show the editor and the command line side-by-side.
touch shopping_list.txt
nano shopping_list.txt
##--- text file ---##
1. Cherry tomatoes
Save and quit. You can verify that you've saved your changes in Bash:
ls
cat shopping_list.txt
Manually assemble your next save point in the Staging area ("Index"). When you're happy with it, commit it to the repository to create a new version of your project.
git status
git add shopping_list.txt
git status
git commit -m "Start shopping list for garden"
git status
- Commit messages should be useful; eventually there will be a lot of them (we'll come back to this)
- There are multiple synonym for each of these locations:
- Workspace or Working Tree
- Staging Area, Index, or Cache
- Repository or Commit History
# Concise help
git add -h
# Verbose help
man git-add
-
Edit the file
##--- text file ---## 1. Cherry tomatoes 2. Italian basil
git status git diff
-
If you try to commit the file before you add it to the Staging area, nothing happens
git commit -m "Add basil" git status
-
You have to add the file to the Staging area, then commit
git add shopping_list.txt git commit -m "Add basil"
git log
git log --oneline
- You can identify a commit by unique ID or by HEAD offset (H, HEAD~1, HEAD~2,…)
- HEAD is a pointer to the most recent commit (of the active branch)
git log --oneline --graph # Useful if you have many branches
git log --author=~Gilgamesh
git log --since=5.days # or weeks, months, years
-
Edit the file
##--- text file ---## 1. Cherry tomatoes 2. Italian basil 3. Jalapenos
-
By default,
diff
shows changes to Workspacegit status git diff
-
Once the file is added to Staging,
diff
no longer shows changesgit add shopping_list.txt git status git diff
-
You can examine Staging instead
git diff --staged # or "--cached" git commit -m "Add peppers" git status
- Staging area is for creating sensible commits. You can edit multiple files and only add a subset of them to a given commit. This makes it easier to look back at your work.
- A commit should be a coherent functional chunk (whatever that means). One way to think about it: If you wanted to cleanly undo your work, what would that look like?
-
Try to commit an empty directory
mkdir flowers git status git add flowers git status
-
Now add files and try again
touch flowers/roses flowers/tulips git status git add flowers git commit -m "Initial thoughts on flowers"
##--- text file ---##
1. Cherry tomatoes
2. Italian basil
3. Jalapenos
4. Cayenne peppers
# NB: This is identical to "git diff" with no argument
# git diff HEAD shopping_list.txt
# Show all changes back to this point
# HEAD~1 doesn't have text changes - added directory
git diff HEAD~1 shopping_list.txt
git diff HEAD~3 shopping_list.txt
# Show changes for just HEAD~3
git show HEAD~3 shopping_list.txt
# Show changes in range of commits
git diff HEAD~3..HEAD~1 shopping_list.txt
git log HEAD~3..HEAD~1
# Theoretically you can do this
# git diff f22b25e3233b4645dabd0d81e651fe074bd8e73b shopping_list.txt
# Use reduced 7-character ID from "git log --oneline"
git diff f22b25e shopping_list.txt
# We have unstaged changes
git status
# Revert the working tree to the most recent commit
git restore shopping_list.txt
# Check whether your editor is automatically updating!
cat shopping_list.txt
# The old way of doing it:
# git checkout HEAD shopping_list.txt
git checkout f22b25e shopping_list.txt
# Alternatively, you can use the HEAD offset:
git checkout HEAD~3 shopping_list.txt
# View the changed file in the Working Tree
cat shopping_list.txt
# These changes are also in the Staging area; you can create a new commit
# that includes the older file version.
git status
git diff
git diff --staged
# Go back to the most recent version
git checkout HEAD shopping_list.txt
Instructor's note: Update drawing with files moving in and out of working tree/staging area
What if you want to see a previous version of the whole project?
# Detached HEAD moves the whole HEAD pointer back to an earlier version
git checkout HEAD~2
git status
git log --oneline
# Move HEAD back to latest commit by checking out the branch name
git checkout master
Instructor's note: Update drawing with moving HEAD pointer
- You can also check out a tag.
- Unfortunately some of these terms, like "checkout", are overloaded. Think about what you want to do to your history, then look up the appropriate command.
# Create a new branch
git branch feature
# Show all branches
git branch
# Switch to new branch
git switch feature
git branch
git status
touch feature.txt
nano feature.txt
##--- text file ---##
This is a new feature we're trying out
git status
git add feature.txt
git commit -m "Added a trial feature"
# File doesn't exist on the master branch
git switch master
ls
# Merging the feature branch adds your changes
git merge feature
ls
- This is simplest possible case: All of the new changes were in one branch (Fast-Forward merge moves branch tag)
- A branch history with competing changes is shown in the Conflicts section below (Recursive merge, which resembles the octopus diagram)
git branch pepper
git switch pepper
##--- text file ---##
1. Cherry tomatoes
2. Italian basil
3. Jalapenos
4. Cayenne peppers
git add shopping_list.txt
git commit -m "Added peppers to pepper branch"
git switch master
##--- text file ---##
1. Cherry tomatoes
2. Italian basil
3. Jalapenos
4. Garlic
git add shopping_list.txt
git commit -m "Added garlic to main branch"
git merge pepper
Edit the file to resolve the conflict. You can delete one of the two lines, combine them, or make any other changes. Delete the conflict markers before staging the file (the lines beginning in "<", "=", and ">").
##--- text file ---##
<<<<<<< HEAD
4. Garlic
=======
4. Cayenne peppers
>>>>>>> dabb4c8c450e8475aee9b14b4383acc99f42af1d
git add shopping_list.txt
git commit -m "Added garlic to main branch"
mkdir results
touch a.dat b.dat c.dat results/a.out results/b.out
ls
git status
touch .gitignore
ls -a
##--- text file ---##
*.dat
results/
# We are ignoring .dat files and tracking .gitignore
git status
git add .gitignore
git commit -m "Ignore output files"
- Ignoring complicated directory structures can be tricky, come talk to me
- You should generally ignore archives (zip, tar), images (png, jpg), binaries (dmg, iso, exe), compiler output, log files, and .DS_Store (Mac)
- git pull merges origin/master branch into local master branch
- git push merges local master branch into origin/master branch
- Under the hood, pull is fetch + merge
- fetch gets updates from remote
- Local repository has a 2 branches: remote/master and master
- merge merges remote/master into master
- Push appears to not be compound?
- easy collaboration
- sync between machines
- off-site backup
- peer review
- Github Desktop uses browser token across platforms. Unix people can use SSH keys instead.
- Two-factor authentication options
- Github Mobile
- Personal authenticator (e.g. Microsoft, 1Password, etc)
- SMS (not preferred)
- Create new repository (visual instructions here: https://swcarpentry.github.io/git-novice/07-github/index.html)
- Call it "garden"
- Find HTTPS string that identifies repository
git remote add origin https://github.com/devnich/garden.git
git remote -v
git push origin master # you should get a password prompt
If you configure your origin as upstream, you can just do:
git push
pull
is a shortcut for fetch
+ merge
git pull
Instructor's note: Demo this section with two terminal windows, one for "garden" and one for "garden-clone"
git clone https://github.com/devnich/garden.git ~/Desktop/garden-clone
cd garden-clone
touch trees.txt
##--- text file ---##
1. Plum
2. Pluot
3. Aprium
pwd # we are in ~/Desktop/garden-clone
git status
git add trees.txt
git commit -m "I like plums"
git push
cd ../garden # now we are in ~/Desktop/garden
ls
git pull
ls
- Clone repository
- Create new branch
- Push branch to shared repository
- Request merge
- Fork repository
- Clone forked repository
- Create branch (optional)
- Push changes to forked repository
- Create pull request for original repository
##--- text file ---##
1. Cherry tomatoes
2. Italian basil
3. Jalapenos
4. Scotch bonnet peppers
git add shopping_list.txt
git commit -m "Added more peppers our copy"
git push origin master
##--- text file ---##
1. Cherry tomatoes
2. Italian basil
3. Jalapenos
4. Garlic
git add shopping_list.txt
git commit -m "Added garlic to rival copy"
# Rejected because Git can't merge changes cleanly
git push origin master
# Pulling results in a local conflict
git pull origin master
Edit the file to resolve the conflict. You can delete one of the two lines, combine them, or make any other changes. Delete the conflict markers before staging the file (the lines beginning in "<", "=", and ">").
##--- text file ---##
<<<<<<< HEAD
4. Garlic
=======
4. Cayenne peppers
>>>>>>> dabb4c8c450e8475aee9b14b4383acc99f42af1d
You may want to enable a default merge tool:
git config --global merge.tool meld
- Open source merge tools include Vimdiff, Meld, Kdiff, Gitfiend, Git Cola, etc. There are many other options!
- Always pull before you push
- To minimize conflicts, do your work on a separate branch
.ipynb files contain a lot of JSON boilerplate that isn't code
Git commands are about moving stuff between trees: https://ndpsoftware.com/git-cheatsheet.html
- Viewing history is a much better experience
- Not fully functional (missing commands and command options)
- Git is still complicated. Menus and buttons don’t change that.
- Accidental button presses are scary
git blame
: See who changed each line of a filegit bisect
: Find out when a change was introduced (good man page)git add --patch
: Stage a part of a file ("hunk") instead the entire filegit -i <command>
: Run a command interactively, confirming each step
Each of these is a different answer to the question, "How do I get back to where I was?" They are listed from least dangerous to most dangerous.
git-restore
: Restore files in the working tree from the index or from another commit. This command does not update your branch.git-revert
: Make a new commit that reverts the changes made by other commits (good man page)git-reset
: Update your branch, moving the tip in order to add or remove commits from the branch (i.e. it moves the HEAD pointer around and then takes additional actions base on the options you provide). This operation changes the commit history.
These commands are potentially dangerous because they rewrite history. You should never change or delete history that you have shared with other people.
git reset
: Delete uncommitted changesgit reset --hard
: Delete some of your commits to get back to an earlier project state. Cannot be undone!git rebase
: Rewrite the history of branch A to include branch B. This is different than merging branch B into branch A; merging retains your project history, whereas rebasing rewrites that history.git squash
: Convert multiple commits into a single commit. This also rewrites your project history.
git cherry-pick
: Copy a single commit from a different branch. This rewrites your project history piecemeal, which can make it difficult to merge branches in the future.
- https://dlstrong.github.io/git-novice/
- https://git-scm.com/book/en/v2
- https://gitlab.com/liibre/curso/-/wikis/material
- https://swcarpentry.github.io/git-novice/reference
- https://swcarpentry.github.io/shell-novice/reference/
- https://twitter.com/jay_gee
- The Pro Git book: https://git-scm.com/book/en/v2
- Graphical user interfaces for Git (useful for visualizing diffs and merges): https://git-scm.com/book/en/v2/Appendix-A%3A-Git-in-Other-Environments-Graphical-Interfaces
- Git for Advanced Beginners: http://think-like-a-git.net
- "Git is built on a graph. Almost every Git command manipulates this graph. To understand Git deeply, focus on the properties of this graph, not workflows or commands.": https://codewords.recurse.com/issues/two/git-from-the-inside-out
- A Visual Git Reference: https://marklodato.github.io/visual-git-guide/index-en.html
- Visual cheat sheet: https://ndpsoftware.com/git-cheatsheet.html