Git
We will use git to manage lab and program submissions in this course. At minimum, this allows our TAs to quickly access your assignments that you “submit” on the server, and allows you to get some basic practice with an extremely important tool in the software development world. You can also choose to use git more extensively to manage your work in this class.
Git basics
Git is a version control system, meaning that it is meant to keep track of changes to code. It’s especially useful for collaborating, but it can also be used when working by yourself so that you can keep a history of your changes and why you made them.
Git is not GitHub. GitHub is a service that hosts many git repositories. Other similar services include GitLab and BitBucket. Other version control systems (so, these are comparable to git) include Mercurial and Subversion.
Minimum setup for assignment submission
For this course, we’re going to create a git repository and use it to tag the version of our programs that we would like to be graded.
Creating a git repository
To start, create a directory. For this course, your directory should be called csci112_spring2023
, so we will use that as an example.
Create the directory using:
mkdir csci112_spring2023
Navigate into the directory:
cd csci112_spring2023
Initialize a git repository inside the directory:
git init
You will see the following message. Don’t worry about it:
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Your csci112_spring2023
directory is now a git repository!
Try running git status
to see the status of your repository. You should see this:
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)
Now create a file with some text in it. You could do this by opening a new file with vim
, putting some text in, and saving it. Or you can copy the following:
echo "Hello, world!" > example.txt
(The echo
command prints whatever you specify to the screen; the >
redirects that output to a file instead of the screen.) You may want to check that the file is there by running ls
.
Now, run git status
again. You should get this:
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
example.txt
nothing added to commit but untracked files present (use "git add" to track)
Committing changes
In git, the basic idea for keeping track of changes to your respository is to save snapshots of the respository, which are called commits. You choose when to save the snapshots, and write a message (called a commit message) describing the changes that were made from the previous commit to this one.
Git allows you to choose which of the changes that you have made should be included in a commit. This is helpful when you are a more advanced git user, but for our purposes, we can just include all changes whenever we are ready to create a new snapshot (commit).
To specify that all changes should be included, we can run
git add -A
The -A
is for “all”. Now, if you run git status
, you should see:
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: example.txt
This means that the file example.txt
has been added (or “staged”, in the terminology git uses). Multiple files may be staged, and they would all be listed under “Changes to be committed” when you run git status
.
Now, we are finally ready to commit. We also need to provide a description of this commit, called a commit message, which we specify with the -m
option followed by our message in quotes. For example, we might do:
git commit -m "Add example.txt file"
And then we see a message like this:
[master (root-commit) 2c82f20] Add example.txt file
1 file changed, 1 insertion(+)
create mode 100644 example.txt
Running git status
again shows that we have no unsaved changes:
On branch master
nothing to commit, working tree clean
We can run
git log
to see a history of all of our commits. In this case, it looks something like
commit 2c82f20e9b41b4fe09b6c142d381dce32f95abe3 (HEAD -> master)
Author: lgw2 <lgw2@uw.edu>
Date: Thu Jan 26 19:55:31 2023 -0700
Add example.txt file
Tagging a commit
If we want to label a commit because it is important or we want to be able to access it easily, we can give it a tag. In this class, you will use git tags to indicate which version of your code we should grade.
To create a tag, run git tag <tagname>
after you have committed your changes. You can run git log
to check that your tag was associated with the right commit. For example, we might run
git tag example_tag
and then
git log
and see
commit 2c82f20e9b41b4fe09b6c142d381dce32f95abe3 (HEAD -> master, tag: example_tag)
Author: lgw2 <lgw2@uw.edu>
Date: Thu Jan 26 19:55:31 2023 -0700
Add example.txt file
Note the tag at the end of the first line!
You’ll notice that if you make a new commit and try to tag that commit with a tag tagname
that has already been used, you’ll get this message:
fatal: tag '<tagname>' already exists
To solve this, first delete the tag
git tag -d <tagname>
and then tag as normal
git tag <tagname>
You can tag a previous commit by referencing its commit hash. For example, if I run git log
and see something like this:
[p19t655@csci112 csci112_spring2023]$ git log
commit 71203455057227331ac8384f3f2c2a1910ef5094 (HEAD -> master)
Author: lgw2 <lgw2@uw.edu>
Date: Fri Feb 10 09:08:12 2023 -0700
Add a example file
commit aa08bd749ec35221abd90ce0908fa7279714e3c1 (origin/master)
Author: lgw2 <lgw2@uw.edu>
Date: Wed Feb 8 16:38:34 2023 -0700
Update
commit 3601b167efb9a1c6ed3f384c55a720c740f4ab07
Author: lgw2 <lgw2@uw.edu>
Date: Thu Jan 26 20:19:37 2023 -0700
lab1
commit 5170dff5f9795001a837407645ed55cae24e43d9
Author: lgw2 <lgw2@uw.edu>
Date: Thu Jan 26 20:18:53 2023 -0700
Update
and I want to tag the commit with the comment “lab1” with the tag lab1
, I cound run
git tag lab1 3601b167efb9a1c6ed3f384c55a720c740f4ab07
tl;dr
Once you have a version of your lab or program that you would like to “submit”, (i.e., be the one we grade if you don’t update it), do this:
git add -A
git commit -m "<description of your changes, or just name of assignment>"
git tag <specified tag for assignment>
If the tag already exists, do git tag -d <specified tag for assignment>
before the third command above.
Common issues
Nested repositories
In general, you do not want to have nested git respositories. This is what happens if you run git init
in one directory, and then run it again in one of the subdirectories of that directory later. You’ll know if you have nested repositories if git log
looks different in different directories of your repository.
Let’s get a better understanding of what this means with an example. Suppose that you have one directory called directory
, and inside it you had two directories called subdir1
and subdir2
, and maybe those have a few files in them. Visually, that looks like this:
├── directory
│ ├── subdir1
│ │ ├── file1.txt
│ └── subdir2
│ ├── file2.txt
│ └── file3.txt
If at some point you had run git init
when you were in directory
, then directory
(and everything in it, including the subdirectories) is now being tracked as a git repository. This means that there is a .git
directory inside directory
, so the file structure actually looks like this:
├── directory
| ├── .git
│ ├── subdir1
│ │ ├── file1.txt
│ └── subdir2
│ ├── file2.txt
│ └── file3.txt
(Note that we don’t see .git
unless we look for hidden files, for example by running ls -a
when we’re in the directory
directory.)
Now suppose we navigate to subdir1
and run git init
. This creates another (nested) repository inside subdir1
, so there is another .git
directory under subdir1
, and subdir1
is a new git repository with its own commit history.
├── directory
| ├── .git
│ ├── subdir1
│ │ ├── .git
│ │ ├── file1.txt
│ └── subdir2
│ ├── file2.txt
│ └── file3.txt
But the whole idea of a repository is that it tracks everything inside of it. So now the directory
repository is tracking file1.txt
and so is the subdir1
repository, but there’s just the one file and so it really only makes sense to track it once.
If you realize you have created a nested repository, make sure you understand the level that you really want your repository to exist, and then delete the unneeded one by deleting the associated .git
repository. The command to do this will be different depending on where you currently are. If you’re in directory
, it would be rm -rf subdir1/.git
. -r
says delete all subdirectories recursively, and -f
says force the deletions without checking whether you want to delete each file. (There are a lot of them in there.) Note that this deletes the whole commit history and can’t be undone, so make sure you delete the correct .git
directory.
Optional: more advanced git
Adding a remote to your repository
If you would like to connect your csci112_spring2023
repository to GitHub, do the following steps. You must choose whether to use HTTPS or SSH to communicate with GitHub. Either is fine, but HTTPS is simpler, so I recommend that.
- Create a new repository on GitHub.com. Do not initialize the new repository with README, license, or gitignore files. Please make this repository private so that I can reuse assignments for this course without the solutions being available online! Copy the remote url with https in it.
- On the server, navigate to your
csci112_spring2023
repository and rungit remote add origin <REMOTE_URL>
- Now run
git push --set-upstream origin master
You may need to generate and enter a personal access token here.
Everything in your repository is now stored on the connected GitHub repository as well! But updates are not automatic. After any new commits, if you would like to send them to the GitHub repository, run
git push
Cloning that repo to your local computer
To copy over the files in your csci112_spring2023
directory (and now on GitHub as well) to a local computer, clone the GitHub repository.
In a terminal window, enter
git clone <REMOTE_URL>
This will create a directory called csci112_spring2023
that holds everything that was in the GitHub repo. If you make an update there (by pushing from the server) and want to get that update on your computer, navigate into the csci112_spring2023
directory and run git pull
.
Additional resources
You can watch the lecture and/or read the text from the MIT Missing Semester lecture on git.
You can look at the git tutorial from the software carpentries program, which aims to teach researchers basic computing skills.