A.2 Tutorial: Versioning Control with Git

 

Introduction

Version control is an important part of software engineering. Through a VC, we are able to track changes to the code, enable collaboration, and work with deployment technologies such as continuous integration.

In this tutorial, we will go through the Git version control system.


Objective

The goal of this tutorial is to understand the most basic functionalities of Git and how to fit a development scenario.

We will develop a dummy project, where we can create code, commit, create branches, merge changes, and resolve conflicts.


Example

In this example, we mimic a development workflow by developing a project with a number of features.

Creating a Repository

Assuming we are developing a project, we first create a directory for that and enter the project. Assuming this is a fresh project, we first create a new Git repository.

$ mkdir project
$ cd project
$ git init
Initierade tomt Git-arkiv i /tmp/proj/.git/

We now have a Git repository.

Adding Files

We start working by create a folder storing source code, and our first code.

$ mkdir src
$ vim src/main.c
...work...
$

At this point, we have one file in our project. We can check the status of our project with the status command.

$ git status
På grenen master

Inga incheckningar ännu

Ospårade filer:
  (använd "git add <fil>..." för att ta med i det som ska checkas in)
    src/

inget köat för incheckning, men ospårade filer finns (spåra med "git add")

Staging files and committing

Git tells us that our project is empty, and there is an untracked folder. To start tracking, we need to stage them.

$ git add src/main.c
$ git status
På grenen master

Inga incheckningar ännu

Ändringar att checka in:
  (använd "git rm --cached <fil>..." för att ta bort från kö)
    ny fil:        src/main.c

Now we have a file in the staging area. We can now commit the changes.

$ git commit

Now we will be directed to a text editor, where we can enter a commit message. It is important to have a clear and concise message for future reference and other collaborators. By convention, the first line should be a short message (if you push it to Github, it will be the message displayed). After that, break by one line, and you can enter a longer message. 

first commit

- to print hello world

# Ange incheckningsmeddelandet för dina ändringar. Rader som inleds
# med "#" kommer ignoreras, och ett tomt meddelande avbryter incheckningen.
#
# På grenen master
#
# Första incheckning
#
# Ändringar att checka in:
#       ny fil:        src/main.c
#

Save and exit, it will tell you the changes are committed

$ git commit
[master (rotincheckning) 1094701] first commit
 1 file changed, 7 insertions(+)
 create mode 100644 src/main.c

Track changes

To see the project

$ git log --all --decorate --oneline --graph
* 1094701 (HEAD -> master) first commit

Here, we use the log command and options to make a list of changes that will be ordered as a graph, which will be useful when we have multiple branches.

Branching

Say that we start to develop a new feature. We want to concentrate on working on the existing code (that we know well) and that the development of the feature can be easily tracked. This is called the Feature Branch paradigm. It says that all feature developments should be developed on a specific branch of the code, and not on the master branch (the default branch). To create a new branch, we first create it and check out the branch. Suppose we want to create a branch to develop a "sum" feature in our project, we can do the following

$ git branch sum-feature
$ git checkout sum-feature
Växlade till grenen "sum-feature"

To check what branches are there and which branch are we in, we can use the branch command.

$ git branch
  master
* sum-feature

We can develop the feature now. Suppose we add a source file sum.c in the source folder, and a header file sum.h in the include folder.

$ tree
.
├── include
│   └── sum.h
└── src
    ├── main.c
    └── sum.c

2 directories, 3 files

We now add the two files to Git.

$ git add include
$ git add src/sum.c
$ git status
På grenen sum-feature
Ändringar att checka in:
  (använd "git restore --staged <fil>..." för att ta bort från kö)
    ny fil:        include/sum.h
    ny fil:        src/sum.c

Finally, we commit them

$ git commit
[sum-feature b1b0ff8] implement sum feature
 2 files changed, 10 insertions(+)
 create mode 100644 include/sum.h
 create mode 100644 src/sum.c

If we take a look at the log now, we will see that we progressed.

$ git log --all --decorate --oneline --graph
* b1b0ff8 (HEAD -> sum-feature) implement sum feature
* 1094701 (master) first commit

Merging

Suppose that we feel our sum feature is ready to be merged to master. We go back to master and perform a merge and delete the branch.

$ git checkout master
Växlade till grenen "master"
$ git merge sum-feature 
Uppdaterar 1094701..b1b0ff8
Fast-forward
 include/sum.h | 4 ++++
 src/sum.c     | 6 ++++++
 2 files changed, 10 insertions(+)
 create mode 100644 include/sum.h
 create mode 100644 src/sum.c
$ git branch -d sum-feature 
Tog bort grenen sum-feature (var b1b0ff8).

We now have our code in the master branch.

$ git log --all --decorate --oneline --graph
* b1b0ff8 (HEAD -> master) implement sum feature
* 1094701 first commit

Merge without conflicts

Our previous case is extremely simple that you will never encounter even for a small one-person project. The reason is that multiple people (or multiple features) are often developed in parallel. This means that some parts of the codebase may be changed simultaneously. In many cases, Git is smart enough to be able to resolve the conflicts automatically, but not in some cases. Assume that we simultaneously develop two features minus and product. The log will show that our work has diverged.

$ git log --all --decorate --oneline --graph
* 37e5937 (minus-feature) implement minus feature
| * 49db9d4 (product-feature) implement product feature
|/  
* b1b0ff8 (HEAD -> master) implement sum feature
* 1094701 first commit

We merge the features one after one, first, we merge the master branch into the minus branch to resolve any conflicts. After that, we merge it into master.

$ git branch
  master
* minus-feature
  product-feature
$ git status
På grenen minus-feature
inget att checka in, arbetskatalogen ren
$ git merge master
Redan à jour.
$ git checkout master
Växlade till grenen "master"
$ git merge minus-feature 
Uppdaterar b1b0ff8..37e5937
Fast-forward
 include/minus.h | 4 ++++
 src/minus.c     | 6 ++++++
 2 files changed, 10 insertions(+)
 create mode 100644 include/minus.h
 create mode 100644 src/minus.c

We now merged the branch into master, and the log will show that both master and the minus feature branch are the same.

$ git log --all --decorate --oneline --graph
* 37e5937 (HEAD -> master, minus-feature) implement minus feature
| * 49db9d4 (product-feature) implement product feature
|/  
* b1b0ff8 implement sum feature
* 1094701 first commit

We can now go to the product-branch and do the same.

$ git checkout product-feature 
Växlade till grenen "product-feature"
$ git merge master 
Merge made by the 'recursive' strategy.
 include/minus.h | 4 ++++
 src/minus.c     | 6 ++++++
 2 files changed, 10 insertions(+)
 create mode 100644 include/minus.h
 create mode 100644 src/minus.c
$ git checkout master
Växlade till grenen "master"
$ git merge product-feature 
Uppdaterar 37e5937..d5223a9
Fast-forward
 include/product.h | 4 ++++
 src/product.c     | 6 ++++++
 2 files changed, 10 insertions(+)
 create mode 100644 include/product.h
 create mode 100644 src/product.c

$ git branch -d product-feature minus-feature
Tog bort grenen product-feature (var d5223a9).
Tog bort grenen minus-feature (var 37e5937).

We have now demonstrated how to develop features separated.

Merge with conflict

Suppose now we (for whatever reason) want to perform parallel development of the sum function. We create two branches for them.

$ git branch sum-feature-1
$ git branch
* master
  sum-feature-1

We go to the and edit the sum function from:

#include "sum.h"

double sum(double a, double b)
{
  return a + b;
}

to :

#include "sum.h"

float sum(float a, float b)
{
  float c = a + b;
  return c;
}

Now, we commit this change and go back to master.

$ git add src/sum.c 
$ git commit
[sum-feature-1 ebe4937] change to support float only
 1 file changed, 3 insertions(+), 2 deletions(-)
$ git checkout master
Växlade till grenen "master"

We are now in master. For some strange reasons, we decided to modify the sum.c file directly, from the original to:

#include "sum.h"

int sum(int a, int b)
{
  return a + b;
}

We commit this change.

$ git add src/sum.c 
$ git commit
[master 815b5dd] modify sum to support int
 1 file changed, 1 insertion(+), 1 deletion(-)

We now want to merge our feature from the branch.

$ git merge sum-feature-1 
Slår ihop src/sum.c automatiskt
KONFLIKT (innehåll): Sammanslagningskonflikt i src/sum.c
Kunde inte slå ihop automatiskt; fixa konflikter och checka in resultatet.

Opps! we have a conflict. Because we changed the files at the same time, Git is unable to determine how to perform the merge. The message says that the sum.c file contains a conflict. To see the status of the project, we can use the status command.

$ git status
På grenen master
Du har ej sammanslagna sökvägar.
  (rätta konflikter och kör "git commit")
  (använd "git merge --abort" för att avbryta sammanslagningen)

Ej sammanslagna sökvägar:
  (använd "git add <fil>..." för att ange lösning)
    ändrat av bägge:    src/sum.c

inga ändringar att checka in (använd "git add" och/eller "git commit -a")

Here, Git paused the merge and wants us to manually inspect and resolve the conflict in sum.c, and to perform a final merge commit. The sum.c code now looks like this.

#include "sum.h"

<<<<<<< HEAD
int sum(int a, int b)
=======
float sum(float a, float b)
>>>>>>> sum-feature-1
{
  float c = a + b;
  return c;
}

This explains why there is a conflict. We simultaneously changed the signature (which is on the same line), and Git is unable to determine how to merge it. Git also annotated for us the differences. In the upper part, we see the state of sum.c at that particular line at HEAD, which is basically our current branch. The lower part shows the state in sum-feature-1 branch. We can now decide how to resolve the conflict. For example, we change the code to:

#include "sum.h"
int sum(int a, int b)
{
  int c = a + b;
  return c;
}

We can now add and commit the file to finish the merge. When we do git add and commit, we will already get a predefined message:

Merge branch 'sum-feature-1'

# Conflicts:
#       src/sum.c
#
# Det verkar som du checkar in en sammanslagning.
# Om det inte stämmer tar du bort filen
#       .git/MERGE_HEAD
# och försöker igen.


# Ange incheckningsmeddelandet för dina ändringar. Rader som inleds
# med "#" kommer ignoreras, och ett tomt meddelande avbryter incheckningen.
#
# På grenen master
# Alla konflikter har rättats men du är fortfarande i en sammanslagning.
#
# Ändringar att checka in:
#       ändrad:        src/sum.c
#

We can now see the log and how our project has progressed!

$ git log --all --decorate --oneline --graph
*   ffffe56 (HEAD -> master) Merge branch 'sum-feature-1'
|\  
| * ebe4937 (sum-feature-1) change to support float only
* | 815b5dd modify sum to support int
|/  
*   d5223a9 Merge branch 'master' into product-feature
|\  
| * 37e5937 implement minus feature
* | 49db9d4 implement product feature
|/  
* b1b0ff8 implement sum feature
* 1094701 first commit

Working with remote

One major advantage of Git is that it is decentralized, meaning that each repository is an independent repository of its own. This means that you can work on a plane without a WiFi connection, and push and merge the changes when you land and get home. Typically, Git projects are supported by the remote repository that hosts the entire project. Some examples include Github, Gitlab, and Bitbucket. To work with Github, we need to learn about the concept of a remote. A remote is simply another repository. We typically call the central repository origin. When we create a git repository on Github, we will get a list of instructions on what to do. When working with remote, there are two important commands to know.

$ git push (remote name) (branch name)

This is to push all local commits to a remote repository. The opposite is to fetch.

$ git fetch

Finally, to copy over the changes of a particular branch, we can do:

$ git pull (remote name) (branch name)

Ignore Files

One important issue that is easy to forget is to not commit build or temperory files. For example, when using CMake, while one should commit the content of the CMake List file, the build folder and objects should not be committed. This can be easily done adding a .gitignore file at the root of the project. Git will read the pattern described in the ignore file and ignore all the files where the file name matches. This repository Links to an external site. from Github provides a collection of ignore files for different kinds of project. For example, in a C project, we can use this Links to an external site. to ignore all the intermediate buid files.


Hands-on Exercise

We recommend that you complete the following exercise.

https://learngitbranching.js.org Links to an external site.

It is an interactive git exercise, similar to the above, where you are given a scenario and to finish the tasks. The tutorial is step by step and guide, where your commit will be visualized along the way.


References