Git Introduction

Elvenware Logo

TOC

Introduction

Git is a free, open source, distributed, version control system. It allows individuals or teams to check in and track text based documents of all kinds to a central repository available over the web.

Git is capable of working with binary files, but it is best at storing text based documents such as source files, markdown, HTML, JSON, XML, or standard ASCII documents.

With Git, you can run the clock backwards, finding old versions of your code, and you can compare one version with another version.

Git is a sophisticated tool, and it is not always easy to learn. It is however, the dominant tool of its kind, and with good reason. It is powerful, flexible, and robust. Ultimately, Git is worth mastering. In fact, much of the basic functionality of Git is relatively easy to master once you get over a few low hurdles.

Outline

When to use Git

As mentioned earlier, Git is designed to, and best for, working with text documents. This means it is probably not the best way to track Word Documents, zip files, or graphics files. It can be used for that purpose, but there are probably better tools for those kinds of tasks.

You don't need to be deeply technical to use Git or to read this book, but you must be willing to learn. Technology has some surprising twists and turns to it.

There is little doubt that heavyweight tools such as Word are the best way to create text when you first start using computers. But if you want to take one step further into the technology, it is much, much simpler if you step away from complex formats such as Word offers and start working with simple text documents. As you learn how to use them, you will find that they offer most of the capabilities of a tool like Word, but provide much more flexibility.

Centers of Activity

Version control has been around a long time. What makes Git different is the way it handles repositories. Git has a distributed model that allows you to clone repositories onto your local machine. In the past, there was only one version of the repository, typically somewhere on your local network. Git is quite different, in that you can have multiple versions of your repository. This means you can have a version of the repository on your local network or even your local machine. You can check in and check out from there before deciding that you want to push to the main repository in the cloud. One advantage of this system is that it allows you to share your work with a restricted group of people before pushing it up to a more public location.

In a common scenario, members of a team might each have copies of the repository on their local machine. They can check their work into repository on the local network that all team members can access. Once the code passes tests, and everyone on the team feels it is ready to go public, it can be pushed up to repository on the Internet that can be accessed by the general public.

There are many Git servers in the cloud. Here are two that everyone should know about:

You probably want to have accounts on both GitHub and Bitbucket. Everyone uses GitHub, and Bitbucket competes by allowing you to have free private accounts.

Git Website

Below you find links to the main site for Git. You will also find their download page and related links:

GIT Guides found on the Web

I've put together various resources including a video. I've also put together a walk through/assignment that can help you get started with Git on Linux. Also, I have a slide deck on Git.

There is an excellent free online book:

More useful links:

Is Git Reliable?

Topics like "is Git reliable?", "is it me, or is it Git?" come up from time to time. My concern is that students will spin their wheels unnecessarily, focusing on imaginary problems with Git, rather than trying to problem solve.

The following stats were assembled in Feb, 2016:

As of August, 2017, GitHub has about 23 million users and 64 million repositories. Those numbers are growing quickly. That's just GitHub, not total Git users. You can use Git and not use any cloud repository, or use Git and use a different cloud repository such as BitBucket.

Related links which probably contain updates to total GitHub users:

GitHub and BitBucket are websites, and websites are sometimes down due to operational errors or denial of service attacks. But GitHub's record is usually good:

Though the odds that it will happen are very low, it is always possible that one of us will hit a bug in Git. Note, however, that that the odds that we hit a Git bug when performing basic operations is very, very low.

Git is not perfect. No piece of software is perfect. Yet Git is, in my opinion, one of the most reliable, tested, and proven software programs in the world.

NOTE: By far the most common problem my students have with Git occurs because the student does not shut down their VM properly. Don't close the lid of your laptop with your VM running. Don't close a VM by clicking the X for cancel icon at the top right or left of the VM window. Instead, properly shutdown your copy of your VM. The best way to do this is by chosing the shutdown button at the right of the interface, the Start button (bottom left of Lubuntu interface), or typing sudo shutdown -h now in the bash shell.

Let me pull on your coat for a moment longer regarding this subject. I occasionally have a student for multiple quarters, as many as four in a row. I have found that some of these students persist in shutting down their VM improperly despite the fact that they damage their repositories repeatedly with their bad habits. It is as if there is nothing I can say that will inspire these students to take the extra few seconds required to shutdown properly, thereby saving themselves much time and trouble in the long run.

Git vs DropBox

There are some tools such as DropBox, Google Drive and OneDrive that will automatically propagate changes to any machine that subscribes to updates. Git, on the other hand, is designed to let developers, or authors, work on a set of documents and then share them when and as needed. The advantages to the second scenario (Git), for a certain type of activity, are twofold:

I think all that goes at least some of the way to explaining why Git works the way it does.

Sharing documents is a good and common use of a tool like Git. When working with documents rather than code, the consequences of automatically pushing updates is less severe, but still real. Suppose I'm putting together a list of steps that must be followed. If I accidentally leave out a few steps in my first draft, then people who are following my changes live are likely to be miffed that my guide does not work. Or perhaps I begin working on a description of a seven step process. I get to step 5, then get called to dinner, or head off to bed. People reading the document would then begin following my steps, only to find that the description is incomplete.

There are, of course, strong arguments for exactly the opposite kind of functionality. If I'm taking notes on a lecture in Evernote, at the end of class I want them to be propagated automatically and immediately so I can access them as needed from home. There is no advantage in that scenario to any other system. Leaving a copy of my notes at school without being able to access them from home would have no advantages. I want to see all changes immediately on all machines when I log into Evernote.

The bottom line is that there are many different ways to share information in the cloud. The only way to learn which one is best for which task is to learn the tools, and begin experimenting with them. Everyone is working these things out for themselves just now. Unfortunately, students in the future will probably be told to use tool X for purpose Y, and tool W for Z. Neat and clean, but not nearly as fun as exploring this world on your own. We live in interesting times.

Copyright © 2017 by Charles Calvert