The What, Why, and How of Git and GitHub
Git and GitHub are widely used tools in technical projects. Despite their prevalence, many rote memorize Git commands for straightforward tasks without fully understanding what these tools are, how they differ, and what roles they play in a project. In this section, I will provide an overview of Git and GitHub and explain why it is worthwhile to learn and use them over other available options.
What exactly are Git and GitHub?
Updated February 23rd, 2020. Accessed October 2025.
There are two types of version control that facilitate project sharing and collaboration: centralized and distributed. Git is a Distributed Version Control System (DVCS), which is depicted in the clip on the left 1–3. In a DVCS, a remote server stores the integrated copy of a project, while mirrors containing the full change history are distributed to individual contributors, who act as clients of the server 4,5. Local changes are communicated in what is called a peer-to-peer manner, synchronizing committed changes through patches that are sent over the server 1,6,7.
Downloaded October 10th, 2024.
Git was first released in 2005 and is the most popular version control system, even compared to centralized systems 2,3,8. However, Git does not come with its own GUI and lacks the infrastructure for project sharing or collaborations. Users also need to execute Git commands through a CLI like Terminal, which utilizes Bash—a type of shell script; a framework that can prove to be a steep learning curve for new users 9,10.
GitHub was designed in 2007 to improve upon Git and act as a centralized library of repositories where projects can be shared and collaborated on 10,11. Essentially, GitHub is a developer platform built upon Git repository management structures and commands. While GitHub does not provide version control management independently of Git, it offers additional features designed to fit Git’s paradigm for version control, such as the Fork and pull requests. GitHub also provides tools and settings that help development teams set milestones, assign responsibilities, and track issues or alternative suggestions 11.
Together, Git and GitHub enable robust code-based project management for both individual and collaborative efforts. Both Git and GitHub are language-agnostic, meaning they support projects written in any programming language. They integrate seamlessly with many Integrated Development Environments (IDEs) that support them natively, or they can be used alongside IDEs that do not have built-in integration features 10.
OK, but why choose them over other options?
Git is by far the most popular version control system among both learners and professionals. According to a survey conducted by StackOverflow in 2022, more than 95% of professionals reported using Git in their work, compared to other systems such as Mercurial or SVN 8. Git is also widely adopted by those learning to code, although version control systems remain underutilized by learners. Approximately 17% of learners indicated they do not use any version control, compared to just about 1% of professionals.
2022 report.
GitHub is a leading platform for hosting free and open-source code, serving over 100 million developers and hosting more than 420 million repositories as of 2024 10,11. Since 2015, it has surpassed its main competitor, SourceForge, and had become one of the world’s largest hosts of source code by 2022/2023 12.
GitHub is a highly sought-after resource for maintainers and contributors worldwide, including developers working for companies and nonprofits. The pie chart on the right highlights 20 communities outside the US with the largest user growth in 2022 compared to the previous year. In 2013, most users were based in the US, but with the year-over-year growth of users from India observed thus far it is expected that Indian programmers will match the US developer population in 2025 13.
Essential Takeaways
With this introduction, we’re ready to start exploring how these tools handle version control and code distribution. While the contextual overview presented thus far is valuable for new or beginner users of Git and GitHub, let’s take a step back and review the core principles that you actually need to know before diving into the how.
The Motivations: In a coding project, it is often advantageous to track changes made at each step. When something goes wrong, version control allows you to easily diagnose which changes caused the bug and revert to older, working versions while you isolate and fix the issue. Think of it like “going back to the last save” in a game. Robustly managing a code-based project is only part of the challenge; you also need to provide means for accessing the work for publishing, reporting, or collaboration.
The Short and Sweet Descriptions: Git is a system that allows you to save, manage, and track code version histories, enabling you to revert to previously saved versions when necessary. GitHub builds on the Git framework, offering remote storage, distribution, and collaboration for Git-controlled projects 2,11. Some refer to GitHub as more than just a place to distribute code, but as a “social networking site for programmers” 10.
Unpacking How: Git and GitHub in Action
In the following pages, we’ll guide you through installing Git, setting up a GitHub account, and configuring both. The rest of the chapter will explore the intricacies of working with Git and GitHub through a realistic example and three common scenarios for managing project codebases. You’ll also have the opportunity to practice these skills using pre-prepared problems.
Before we jump into these, however, I want to address some common complaints about Git and GitHub introductions. Many find these resources present too much as need-to-know or give commands out of context, making it hard to see the big picture.
These approaches makes it challenging to grasp Git and GitHub basics and often sends users down endless rabbit holes trying to piece together a cohesive understanding. It also complicates extending beyond basic usage, as it’s not clear how these concepts apply to or can be implemented in your own work.
To mitigate these sources of confusion, the following section offers high-level key takeaways about how Git and GitHub interact with your project materials and with each other. We can divide this interaction into three main categories:
- Version control on your local device.
- Remote storage and distribution of your code.
- Secure information transfer between Git and GitHub.
Version Control on Your Device
Recall that Git handles version control entirely, while GitHub is designed based on Git and provides limited extended features for collaboration and code sharing. Therefore, all necessary version control features for your projects are accessible on your device, often referred to as “local.”
Projects are typically stored in a folder, also called a directory, containing all the files and code relevant to that work. A Git-initialized project will include a .git folder in the root directory of your project, which contains all the information necessary for version controlling its contents.
In a Git-initialized directory, Git interprets your project files to be in one of three domains at any given time: the Working Tree, Staged Edits, and Committed Edits 14. These domains and common transitional commands are demonstrated in the figure below, based on a diagram given in “What is git commit, push, pull, log, aliases, fetch, config & clone” 14.
This workshop will not cover all the different ways to move tracked files through version control domains, but will focus on core commands necessary for all projects. Just know that you can perform various types of changes not described here that may better suit your coding needs.
The Working Tree is the area where developers make changes to files in the project. In this domain, Git doesn’t actively track changes; instead, changes are recycled back into the same file without recording interim revisions.
When you are ready for Git to follow changes, you need to add them to the staged environment with git add 15. This prompts Git to identify differences between Staged Edits and already committed versions that are stored in the .git directory. Files can be unstaged and restored to the working tree with the git reset command 16,17.
When prompted for tracking, Git uses a heuristic to compare files in your project directory to determine if a file has changed or is new. This method of change detection is not always perfect and may sometimes mistakenly identify significantly altered files as new ones 18.
Depending on the version of Git used, this may be observed when renaming or moving a tracked file to a new folder in the project directory. Certain Git version will detect this as both the creation of a new file under the new name and the deletion of the file under the old name 18,19. This can get confusing when reviewing the project’s version history!
If you see this, it’s best practice to track a file rename or move action with the git mv command and to stage the modification for its own commit with an appropriate message 19,20.
Command-Line Application
# Track moving a file
git mv <source>/filename.txt <destination>/filename.txt
git add <destination>/filename.txt
git commit -m "Move filename.txt from <source> to <destination>."
# Track renaming a file
git mv oldfilename.txt newfilename.txt
git add newfilename.txt
git commit -m "Rename oldfilename.txt to newfilename.txt."Once you have reviewed the detected changes and approve saving or sharing them, use git commit to promote the staged version to become the latest copy reflected in the .git directory 15. The most recent Committed Edits are then synced to the GitHub server as a “peer-to-peer” patch.
You will use the same commands even if there was no previous version of a file stored in the .git directory. When you use git status to check the current state of your tracked project files, Git will automatically identify untracked files and prompt you to add them for tracking. If your project is connected to a GitHub repository, the local copy of the project does NOT continuously monitor the remote repository’s contents. Instead, it keeps a copy of the last-checked version of the remote repository’s contents 14.
Remote Storage and Distribution
When you use Git to interact with a remote repository stored in GitHub, there are three core commands: clone, push, and pull. In the second part of this workshop, I cover these operations in greater detail, so do not worry too much about the exact purpose or definition of these commands right now.
Cloning is a one-time command used to download and store a complete copy of a public repository distributed on GitHub to your local device. Doing this does not necessarily mean you have permissions to contribute changes to this same GitHub repository.
Pushing is when you are ready to share your local copy of the codebase through GitHub. When you push, the most recent committed edits in the .git directory get shared with the remote version and evaluated for coalescence.
Pulling is when you want to update your local copy of the codebase that might have differed in the remote location. Pulling is an action that constitutes two operations: fetch and either merge or rebase. A diagram describing these is shown on the right.
Notice that pulling is not the same thing as cloning. Instead, it is used throughout the lifetime of the project to synchronize changes between your local and remote project copies.
Safeguarding Information Transfer Processes
When your local device interacts with GitHub, several security checks are performed. For instance:
- You must verify that the communication is indeed with GitHub and not an impostor site.
- You must confirm that you have the necessary permissions to access specific content on GitHub.
Additionally, encrypting your data ensures that it is securely transferred, safeguarding your content from potential intrusions.
GitHub supports two authentication and communication protocols: Secure Shell (SSH) and Hypertext Transfer Protocol Secure (HTTPS). NOTE: Some firewalls and proxies might not allow SSH connections, but this should not be an issue with HTTPS. If you run into this problem, you can reference GitHub’s SSH Troubleshooting page for some guidance.
SSH operates by utilizing a pair of keys - a private key and a public key. Information gets encrypted when being sent between servers, and only a key pair match can decrypt the information when it is received.
HTTPS facilitates the secure transmission of information over an encrypted internet connection. Decryption occurs when the correct username and Personal Access Token (PAT) are provided.
Section Glossary
| Distributive Version Control System (DVCS) | The project codebase is copied as a mirror to each contributor’s local computer. Local changes get synched via patches sent peer-to-peer through the server 1,2. |
git add
|
Prompt Git to track changes made to specified files and transition them from the Working Tree to the Staged Edits domain. Git compares the added files to any previously saved versions available in the .git directory 15.
|
git clone
|
Copies an existing repository stored in a remote server to your own device, including all files, the version history, and branches 21. |
git commit
|
Promote the staged version of specified files to become the latest copy reflected in the .git directory. The Committed Edits version is what gets synced to the remote server as a “peer-to-peer” patch 15. |
git merge
|
One method to reconcile different committing histories in divergent branches. Creates a new version integrating the head of the two branches in a three-way commit 22,23. |
git mv
|
Moves or renames a specified file, directory, or symlink that is already tracked by Git 20. |
git fetch
|
Downloads changes from a remote repository without coalescing with your local copy. Part of the pull action 24.
|
git pull
|
The combined action of fetch and merge or rebase. Downloads changes from a remote repository and coalesces them with your local copy 23.
|
git push
|
Upload your recent Committed Edits to a remote repository, synchronizing changes for others to see 15. |
git rebase
|
An alternative to merge. The branch commit histories are realigned so that the leading one defines the commit parent history of the following branch, thus rebasing its commits 23.
|
git reset
|
Unstages changes to specified files, moving them from Staged Edits back to the Working Tree without altering the files’ versions in the Working Tree 17. |
git status
|
Summarizes the state of files in the Working Tree and Staged Edits, comparing changes to the latest, committed version in the .git directory.
|
| Mirror | An exact copy of a project from a server, including al files, the version history, and branches 5. |
| Patch | Snippets of code or data used to update existing software 7. |
| Peer-to-peer | Participants in a network act as both client and server by trading resources and services with one another 6. |
| Root directory | The top-most directory in a branched hierarchy, containing all other files and directories. Your project’s root directory holds all the relevant files and code used your project. |
| Server and Client | Servers are computers or systems that provides resources (i.e. data or programs) to other computers, known as clients, over a network 4. |
| Shell | A program used by the CLI to mediate communication between the user and computer by interpreting commands and outputs. Examples: Bash, PowerShell, etc 9. |
