Introduction
This post is part of a series about Git and GitHub.
In all our previous posts we worked with Git locally.
In this post we’re going to look at using Git to work with remote repositories.
What is a remote repository?
According to the “Git Basics - Working with Remotes” section from Pro Git by Scott Chacon and Ben Straub:
Remote repositories are versions of your project that are hosted on the Internet or network somewhere.
That’s it: remote repositories are just versions of your Git repository that aren’t stored locally. There isn’t anything unique about remote repositories. A remote repository is the exact same .git file that you have on your machine, but it’s stored somewhere else.
Why use a remote repository?
There are several reasons to work with a remote repository:
-
A remote repository can serve as a backup file in case your local copy is lost, damaged, or deleted.
-
You can have a copy of your repository on separate computers so you can work on your files on different machines (for instance, a computer at work and another at home).
-
You can have a repository on a production server to deploy files you’ve developed locally.
-
Other people can have copies of the repository on their computers so that they can collaborate to work on project files with you.
-
A hosted common repository provides a convenient and straightforward way to enable collaboration between multiple contributors.
What’s special about using Git remote repositories?
There are several things to keep in mind about the way Git works with repositories, both locally and remotely.
From Wikipedia’s entry for Git:
[E]very Git directory on every computer is a full-fledged repository with complete history and full version-tracking abilities, independent of network access or a central server.
From the “Getting Started - About Version Control” section from Pro Git:
In a [Distributed Version Control System]… clients don’t just check out the latest snapshot of the files; rather, they fully mirror the repository, including its full history. Thus, if any server dies, and these systems were collaborating via that server, any of the client repositories can be copied back up to the server to restore it. Every clone is really a full backup of all the data.
From the “Getting Started - What is Git?" section from Pro Git:
Most operations in Git need only local files and resources to operate — generally no information is needed from another computer on your network. … Because you have the entire history of the project right there on your local disk, most operations seem almost instantaneous.
For example, to browse the history of the project, Git doesn’t need to go out to the server to get the history and display it for you — it simply reads it directly from your local database. This means you see the project history almost instantly. If you want to see the changes introduced between the current version of a file and the file a month ago, Git can look up the file a month ago and do a local difference calculation, instead of having to either ask a remote server to do it or pull an older version of the file from the remote server to do it locally.
This also means that there is very little you can’t do if you’re offline or off VPN. If you get on an airplane or a train and want to do a little work, you can commit happily (to your local copy, remember?) until you get to a network connection to upload. If you go home and can’t get your VPN client working properly, you can still work. In many other systems, doing so is either impossible or painful.
Because Git is built as a Distributed Version Control System, it works really well both locally and remotely. You don’t lose functionality or efficiency by incorporating remote repositories into your workflow. You get the benefits of having distributed (i.e. remote) repositories without sacrificing the benefits of decentralization that Git is built on.
Furthermore, as the “Getting Started - What is Git?" section from Pro Git points out:
Everything in Git is checksummed before it is stored and is then referred to by that checksum. This means it’s impossible to change the contents of any file or directory without Git knowing about it. This functionality is built into Git at the lowest levels and is integral to its philosophy. You can’t lose information in transit or get file corruption without Git being able to detect it.
So even when working with remote repositories, Git can guarantee the integrity of files in the repository.
And Git not only tracks what changes were made to files, but also the order of those changes and how they relate to each other.
That means that even when changes are made on separate machines, Git can often automatically merge those changes together. You don’t have to manually track and compare what changed on each machine - Git does that and uses that information to logically merge changes to files from different sources.
It’s also important to note that while you can establish a relationship between remote repositories, they remain separate and distinct. By default, your local repositories and their data remains private and only viewable by you unless you choose to share it. You have to explicitly command that your local repositories exchange information with remote repositories. This mean that you remain in control of what, if anything, is shared between your local repositories and remote repositories.
In summary, Git remote repositories extend the capabilities of Git with virtually no tradeoffs.
Working with Git remote repositories
Now, let’s look at how to work with Git remote repositories.
Establishing remote-tracking repositories
As mentioned above, Git won’t share the data in your local repositories with any remote repository unless you explicitly command it to.
So, before you exchange data with a remote repository you will typically want to establish a relationship, called a remote-tracking branch. There are two commands that do that:
- git clone
- git remote add
The video Sharing Data on a Remote Repository by Tower outlines these concepts.
Git clone
If there is an existing project that you want to work with, the git clone <url>
command allows you to download a copy of the Git repository. This will download the .git file, which includes the entire history of the project.
Navigate to the directory where you want the project on your local machine and then run the command with the URL for your project. For example, to download the Git repository for the list of Hacker Laws maintained by Dave Kerr, you would use the following:
git clone https://github.com/dwmkerr/hacker-laws
The video Cloning an Existing Repository by Tower illustrates this command.
Git remote add
If you have started a project locally and want to upload it to a remote host, the git remote add <shortname> <url>
command will create a remote repository for it.
Navigate to the directory of your local Git repository and then run the command with a chosen shortname (“origin” is the convention) and the URL for the remote repository. For example, if your GitHub username was exampleuser and your remote repository had the shortname exampleproject, you would use the following:
git remote add origin https://github.com/exampleuser/exampleproject
The video Connecting a Remote Repository by Tower illustrates this command.
Exchanging data with remote-tracking repositories
Because remote repositories are separate from your local repository, they can evolve indendently. Changes can take place in any one repository that the others are unaware of. To communicate about these changes you have to explicitly tell Git to either request or send updated information.
Once you have established remote-tracking repositories, communicating this information is simple. There are three commands for exchanging data with remote repositories:
- git fetch
- git pull
- git push
The first two commands are for downloading data from a remote repository to your local computer. The third command is for uploading data from your local repository to a remote repository.
Git fetch
The git fetch <remotename>
command downloads all the data from the remote repository that your local repository doesn’t have. This command works with the shortname of one of your existing remote-tracking repositories. For example, if you previously setup a remote repository with the shortname origin, you would use the following:
git fetch origin
Git fetch leaves your working copy untouched. As the “Git Basics - Working with Remotes” section from Pro Git highlights:
It’s important to note that the git fetch command only downloads the data to your local repository — it doesn’t automatically merge it with any of your work or modify what you’re currently working on. You have to merge it manually into your work when you’re ready.
You can now use the git branch -vva
command to see what was changed remotely:
git branch -vva
If there are any changes you want to incorporate you can use the git merge
command (outlined in a previous post) to merge those changes.
The video Pulling & Fetching Changes from a Remote by Tower illustrates this command.
Git pull
Similarly, the git pull
command can fetch and automatically merge changes from your remote repository into your local repository.
For example, if you previously setup a remote-tracking repository for your active branch, you can simply use the following:
git pull
Git will fetch and attempt to automatically merge any changes.
According to the “Git Branching - Remote Branches” section from Pro Git:
Generally it’s better to simply use the fetch and merge commands explicitly as the magic of git pull can often be confusing.
The video Pulling & Fetching Changes from a Remote by Tower illustrates this command.
Git push
The previous commands were about downloading changes from remote repositories. To share changes you have made, you can use the git push
command.
When you first use the command for a given branch you have to use the longer form of the command, git push -u <shortname> <branchname>
. To begin, make sure you have checked out the branch that you want to share, then you can use the command. For example, if you previously setup a remote repository with the shortname origin and you want to publish a branch called newfeature, you would use the following:
git push -u origin newfeature
This will create the new branch on the remote repository, share the repository data for that branch, and establish a tracking connection with that branch on your local repository.
Because we established a tracking connection, the next time you want to push changes from that branch, you can use a simpler command:
git push
This will push the latest changes from your current branch on your local repository to the tracked branch on the remote repository.
The video Pushing Changes to a Remote by Tower illustrates this command.
Reviewing remote repository connections
To see the remote repositories that have tracking connections with your local repository, use the following:
git remote -v
To see the remote repositories that have tracking connections with the active branch on your local repository, use the following:
git branch -vva
To see more information about a remote, use the command git remote show <shortname>
. For example, for a remote with the shortname origin, use the following:
git remote show origin
More useful commands for working with remote repositories
To rename a remote’s shortname use the git remote rename <oldname> <newname>
command. For example, for a remote called temp that you want to rename to offsite, use the following:
git remote rename temp offsite
To remove a remote-tracking connection use the git remote remove <remotename>
command. For example, for a remote called legacy, use the following:
git remote remove legacy
Additional resources
Both the “Git Basics - Working with Remotes” and “Git Branching - Remote Branches” sections from Pro Git are extremely helpful for understanding this topic. They contain examples and diagrams that illustrate the concepts more thoroughly. If you have any questions or confusion about the topic, those two sections will likely help clarify the subject.
Using remote repositories
There are many options for using remote repositories.
You can setup your own remote Git server or use one offered by a service provider. The “Git on the Server” section of Pro Git offers thorough step-by-step instructions for setting up your own Git server.
A simpler option (especially when you’re getting started) is to use a remote Git server hosted by another service provider. GitHub is the most popular and well-known option, but it’s far from the only one. (If you decide to use GitHub the “GitHub” section of Pro Git has instructions to help you get started.) As pointed out in the “Git on the Server - Third Party Hosted Options” section from Pro Git:
These days, you have a huge number of hosting options to choose from, each with different advantages and disadvantages. To see an up-to-date list, check out the GitHosting page on the main Git wiki at https://git.wiki.kernel.org/index.php/GitHosting.
Many of the services on that list include options for free private repositories.
Next steps in learning Git
As previously mentioned, there is much more to learn in Git. We have touched on only the introductory features of this powerful tool.
To find a curated list of many more resources for learning about Git you can review our Git and GitHub Topic post.
We also highly recommend reading Pro Git by Scott Chacon and Ben Straub. It is a thorough and organized guide for using Git, from basic features to advanced topics. It covers several aspects of GitHub as well. And it is available for free on Git’s official website.