The main way of contributing to an open-source project that is hosted on GitHub is via a pull request. A pull request says, “Here are some changes that I have made in my copy. Please incorporate them into the main version of the project.”
Contents:
(Also see Version control concepts and best practices.)
.github/workflows/foo.yml
file in your
repository enables CI. If the upstream repository has such a file,
then so does your fork.
git clone git@github.com:USERNAME/REPONAME.git
git clone https://github.com/USERNAME/REPONAME.git
where USERNAME is your GitHub username. (In any example command, you need to replace any text in ITALIC CAPS.)
cd REPONAME git remote add upstream https://github.com/OTHERUSER/REPONAME.git git fetch upstream
README
file.
When you are ready to start on a unit of work, such as fixing a bug or implementing a feature, create a branch. A branch is a parallel thread of development — you can create as many branches as you want in your repository, which is like having multiple independent repositories. All the work on every branch is saved on GitHub.
Each working copy has a single current branch.
When you commit changes (with git commit
) or push commits to
GitHub (with git push
), they are saved to the current branch.
You must always remember which branch you are working on; making changes to
the wrong branch is a common mistake.
Here are two ways to change which branch is the current branch. (Before changing the current branch, update your repository from upstream and commit any changes.)
To create a branch named MYBRANCHNAME and make it the current branch, run
git checkout -b MYBRANCHNAME
or run (these two commands are equivalent to the one above):
git branch MYBRANCHNAME git checkout MYBRANCHNAME
Use a descriptive, readable name for your branch, such as
unicode-support
or
fix-issue-22
.
Switch to an existing branch (that is, make an existing branch the current branch) by running
git checkout MYBRANCHNAME
Each branch should represent a logical unit of work. If you are doing two different tasks like fixing a bug and performing a refactoring (or if while doing a task you discover a second, distinct task, like the need to refactor or to improve unrelated documentation), then create two different branches for them. This is a bit of a hassle for you, but it makes reviewing your changes much easier, and the maintainers will be more likely to accept your changes.
Do not work on the main branch in your fork. (It is often
named main
or master
.) If you do so,
future pull requests will be cluttered by unnecessary merge commits. The
rest of this section explains why; you can skip it unless you want to learn
more details.
When a developer merges your work into the main repository, that usually creates a new commit (it contains the same code changes, but has a different identity than the one or more commits that you made in your branch). Whenever a branch isn't identical to upstream, pulling from upstream will create a new merge commit. Once a branch is different from upstream, each pull will accumulate more changes (differing commits) from upstream.
Therefore, it is better to keep your main branch identical to upstream, and create a new branch for each pull request. You will delete the branch when your pull request is merged into the upstream repository. (GitHub will do the deletion automatically if you enable “Automatically delete head branches” in the repository settings of your fork.)
If you do create a pull request from your fork's main branch, leading to spurious merge commits on your fork's main branch, then after all your pull requests are merged, you are probably best off deleting your GitHub fork (do this at GitHub.com) and all clones of it, and then re-creating the fork (do this at GitHub.com) and re-cloning. There are other ways to fix the problem, but they use advanced git commands and are more error-prone.
Before you start to implement your changes, write tests that currently fail but will pass once you have fixed the bug or implemented the feature. Run the tests locally to confirm that they currently fail. (The project's developer documentation will tell you how to do this.) Now, commit the tests and push them. Check that continuous integration has run the project's tests on your fork and that they failed.
Now, do your work, testing locally and committing logical chunks of work as
you go. You can push these commits to GitHub by running git
push
whenever you like. Eventually, you will be done and ready for
a code review.
Being done requires at least the following:
Periodically pull upstream into your branch; that is, incorporate into your branch any work that other maintainers have done since you created your branch. It's easier to do this frequently than all at once. Here are two ways to do so:
git pull upstream BRANCHNAMEor
git pull https://github.com/OTHERUSER/REPONAME.git
If this command had any effect, then:
Oftentimes, when you are working to add a feature, you will also fix a bug, or add documentation, or perform a refactoring. It is great to make these improvements. However, each pull request should be a single, logical unit. If you have made multiple different changes, create a new branch and a separate pull request for each one.
Your repository might start out having only a branch
named all-my-changes
(the actual name should be more
descriptive!). After you make new branches for the logically distinct
changes, you might have
branches all-my-changes
, add-documentation
,
and refactoring
. While you develop, periodically pull
the main
, add-documentation
,
and refactoring
branches into all-my-changes
.
Create pull requests for each branch when it is ready. Don't create a
pull request for all-my-changes
until the pull requests for
ancillary branches have been merged into the main branch, and you have
merged the main branch into all-my-changes
.
It's a bit more work to separate different changes into different branches, but it makes each pull request much easier to understand. Each pull request can be reviewed more quickly. A change history with more, smaller commits is more helpful to future developers. Test failures are easier to understand. Any interactions between changes are easy to see.
Once you are happy with your work and you believe it is ready to be incorporated into the project's main repository, you can create a pull request.
Make your code self-explanatory. You should not write pull request comments on lines of code, and you should write very little in the introductory comment to your pull request. Comments in a pull request will never be seen by a programmer reading the source code. If there is information that is needed by a programmer reading the source code, you should put it in a code comment. This also applies to answering questions from reviewers: it is better to clarify the code or add documentation, rather than answering a question in the pull request comment thread.
Sometimes you want feedback on your code before you are ready to merge it into a different fork. In this case, you can create a pull request between two branches of your fork.
Sometimes, you want a review of code that you have already pushed to GitHub. Or, you want a holistic code review to critique the design of an entire component of your code, rather than incremental code reviews of bits and pieces of it.
GitHub's pull request mechanism does not support this workflow well, but here are two ways to make it work.
main
into it. You will never merge that
pull request, but will merely address feedback in main
and eventually close the pull request without merging it.
main
(that is, do git checkout main; git
checkout -b review
). Now, the reviewer reads and edits
the review
branch in their normal editor, adding TODO code
comments. The author also edits the review
branch, until
there are no more TODO code comments in the diff. Then, merge the
branch into main.
As soon as you receive feedback, you can start working on it. The reviewer should send you a message and/or assign the code review back to you, but the reviewer might forget, so don't wait for those events.
Make sure you are
working on the right branch; use git branch
to check.
Never force a push with git push -f
. Forcing a push is bad
practice, will cause loss of code review comments that GitHub attached to
that commit (you can't control which commit GitHub uses), and
will cause extra merges or
merge conflicts for people who have cloned your branch (such as the
people doing the review).
Go through each piece of feedback.
When you push commits to GitHub, the pull request will be automatically updated. If you change a line of code on which you received feedback, that feedback is no longer shown by default (or maybe it is shown but marked as out of date). That is, GitHub assumes that if a line near a review comment has been changed, then the review comment has been resolved. This means that you should try not to push changes (such as a change to indentation) that change a line without addressing all the comments related to that line.
Periodically run git remote prune origin
to remove deleted
branches from your working copy, so that you don't accidentally use them.
Some Git documentation recommends rebasing, amending commits, or other changes to existing version control history. Don't do any of these things. They are confusing and error-prone, they can corrupt your pull request, and they are not necessary. All of your changes will be squashed and merged into a single commit when your pull request is accepted, so don't worry about what the version control history of your branch looks like. Just focus on its differences from the upstream's main branch, which you can see in your pull request.
You will receive email about comments to your pull requests. Don't reply by email. Instead, reply on the GitHub webpage that is referenced by the email. One reason is that if you reply by email, you may needlessly bloat your response with all the quoted text from the email you received. Another reason is that if you reply by email, GitHub may not associate your comment with the right thread in the code review.
This section is for maintainers who are reviewing and merging a pull request.
This section is currently incomplete, but contains a few tips.
It is desirable to keep the version control history clean: in the main branch, each logical change should be in one commit, even if the pull request entailed work over a period of time and multiple commits on a feature branch. To achieve this, select “Squash and merge” when you merge a pull request. “Squash and merge” results in a single commit that contains all the changes in the pull request. Consistently using “Squash and merge” results in a linear commit history on the main branch.
A single commit is desirable because a pull request represents a single conceptual change that has been tested and reviewed as a logical unit. When a pull request is ready to be merged, it may consist of many commits. Future maintainers will not be interested in each individual commit, such as showing bug fixing within the logical change or interactions during the pull request review. A git history that is littered with lots of little commits is much harder to read and understand.
A side benefit of squash-and-merge is that every commit on the main branch passes tests.
The repository owner can prevent incorrect pull request merges. In the repository settings, in the “Merge button” section, disable “Allow merge commits” and “Allow rebase merging”. You might also want to enable “Automatically delete head branches”.
When you squash-and-merge a GitHub pull request, the default first line of the commit message is the pull request's title, and the remainder (which GitHub calls the “extended description”) is the concatenation of the messages for all the commits in the pull request. This latter information is not useful to future developers. Therefore, edit the detail text to remove all the commit messages. Use the pull request's description (the very first comment that was written when the pull request was created), if any.
Another problem with not editing the commit message is that it may leave “[ci skip]” in the commit message, so the merge commit may not be processed by continuous integration such as Azure Pipelines, CircleCI, GitHub Workflows, or Travis CI. (CI may perform some action on every (successful) commit to main.)
Back to Advice compiled by Michael Ernst.
Michael Ernst