The main way of contributing to an open-source project that is hosted on GitHub is via a pull request. A pull request says, “Here are some changes that I have made in my copy. Please incorporate them into the main version of the program.”
Contents:
(Also see Version control concepts and best practices.)
.github/workflows/foo.yml
file in your
repository enables CI.
git clone git@github.com:USERNAME/REPONAME.git
where USERNAME is your GitHub username. (In any example command, you need to replace any text in ITALIC CAPS.)
cd REPONAME git remote add upstream https://github.com/OTHERUSER/REPONAME.git git fetch upstream
README
file.
When you are ready to start on a unit of work, such as fixing a bug or implementing a feature, create a branch. A branch is a parallel thread of development — you can create as many branches as you want in your repository, which is like having multiple independent repositories.
To create a branch named MYFEATURE and switch to it, first update your branch from upstream, then run
git checkout -b MYFEATURE
(Use a descriptive, readable name for your branch, such as
unicode-support
or
fix-issue-22
.)
You can switch to an existing branch by executing a command such as
git checkout MYBUGFIX
When you commit changes (with git commit
) or push commits to
GitHub (with git push
), they are saved to the current branch.
Each branch should represent a logical unit of work. If you are doing two different tasks like fixing a bug and performing a refactoring (or if while doing a task you discover a second, distinct task, like the need to refactor or to improve unrelated documentation), then create two different branches for them. This is a bit of a hassle for you, but it makes reviewing your changes much easier, and the maintainers will be more likely to accept your changes.
Do not work on the master
branch in your fork. If you do so,
future pull requests will be cluttered by unnecessary merge commits. The
rest of this section explains why; you can skip it unless you want to learn
more details.
When a developer merges your work into the main repository, that usually creates a new commit (it contains the same code changes, but has a different identity than the one or more commits that you made in your branch). Whenever a branch isn't identical to upstream, pulling from upstream will create a new merge commit. Once a branch is different from upstream, each pull will accumulate more changes (differing commits) from upstream.
Therefore, it is better to keep your master
branch identical
to upstream, and create a new branch for each pull request. You will
delete the branch when your pull request is merged into the upstream
repository.
If you do create a pull request on master
, then after it is
merged, you are probably best off deleting your GitHub fork and all clones
of it, and then re-creating it. Only do this if all its work has been
merged upstream! There are other ways to fix the problem, but they are
more error-prone.
Before you start to implement your changes, write tests that currently fail but will pass once you have fixed the bug or implemented the feature. Run the tests locally to confirm that they currently fail. (The project's developer documentation will tell you how to do this.) Now, commit the tests and push them. Check that continuous integration has run the project's tests on your fork and that they failed.
Now, do your work, testing locally and committing logical chunks of work as
you go. You can push these commits to GitHub by running git
push
whenever you like. Eventually, you will be done and ready for
a code review.
Being done requires at least the following:
Periodically pull upstream into your branch; that is, incorporate into your branch any work that other maintainers have done since you created your branch. It's easier to do this frequently than all at once. Here are two ways to do so:
git pull upstream BRANCHNAMEor
git pull https://github.com/OTHERUSER/REPONAME.git
If this command had any effect, then:
Oftentimes, when you are working to add a feature, you will also fix a bug, or add documentation, or perform a refactoring. It is great to make these improvements. However, each pull request should be a single, logical unit. If you have made multiple different changes, create a new branch and a separate pull request for each one.
Your repository might start out having only a branch
named all-my-changes
(the actual name should be more
descriptive!). After you make new branches for the logically distinct
changes, you might have
branches all-my-changes
, add-documentation
,
and refactoring
. While you develop, periodically pull
the master
, add-documentation
,
and refactoring
branches into all-my-changes
.
Create pull requests for each branch when it are ready. Don't create a
pull request for all-my-changes
until the pull requests for
ancillary branches have been merged, and you have
merged master
into it.
It's a bit more work to separate different changes into different branches, but it makes each pull request much easier to understand. Each pull request can be reviewed more quickly. A change history with more, smaller commits is more helpful to future developers. Test failures are easier to understand. Any interactions between changes are easy to see.
Once you are happy with your work and you believe it is ready to be incorporated into the project's main repository, you can create a pull request.
Make your code self-explanatory. You should not write pull request comments on lines of code, and you should write very little in the introductory comment to your pull request. Comments in a pull request will never be seen by a programmer reading the source code. If there is information that is needed by a programmer reading the source code, you should put it in a code comment. This also applies to answering questions from reviewers: it is better to clarify the code or add documentation, rather than answering a question in the pull request comment thread.
Sometimes you want feedback on your code before you are ready to merge it into a different fork. In this case, you can create a pull request between two branches of your fork.
Sometimes, you want a review of code that you have already pushed to GitHub. Or, you want a holistic code review to critique the design of an entire component of your code, rather than incremental code reviews of bits and pieces of it.
GitHub's pull request mechanism does not support this workflow well, but here are two ways to make it work.
master
into it. You will never merge that
pull request, but will merely address feedback in master
and eventually close the pull request without merging it.
master
(that is, do git checkout master; git
checkout -b review
). Now, the reviewer reads and edits
the review
branch in their normal editor, adding TODO code
comments. The author also edits the review
branch, until
there are no more TODO code comments in the diff. Then, merge the
branch into master.
As soon as you receive feedback, you can start working on it. The reviewer should send you a message and/or assign the code review back to you, but the reviewer might forget, so don't wait for those events.
Make sure you are
working on the right branch; use git branch
to check.
Never force a push with git push -f
. Forcing a push is bad
practice, will cause loss of code review comments that GitHub attached to
that commit (you can't control which commit GitHub uses), and
will cause extra merges or
merge conflicts for people who have cloned your branch (such as the
people doing the review).
Go through each piece of feedback.
When you push commits to GitHub, the pull request will be automatically updated. If you change a line of code on which you received feedback, that feedback is no longer shown by default. That is, GitHub assumes that if a line near a review comment has been changed, then the review comment has been resolved. This means that you should try not to push changes (such as a change to indentation) that change a line without addressing all the comments related to that line.
Periodically run git remote prune origin
to remove deleted
branches from your working copy, so that you don't accidentally use them.
Some Git documentation recommends rebasing, amending commits, or other changes to existing version control history. Don't do any of these things. They are confusing and error-prone, they can corrupt your pull request, and they are not necessary. All of your changes will be squashed and merged into a single commit when your pull request is accepted, so don't worry about what the version control history of your branch looks like. Just focus on its differences from the upstream's master, which you can see in your pull request.
You will receive email about comments to your pull requests. Don't reply by email. Instead, reply on the GitHub webpage that is referenced by the email. One reason is that if you reply by email, you may needlessly bloat your response with all the quoted text from the email you received. Another reason is that if you reply by email, GitHub may not associate your comment with the right thread in the code review.
This section is for maintainers who are reviewing and merging a pull request.
This section is currently incomplete, but contains a few tips.
To keep the version control history clean, select “Squash and merge” when you merge a pull request. “Squash and merge” results in a single commit that contains all the changes in the pull request.
A single commit is desirable because a pull request represents a single conceptual change that has been tested and reviewed as a logical unit. When a pull request is ready to be merged, it may consist of many commits. Future maintainers will not be interested in each individual commit, such as showing bug fixing within the logical change or interactions during the pull request review. A git history that is littered with lots of little commits is much harder to read and understand.
A side benefit of squash-and-merge is that every commit on the master branch passes tests.
The repository owner can prevent incorrect pull request merges. In the repository settings, in the “Merge button” section, disable “Allow merge commits” and “Allow rebase merging”. You might also want to enable “Automatically delete head branches”.
When you squash-and-merge a GitHub pull request, the default first line of the commit message is the pull request's title, and the remainder (which GitHub calls the “extended description”) is the concatenation of the messages for all the commits in the pull request. This latter information is not useful to future developers. Therefore, edit the detail text to remove all the commit messages. Use the pull request's description (the very first comment that was written when the pull request was created), if any.
Another problem with not editing the commit message is that it may leave “[ci skip]” in the commit message, so the merge commit may not be processed by continuous integration such as Azure Pipelines, CircleCI, GitHub Workflows, or Travis CI. (CI may perform some action on every (successful) commit to master.)
Back to Advice compiled by Michael Ernst.
Michael Ernst