The municipal scraping effort we’re working on is extremely friendly to contributors of all backgrounds, and we accept code contributions via GitHub Pull Requests.
Before you begin, you should create a GitHub account if you don’t already have one, and learn the basics of using Git (slightly out of scope for this document). This guide assumes basic proficiency with Git.
Please test your scraper locally, but if you have questions we’re all quite happy to go back and forth in the github comments to work on changes, if they’re needed.
In general, if you just keep the Pull Request short and self-contained. If you’ll be modifying multiple jurisdictions, please submit separate pull requests. It will make life much easier for the reviewers!
First navigate to the repo you want to contribute to and create a fork. If you’re contributing a municipal scraper within the United States, for example, view that repo’s page on Github and click the fork button.
This creates a repo (called scrapers-us-municipal, if you’ve followed the link just up above) on your personal account that you can commit to.
You can clone this repo down to your local machine by using git clone (do read up on the GitHub guide if you’re having trouble with this step - or git!)
After pulling the repo down, we’ll set a new remote called upstream to help interact with the opencivicdata repo later on.
$ git remote add upstream email@example.com:opencivicdata/municipal-scrapers-us.git
If you cloned the opencivicdata repo before you forked the repo on GitHub, don’t worry - you can adjust this fairly quickly!
$ git remote rm origin $ git remote add origin firstname.lastname@example.org:yourbadself/municipal-scrapers-us.git $ git fetch origin
Before you submit your Pull Request, it’s quite handy to run through a quick checklist of common (and easy to catch) gotchas:
- Have you added yourself to the AUTHORS file? If not, please do.
- Is your Pull Request up-to-date with the opencivicdata repo? If it’s not, it might be helpful to jump down to the Keeping your branch up to date section below.
Finally, navigate to the commit you made to your forked repo, and click the button to submit a pull request.
This guide won’t get into a generic git tutorial, and assumes basic proficiency with git and some knowledge of GitHub.
It’s good practice to use a branch when working on the scrapers, this helps continue to integrate changes into your branch, and helps you compare changes without much effort. With many people working on the codebase at the same time, it’s likely we’ll end up with changes that impact others sometimes. By using a branch, it’s much easier to fix these conflicts.
Please do make sure you always create a branch off the master branch, unless you’ve got lots of git experience and are doing this for a very specific reason.
To create a branch, you can checkout a new branch (this operation creates the branch, so don’t worry about using git branch just yet.)
$ git checkout -b bugfix/fix-this-broken-jurisdiction
It’s common to prefix a branch with one of bugfix, or feature (or anything else that’s short and desriptive). After the prefix, you should add a descriptive slug related to the change, so that it’s easy to remember which branch is which. These are sometimes called “Topic branches”.
After this, you can check which branch you’re working on by running git branch, and looking for the marked branch.
$ git branch * bugfix/fix-this-broken-jurisdiction master
To switch back to the master branch (for any reason), you can checkout the branch again.
$ git checkout master $ git branch bugfix/fix-this-broken-jurisdiction * master
It saves quite a bit of time if you can ensure that all changes have been incorporated in your branch when sending in a Pull Request. Often times this is not an issue for short-lived branches, however, sometimes people have changed code in the opencivicdata repo, and you need to merge code from “upstream” into your working branch.
Let’s go over how to do this.
The following assumes you have a setup similar to above. Make sure that you have the upstream remote set up, and are working on a topic branch.
Firstly, be sure that you’ve commited all your code, and you’re up to date.
$ git branch * bugfix/fix-this-broken-jurisdiction master $ git checkout master $ git pull upstream master $ git checkout bugfix/fix-this-broken-jurisdiction $ git merge master
Please do remember to change bugfix/fix-this-broken-jurisdiction with the name of your topic branch that you’re working on (as seen in the output of the first command run).
You can check how much has changed at any point very simply, by using git diff. Something like:
$ git diff master --color
Can come in quite handy when reviewing changes before sending in a Pull Request.