Github Actions for Python Project
2020/04/19
In any substantial software engineering project these trifecta of CI (continuous integration) must always exist:
build
test
code quality/test coverage
For long I have always used CircleCI to do CI for my projects. It has served me well, but it wasn't always perfect. Ever since Github Actions officially came out I've been wanting to try it; this weekend I finally did, and I immediately migrated to Github Actions with minimal effort.
The pain points of CircleCI
First, a quick rundown of how CircleCI works for the three steps outlined above. For a project on Github, CircleCI:
pulls a prebuilt Docker image of the project for the CI job
updates dependencies as needed for the latest commit
runs the test command
uploads the test coverage result
Note that code quality analysis is done outside of CircleCI, and I use CodeClimate for static code analysis. Once the tests are finished, step 4 uploads the test coverage results from CircleCI to CodeClimate.
These standard steps don't change. However, the problem with CircleCI is how difficult and time-consuming these steps can be.
For step 1, a project like SLM Lab will have a large Docker image (> 3GB), which will take time to build and push to Docker Hub. You also have to take care of OS-specific dependencies that often are missing and are difficult to install from a bare OS image. Just look at the OS-level dependencies needed for SLM Lab:
This image has to be tested before being pushed and used by CI to run its jobs. In step 2, we have to also install the project dependencies. Python does not have a nice dependency distribution system like Node.js or Ruby, thus installing packages can be really painful and slow. We use Conda for dependency management. An important thing to do at this step is to cache the project dependencies, which was impossible to do before V2, and I was going to try out Github Actions before updating to CircleCI V2.
Step 3 can be slow because the machines available from the free tier of CircleCI are slower. It takes 20 minutes to run the entire CI pipeline on CircleCI, and as we will see later it takes 5-10 minutes on Github Actions.
Step 4 is painful because there was not such thing as community-built Actions to do things like uploading test coverage to CodeClimate. The result was a complicated block of YAML to do 2 simple things: run test and upload results:
Migrating to Github Actions
By the end of my quick experimentation with Github Actions, I already got a working CI pipeline that's much cleaner and faster than my existing CircleCI pipeline, so I just turned off CircleCI and the migration was completed.
What makes Github Actions good:
Great design for composing workflows
Fast machines with prebuilt common OS-dependencies
Community-driven Github Actions
I spent a short time going through the intro sections of its documentation and its differences with CircleCI before jumping directly into writing a new CI pipeline. A few design details to note:
the main component of the pipeline is Workflow, which is triggered by Github events and runs a number of jobs in separate containers.
Workflows run asynchronously and cannot be chained; jobs can be chained by adding a
needs
keyword.the Ubuntu container preinstalls many useful dependencies, which means complex libraries like SLM Lab doesn't need a custom Docker image to build its own.
Now let's get to building the CI pipeline by replicating the 4 steps in the CircleCI pipeline we're replacing.
Step 1 of building and using a custom Docker image with these OS-dependencies is no longer necessary since we can use the Github ubuntu image directly.
So, we can jump directly to step 2 to install the Conda dependencies. There's a community-built action to setup miniconda that you can use immediately. Now my Github Actions CI file looks like this:
Next, for step 3 and 4 we need to run the test upload the test coverage result. Again, we use an action to upload to CodeClimate. A small detail is the token required CC_TEST_REPORTER_ID
can be stored in the repository secrets and made available - this is a really nice first-class integration with Github. Altogether, we add the following block:
Now, all of the CI steps have been replicated and the CI workflow runs really fast in about 6 minutes. This is already a huge improvement in speed and simplicity over CircleCI.
For bonus, I added a flake8 annotation action which will run flake8 on the commit, then nicely annotate a pull request with the messages.
That's all for the Github CI pipeline! I saved the file to .github/workflows/ci.yml of my repo, a status badge to README.md, and pushed the commit. See the CI build here.
Lastly, I removed my old CircleCI pipeline. So long, old friend. Github Actions is just first-class when used with Github, and that is certainly expected given Github is entering the CI/CD space.
Last updated