Unit test execution (part 2): Coverage

Quality assurance

by Miguel Ángel Moreno on February 23, 2017

The post is the second part of the Unit test execution (part 1) blog post published in October 2016. This time we will explain how to run Python tests with coverage using a distributed architecture to ensure things don’t take too long! We will describe step by step how and what we set up to achieve this. But first of all lets quickly define what we mean by test coverage.

What is test coverage?

A definition for test coverage could be a metric that measures the amount of testing performed by a set of tests (1). There is a lot of documentation about benefits and drawbacks. In our case, we are interested in statement test coverage (2).

We want to measure which code statements are covered by which tests. This metric will guide us to zones in our source code where testing is low or is missing, for example in some legacy area of the code base.

Test coverage in Python

In Python, we usually work with a coverage library. This library can be integrated with a py.test testing library using py.test-cov, with nose, in Python web frameworks such as Django (django-nose and pytest-django) or with any Python script using coverage API. In any case, you will need to configure coverage settings using parameters or a .coveragerc configuration file.

See an example file below:

[run]
omit = 
    */migrations/*
    */tests/*
    */management/*

[report]
ignore_errors = True
show_missing = True

[html]
directory = ./coverage/html

[xml]
output = ./coverage/xml/coverage.xml

We can set parameters to run coverage and generate reports, for example omit irrelevant source code directories, ignore errors, set output file paths and names.

After a test’s execution using coverage option, the process generates an output file named .coverage:

!coverage.py: This is a private format, don't read it directly!{"lines": {"/mypath/__init__.py": [], "/mypath/file1.py": [3, 5], "/mypath/file2.py": [], "/mypath/file3.py": [1, 2, 3, 4, 5, 6, 7]}}

Coverage results can be interpreted in multiple formats such as “reports”. The most simple version is a console output executing the command:

coverage report

 Name                    Stmts   Miss  Cover   Missing
-------------------------------------------------------------------
__init__                    1      1    0%              1
file1                       5      3   40%              1-2, 4
file2                       7      7    0%              1-7
file3                       8      0  100%

However, we can also extract reports in xml or html formats.

Generating a xml report, the results can be interpreted by third party applications such as Jenkins or SonarQube. The command to generate this report is:

coverage xml

Instead, with html report we can visually check coverage line by reviewing the source code.

coverage html

In our case, these are the basic uses for coverage when we run tests in a single process. As the amount of tests grow, we need to look for more advanced ways of executing tests at scale.

Paralleling test coverage

A way of reducing test execution time is to use paralleling. For example, using pytest-xdist plugin for py.test, we can execute Python tests using multiple CPUs with -n as an option. Below is an example of test with 6 CPU:

(_env)user@machine:~/projectpath$ py.test -n 6

We can also run different groups of tests on multiple machines at the same time. For example, if our project has three modules – trades, rating and clients – we could execute the following commands:

(_env)user@machine1:~/projectpath$ pytest -n 6 trades

(_env)user@machine2:~/projectpath$ pytest -n 6 rating

(_env)user@machine3:~/projectpath$ pytest -n 6 clients

Either way, executions with more than one process may cause problems with coverage library, if we do not take precautions. Coverage library creates a .coverage file in its memory when the process starts and saves it to the disk when the process is completed. If we execute more than one process over the same time period of time and use same source code, it will replace the coverage data file generated previously. To avoid this, we need to add another option to coverage configuration:

[run]
parallel=True

With this option active, the process will always create a new coverage file for every process and execution.

Finding pattern for coverage results is next:

.coverage.<machine-name>.<pid>.<random-number>

In the previous example, we could have generated coverage data files:

Generated coverage files in machine 1:

.coverage.machine1.14109.808191

Generated coverage files in machine 2:

.coverage.machine2.10545.230403

And generated coverage files in machine 3:

.coverage.machine3.10753.002771

Thus, we can extract test coverage results from distinct processes without information loss. To have a unified view over all these coverage results, we can combine all the coverage data files to one .coverage file, using combine coverage command:

coverage combine

In our example, we only execute one process per machine, thus generating one .coverage.* file per machine. If we execute more than one process per machine, we will have more than one coverage file per machine. By executing combine command in each machine we get one .coverage file per machine. We want a unified coverage file for our overall test execution, so we can upload coverage files from each machine to a repository (such as Amazon S3), download all files to a single machine and combine subsequently all files together. The last file will contain test coverage merged from all parallel executions.

Distributed workflow

At Ebury, we execute tests in Amazon Web Services using several Docker instances triggered by Jenkins and generate unified coverage results associated to each branch and build.

To achieve this goal, we are running several multi-jobs from Jenkins as follows:

Build a Docker application image with source code and requirements to run tests.
Create N instances using AWS EC2 plugin where N is the number of executors. Set N to 6 to execute tests using 6 processes, one per instance.
Every instance executes one process with coverage so that it generates files such as .coverage.<machine-name>.<pid>.<random-number>.
Subsequently, every instance uploads coverage results to Amazon S3 using s3cmd and setting an S3 folder with the source code commit id.
When completed, download all .coverage.* files from the S3 bucket to project the path.
Combiner instance combines all .coverage.* files to generate only one .coverage.
Combiner instance uploads .coverage to the S3 bucket in a folder with commit id.
Combiner instance generates xml and html reports and publishes them for example as a website or results in SonarQube

In this way, we generate one unified .coverage data file for every branch after running test execution from multiple machines, we can publish html reports and send xml results to third party applications all in an automated fashion

Findings

We can extract test coverage metric faster by combining paralleling with Docker, and features provided by Python coverage library. This workflow has costs in Amazon resources, orchestration with Jenkins, scripting in machines and configuration in the project files, however, is worth it. Coverage is not a silver bullet, but orients the efforts of a development team to improve test coverage on legacy code where TDD was not adopted. Also, coverage allows us to establish thresholds in testing quality.
Generating coverage results is the last step in test execution. Running tests in parallel and adapting coverage to this may be the difference between having coverage results several times a day or several times an hour. Likewise, the difference between using coverage intensively during a development workflow or having coverage as an anecdotal resource.

Thank you!

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_5ZETTGME4T	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_51187572_43	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	16 years 4 months	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	These cookies are set via embedded youtube-videos.
yt-remote-device-id	never	These cookies are set via embedded youtube-videos.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.