Quality assurance

Unit test execution (part 2): Coverage

February 23, 2017

February 23, 2017 by Miguel Ángel Moreno

The post is the second part of the Unit test execution (part 1) blog post published in October 2016. This time we will explain how to run Python tests with coverage using a distributed architecture to ensure things don’t take too long! We will describe step by step how and what we set up to achieve this. But first of all lets quickly define what we mean by test coverage.

What is test coverage?

A definition for test coverage could be a metric that measures the amount of testing performed by a set of tests (1). There is a lot of documentation about benefits and drawbacks. In our case, we are interested in statement test coverage (2).

We want to measure which code statements are covered by which tests. This metric will guide us to zones in our source code where testing is low or is missing, for example in some legacy area of the code base.

Test coverage in Python

In Python, we usually work with a coverage library. This library can be integrated with a py.test testing library using py.test-cov, with nose, in Python web frameworks such as Django (django-nose and pytest-django) or with any Python script using coverage API. In any case, you will need to configure coverage settings using parameters or a .coveragerc configuration file.

See an example file below:

[run]
omit = 
    */migrations/*
    */tests/*
    */management/*

[report]
ignore_errors = True
show_missing = True

[html]
directory = ./coverage/html

[xml]
output = ./coverage/xml/coverage.xml

We can set parameters to run coverage and generate reports, for example omit irrelevant source code directories, ignore errors, set output file paths and names.

After a test’s execution using coverage option, the process generates an output file named .coverage:

!coverage.py: This is a private format, don't read it directly!{"lines": {"/mypath/__init__.py": [], "/mypath/file1.py": [3, 5], "/mypath/file2.py": [], "/mypath/file3.py": [1, 2, 3, 4, 5, 6, 7]}}

Coverage results can be interpreted in multiple formats such as “reports”. The most simple version is a console output executing the command:

coverage report
 Name                    Stmts   Miss  Cover   Missing
-------------------------------------------------------------------
__init__                    1      1    0%              1
file1                       5      3   40%              1-2, 4
file2                       7      7    0%              1-7
file3                       8      0  100%

However, we can also extract reports in xml or html formats.

    • Generating a xml report, the results can be interpreted by third party applications such as Jenkins or SonarQube. The command to generate this report is:
coverage xml
    • Instead, with html report we can visually check coverage line by reviewing the source code.
coverage html

In our case, these are the basic uses for coverage when we run tests in a single process. As the amount of tests grow, we need to look for more advanced ways of executing tests at scale.

Paralleling test coverage

A way of reducing test execution time is to use paralleling. For example, using pytest-xdist plugin for py.test, we can execute Python tests using multiple CPUs with -n as an option. Below is an example of test with 6 CPU:

(_env)user@machine:~/projectpath$ py.test -n 6

We can also run different groups of tests on multiple machines at the same time. For example,  if our project has three modules – trades, rating and clients – we could execute the following commands:

(_env)user@machine1:~/projectpath$ pytest -n 6 trades
(_env)user@machine2:~/projectpath$ pytest -n 6 rating
(_env)user@machine3:~/projectpath$ pytest -n 6 clients

Either way,  executions with more than one process may cause problems with coverage library, if we do not take precautions. Coverage library creates a .coverage file in its memory when the process starts and saves it to the disk when the process is completed. If we execute more than one process over the same time period of time and use same source code, it will replace the coverage data file generated previously. To avoid this, we need to add another option to coverage configuration:

[run]
parallel=True

With this option active, the process will always create a new coverage file for every process and execution.

Finding pattern for coverage results is next:

.coverage.<machine-name>.<pid>.<random-number>

In the previous example, we could have generated coverage data files:

  • Generated coverage files in machine 1:
.coverage.machine1.14109.808191
  • Generated coverage files in machine 2:
.coverage.machine2.10545.230403
  • And generated coverage files in machine 3:
.coverage.machine3.10753.002771

Thus, we can extract test coverage results from distinct processes without information loss. To have a unified view over all these coverage results, we can combine all the coverage data files to one .coverage file, using combine coverage command:

coverage combine

In our example, we only execute one process per machine, thus generating one .coverage.* file per machine. If we execute more than one process per machine, we will have more than one coverage file per machine. By executing combine command in each machine we get one .coverage file per machine. We want a unified coverage file for our overall test execution, so we can upload coverage files from each machine to a repository (such as Amazon S3), download all files to a single machine and combine subsequently all files together. The last file will contain test coverage merged from all parallel executions.

Distributed workflow

At Ebury, we execute tests in Amazon Web Services using several Docker instances triggered by Jenkins and generate unified coverage results associated to each branch and build.

To achieve this goal, we are running several multi-jobs from Jenkins as follows:

  1. Build a Docker application image with source code and requirements to run tests.
  2. Create N instances using AWS EC2 plugin where N is the number of executors. Set N to 6 to execute tests using 6 processes, one per instance.
  3. Every instance executes one process with coverage so that it generates files such as .coverage.<machine-name>.<pid>.<random-number>.
  4. Subsequently, every instance uploads coverage results to Amazon S3 using s3cmd and setting an S3 folder with the source code commit id.
  5.  When completed, download all  .coverage.* files from the S3 bucket to project the path.
  6. Combiner instance combines all .coverage.* files to generate only one .coverage.
  7. Combiner instance uploads .coverage to the S3 bucket in a folder with commit id.
  8. Combiner instance generates xml and html reports and publishes them for example as a website or results in SonarQube

In this way, we generate one unified .coverage data file for every branch after running test execution from multiple machines, we can publish html reports and send xml results to third party applications all in an automated fashion

Findings

We can extract test coverage metric faster by combining paralleling with Docker, and features provided by Python coverage library. This workflow has costs in Amazon resources, orchestration with Jenkins, scripting in machines and configuration in the project files, however, is worth it. Coverage is not a silver bullet, but orients the efforts of a development team to improve test coverage on legacy code where TDD was not adopted. Also, coverage allows us to establish thresholds in testing quality.
Generating coverage results is the last step in test execution. Running tests in parallel and adapting coverage to this may be the difference between having coverage results several times a day or several times an hour. Likewise, the difference between using coverage intensively during a development workflow or having coverage as an anecdotal resource.

Thank you!

Share on Share on FacebookGoogle+Tweet about this on TwitterShare on LinkedIn

Your email address will not be published. Required fields are marked *