B.1 Tutorial: Writing Unit Tests with Pytest
Testing is critical to ensuring that our software works as expected and does not contain critical bugs. However, manual testing is severely limited because bugs may be introduced at any time in the code.
Nowadays, automated testing is a standard practice in software engineering and HPC software development.
In this tutorial, we briefly describe important aspects of automated testing:
- unit tests
- test-driven development
- test coverage
- continuous integration.
Following these practices is fundamental in order to produce good software for HPC.
We are going to use pytest for unit testing.
Pytest
Python has a native unit testing module that you can readily use (unittest
). Other third-party unit testing packages exist.
In this tutorial, we will use pytest.
We can install it with pip install pytest
.
Pytest Syntax
- File names should start or end with “test”, as in
test_example.py
orexample_test.py
. - If tests are defined as methods on a class, the class name should start with “Test”, as in
TestExample
. The class should not have an__init__
method. - File names should start or end with “test”, as in
test_example.py
orexample_test.py
. - If tests are defined as methods on a class, the class name should start with “Test”, as in
TestExample
. The class should not have an__init__
method.
Write a Basic Test with Pytest
In file test_example.py, let’s write a simple function sum
that takes two arguments num1
and num2
and returns their sum.
def sum(num1, num2):
"""It returns sum of two numbers"""
return num1 + num2
Now we will write tests to test our sum
function. To test an expected result we use an assertion. Tests typically involve many assertions. With pytest, we can simply use the built-in assert
keyword. Further convenient assertion functions are provided by NumPy (see http://docs.scipy.org/doc/numpy/reference/routines.testing.html
Links to an external site.). They are especially useful when working with arrays. For example, np.testing.assert_allclose(x, y)
asserts that the x
and y
arrays are almost equal, up to a given precision that can be specified.
import pytest #make sure to start function name with test
def test_sum():
assert sum(1, 2) == 3
To run this test, run the command:
#It will find all tests inside file test_example.py and run them
pytest test_example.py
or
python -m pytest test_example.py
To get more info about the test run, use the above command with -v (verbose) option.
python -m pytest test_example.py
When running, we will get:
Add one more test in file test_example.py, which will test the type of output which function sum
gives i.e integer.
def test_sum_output_type():
assert type(sum(1, 2)) is int
Now, if we run the tests again, we will get the output of two tests and test_sum
and test_sum_output_type
let’s change the assertion of test_sum
to make it fail:
def test_sum():
assert sum(1, 2) == 4
This will give a result along with failure reason that we expected sum(1, 2) which is 3 to equal 4.
Parametrizing Tests
If you look at our test_sum
function, it is testing our sum
function with only one set of inputs (1, 2) and the test is hard-coded with this value.
A better approach to cover more scenarios would be to pass test data as parameters to our test function and then assert the expected outcome.
Let’s make changes to our test_sum
function to use parameters.
import pytest@pytest Links to an external site..mark.parametrize('num1, num2, expected', [(3,5,8)])
def test_sum(num1, num2, expected):
assert sum(num1, num2) == expected
The builtin pytest.mark.parametrize Links to an external site. decorator enables parametrization of arguments for a test function. We have passed the following parameters to it-
- argnames — a comma-separated string denoting one or more argument names, or a list/tuple of argument strings. Here, we have passed
num1
,num2
andexpected
as 1st input , 2nd input and expected sum respectively. - argvalues — The list of argvalues determines how often a test is invoked with different argument values. If only one argname was specified argvalues is a list of values. If N argnames were specified, argvalues must be a list of N-tuples, where each tuple-element specifies a value for its respective argname. Here, we have passed a tuple of
(3,5,8)
inside a list where3
isnum1
,5
isnum2
and8
isexpected sum.
Here, the @parametrize
decorator defines one (num1, num2, expected)
tuple so that the test_sum
function will run one time.
We can add several tuples of (num1, num2, expected)
in the list passed as 2nd argument in the above example.
import pytest
@pytest Links to an external site..mark.parametrize('num1, num2, expected',[(3,5,8),
(-2,-2,-4), (-1,5,4), (3,-5,-2), (0,5,5)])
def test_sum(num1, num2, expected):
assert sum(num1, num2) == expected
This test_sum
test will run 5 times for the above parameters.
In the above code, we have passed the values of the 2nd argument(which are actual test data) directly there. We can also make a function call to get those values.
def get_sum_test_data():
return [(3,5,8), (-2,-2,-4), (-1,5,4), (3,-5,-2), (0,5,5)]@pytest Links to an external site..mark.parametrize('num1, num2, expected',get_sum_test_data())
def test_sum(num1, num2, expected):
assert sum(num1, num2) == expected
Usage of Fixtures
Sometimes, your module's functions require preliminary work to run (for example, setting up the environment, creating data files, or setting up a web server). The unit testing framework can handle this via fixtures. The state of the system environment should be exactly the same before and after a testing module runs. If your tests affect the filesystem, they should do so in a temporary directory that is automatically deleted at the end of the tests. Testing frameworks such as pytest provide convenient facilities for this use case.
The purpose of test fixtures Links to an external site. is to provide a fixed baseline upon which tests can reliably and repeatedly execute.
Fixtures can be used to share test data between tests, execute setup, and teardown methods before and after test executions respectively.
To understand fixtures, we will re-write the above test_sum
function and make use of fixtures to get test data-
import pytest
@pytest Links to an external site..fixture
def get_sum_test_data():
return [(3,5,8), (-2,-2,-4), (-1,5,4), (3,-5,-2), (0,5,5)]def test_sum(get_sum_test_data):
for data in get_sum_test_data:
num1 = data[0]
num2 = data[1]
expected = data[2]
assert sum(num1, num2) == expected
This will give output
Scope controls how often a fixture gets called. The default is function. Here are the options for scope:
- function: Run once per test
- class: Run once per class of tests
- module: Run once per module
- session: Run once per session
Possible scopes, from lowest to highest area are: function < class <module<session.
A fixture can be marked as autouse=True, which will make every test in your suite use it by default.
Now, we will make use of fixtures to write setup_and_teardown
function. The setup function reads some data from the database before the test starts and the teardown function writes the test run data in database after the test ends. For simplicity, our setup_and_teardown
functions will simply print a message.
Notice the yield
in setup_and_teardown
. Anything written after yield
is executed after the tests finish executing.
@pytest Links to an external site..fixture(scope='session')
def get_sum_test_data():
return [(3,5,8), (-2,-2,-4), (-1,5,4), (3,-5,-2), (0,5,5)]@pytest Links to an external site..fixture(autouse=True)
def setup_and_teardown():
print('\nFetching data from db')
yield
print('\nSaving test run data in db')
def test_sum(get_sum_test_data):
for data in get_sum_test_data:
num1 = data[0]
num2 = data[1]
expected = data[2]
assert sum(num1, num2) == expected
Also notice that we have not passed setup_and_teardown
as a parameter to test_sum
function because autouse=True
is already set in the fixture definition. So, it will automatically be called before and after each test run. We need to run the test using -s
option now to print to stdout.
pytest test_example.py -v -s

How it works...
By definition, a unit test must focus on one specific functionality. All unit tests should be completely independent.
Writing a program as a collection of well-tested, mostly decoupled units forces you to write modular code that is more easily maintainable.
In a Python package, a test_xxx.py
module should accompany every Python module named xxx.py
. This testing module contains unit tests that test functionality implemented in the xxx.py
module.
Writing a full testing suite takes time. It imposes strong (but good) constraints on your code's architecture. It is a real investment, but it is always profitable in the long run. Also, knowing that your project is backed by a full testing suite is a real load off your mind.
First, thinking about unit tests from the beginning forces you to think about a modular architecture. It is really difficult to write unit tests for a monolithic program full of interdependencies.
Second, unit tests make it easier for you to find and fix bugs. If a unit test fails after introducing a change in the program, isolating and reproducing the bugs becomes trivial.
Third, unit tests help you avoid regressions—that is, fixed bugs that silently reappear in a later version. When you discover a new bug, you should write a specific failing unit test for it. To fix it, make this test pass. Now, if the bug reappears later, this unit test will fail and you will immediately be able to address it.
When you write a complex program based on interdependent APIs, having a good test coverage for one module means that you can safely rely on it in other modules, without worrying about its behavior not conforming to its specification.
Unit tests are just one type of automated test. Other important types of tests include integration tests (making sure that different parts of the program work together) and functional tests (testing typical use cases).
Test Coverage
Using unit tests is good. However, measuring test coverage is even better: it quantifies how much of our code is being covered by your testing suite. The coverage.py
module (https://coverage.readthedocs.io/
Links to an external site.) does precisely this. It integrates well with pytest.
The coveralls.io service brings test-coverage features to a continuous integration server (refer to the Unit testing and continuous integration section). It works seamlessly with GitHub.
Workflows with unit testing
Note the particular workflow we have used in this example. After writing our function, we created a first and a second unit test that passed. Then we created a third test, which failed. We investigated the issue and fixed the function. The second test passed. We could continue writing more and more complex unit tests until we are confident that the function works as expected in most situations
We could even write the tests before the function itself. This is Test-driven development (TDD), which consists of writing unit tests before writing the actual code. This workflow forces us to think about what our code does and how one uses it, instead of how it is implemented.
Unit testing and continuous integration
A good habit to get into is running the full testing suite of our project at every commit. In fact, it is even possible to do this completely transparently and automatically through continuous integration. We can set up a server that automatically runs our testing suite in the cloud at every commit. If a test fails, we get an automatic email telling us what the problem is so that we can fix it.
There are many continuous integration systems and services: Jenkins/Hudson, Travis CI (https://travis-ci.org Links to an external site.), Codeship (http://codeship.com/ Links to an external site.), and others.
Some of them play well with GitHub. For example, to use Travis CI with a GitHub project, create an account on Travis CI, link your GitHub project to this account, and then add a .travis.yml
file with various settings in your repository (see the additional details in the references below).
In conclusion, unit testing, code coverage, and continuous integration are standard practices that should be used in all significant projects.