Unit testing in Python using the unittest module

The aim of this blog post is to capture some simple “recipes” on testing code in Python that I can return to in the future. I thought it would also be worth sharing some of my thinking around testing more widely. The code in this GitHub gist illustrates the testing features I mention below.

#!/usr/bin/env python
# encoding: utf-8
"""
Some exercising of Python test functionality based on:
https://docs.python.org/3/library/doctest.html
https://docs.python.org/3/library/unittest.html
Generating tests dynamically with unittest
https://eli.thegreenplace.net/2014/04/02/dynamically-generating-python-test-cases
Supressing log output to console:
https://stackoverflow.com/questions/2266646/how-to-disable-and-re-enable-console-logging-in-python
The tests in this file are run using:
./tests.py -v
Ian Hopkinson 2020-11-18
"""
import unittest
import logging
def factorial(n):
"""Return the factorial of n, an exact integer >= 0.
>>> [factorial(n) for n in range(6)]
[1, 1, 2, 6, 24, 120]
>>> factorial(30)
265252859812191058636308480000000
>>> factorial(-1)
Traceback (most recent call last):
ValueError: n must be >= 0
Factorials of floats are OK, but the float must be an exact integer:
>>> factorial(30.1)
Traceback (most recent call last):
ValueError: n must be exact integer
>>> factorial(30.0)
265252859812191058636308480000000
It must also not be ridiculously large:
>>> factorial(1e100)
Traceback (most recent call last):
OverflowError: n too large
"""
import math
if not n >= 0:
raise ValueError("n must be >= 0")
if math.floor(n) != n:
raise ValueError("n must be exact integer")
if n+1 == n: # catch a value like 1e300
raise OverflowError("n too large")
result = 1
factor = 2
while factor <= n:
result *= factor
factor += 1
return result
class TestFactorial(unittest.TestCase):
test_cases = [(0, 1, "zero"),
(1, 1, "one"),
(2, 2, "two"),
(3, 5, "three"), # should be 6
(4, 24, "four"), # should be 24
(5, 120, "five")]
def test_that_factorial_30(self):
self.assertEqual(factorial(30), 265252859812191058636308480000000)
def test_that_factorial_argument_is_positive(self):
with self.assertRaises(ValueError):
factorial(-1)
def test_that_a_list_of_factorials_is_calculated_correctly(self):
# nosetests does not run subTests correctly:
# It does not report which case fails, and stops on failure
for test_case in self.test_cases:
with self.subTest(msg=test_case[2]):
print("Running test {}".format(test_case[2]), flush=True)
logging.info("Running test {}".format(test_case[2]))
self.assertEqual(factorial(test_case[0]), test_case[1], "Failure on {}".format(test_case[2]))
@unittest.skip("demonstrating skipping")
def test_nothing(self):
self.fail("shouldn't happen")
if __name__ == '__main__':
# Doctests will run if they are invoked before unittest but not vice versa
# nosetest will only invoke the unittests by default
import doctest
doctest.testmod()
# If you want your generated tests to be separate named tests then do this
# This is from https://eli.thegreenplace.net/2014/04/02/dynamically-generating-python-test-cases
def make_test_function(description, a, b):
def test(self):
self.assertEqual(factorial(a), b, description)
return test
testsmap = {
'test_one_factorial': [1, 1],
'test_two_factorial': [2, 3],
'test_three_factorial': [3, 6]}
for name, params in testsmap.items():
test_func = make_test_function(name, params[0], params[1])
setattr(TestFactorial, 'test_{0}'.format(name), test_func)
# This supresses logging output to console, like the –nologcapture flag in nosetests
logging.getLogger().addHandler(logging.NullHandler())
logging.getLogger().propagate = False
# Finally we run the tests
unittest.main(buffer=True) # supresses print output, like –nocapture in nosetests or you can use -b
view raw tests.py hosted with ❤ by GitHub

My journey with more formal code testing started about 10 years ago when I was programming in Matlab. It only really picked up a couple of years later when I moved to work at a software startup, coding in Python. I’ve read a couple of books on testing (BDD in action by John Ferguson Smart, Test-Driven Development with Python by Harry J.W. Percival) as well as Working effectively with legacy code by Michael C. Feathers which talks quite a lot about testing. I wrote a blog post a number of years ago about testing in Python when I had just embarked on the testing journey.

As it stands I now use unit testing fairly regularly although the test coverage in my code is not great.

Python has two built-in mechanisms for carrying out tests, the doctest and the unittest modules. Doctests are added as comments to the top of function definitions (see lines 27-51 in the gist above). They are run either by adding calling doctest.testmod() in a script, or adding doctest to a Python commandline as shown below.

python -m doctest -v tests.py

Personally I’ve never used doctest – I don’t like the way the tests are scattered around the code rather than being in one place, and the “replicating the REPL” seems a fragile process but I include them here for completeness.

That leaves us with the unittest module. In Python it is not unusual use a 3rd party testing library which runs on top of unittest, popular choices include nosetests and, more recently, pytest. These typically offer syntactic sugar in terms of making tests slightly easier to code, and read. There is also additional functionality in writing and running test suites. Unittest is based on the Java testing framework, Junit, as such it inherits an object-oriented approach that demands tests are methods of a class derived from unittest.TestCase. This is not particularly Pythonic, hence the popularity of 3rd party libraries.

I’ve used nosetest for a while, now but it looks like its use is no longer recommended since it is no longer being developed. Pytest is the new favoured 3rd party library. Personally, I’m probably going to revert to writing tests using unittest. As a result of writing this blog post I will probably stop using nosetests as a test runner and simply use pure unittest.

The core of unittest is to call the function under test with a set of parameters, and check that the function returns the correct response. This is done using one of the assert* methods of the unittest.TestCase class. I nearly always end up using assertEquals. This is shown in minimal form in lines 67-76 above.

With data science work we often have a list of quite similar tests to run, calling the same function with a list of arguments and checking off the response against the expected value. Writing a function for each test case is a bit laborious, unittest has a couple of features to help with this:

  • subTest puts all the test cases into a single test function, and executes them all, reporting only those that fail (see lines 82-90). This is a compact approach but not verbose. Note that nosetests does not run subTest correctly, it being a a feature of unittest only introduced in Python 3.4 (2014);
  • alternatively we can use a functional programming trip to programmatically generate test functions and add them to the unittest.TestCase class we have derived, this is shown on lines 105-116;

Sometimes you write tests that you don’t always want to run either because they are slow to run, or because you used them in addressing a particular problem and now want to keep for the purposes of documentation but not to run. Decorators in unittest are used to skip tests, @unittest.skip() is the simplest of these, this is an “opt-out”.

Once you’ve written your tests then you need to run them. I liked using nosetests for this, if you ran it in a directory then it would trundle off and find any files that looked like they contained tests and run them, reporting back on the results of the tests.

Unittest has some test discovery functionality which I haven’t yet explored, the simplest way of invoking it is simply by running the script file using:

python tests.py -v

The -v flag indicates that output should be “verbose”, the name of each test is shown with a pass/fail status, and debug output if a test fails. By default unittest shows print messages from the test functions and the code being tested on the console, and also logging messages which can confuse test output. These can be supressed by running tests with the -b flag at the commandline or setting the buffer argument to True in the call to unittest.main(). Logging messages can be supressed by adding a NullHandler, as shown in the gist above on lines 188-119.

The only functionality I’ve used in nosetests and can’t do using pure unittest is re-running only those tests that failed. This limitation could be worked around using the -k commandline flag and using a naming convention to track those test still failing.

Not covered in this blog post are the setUp and tearDown methods which can be run before and after each test method.  

I hope you found this blog post useful, I found writing it helpful in clarifying my thoughts and I now have a single point of reference in future.