Testing and Debugging

Introduction to Software Engineering (CSSE 1001)

Author

Paul Vrbik

Published

March 1, 2026

Docstring Testing

We have been diligently including Docstring tests in our code like the following.

def factorial(k:int) -> int:
    """Returns k! where k! = k*(k-1)! and 0! = 1.
    Assumes k > 0
    >>> factorial(3)
    6
    >>> factorial(0)
    1
    """

Today we will learn to use doctest.testmod() to run our tests.

def factorial(k: int) -> int:
    """
    >>> factorial(3)
    6
    >>> factorial(0)
    1
    """
    ans = 1
    for ell in range(k):
        ans *= ell
    return ans

Note the error! Docstrings can help you catch your own coding mistakes.

import doctest
doctest.testmod(verbose=True)  # Begin verbose gives more information.
Trying:
    factorial(3)
Expecting:
    6
**********************************************************************
File "__main__", line 3, in __main__.factorial
Failed example:
    factorial(3)
Expected:
    6
Got:
    0
Trying:
    factorial(0)
Expecting:
    1
ok
2 items had no tests:
    __main__
    __main__.ojs_define
**********************************************************************
1 items had failures:
   1 of   2 in __main__.factorial
2 tests in 3 items.
1 passed and 1 failed.
***Test Failed*** 1 failures.
TestResults(failed=1, attempted=2)

Factorial: Corrected Version

def factorial(k: int) -> int:
    ans = 1
    for ell in range(k):
        ans *= k-ell
    return ans
doctest.testmod()
TestResults(failed=0, attempted=0)

Writing Good Doctests

A comprehensive Doctest would

  1. Test typical cases and edge cases.
  2. Test the zero of the data-type. E.g. 0, [], "".
  3. Test the singleton of the data-type. E.g. 1, [1], "a".
  4. Tests for correctness and not violations of contract.
  5. No redundant tests.

White Space

Be careful with spacing! The following tests will fail because of the space inside the square brackets at Line 4.

def identity(x):
   """
   >>> identity([])
   [ ]
   >>> identity([1,2,3])
   [1, 2, 3]
   """

This one will also fail for two reasons.

def identity(x):
   """
   >>> identity([])
   [] {}
   >>> identity([1,2,3])
   [1,2,3]
   """

The trailing space at Line 4 and the lack of spaces after commas in Line 6 are the cause of the errors.

Warning

We are doing string testing (comparing printed output) and not unit testing (comparing values). The output must match what Python outputs exactly.

def identity(x):
   """
   >>> identity([])
   []
   >>> identity([1,2,3])
   [1, 2, 3]
   """
   return x
doctest.testmod()
TestResults(failed=0, attempted=2)
def identity(x):
   """
   >>> identity(1.0)
   1
   """
   return x
doctest.testmod()
**********************************************************************
File "__main__", line 3, in __main__.identity
Failed example:
    identity(1.0)
Expected:
    1
Got:
    1.0
**********************************************************************
1 items had failures:
   1 of   1 in __main__.identity
***Test Failed*** 1 failures.
TestResults(failed=1, attempted=1)

Black-Box Approach

Suppose we are given the following function. How could we gain confidence in its correctness through black-box testing. That is, we can evaluate the function as much as we like but the code is hidden?

def pow(x:int, y:int) -> float:
    """Returns x**y. Precondition: y >= 0.
    """
    return x*pow(x, y-1) if y else 1  
    # You do not have to understand this
def pow(x: int, y: int) -> int:
    """
    >>> pow(0, 0)  # Zero
    1
    >>> pow(1, 0)  # Unit and zero
    1
    >>> pow(0, 1)  # Zero and unit
    0
    >>> pow(3, 1)  # Typical and unit
    3
    >>> pow(1, 3)  # Unit and typical
    1
    >>> pow(6, 10)  # Typical
    60466176
    """

Write doctests for the following function then implement it.

def ourmax(x:int, y:int) -> int:
    """Return the larger of x and y.
    """

Sets

Take care with sets. Only sets containing numbers will output sorted.

>>> {3, 2, 1}
{1, 2, 3}
>>> {2, 1, 3}
{1, 2, 3}
>>> {"a", "b", "c", "d", "e", "f"}
{'a', 'b', 'c', 'd', 'e', 'f'}
>>> {"b", "c", "a", "f", "e", "d"}
{'b', 'd', 'e', 'c', 'f', 'a'}

Testing Unordered Types

def identity(x):
     """
     >>> {3,1,2} == identity({1,2,3})
     True
     >>> {1: "A", 2: "B"} == identity({1: "A", 2: "B"})
     True
     """

Multiline Docstring

The following is allowed.

def identity(x:int) -> int:
    """
    >>> a = 2
    >>> b = 1
    >>> identity( a + b )
    3
    """

Write doctests for the following function. Then implement the function.

def poly_min(a: int, b: int, c: int) -> float:
    """ Return the (approximate) minimum value of
    f(x) = a*x**2 + b*x + c
    for x any float.
    """
def poly_min(a: int, b: int, c: int) -> float:
    """
    >>> tolerance = 10**-3
    >>> abs(poly_min(1, 0, 0) - 0) < tolerance
    True
    >>> abs(poly_min(3, -5, 10) - 7.916666666666666) < tolerance
    True
    """

Float testing is complicated by the fact float arithmetic is inexact. This is why we usually only insist on answers being close enough rather than equal.

Our docstring examples are not intended to fully test a module. Rather, they explain usage of the function for users. It is more appropriate to write a full suite of tests outside our functions in files.

testing.txt
sandbox.py should be in the same directory as this file and contain fact. This 
entire file will be treated as a docstring. For instance, this paragraph is 
considered a comment despite not having quotes around it.

from sandbox import fact
fact(3)
6

fact(0)
1

Then in the repl we do.

>>> doctest.testfile("testing.txt", verbose=True)
Trying:
.
.
.
.
1 items passed all tests:
   3 tests in testing.txt
3 tests in 1 items.
3 passed and 0 failed.
Test passed.
TestResults(failed=0, attempted=3)

Exercises

Exercise 1 Write doctests for the following function then implement it.

def indices(cs: str, subcs: str) -> list[int]:
    """Return the indices in cs at which non-overlapping
    copies of subcs start. subcs is non-empty.

    >>> indices('A Coool pool look', 'oo')
    [3, 9, 14]
    """

Exercise 2 Write doctests for the following function then implement it.

def insert_after(xs: list[int], a: int, b: int) -> list[int]:
    """
    Insert <a> after each occurrence of <b> in list <xs>.
    """

Exercise 3 Write doctests for the following function then implement it.

def increment_count(hash: Dict[str, int], key: str) -> None:
    """Increment the value associated with key in hash 
    _in-place_.
    If key is not a key in hash, add key with value 1.
    """
    if key in hash:
        hash[key] += 1
    else:
        hash[key] = 1
    return None

Exercise 4 Write doctests for the following function then implement it.

def average_grade(grades: list[list[object]]) -> float:
    """Return the average grade for all the students in grades 
    where the inner lists contain a student ID and a grade.
    >>> grades = [['998765', 70], ['111234', 90],
     ['444567', 83]]
    >>> average_grade( grades )
    81.0
    """

Exercise 5 Write doctests for the following function then implement it.

def choose_chars(xs: str, ys: str, mask: str) -> str:
    """Return a string where index i is xs[i] if mask[i] 
    is 0 and ys[i] if mask[i] is 1.
    
    Precondition: 
        1/ xs, ys, and mask are all of the same length.
        2/ mask consists only of characters 0 and 1.
    """

Summary

We can verify our docstring examples using doctest. Tests should have sufficient coverage and not be redundant. Testing cannot/do not guarantee a function is working generally but rather give confidence that a function is working and helps prevent coding errors.

Thus concludes the module on imperative programming.