Sorting and Complexity

Introduction to Software Engineering (CSSE 1001)

Author

Paul Vrbik

Published

December 12, 2025

Motivation

A classical problem of computer science is that of sorting a list of items.

This begs a couple questions:

Are there multiple ways of doing this?

and, if so,

are some of the ways better than others?

Let us invest five different ways that this problem with two goals in mind:

to introduce that notion of an algorithm, and
to formalize how these algorithms are compared.

This topis is part of the transition from software engineering into computer science.

Sorting Algorithms

An algorithm is a strategy for solving a problem that can be described independently of Python (or indeed any programming language). Strictly speaking, an algorithm is an ordered set of unambiguous instructions that eventually terminates. Algorithms are an important part of computer science because it is not always obvious how to solve a problem using a computer, and even when we know how, we may not know how to do it quickly.

We turn our focus now on algorithms that sort lists in, say, ascending order. Python has two built-in functions (ones that do not require calling a library) that accomplish this:

>>> sorted([3, 5, 1, 2, 9])

[1, 2, 3, 5, 9]

and

>>> xs = [3, 5, 1, 2, 9]
>>> xs.sort()
>>> xs

[1, 2, 3, 5, 9]

Notice that sorted is a function that given a list returns a new list whereas sort is a method that sorts the list in-place. Generally speaking it is better to sort the list in-place because this requires (at least) half the amount of memory.

We will restrict ourselves to a single list action then use that action to sort the list. The sorting algorithms are named by this action.

Selection Sort

List Action

Given a list, we can determine (i.e. select) the index of the largest element in the list.

def select(xs: list[int]) -> int | None:
    """
    Return the _index_ of the largest element of <xs>.
    """
    current_largest_element = -float('inf')
    current_largest_index = None

    for k, x in enumerate(xs):
        if x > current_largest_element:
            current_largest_element = x
            current_largest_index = k

    return current_largest_index

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> select(xs)

>>> select([1])

>>> select([0, 1, 2])

>>> select([2, 1, 0])

>>> select([1, 2])

As illustrated below, selection sort works by selecting the largest element of the list moving it to the back of the list, while decrementing the index of the back of the list.

Selection Sort. Image taken from Runestone Academy

And the following function realizes this algorithm.

def selection(xs: list[int]):
    """
    Sort the list <xs> in ascending order _in-place_.
    """

    for k in range(len(xs),0,-1):  # decrement from back of list

        # find largest element in first k elements
        pos_largest_element = select(xs[:k]) 

        # swap this largest element with the back of the list
        xs[k-1], xs[pos_largest_element] = \
            xs[pos_largest_element], xs[k-1]

    return

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> selection(xs)
>>> xs

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Now, despite the fact we have faithfully sorted a list, our implementation has some issues. When we list slice, as we do when invoking select(xs[:k]), Python will create a new list and pass that new list to select. This means sorting a list of length 100 requires we store \(100 + 99 + \cdots + 2 + 1 = 5050\)
elements (proportional to the square of the original list size). This is not fit for purpose because we may want to sort lists that are so big they totally fill our memory. This is the primary advantage of doing in-place sorts — they are much more spatially efficient, albeit trickier to implement.

We can modify our implementation of selection sort to avoid list-slicing by moving the code for select into selection.

def selection(xs: list[int]) -> list[int]:
    """
    Sort the list <xs> in ascending order _in-place_.
    """

    for j in range(len(xs),0,-1):  # decrement from back of list

        # START SELECT (position of largest element in xs[:j])
        current_largest_element = -float('inf')
        current_largest_index = None

        for k in range(j):
            if xs[k] > current_largest_element:
                current_largest_element = xs[k]
                current_largest_index = k
        # FINISH SELECT

        # swap
        xs[j-1], xs[current_largest_index] = \
            xs[current_largest_index], xs[j-1]

    return

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> selection(xs)
>>> xs

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Now our our function sorts this list without having to duplicate any elements.

Insertion Sort

List Action

Given a sorted list, we can insert a new element into that list into the appropriate position.

def insert(y: int, xs: list[int]) -> list[int]:
    """
    Insert <y> into <xs> so that <xs> remains sorted.

    Precondition:
        <xs> is ascended sorting
    """

    for k, x in enumerate(xs):
        if y <= x:
            return xs[:k] + [y] + xs[k:]
        
    return xs + [y]

>>> insert(3, [1, 2, 4])

[1, 2, 3, 4]

>>> insert(3, [1, 2])

[1, 2, 3]

>>> insert(0, [1, 2])

[0, 1, 2]

>>> insert(0, [])

[0]

To utilize insert to sort a list is fairly trivial. If we want to sort the list xs we simply insert (starting with an empty list which is sorted by definition) every element of xs one at a time. The list we are inserting into will always be left sorted and thereby will always satisfy the precondition for insert.

def insertion(xs: list[int]) -> list[int]:
    """
    Return (a copy of) <xs> but in ascending order.
    """
    
    ys = []                 # start with empty (sorted) list
    for x in xs:
        ys = insert(x, ys)  # repeatedly insert into 
                            # (guaranteed sorted) list
    return ys

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> insertion(xs)

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Astute readers will have noticed that we have again used list slicing in order to implement this particular sorting algorithm. This can be rectified by inserting into the front of the list rather than a new list (as illustrated below).

Insertion Sort. Image taken from Runestone Academy

We omit the in-place insertion algorithm because it is fairly nuanced and distracting from our ultimate goal. These sorting algorithms are covered in much greater detail in COMP3506 (Algorithms & Data Structures).

Bubble Sort

List Action

Given two items that are out of order in a list, we can swap them.

Bubble sort is among the easiest of the sorting algorithms to implement. We simply look at adjacent elements in the list and swap them when they are out of order. Looking at all adjacent pairs of a list is called a pass and after each pass the largest out-of-place element “bubbles” to the top.

The following illustrates one pass of the list where the element 93 bubbles to the top.

Bubble Sort. Image taken from Runestone Academy

If we make a pass and execute no swaps, then the list must be sorted. The following is an implementation of this idea.

def bubble(xs: list[int]) -> None:
    """
    Sorts the list <xs> in ascending order _in-place_.
    """

    has_swapped = True  # for detecting if swaps occurred
    while has_swapped:
        has_swapped = False
        for k in range(len(xs)-1):
            if xs[k] > xs[k+1]:
                xs[k], xs[k+1] = xs[k+1], xs[k]  # swap
                has_swapped = True

    return None

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> bubble(xs)
>>> xs

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Because we know that after \(n\) bubble-passes of a list, that \(n\) elements (from the back) are necessarily in order, we can improve our implementation by stopping our bubbling when we hit the sorted part of the list. This eliminates the need to tests if swaps are made because eventually the sorted part of the list is the entire list.

def bubble(xs: list[int]) -> list[int]:
    """
    Sorts the list <xs> in ascending order _in-place_.
    """

    if len(xs) < 2:
        return xs
    
    for j in range(1, len(xs)):  # counts bubble passes
        for k in range(len(xs)-j):  # decrements back of list
            if xs[k] > xs[k+1]:
                xs[k], xs[k+1] = xs[k+1], xs[k]  # swap
    
    return

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> bubble(xs)
>>> xs

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Merge Sort

List Action

Given two sorted lists, we can merge these lists into a single sorted list.

def merge(xs: list[int], ys: list[int]) -> list[int]:
    """
    Return sorted(xs+ys).

    Precondition:
        <xs> is sorted.
        <ys> is sorted.
    """
    
    zs = []
    while xs and ys:  # at least one list is non-empty
        if xs[0] < ys[0]:
            zs.append(xs.pop(0))
        else:
            zs.append(ys.pop(0))

    # xs or ys may still have elements left
    zs.extend(xs)
    zs.extend(ys)

    return zs

>>> merge([1, 3, 5], [2, 4])

[1, 2, 3, 4, 5]

>>> merge([1], [2])

[1, 2]

>>> merge([1], [])

[1]

>>> merge([], [2])

[2]

>>> merge([], [])

[]

This is the first recursion sorting algorithm we consider. To sort a list with merge we cut the list in half and sort (recursively) the halves and then merge the outcome. Lists of length zero or one are sorted by definition and constitute the smallest sorting problem (the base case). This process is illustrated below.

Merge Sort (Winding). Image taken from Runestone Academy

Merge Sort (Unwinding). Image taken from Runestone Academy

def merge_sort(xs: list[int]) -> list[int]:
    """
    Return (a copy of) <xs> but in ascending order.
    """

    # base case
    if len(xs) < 2:
        return xs
    
    # cut list in two pieces
    n = len(xs)
    ys, zs = xs[:n//2], xs[n//2:]

    return merge(merge_sort(ys), merge_sort(zs))

>>> xs = [5, 2, 9, 10, 3, 2, 4, 8, 1, 3]
>>> merge_sort(xs)

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Merge-sort can (and should) also be made in-place (but we omit this).

Quick Sort

List Action

We can organize the list into two groups based on some predicate.

This is another recursive sorting algorithm that works like merge sort in that we sort the list by breaking it into smaller pieces which are recursively sorted. Two groups are formed by choosing a pivot and collecting those elements less than (or equal to) the pivot into a “left” group and those not into a “right” group. We know the (sorted) list of left-elements are all less than the (sorted) list of right-elements and therefore we can just append them while making sure to place the pivot itself in the middle. The resulting list is sorted.

Choosing a clever pivot can make quick-sort very fast, but for our purposes we simply choose the head of the list as our pivot.

def quick_sort(xs: list[int]) -> list[int]:
    """
    Return (a copy of) <xs> but in ascending order.
    """

    if len(xs) < 2:
        return xs
    
    pivot, tail = xs[0], xs[1:]

    left = [x for x in tail if x <= pivot]
    right = [x for x in tail if x > pivot]

    return quick_sort(left) + [pivot] + quick_sort(right)

>>> quick_sort([5, 2, 9, 10, 3, 2, 4, 8, 1, 3])

[1, 2, 2, 3, 3, 4, 5, 8, 9, 10]

Quick-sort can (and should) also be made in-place (but we omit this).

Comparing the Algorithms

We have written five functions that all accomplish the same task faithfully.

Which one is best?

To see if there is even any difference we can run some experiments by timing how long each method takes to sort random lists of various sizes. To be fair let us have each method sort 50 random list and average the times.

import random
import timeit

def random_list(n: int) -> list[int]:
    """
    Return a random list of length-<n> of integers in range(1000)
    """
    return [random.randint(0,1000) for _ in range(n)]

def experiment(n: int):
    """
    Print the average number of seconds required of each algorithm to 
    sort a random list of <n> integers in range(1000)
    """

    print(f"Sorting size |xs| = {n}")

    for algorithm in [selection, insertion, bubble, merge_sort, quick_sort]:
        time = 0
        for _ in range(10):
            xs = random_list(n)
            time += timeit.timeit(lambda: algorithm(xs), number=1)

        # we are using fancy print formatting here
        print(f"{algorithm.__name__:>11}  \t {time/10:>5.3f}s")

>>> experiment(10**3)

Sorting size |xs| = 1000
  selection      0.007s
  insertion      0.005s
     bubble      0.020s
 merge_sort      0.001s
 quick_sort      0.001s

>>> experiment(10**4)

Sorting size |xs| = 10000
  selection      0.731s
  insertion      0.704s
     bubble      2.112s
 merge_sort      0.014s
 quick_sort      0.009s

>>> experiment(10**5)

Sorting size |xs| = 100000
  selection      74.872s
  insertion      74.866s
     bubble      214.060s
 merge_sort      0.732s
 quick_sort      0.237s

So, clearly, some the sorting algorithms are better than others.

Complexity

We summarize the timing results of the previous section in the following table.

Average sorting times (in seconds) to sort the list `xs` where `n = len(xs)`.
Algorithm	\(n=10^3\)	\(n=10^4\)	\(n=10^5\)
Selection	0.007s	0.731s	74.87s
Insertion	0.005s	0.704s	74.87s
Bubble	0.020s	2.122s	214.1s
Merge	0.001s	0.014s	0.732s
Quick	0.001s	0.009s	0.237s

Notice that for selection, insertion, and bubble a ten-fold increase in list size corresponds to a hundred-fold increase in processing time. Whereas for merge and quick sort, the increase in processing time is just a tad more than ten-fold.

It is this relationship between input size and run-time that we are interested in more-so than the raw timings because they enable us to extrapolate wait times of larger inputs from smaller inputs, giving us some expectation of how long a function will take to execute before we run it. In computer science this is called the algorithm’s complexity.

We measure the complexity of an algorithm by bounding the number of relevant operations it performs rather than giving a precise accounting of operations (which is only possible for a given input). Our sorting algorithms are dominated by comparisons and swaps so this is what we will count. Let us start with selection.

def selection(xs: list[int]) -> list[int]:

    # let n = len(xs)

    for j in range(len(xs),0,-1): # repeats n many times

        current_largest_element = -float('inf')
        current_largest_index = None

        for k in range(j): # repeats (at most) n many times,
                           # n many times = n**2 many times
            if xs[k] > current_largest_element:  # (at most) n**2 many comparisons
                current_largest_element = xs[k]
                current_largest_index = k

        # n many swaps
        xs[j-1], xs[current_largest_index] = \
            xs[current_largest_index], xs[j-1] 

    return

# The number of swaps and comparisons (at most) n**2 + n

So the number of swaps and comparisons that selection does is (at most) \(n^2 + n\). However when \(n\) becomes large this measurement becomes totally dominated by \(n^2\). Because of this, we say the complexity of selection sort is \(\mathcal{O}(n^2)\) rather than report the more precise count.

This aligns with our experiments. We have determined it takes about \(n^2\) many operations to selection sort a list of length \(n\). Therefore to sort a list of length \(10 n\) requires \((10n)^2 = 100 n^2\) operations — a one-hundred-times increase in cost.

A similar analysis can be done on insertion and bubble to determine their complexities are also \(\mathcal{O}(n^2)\). Even though bubble runs way slower we consider it to as good/bad as the other two in its complexity analysis.

To determine the complexity of merge and quicksort is trickier and requires the Master Theorem which is a result characterizing the complexity of recursive functions. Using that the complexities of both of these algorithms can be determined to be \(\mathcal{O}(n \log n)\).

Strict Definition

The following is the precise definition of complexity. It looks complicated, but it is merely says a function \(f(n)\) is in the complexity class \(\mathcal{O}(g(n))\) when eventually \(g(n)\) is always larger than \(f(n)\) (up to scaling \(g(n)\)). This is necessary because it takes some time for dominant terms to take over. Consider that \(n\) is actually larger than \(n^2\) on the interval \((0, 1)\).

Definition 1 (Big-Oh Complexity) We say that \(f(n)\) is big-oh of \(g(n)\) and write \(f(n) = \mathcal{O}(g(n))\) when \[ \exists\, c,\, k \in \mathbb{R}^{>0} : \forall n \geq k;\, 0 \leq f(n) \leq cg(n). \] That is, when there is some positive constant \(c\) such that \(f\) is eventually larger than \(cg\).

Example 1 The function \(2x^2 + x + 1 = \mathcal{O}(n^2)\) because for \(x>2\) \[2x^2 + x + 1 < 3x^2.\]

Example 2 The function \(x^3 -\sin(x) = \mathcal{O}(n^3)\) because for \(x>1\) \[x^3 - \sin(x) \leq x^3 + 1 < 2x^3.\]

We will now just give some examples of functions of various complexities. At this introductory level, we are effectively just accounting for embedded loops. Namely, loops in loops will have a multiplicative effect on the complexity whereas loops after loops will have an additive effect.

Constant \(\mathcal{O}(1)\)

Any function that is bounded by a fixed number of operations is part of the constant complexity class. This can sometimes be confusing because a function that has loops (even embedded ones) can fall into this class if the loops are not dependent on the input.

Here are some examples of functions in the constant complexity class.

def foo(x: int):
    x += 1   # 1 assignment / addition
             # -----------------------
             # 1 = O(1)

def bar(n: int):
    while i < 10**10:   # 10**10 comparisons
        index += 1      # 10**10 assignments / additions
                        # ------------------------------
                        # 2*10**10 = O(1)

Linear \(\mathcal{O}(n)\)

Here is an example of a function in the linear complexity class.

def spam(n: int):
                       # Number of ops
    while index < n:   # n  comparisons
        index += 1     # n  assignments / additions
                       # --------------------------
                       # 2n + 2 = O(n) operations

Generally speaking, and loop (while or for) that is effected by the size of the input will contribute a \(n\) (linear term) to the complexity.

Quadratic \(\mathcal{O}(n^2)\)

Here is an example of a function in the quadratic complexity class.

def eggs(n: int):
    while i < n:       # n comparisons
        j = 0          # n assignment
        while j < n:   # n * n comparisons
            j += 1     # n * n assignments / additions
        i += 1         # n assignments / additions
                       # -----------------------------
                       # 2n**2 + 3n = O(n**2)

def ham(n: int):
    for _ in range(n):          # n increments
        for _ in range(n//2):   # n * n//2 increments
            pass                # n * n//2 empty instructions
                                # ---------------------------
                                # n**2//2 + n = O(n**2)

Notice the multiplicative effect that embedding loops has. Generally speaking embedded loops that both have size proportional to the input will contribute a \(n^2\) (quadratic term) to the complexity.

On the other hand, if the loops are sequenced then the effect is additive.

def green(n: int):
    for _ in range(n):     # n increments
        pass               # n empty instructions

    for _ in range(2*n):   # 2*n increments
        pass               # 2*n empty instructions
                           # ----------------------
                           # 2*n + 4*n = O(n)

Also be careful to ensure the loop is actually proportional to the input, otherwise the contribution is constant. That is, just because a function has embedded loops does not necessary imply the presence of a quadratic term.

def sam(n: int):
    for _ in range(n):         # n increments
        for _ in range(100):   # n*100 increments
            pass               # n*100 empty instructions
                               # ------------------------
                               # 201*n = O(n)

Cubic and Beyond \(\mathcal{O}(n^3)\)

For every embedded loop that is proportional to the input the complexity class will increase from quadratic to cubic and beyond.

def iam(n: int):
    for _ in range(n):           # n increments
        for _ in range(n):       # n*n increments
            for _ in range(n):   # n*n*n increments
                pass             # n*n*n empty instructions
                                 # ------------------------
                                 # 2n**3 + 2n**2 + n = O(n**3)

Logarithmic \(\mathcal{O}(\log n)\)

Logarithmic is an important complexity class because they characterize functions that utilize divide and conquer methods (like merge and quicksort). A list of size \(n\) can be broken in half \(\log_2(n)\) many times and therefore mere_sort will make \(\log_2(n)\) many calls to merge that itself has complexity \(O(n)\).

def merge(xs: list[int], ys: list[int]) -> list[int]:
                                   # let n = len(xs) + len(ys)
    zs = []                        
    while xs and ys:               # at most n many iterations
        if xs[0] < ys[0]:          # at most n many comparisons
            zs.append(xs.pop(0))   # at most n many appends
        else:
            zs.append(ys.pop(0))   # at most n many appends

    # xs or ys may still have elements left
    zs.extend(xs)                  # at most n many appends
    zs.extend(ys)                  # at most n many appends

    return zs                      # ------------------------
                                   # 6n = O(n)

def merge_sort(xs: list[int]) -> list[int]:

    # base case
    if len(xs) < 2:
        return xs
    
    # cut list in two pieces
    n = len(xs)
    ys, zs = xs[:n//2], xs[n//2:]

    return merge(merge_sort(ys), merge_sort(zs))
    # ^ at most log[2](n) many calls to merge which has cost O(n)
    # where n = len(xs)

Consequently, the complexity of merge_sort is \(\mathcal{O}(n \log n)\). Logarithmic complexities are among the best we can hope as a logs increase exceedingly slowly. This means algorithms with logarithmic complexities are typically fast, as we saw in our experiments.