Nonprimitive Data

Introduction to Software Engineering (CSSE 1001)

Author

Paul Vrbik

Published

April 24, 2025

Lists

A list is a mutable ordered collection of elements. These elements are not necessarily the same type and are not necessarily distinct.

Square brackets [] are used to create lists in Python.

xs = [1, "apple"]
type(xs)
list

And are indexable like strings.

xs[0]
1

Again (and importantly) that lists are mutable. We can modify a list once it is created, something we could not do with strings and tuples.

xs[0]
1
xs[0] = 2
xs[0]
2
xs[0] = 2*xs[1]
xs[0]
'appleapple'

Comparison

[1,2,3] < [4,5,6] 
True

Point-wise comparisons from position zero.

[7,2,3] < [4,5,6] 
False
[] < [1] 
True

Membership

1 in [1, 2, 3]
True
0 in [1, 2, 3] 
False
123 in [1, 2, 3] 
False
[1] in [1, 2, 3] 
False

Append

This is the original list updated.

xs = [0, 1, 2]  
xs = xs + [3] 
xs
[0, 1, 2, 3]

Alternatively, using a method that list provides…

xs = [0, 1, 2]
xs.append(3)
xs 
[0, 1, 2, 3]
xs = [1, 2, 3]
xs.append([4,5]) 
xs 
[1, 2, 3, [4, 5]]

Extend

xs = [1, 2, 3] 
xs.extend([4,5]) 

Is code similar to.

xs = xs + [4, 5]

List Aliasing

Take care when (trying to) copy lists.

xs = [1, 2, 3]
ys = xs
ys[-1] = 9
ys
[1, 2, 9]
xs 
[1, 2, 9]

Copying Lists

xs = [1, 2, 3]
ys = xs.copy()
ys[-1] = 9
xs
[1, 2, 3]
ys 
[1, 2, 9]

Passing Lists to Functions

A more nuanced case is when passing lists (but really list references) to functions.

def foo(ys):
    ys[0] = -100
xs = [1, 2, 3] 
foo(xs) 
xs 
[-100, 2, 3]

Notice here that a reference to the list is passed to foo and consequently changes made inside function have effect outside the function.

Often this is something we wish to exploit. We can have functions that sort a list “in place” (without duplicating the list) or have utility functions that zero out the contents of a list for example.

Notice in ths visualization that a reference (arrow) is passed to foo.

List Aliasing

There are some minor nuances with list appending. We enumerate them here so they can be identified in debugging.

The following have typical behavior on aliases. Appending to xs changes the list ys is referencing.

xs = [1, 2, 3] 
ys = xs
xs.append(4)
ys # ys has also changed
[1, 2, 3, 4]
xs = [1, 2, 3]
ys = xs
xs += [4]
ys # ys has also changed
[1, 2, 3, 4]

Concatenating like the following, however, breaks the aliasing relationship because xs creates a new list in memory.

xs = [1, 2, 3]
ys = xs
xs = xs + [3]
ys # ys has NOT changed!
[1, 2, 3]

Here the aliasing relationship broken! This operation is concatenating two lists (which produces a new list) and not appending xs.

Slicing

xs = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
xs[3:6] 
[3, 4, 5]
[3, 4, 5]
xs[::2] 
[0, 2, 4, 6, 8]
xs[7:2:-2] 
[7, 5, 3]

Exercise 1 We say the list

[0, 1, 2, 3, 4, 5]

rotated by 2 is

[2, 3, 4, 5, 0, 1].

Write a function

def rotate(xs: list, k: int) -> list

that rotates a list by k.

Nested Lists

A list can have another list as an element.

xs = [ [1,2], [5,6,7] ]
len(xs)
2
len(xs[-1])  
3

It is more accurate to say we a list of references to lists, and not the lists themselves. As illustrated in the following.

Matrices

Notice that a matrix can be represented by a list-of-lists in Python.

\[ \mathbb{A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \equiv [[1, 2, 3], [4, 5, 6], [7, 8, 9]] \]

ass = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] 
ass[-1] 
[7, 8, 9]
ass[1:3] 
[[4, 5, 6], [7, 8, 9]]
ass[1]
[4, 5, 6]
ass[1][-1]
6
ass[1,-1] 
TypeError: list indices must be integers or slices, not tuple

Deep-Copy

ass = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
bss = ass.copy()
bss[0][0] = 0 
bss
[[0, 2, 3], [4, 5, 6], [7, 8, 9]]
ass
[[0, 2, 3], [4, 5, 6], [7, 8, 9]]

This is because Python did a shallow-copy (it copied the references, not the lists themselves). For this to work we need a deep-copy, which is not built-in — see copy.deepcopy.

Converting Lists to Arguments

It is possible to “unbracket” a list which is useful for passing all the arguments in on variable

def f(x, y, z):
    return x + y + z
t = [1, 2, 3]
f(t)
TypeError: f() missing 2 required positional arguments: 'y' and 'z'
f(*t) 
6

Exercises

Exercise 2 Write a function

def transpose(ass: List[List[int]]) -> List[List[int]]:

that returns the matrix transpose of \(\mathbb{A}\). Recall \[ \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}^T = \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 9 \end{bmatrix} \] That is the columns and rows are swapped.

Dictionaries

Here is a deliberately vague task: Write a function that returns the character frequency of a string comprised only of lower case characters.

For example, the character frequency of “hello world” would encode that there is one “h”, three “l”s and so on.

What are some viable representations?

Two Lists

[ ['h', 'e', 'l', 'o', ' ', 'w', 'r', 'd'],  
  [1, 1, 3, 2, 1, 1, 1, 1] ]

One-to-one Encoding

 a b c d e f g h i j k l m n o p q r s t u v w x y z
[0,0,0,1,1,0,0,1,0,0,0,3,0,0,2,0,0,1,0,0,0,0,1,0,0,0]

Hash Function

In both cases we provided a way to map a key (character) to a value (number). \[ \begin{align*} \text{str} &\to \text{int} \\ \text{h} &\mapsto 1 \\ \text{e} &\mapsto 1 \\ \text{k} &\mapsto 3 \\ &\;\; \vdots \end{align*} \] This mapping is called a hash function.

This is the hash-function for the two list encoding.

def hash_1(key: str) -> int:
    """Two list encoding."""
    xs = ['h', 'e', 'l', 'o', ' ', 'w', 'r', 'd']
    ys = [1, 1, 3, 2, 1, 1, 1, 1]
    k = xs.index(key)  # position of key in xs
    return ys[k]
>>> hash_1('o')
2
>>> hash_1('r')
1

This is the hash function for the single list encoding.

def hash_2(key: str) -> int:
    """One-to-one encoding."""
    xs = [0,0,0,1,1,0,0,1,0,0,0,3,0,0,2,0,0,1,0,0,0,0,1,0,0,0]
    pos_in_alpha = ord(key)-ord('a')  # position of key in alphabet
    return xs[pos_in_alpha]
>>> hash_2('o')
2
>>> hash_2('r')
1

Generally speaking we can index a list by any type of key given some hash-function which takes keys to values.

Python has a “magic” hash function that will index anything appropriately. The result is implemented as the dict data-type.

Dictionaries

A dictionary in Python is a data-structure that stores tuple(key, value) pairings. Python has a “magic” hash function to find a key among the tuples fast. They are like lists that can be indexed by more than just index positions.

The “hello world” example encoded using a dictionary is

>>> char_to_freq = {
    'd': 1, 
    'o': 1, 
    ' ': 1, 
    'r': 1, 
    'w': 1, 
    'e': 1, 
    'l': 3, 
    'h': 0
    }

Dictionaries have no ordering because the keys can be of mixed type but are mutable.

Hash tables are a candidate for the most important/useful data-structure in computer science.

Dictionary Methods

items

The items of a dictionary are its key/value pairings written as key: value.

h = {
    "red": ["apple", "firetrucks", "cars"],
    "yellow": ["banana", "cars"],
    "blue": ["sky", "cars"]
    }
h.items()
dict_items([('red', ['apple', 'firetrucks', 'cars']), ('yellow', ['banana', 'cars']), ('blue', ['sky', 'cars'])])

values

The values of a dictionary are what keys point to. The value part of key: value.

h = {
    "red": ["apple", "firetrucks", "cars"],
    "yellow": ["banana", "cars"],
    "blue": ["sky", "cars"]
    }
h.values() 
dict_values([['apple', 'firetrucks', 'cars'], ['banana', 'cars'], ['sky', 'cars']])

keys

The keys of the dictionary point to values. The key part of key: value.

Warning

The keys of a dictionary must be an immutable type.

d = dict()
d[ [1,2,3] ] = 1
TypeError: unhashable type: 'list'

Tuples are immutable

d[ (1,2,3) ] = 1

Dictionaries are mutable.

h = dict()
d[ h ] = 1
TypeError: unhashable type: 'dict'
h = {
    "red": ["apple", "firetrucks", "cars"],
    "yellow": ["banana", "cars"],
    "blue": ["sky", "cars"]
    }
h.keys() 
dict_keys(['red', 'yellow', 'blue'])
h["blue"] 
['sky', 'cars']
h["green"]
KeyError: 'green'

The following is fine because we are assigning not retrieving.

h["green"] = ["leaves"] 

clear

We can empty a dictionary of its items.

h = {
    "red": ["apple", "firetrucks", "cars"],
    "yellow": ["banana", "cars"],
    "blue": ["sky", "cars"]
    }

f = h.copy()
h.clear()
h["blue"] 
KeyError: 'blue'
f["blue"] 
['sky', 'cars']

copy

Recalling that copying is required because a dictionary is a mutable type. There is a method for copying a dictionary.

h = {
    "red": ["apple", "firetrucks", "cars"],
    "yellow": ["banana", "cars"],
    "blue": ["sky", "cars"]
    }

f = h.copy()
h["blue"].append("windex") 
h["blue"]
['sky', 'cars', 'windex']

Shallow copy.

f["blue"] 
['sky', 'cars', 'windex']

get

The method get is preferred by some because it returns None rather than KeyError.

h = {
    "red": ["apple", "firetrucks", "cars"],
    "yellow": ["banana", "cars"],
    "blue": ["sky", "cars"]
    }
h.get("red") 
['apple', 'firetrucks', 'cars']
h["green"] 
KeyError: 'green'
h.get("green")  # output is invisible because it is None
type(h.get("green"))
NoneType

Summary

Lists are a mutable type that collects another type into a indexed sequence string from index 0.

Dictionaries are for storing key-value relationships. A key must be an immutable type and gets mapped to a value using a hash-function that is magically fast.