Primitive Data

Introduction to Software Engineering (CSSE 1001)

Author

Paul Vrbik

Published

March 13, 2025

The primitive data types in Python comprise

  1. bool,
  2. integer,
  3. float (which we skip),
  4. string, and
  5. tuple.

They are the “built-in” types that Python has and come with a bunch of extra functionality.

Immutability

Once a piece of primitive data is loaded into memory it cannot be changed. If we store an integer in the name x

x = 7
id(x)
140030289904048

then increment x by three

x = x + 3

we will see x now refers to a new location in memory rather than the memory location itself being overwritten

id(x)
140030289904144

This detail is not overly important but do take care not to try and modify strings, as we will soon see it is sometimes tempting to do so.

Booleans

The boolean type provides (comprehensively) the values True and False.

type(True)
bool
type(False)
bool

Now that we have this type we can introduce operations like greater than that return a true/false answer.

3 > 7
False

We will cover all the comparison operators on numbers that we omitted last week in this lecture.

Boolean Arithmetic

Generally speaking statements like

> "Is it raining?"

or

> "Is it Wednesday?"

evaluate to a boolean and are called predicates.

Sometimes we want to tie two predicates together like

> "Is it raining AND Wednesday?"

This is accomplished with the operators and, or, not.

2 < 3 and 3 < 2
False

And

and True False
True True False
False False False
Table 1: The truth table for and. Read like a multiplication table swapping numbers for booleans and multiplication for and.
False and False
False
False and True
False
True and False
False
True and True
True

Or

or True False
True True True
False True False
Table 2: The truth table for or. Read like a multiplication table swapping numbers for booleans and multiplication for or.
False or False
False
False or True
True
True or False
True
True or True
True

Not

not(True)
False
not(False)
True

Comparisons

True > False
True
True <= False
False
True != False
True
True == True
True
False == False
True

Integers

Integers in Python can be as big as we want. This requires some work behind the scenes because a computer hardware is designed to do arithmetic on integers that fit in words. In other programming languages it would be the responsibility of the programmer to not exceed the largest number.

Integer Arithmetic

We simply include the of arithmetic here for completion sake.

Operator Name Syntax
() Brackets (x)
+ Unary plus +x
- Unary minus -x
+ Plus x + y
- Minus x - y
* Multiplication x * y
\\ Integer Division x // y
% Modulus x % y
** Exponentiation x ** y
Table 3: Arithmetic Operators. x and y integers (but more generally are any legal Python expressions that evaluate to integers). To be more accurate we should add “and store in memory” to each item in the semantics column.

Comparisons

We can now introduce the operations on numbers that evaluate to booleans.

Operator Name Syntax
== Equal x == y
> Greater than x > y
>= Greater than or equal x >= y
< Less than x < y
<= Less than or equal x <= y
Table 4: Arithmetic Operators. x and y integers (but more generally are any legal Python expressions that evaluate to integers).
67 == 67
True
x, y = 67, 67
x == y
True
3 < 7
True

The evaluation result from this expression can be assigned to a variable.

x = 3 < 7
x
True

There is a shortcut for 0 < x and x < 100

0 < x < 100
True

Strings

Anything (with some exceptions) enclosed by single-quotes ' ' or double-quotes " " is considered to be a string by Python.

A string is an ordered collection of characters (e.g. unicode and ascii) allowed by the computer.

"hello world"
'hello world'
'hello world'
'hello world'
type("hello world")  
str
hello world  # note the lack of quotes  
SyntaxError: invalid syntax 
hello  # note the lack of quotes
NameError: name 'hello' is not defined

Adding Strings (Concatenation)

"hello" + "world"
'helloworld'
space = " "  
"hello" + space + "world"  
'hello world'

When strings are added a new string is created. This is because strings are immutable — once they are in memory they cannot change.

Scalar Multiplying Strings

Because we can repeatedly add a string to itself, it makes sense to multiply strings by positive integers.

3 * "Hello World!" 
'Hello World!Hello World!Hello World!'

It may be odd to have this mixed-type computation, but consider repeated adding is already okay, and multiplication by a positive integer is essentially notation for this operation.

"Hello World!" + "Hello World!" + "Hello World!" 
'Hello World!Hello World!Hello World!'

Exercise 1 What is zero times a string?

0 * "hello"

The empty string.

''

Comparing Strings

The single character a is ordered less than b as you would expect.

"a" < "b" 
True

This is because all characters are listed in a table according to ascii or unicode. These tables allow computers to standardize the way they display text by agreeing that character, say 97, should be lower-case a.

ord("a")  # the "order" of the character a 
97
ord("b")  # the "order" of the character b
98
chr(97)  # get character from order 
'a'
"A" < "a" 
True
"Z" < "a" 
True

The best way to understand how these comparisons on longer strings are resolved is to simply implement a function that does so. However, we require string indexing for that and will therefore wait until then to do it.

"a" < "aa" 
True
"b" < "aa" 
False
"aba" < "ab" 
False
"aZ" < "aa" 
True

Exercise 2 Write a function according to the following specification.

def string_less_than(cs: str, ds: str) -> bool:
    """
    Return true when string <cs> is strictly less than string <ds>.
    """

Do not use < to solve this question.

Escape Characters

New Line

A new line is an escape character that can be used in strings to print whatever is subsequent to it on a new line. The new line escape character is \n.

"hello\nworld"
'hello\nworld'
'hello\nworld'
'hello\nworld'
print("hello\nworld")   
hello
world

Notice how a string can be stored differently than it is printed.

Tab

A tab is a fixed amount of horizontal space. How a tab is displayed depends on the program displaying it. (This is why tabs are the worst :)

"hello\tworld"
'hello\tworld'
'hello\tworld'
'hello\tworld'
print("hello\tworld")
hello   world

Numbers versus Strings

3 + 7
10
"3" + "7"
'37'
3 + "7"
TypeError: unsupported operand type(s) for +: 'int' and 'str'
str(3) + "7"
'37'
3 + int("7")
10
int("3.14")
ValueError: invalid literal for int() with base 10: '3.14'
float("123.456")
123.456

Note that this is only true for numbers expressed with digits!

int("seven")
ValueError: invalid literal for int() with base 10: 'seven'

Substitution

There is a mechanism for printing string variables in sentences through substitution using the string formatter.

x = 1
y = "two"
f"x={x} y={y}"
'x=1 y=two'

Formatted strings are more typically used with print statements.

x = 1
y = "two"
print(f"x={x} y={y}")
x=1 y=two

Length

A string’s length is the number of characters that comprise it.

len("h")
1
len("hello")
5
cs = "world"  
len(cs)
5
len(cs+"world") == len(cs) + len("world")  
True

String Inclusion

We can ask if a string is a substring of another string using in.

"h" in "hello world"
True
"hello" in "hello world" 
True
cs = "world" 
cs in "hello world" 
"ow" in "hello world" 
False

String Indexing

Because a string is ordered we can number its characters starting from zero and access them by using square brackets.

cs = "hello world"
cs[0] 
'h'
cs[1]
'e'
cs[2] 
'l'
cs[len(cs)-1] 
'd'
cs[len(cs)] 
IndexError: string index out of range

We can also index from the end of the list.

cs[-1]
'd'
cs[-2] 
'l'
cs[-3] 
'r'
'r'

It is now possible to solve Exercise 2.

def string_less_than(cs: str, ds: str) -> bool:
    """
    Return true when string <cs> is strictly less than string <ds>.
    """
    k = 0
    while k < min(len(cs), len(ds)):
        if ord(cs[k]) < ord(ds[k]):
            return True
        if ord(cs[k]) > ord(ds[k]):
            return False
        k += 1
    # Here cs and ds are equal up to index k
    return len(cs) < len(ds)

String Slicing

Because the string’s characters are numbered we can slice the string to obtain only a part of it. Let us define a special cs so that the digit of the string matches its index.

cs = "0123456789"

Grab 1st inclusive through 4th exclusive characters.

cs[1:4]
'123'
cs[0:9]
'012345678'
cs[0:10]
'0123456789'
cs[0:]  # Omitting right-endpoint defaults to previous line. 
'0123456789'
cs = "0123456789"

Consider that (-1) % 10 == 9.

cs[3:-1] 
'345678'

Omitting left-endpoint defaults to 0.

cs[:-1]  
'012345678'

Omitting both gives everything!

cs[:]
'0123456789'

Consider that (-7) % 10 == 3.

cs[-7:]
'3456789'

Increment by two from zero inclusive to ten exclusive.

cs[::2]
'02468'

Decrement by one from ten inclusive to zero exclusive.

cs[::-1]
'9876543210'

Increment by three from one inclusive to seven exclusive.

cs[1:-3:3]
'14'

Decrement by four from zero inclusive to ten exclusive.

cs[0:10:-4]
''

Decrement by four from ten inclusive to zero exclusive.

cs[10:0:-4]
'951'

Decrement by four from ten inclusive to zero exclusive.

cs[::-4]
'951'

Immutability of Strings

Something is immutable when it cannot be changed. Strings are immutable.

We cannot capitalize a word in the following way.

cs = "hello"  
cs[0] = "H"  
TypeError: 'str' object does not support item assignment

Strings as booleans

The only falsy Python string is the empty string. Every other string is truthy.

bool("a")
True
bool("Hello")
True
bool("")
False
"" < "A"
True

Empty string is “smaller” than every other character.

"" < chr(0) 
True

Space versus Empty

Space and the empty character are sometimes confused because they look identical when printed.

print(" ")   # 1 space
 
print("  ")  # 2 spaces
  
print("")    # 0 spaces

As you see the output is indistinguishable (actually you will see a difference if you try and copy the spaces). None-the-less these values are not equal.

space = " "  # A space.

A space has nonzero length for example.

len(space)
1

And is truthy (whereas we expect “empty” values to be false).

bool(space)
True

On the other hand,

empty = ""  # Empty string.

The empty string has no length.

len(empty)
0

And is falsey.

bool(empty)
False

Tuples

Generalizing strings to collect any type (not just strings) yields tuples.

In particular, a tuple is an immutable ordered collection of elements. These elements are not necessarily the same type and are not necessarily distinct.

Round brackets () are used to create tuples in Python.

xs = (0, 1, 2, 3, 4, 5)
type(xs) 
tuple

We still have indexing.

xs[-1] 
5

Tuples are immutable.

xs[0] = -1
TypeError: 'tuple' object does not support item assignment

Mixed types are fine.

xs = (1, 'a', 2, 'b')
ys = (3, 'c', 4, 'd')

Tuple addition (which is concatenation) may not as expected.

xs + ys  
(1, 'a', 2, 'b', 3, 'c', 4, 'd')

There is an empty tuple.

type(())
tuple
xs + ()
(1, 'a', 2, 'b')

Tuple scalar multiplication

2*xs
(1, 'a', 2, 'b', 1, 'a', 2, 'b')

Nomenclature

(0, 1)  # Couple 
(0, 1, 2)  # Triple 
(0, 1, 2, 3)  # Quadruple 
(0, 1, 2, 3, 4)  # Quintuple 
(0, 1, 2, 3, 4, 5)  # Sextuple 
(0, 1, 2, 3, 4, 5, 6)  # Septuple 
(0, 1, 2, 3, 4, 5, 6, 7)  # Octuple 

Singleton Tuple

We have to be careful when defining the length-one tuple. The expression (1) is not a tuple.

type((1))
int
(1)
1

Here the brackets are for changing evaluation order.

Thus the following

(1) + (2) 
3
(1) + (2,3)
TypeError: unsupported operand type(s)

The singleton tuple require brackets and a comma.

type((1,))
tuple
(1,) + (2,)
(1, 2)

Comparing Tuples

(1, 2, 3) == (1, 2, 3) 
True
(1, 2, 3) == (1, 2) 
False
(1, 2, 3) < (1, 2, 4) 
True
(1, 2, 3) < (1, 2) 
False
(1, 2) < (1, 2, 3) 
True
True

Type Casting

It is sometimes possible to convert from one type into another, albeit imperfectly. We have already seen an example of this when converting from float to integer by way of truncation. The remaining conversions are covered in this section. Note that we omit conversions into floats as we are trying to avoid using this type.

Bool

Python will convert the “zero” of any type to false and everything else to true. What “zero” is for the various types varies but is typically what the “empty” element would be. For instance, the “zero” for integer and float is the number 0; the “zero” for the string type is the empty string ''.

When a value converts to true called truthy otherwise it is falsy.

Zero is truthy.

bool(0)
False

123 is falsy.

bool(123)
True
bool(0.0)
False
bool(123.345)
True
bool("")
False
bool("hello world")
True

Empty String vs. Space

Bear in mind the empty string is distinct from a string of spaces

bool("")
False
bool(" ")
True

It is true that they are indistinguishable when printed but none-the-less spaces are not special, they are merely characters with no strokes that render space. To emphasize this point consider the difference in length for the two strings.

len("")
0
len(" ")
1

They are not of equal length and therefore cannot be the same!

Integer

Strings comprised of digits can be converted into integers.

int("1234")
1234
type(int("1234"))
int

But strings with alpha-characters cannot.

int("hello world")
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    int("hello world")

Floats are converted to integers through truncation.

int(123.456)
123

Booleans are converted by taking True to 1 and False to 0.

int(True)
1
int(False)
0

Consequently:

True + True
2
True * False
0

String

The function str returns the string representation of an object.

str(123)
'123'

This is the same string that is printed when using print on an object.

String Formatter

Another, more versatile, way of converting to strings is to use the string formatter.

x = 1
y = 22
f"The sum of {x} and {y} is {x+y}"
'The sum of 1 and 22 is 23'

Further Resources