id(123)
140207028539440
Introduction to Software Engineering (CSSE 1001)
Computer memory can be understood to be a (very) long sequence of on/off switches (see Figure 1). We can use different stand-ins for on/off like True/False, 0/1, or even Black/White, but ultimately each encodes two states.
Each of these on/off switches are called bits (so-called because they are binary digits) and a collection of eight consecutive bits is called a byte. One bit can encode at most two pieces of information; two bits four; three bits eight; and so on…
To “put something in memory” means to toggle bits into some meaningful configuration. For instance, the “random” bits in Figure 1 actually encode the unsigned 32-bit integer 3,931,377,233.
In Python the “somethings” we put in memory are called objects and each object has a type which defines which functions the object can interact with, and how.
We will learn about each of these types in turn and eventually add new types!
Bits in memory are numbered so we can refer to them. This number is called an address and rather than addressing each individual bit, we divide up memory into blocks called machine words and address those instead. The length of a machine word varies but typically it is 32-bits (though newer computers use 64-bits).
In Python we can use the id
function to obtain the address of any object in memory. We can ask where the number one-hundred-and-twenty-three is stored.
id(123)
140207028539440
In truth it is not the case that all numbers are stored in memory waiting to be used (that would be quite inefficient). What happened here is that 123 was put in memory by Python when we used it in an expression.
All objects, whether they be numbers, letters, or even music, must be encoded into a sequence of bits in order for a computer to work with them. In order to distinguish these (identically looking) sequences from one another we imbue them with a type.
Loading the number 123 means going to an address in memory, toggling the bits there into the 123 bit sequence, and then recording somewhere that the value at that address is to be interpreted as an integer.
In Python we can use type
on any objet to obtain its type information.
type(101)
int
When we retrieve a bit-sequence from memory we now also know what type of object we are working with. Types are important because they dictate which functions are compatible with the object.
Variables are essentially nicknames we give to addresses in memory. Computing with variables is important concept for when we eventually start defining functions.
In Python, variables are not objects and therefore do not have any type. Practically speaking this means we can use the same variable to store different types – a feature called variable polymorphism. This is not a feature of every programming language and has consequences: it will be easier to author programs, but also easier to introduce bugs.
Assigning to a variable is an imperative or command we give the computer. Python interprets x = 2
as an instruction to put the number two at a memory location we can refer to by x
.
= 2 # put two in memory x
Anything after a hash (#
) in a code region is ignored by Python and is called a comment.
After assigning a value to a name it is common parlance to say the value is “in” the name. As in “2
is in x
” or “x
has 2
in it”.
Notice here that nothing was printed because the imperative in this case was to perform an action (a state change). The evaluation of x = 2
is not something that can be printed per-se because the result of its execution was to change memory.
In order to see what is stored in x
we can evaluate it.
x
2
Once a variable has been assigned we can retrieve its value by simply utilizing the name in an expression.
2 * x
4
Working with names is beneficial because an expression using names like
* radius ** 2 Pi
153.86
has more meaning than one using numbers
3.14 * 7 ** 2
153.86
and will be correct even if the value of radius
changes.
Variable names cannot start with numbers, contain symbols (excepting underscore _
), or spaces.
= 1 a variable
a variable = 1
^
SyntaxError: invalid syntax
2much = 2
2much = 2
^
SyntaxError: invalid decimal literal
! = 3 hello
hello! = 3
^
SyntaxError: invalid syntax
There is also a list of reserved words that have special meaning to Python that we cannot use.
for = 2
for = 2
^
SyntaxError: invalid syntax
We follow the PEP8 guidelines for naming variables: A name that has not yet been assigned a value cannot be used in an expression. In the case that Python encounters a variable with no associated value it will raise a NameError
z
NameError: name 'z' is not defined
Until this point we have been evaluating single expressions with Python. We now evaluate sequences like the following. Note that only the result of the final expression in the sequence is printed.
= 2
x = 3
y = x
y = y
x x
2
Recall that one of our main language features is that the order of the instructions matter. Now that we have an imperative we can illustrate this.
= 2
x = 3
y = y #
x = x # swapped lines
y x
3
Importantly, a variable can be used to reassign itself.
= 1
x = x + 1
x x
2
For this reason we do not read x = 1
as
“
x
equals1
”
but as
“
x
gets1
”
so that x = x + 1
means
“
x
getsx + 1
”
and not
“
x
equalsx + 1
”
Interpreting =
as equal here leads to a contradiction: x = 1
and x = x + 1
implies 1 = 2
(obviously false).
There is a great tool called Python Tutor that visually illustrates Python’s execution of a program.
We have embedded the site below using the examples from the previous section. Try clicking Next >
on each and study the difference.
For now ignore what is meant by “Global frame” (this has to do with scoping) and just care about tracking what variables have been assigned and how their values are changing. The “Objects” column will become relevant when we start loading more complicated data into memory.
Note that a line must be executed in order for the imperative to be acted upon. For example, there is no y
in memory until line two is executed.
The gray code regions found in these notes are actually evaluated by Python. The following expression evaluates to 2
then gets printed like in the REPL.
x
2
However if we load a sequence of instructions into the region it will print only the result of the final evaluation.
= 2
x = 3
y
x y
3
We can force the printing of a variable that is not the final expression in its sequence using print
.
= 2
x = 3
y print(x)
y
2
3
Print will be critical for us when debugging as it allows us to print variables from inside functions that are inaccessible to the REPL.
This distinction is only relevant for reading these notes or when working with Jupyter notebooks.
We have learned that we can load anything that has an encoding as a bit-sequence into memory. Next we learn about the primitive data that Python provides an interface into.