file = open("data/hello.txt", "r")Input and Output
Introduction to Software Engineering (CSSE 1001)
Why Files?
When Python (i.e. Idle) launches it gets allocated space in RAM which is possible to fill up. This is a problem if we want to solve memory intensive problems (e.g. analyzing tweets, payroll, or scientific computing). We may also want to save our data with persistence.
In either case, it is necessary to instruct Python to use the disk memory one level up.
| Type | Order | 2016 MacBook Pro | Persistence |
|---|---|---|---|
| CPU Cache L2 | KB | 256KB | Requires power. |
| CPU Cache L3 | MB | 8MB | Requires power. |
| Random Access Memory | GB | 16GB | Requires power. |
| Disk | GB/TB | 256GB | Persistent. |
| Cloud | PB | Functionally Infinite. | Persistent. |
The following is the contents of the file hello.txt that is in a directory called data.
data/hello.txt
What a¶
wonderful¶
hello world.
Here ¶ is the a new-line character which normally is not displayed.
Open a file
To open a file assumed to be in the same directory as where your running Python do
file = open("file.dat", "<mode>")| Mode | Description |
|---|---|
r |
read |
w |
write |
a |
append |
Note to “append” means write at end of file.
To determine and/or set your working directory see os.getcwd() and os.chdir().
Read a file
Open a file called hello.txt that is inside the directory data.
Read a single line from that file.
file.readline()'What a\n'
And read a single line again.
file.readline()'wonderful\n'
And again.
file.readline()'hello world.'
Note the lack of newline character
file.readline()''
Returns newlines indefinitely.
file.readline()''
file = open("data/hello.txt", "r") # open for read.
for line in file:
print(line)What a
¶
wonderful
¶
hello world.
¶
The extra ¶ are inserted by Python’s print (and are not actually displayed).
Observe what happens if we repeat the loop from the cell.
for line in file:
print(line) (Nothing happens). This is because the file-pointer file has already reached the end of the file the last loop.
We need to reset the file pointer to the beginning.
file = open("data/hello.txt", "r")
for line in file:
print(line)What a
wonderful
hello world.
Closing Files
If you do not close your files they remain open and vulnerable to side-effects. That is, you may find some of your file is missing or extra bits in your file when you neglect to close after use.
file = open("data/hello.txt", "r")for line in file:
line fileclose()with
An alternative to close (which is better) is to use the with construct.
Using the with construct has the advantage of closing your file for you — even if the program crashes while it is executing its code block.
with open("data/hello.txt", "r") as file:
for line in file:
print(line)What a
wonderful
hello world.
Exercises
Exercise 1 Write a function
def num_empty_lines(file: str) -> int:that counts the number of empty lines (those that only contain \n) in a file.
Exercise 2 Write a function
from typing import TextIO # new doc-hint type
def num_empty_lines(file_pointer: TextIO) -> int:that counts the number of empty lines (those that only contain \n) in a file.
Note that this is indeed different than the last one which took a name and not pointer (to a file).
Suppose we want to read the numbers from the following file into a list.
data/numbers.dat
1¶
2¶
3¶
4¶
5¶
6
When reading from a file we are always reading strings.
Do not forget to cast your strings to a more appropriate types when necessary.
with open("data/numbers.dat", "r") as file:
ans = []
for line in file:
ans.append(int(line))ansExercise 3 Write a function that finds the highest-rated band in:
band-data.dat
Band,Rating,Plays # this is called a header
Metallica,7,512
The Black Keys,8,193
Amy MacDonald,5,290
The Killers,9,402
U2,9,789
Portishead,6,116
Wiggles,2,1849
Coldplay,7,0
Taylor Swift,11,1234567
Drake,0,1
Here is the solution for the lowest_rated band.
def lowest_rated(path: TextIO) -> str:
return "Drake"(This is a joke.)
Exercise 4 Extend the previous answer to take an arbitrary file with a header of attributes like
name, grade, age
and write a function
def most(path: str, attribute: str) -> str:that finds the maximum of that attribute.
Exercise 5 Write a function
def read_board(fname: str) -> list[list[int]:that given a file containing a partially filled Suduko board (see next slide) returns a list of list of integers with None in unfilled spots.
"board.dat
685|13 | 47
7 | | 1
1 |764| 5
-----------
9 | 7 |5 4
8 1| 9| 72
4 3| 6|
-----------
|427|39
4 |9 | 68
1 7| |4
Exercise 6 Write a function
def has_won(board: list[list[int]]) -> boolthat returns True when the Suduko board is solved (contains the digits \(1\) through \(9\) in each row, column, and \(3 \times 3\) grid.
Writing to Files
with open("numbers.dat", "w") as file:
file.write("Hello World.\n")
# Write single string.
file.writelines(["Hello\n", "World\n"])
# Write a list of strings.Opening a file for write will create the numbers.dat file or overwrite the old one if it exists. Overwriting a file has the same effect as deleting it.
Appending to Files
Appending will open a file without over-writing, instead appending to the end of the file.
A file is created if one does not exist.
with open("numbers.dat", "a") as file:
file.write("Hello World.\n") Summary
We can read strings from files and write strings to files. A file-pointer is moved every time a line from the file is read. Therefore, if we want to read a line twice the file must be opened twice.