Tutorial: Python Essentials
If you don't have Python installed, you can download Python from python.org Links to an external site.. But if you don’t already have Python, it is possible installing the Anaconda Links to an external site. distribution, which already includes many libraries that we are going to use.
In this course, we Python is 3.x. Python 3 is not backward-compatible with Python 2 so make sure that you are using Python 3
Make sure to have available or to install pip Links to an external site., which is a Python package manager that allows us to easily install third-party packages. It is also good to obtain IPython Links to an external site., which is an easy Python shell to work with. To install IPython:
pip install ipython
Writing Python Code: White Space Formatting
Python uses indentation to delimit blocks of code. As example:
for
i
in
[
1
,
2
,
3
,
4
,
5
]:
print(
i)
# first line in "for i" block
for
j
in
[
1
,
2
,
3
,
4
,
5
]:
print(
j)
# first line in "for j" block
print(
i
+
j)
# last line in "for j" block
i
# last line in "for i" block
"done looping"
This makes Python code readable, but we need to be careful with your formatting. Whitespace is ignored inside parentheses and brackets, which can be helpful for long computations:
long_winded_computation
=
(
1
+
2
+
3
+
4
+
5
+
6
+
7
+
8
+
9
+
10
+
11
+
12
+
13
+
14
+
15
+
16
+
17
+
18
+
19
+
20
)
and for making code easier to read:
list_of_lists
=
[[
1
,
2
,
3
],
[
4
,
5
,
6
],
[
7
,
8
,
9
]]
easier_to_read_list_of_lists
=
[
[
1
,
2
,
3
],
[
4
,
5
,
6
],
[
7
,
8
,
9
]
]
We can also use a backslash to indicate that a statement continues onto the next line:
two_plus_three
=
2
+
\3
Modules
Certain features of Python are not loaded by default. These include both features included as part of the language as well as third-party features that you download yourself. In order to use these features, you’ll need to import
the modules that contain them.
One approach is to simply import the module itself:
import
re
my_regex
=
re
.
compile
(
"[0-9]+"
,
re
.
I
)
Here re
is the module containing functions and constants for working with regular expressions. After this type of import
you can only access those functions by prefixing them with re.
.
If you already had a different re
in your code you could use an alias:
import
re
as
regex
my_regex
=
regex
.
compile
(
"[0-9]+"
,
regex
.
I
)
You might also do this if your module has an unwieldy name or if you’re going to be typing it a lot. For example, when visualizing data with matplotlib
, a standard convention is:
import
matplotlib.pyplot
as
plt
If you need a few specific values from a module, you can import them explicitly and use them without qualification:
from
collections
import
defaultdict
,
Counter
lookup
=
defaultdict
(
int
)
my_counter
=
Counter
()
Functions
A function is a rule for taking zero or more inputs and returning a corresponding output. In Python, we define functions using def
:
def
double
(
x
):
"""this is where you put an optional docstring
that explains what the function does.
for example, this function multiplies its input by 2"""return
x
*
2
Python functions are first-class, which means that we can assign them to variables and pass them into functions just like any other argument:
def
apply_to_one
(
f
):
"""calls the function f with 1 as its argument"""
return
f
(
1
)
my_double
=
double
# refers to the previously defined function
x
=
apply_to_one
(
my_double
)
# equals 2
It is also easy to create short anonymous functions or lambdas:
y
=
apply_to_one
(
lambda
x
:
x
+
4
)
# equals 5
Function parameters can also be given default arguments, which only need to be specified when you want a value other than the default:
def
my_print
(
message
=
"my default message"
):
message
my_print(
"hello"
)
# prints 'hello'
my_print
()
# prints 'my default message'
It is sometimes useful to specify arguments by name:
def
subtract
(
a
=
0
,
b
=
0
):
returna
-
b
subtract
(
10
,
5
)
# returns 5
subtract
(
0
,
5
)
# returns -5
subtract
(
b
=
5
)
# same as previous
Strings
Strings can be delimited by single or double quotation marks:
single_quoted_string
=
'DD2358'
double_quoted_string
=
"DD2358"
Python uses backslashes to encode special characters. For example:
tab_string
=
"
\t
"
# represents the tab character
len
(
tab_string
)
# is 1
Lists
Probably the most fundamental data structure in Python is the list
. A list is simply an ordered collection.
This is similar to what in other languages might be called an array, but with some added functionality:
integer_list
=
[
1
,
2
,
3
]
heterogeneous_list
=
[
"string"
,
0.1
,
True
]
list_of_lists
=
[
integer_list
,
heterogeneous_list
,
[]
]
list_length
=
len
(
integer_list
)
# equals 3
list_sum
=
sum
(
integer_list
)
# equals 6
We can obtain or set the nth element of a list with square brackets:
x
=
range
(
10
)
# is the list [0, 1, ..., 9]
zero
=
x
[
0
]
# equals 0, lists are 0-indexed
one
=
x
[
1
]
# equals 1
nine
=
x
[
-
1
]
# equals 9, 'Pythonic' for last element
eight
=
x
[
-
2
]
# equals 8, 'Pythonic' for next-to-last element
x
[
0
]
=
-
1
# now x is [-1, 1, 2, 3, ..., 9]
We can also use square brackets to “slice” lists:
first_three
=
x
[:
3
]
# [-1, 1, 2]
three_to_end
=
x
[
3
:]
# [3, 4, ..., 9]
one_to_four
=
x
[
1
:
5
]
# [1, 2, 3, 4]
last_three
=
x
[
-
3
:]
# [7, 8, 9]
without_first_and_last
=
x
[
1
:
-
1
]
# [1, 2, ..., 8]
copy_of_x
=
x
[:]
# [-1, 1, 2, ..., 9]
To concatenate lists together:
x
=
[
1
,
2
,
3
]
x
.
extend
([
4
,
5
,
6
])
# x is now [1,2,3,4,5,6]
If you don’t want to modify x
we can use list addition:
x
=
[
1
,
2
,
3
]
y
=
x
+
[
4
,
5
,
6
]
# y is [1, 2, 3, 4, 5, 6]; x is unchanged
More frequently we will append to lists one item at a time:
x
=
[
1
,
2
,
3
]
x
.
append
(
0
)
# x is now [1, 2, 3, 0]
y
=
x
[
-
1
]
# equals 0
z
=
len
(
x
)
# equals 4
It is often convenient to unpack lists if you know how many elements they contain:
x
,
y
=
[
1
,
2
]
# now x is 1, y is 2
It’s common to use an underscore for a value you’re going to throw away:
_
,
y
=
[
1
,
2
]
# now y == 2, didn't care about the first element
Tuples
Tuples are list’ immutable equivalent. Anything we can do to a list that doesn’t involve modifying it, you can do to a tuple. We specify a tuple by using parentheses (or nothing) instead of square brackets:
my_list
=
[
1
,
2
]
my_tuple
=
(
1
,
2
)
other_tuple
=
3
,
4
my_list
[
1
]
=
3
# my_list is now [1, 3]
my_tuple
[
1
]
=
3
# ERROR:
Wecannot modify a tuple!
Tuples are a convenient way to return multiple values from functions:
def
sum_and_product
(
x
,
y
):
return
(
x
+
y
),(
x
*
y
)
sp
=
sum_and_product
(
2
,
3
)
# equals (5, 6)
s
,
p
=
sum_and_product
(
5
,
10
)
# s is 15, p is 50
Dictionaries
Another fundamental data structure is a dictionary, which associates values with keys and allows you to quickly retrieve the value corresponding to a given key:
empty_dict
=
{}
# empty dictionarygrades
=
{
"Joel"
:
80
,
"Tim"
:
95
}
# dictionary literal
We can look up the value for a key using square brackets:
joels_grade
=
grades
[
"Joel"
]
# equals 80
Dictionaries have a get
method that returns a default value (instead of raising an exception) when we look up a key that’s not in the dictionary:
joels_grade
=
grades
.
get
(
"Joel"
,
0
)
# equals 80
kates_grade
=
grades
.
get
(
"Kate"
,
0
)
# equals 0
no_ones_grade
=
grades
.
get
(
"No One"
)
# default default is None
You assign key-value pairs using the same square brackets:
grades
[
"Tim"
]
=
99
# replaces the old value
grades
[
"Kate"
]
=
100
# adds a third entry
num_students
=
len
(
grades
)
# equals 3
We use dictionaries as a simple way to represent structured data:
tweet
=
{
"user"
:
"mister_x"
,
"text"
:
"interesting message"
,
"retweet_count"
:
200
}
Besides looking for specific keys we can look at all of them:
tweet_keys
=
tweet
.
keys
()
# list of keys
tweet_values
=
tweet
.
values
()
# list of values
tweet_items
=
tweet
.
items
()
# list of (key, value) tuples
Dictionary keys must be immutable; in particular, we cannot use list
s as keys.
Control Flow
As in many programming languages, we can perform an action conditionally using if
:
if
1
>
2
:
message
=
"if only 1 were greater than two..."
elif
1
>
3
:
message
=
"elif stands for 'else if'"
else
:
message
=
"when all else fails use else (if you want to)"
Python has a while
loop:
x
=
0
while
x
<
10
:
x
)x
+=
1
although more often we use for
and in
:
for
x
in
range
(
10
):
(x
)
If we need more-complex logic, we can use continue
and break
:
for
x
in
range
(
10
):
if
x
==
3
:
continue
# go immediately to the next iteration
if
x
==
5
:
break
# quit the loop entirely
x
Randomness
In HPC and scientific computing, we frequently need to generate random numbers, which we can do with the random
module:
import
random
four_uniform_randoms
=
[
random
.
random
()
for
_
in
range
(
4
)]
# [0.8444218515250481, # random.random() produces numbers
# 0.7579544029403025, # uniformly between 0 and 1
# 0.420571580830845, #
# 0.25891675029296335] #
The random
module actually produces pseudorandom (that is, deterministic) numbers based on an internal state that you can set with random.seed
if you want to get reproducible results:
random
.
seed
(
10
)
# set the seed to 10
print(
random
.
random
())
# 0.57140259469
random
.
seed
(
10
)
# reset the seed to 10
print(
random
.
random
())
# 0.57140259469 again
If you need to randomly pick one element from a list you can use random.choice
:
my_best_friend
=
random
.
choice
([
"Alice"
,
"Bob"
,
"Charlie"
])
args and kwargs
Let’s say we want to create a higher-order function that takes as input some function f
and returns a new function that for any input returns twice the value of f
:
def
doubler
(
f
):
def
g
(
x
):
return
2
*
f
(
x
)
return
g
This works in some cases:
def
f1
(
x
):
return
x
+
1
g
=
doubler
(
f1
)
print(
g
(
3
))
# 8 (== ( 3 + 1) * 2)
print(
g
(
-
1
))
# 0 (== (-1 + 1) * 2)
However, it breaks down with functions that take more than a single argument:
def
f2
(
x
,
y
):
return
x
+
y
g
=
doubler
(
f2
)
print(
g
(
1
,
2
))
# TypeError: g() takes exactly 1 argument (2 given)
What we need is a way to specify a function that takes arbitrary arguments. We can do this with argument unpacking and:
def
magic
(
*
args
,
**
kwargs
):
print(
"unnamed args:"
,
args)
print(
"keyword args:"
,
kwargs)
magic
(
1
,
2
,
key
=
"word"
,
key2
=
"word2"
)
# prints
# unnamed args: (1, 2)
# keyword args: {'key2': 'word2', 'key': 'word'}
That is, when we define a function like this, args
is a tuple of its unnamed arguments and kwargs
is a dict
of its named arguments. It works the other way too if we want to use a list
(or tuple
) and dict
to supply arguments to a function:
def
other_way_magic
(
x
,
y
,
z
):
return
x
+
y
+
z
x_y_list
=
[
1
,
2
]
z_dict
=
{
"z"
:
3
}
print(
other_way_magic
(
*
x_y_list
,
**
z_dict
))
# 6