Python for Data Science

A Crash Course

Introduction to Python Programming

Khalil El Mahrsi
2023
Creative Commons License

Outline

Basics (variables, objects, built-in types, ...)
Control Flow (if, while, for statements, ...)
Functions (writing reusable code)
Errors and Exceptions (handling when things go wrong)
Data Structures (lists, tuples, dictionaries, ...)

Basics

What is Python?

High-level, interpreted, general-purpose programming language
Named after the Monty Python's Flying Circus TV series , not the snake species
One specification, multiple implementations

CPython: reference implementation, written in C, offers the highest compatibility with packages and extensions
PyPy, Jython (Java implementation), ...

Currently the most popular programming language for mainstream data science (but not as rich as R for niche needs)

Environment Setup Recommendations

Anaconda (https://www.anaconda.com/products/individual) is the most popular Python distribution among Data Scientists

MKL-optimized binaries of popular data science packages
Easy package and environment management with Conda
Desktop GUI (Anaconda Navigator) for launching applications and managing environments

Recommended tools and IDEs

For learning: Jupyter Notebook or JupyterLab
For cleaner coding practices

Visual Studio Code + Python extension (my favorite)
PyCharm

Python 2 has been retired as of Jan. 1, 2020. Use Python 3.x (preferably 3.7+) only!

Executing Python Code

Two ways to run Python code

By using the Python interpreter in interactive mode
By executing Python scripts (source files)

Interacting With the Python Interpreter

The Python interpreter can be launched in interactive mode by typing the python command in the Unix shell (terminal)
The interpreter prompts for the next instruction with the primary prompt (>>>)
Instructions are read from the standard input (e.g., keyboard)
The results are directed to the standard output (e.g., screen)
Use quit() to exit the interpreter

% python
Python 3.8.5 (default, Aug  5 2020, 03:39:04) 
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> welcome_message = "Hello world!"
>>> print(welcome_message)
Hello world!
  >>> 1 + 2
  3
>>> quit() # exit() or [ctrl + d] ([ctrl + z] for Windows) also works%

Using Python Source Files

The Python interpreter can also be invoked while passing the name of a Python script (.py source file) as an argument
The statements in the file are read and executed by the interpreter

Content of hello_world.py file


# -*- coding: utf-8 -*-
welcome_message = "Hello world!"
print(welcome_message)
1 + 2

Execution

% python hello_world.pyHello world!%

Behavior when executing scripts might differ from interactive mode!

Statements and Expressions

Instructions that can be executed by the Python interpreter are called statements

A simple statement is contained within a single logical line

Expressions are combinations of values, variables, operators, and function calls that the interpreter can evaluate

The evaluation of an expression produces a value
The expression's content is evaluated in a specific order defined by the operator precedence


>>> welcome_message = "Hello world!" # this is an assignment statement
>>> print(welcome_message) # another statement
Hello world!
>>> 1 + 2 # expression statement with one expression combining the + operator with the values 1 and 2
3

Objects

In Python, everything is referred to as an object
Objects have types that define possible values and operations

"Hello world!" is an object of type str (string)
1 and 2 are objects of type int (integer)
Even the function print is an object of type builtin_function_or_method (more on functions later)

Variables

Variables are names used to reference objects
The (re)binding between names and objects is done through assignment statements

The expression on the right is evaluated and the name on the left is assigned to the resulting object

Syntax


        variable_name = expression

Examples


>>> welcome_message = "Hello world!"
>>> print(welcome_message)
Hello world!
>>> a, b = 1, 2 # multiple assignments can be done in the same statement
>>> print(a)
1
>>> print(b)
2
>>> a += b # augmented assignment statement, equivalent to a = a + b
>>> print(a)
3

Naming Rules

Names given to variables (and functions) must adhere to three rules

The name can only contain letters (A-Z, a-z), digits (0-9), and underscores ( _ )
The first character must be a letter or an underscore (it cannot start with a digit)
The name must not be one of the keywords reserved by Python

Examples of valid names: a, B, foo, bar_2_3, _name, snake_case_name, mixedCaseName, CamelCaseName, UPPERCASENAME, ...
Examples of invalid names: 2items, return, name with spaces, ab>cd@3, X Æ A-12 (sorry, Elon Musk 😢), ...

Naming Rules

Python is case-sensitive! foo and Foo are two different names!

Always use meaningful variable and function names! e.g., name and age are better names than n and a!

Names starting with underscores have a particular (by convention) meaning. Avoid using such names unnecessarily!

The PEP 8 style guide recommends using snake_case names for variables (and functions).

Comments

A comment starts with the hash character (#) and ends at the end of a physical line
Comments are for humans, they are ignored by the interpreter

Excessive comments are considered to be a code smell (bad)!

Obvious or redundant comments must be avoided
Comments should explain why something is done in a specific way
The how (the code) should be self-explanatory (if that is not the case, it can very likely be simplified or restructured!)

Numbers

Python mainly provides three built-in numeric types

Integers (int) : 1, 3, -5, ...
Floating point numbers (float): 1., 2.45, 3.5e-6, ...
Complex numbers (complex): 1 + 2j, 4.5j, ...

Numbers are created by numeric literals or built-in functions and operators

Unadorned numeric literal → integer
Numeric literal containing a decimal point (.) or exponent sign (e) → floating point number
Numeric literal with appended j (or J) → complex number

Operations on Numbers

Numeric types support a wide variety of operations

Arithmetics (+, -, *, /, %, ...)
Comparisons (==, !=, <, >, <=, >=)
Type conversions
Type-specific methods

Execution of operations in the same statement follows a strict order (operator precedence)
Mixed binary arithmetic operations are fully supported: when the types of the operands differ, the “narrower” operand is widened


>>> 1 + 2 * 3

7

>>> (1 + 2) * 3 # () change operations' order

9

>>> 3 ** 2 # power, equivalent to pow(3, 2)

9

>>> 7.5 / 2. # quotient

3.75

>>> 7.5 // 2. # floored quotient

3.0

>>> 7.5 % 2. # remainder

1.5

>>> 2.5 * 1e2

250.0

>>> 2.5 == 3.75

False

>>> (2.5 <= 3.75) and (3.75 < 5)

True

>>> 2.5 <= 3.75 < 5 # this is also permitted

True

>>> float(3) # cast (convert) int to float

3.0

>>> 2 + 4.7 # 2 is converted to float before sum

6.7

>>> 1 + 3j + 2 + 0.4J # both j and J accepted

(3+3.4j)

>>> 1 + 4 j # beware of space in imaginary part

File "<stdin>", line 1
1 + 4 j
      ^
SyntaxError: invalid syntax

Booleans (Truth Values)

Two boolean values, the constants False and True are used by Python to represent truth values
Any Python object can be tested for truth value or used in boolean operations (cf. next slide)
Objects that are considered false

The False and None constants
Zero in any numeric type: 0, 0.0, 0j, ...
Empty sequences and collections: "", {}, (), [], ... (more on these later)

Booleans behave like integers (0 and 1) in numeric contexts (e.g., with arithmetic operations)

Truth Value Testing

Comparisons

Operation	Meaning
`<`	Strictly less than
`<=`	Less than or equal to
`>`	Strictly greater than
`>=`	Greater than or equal to
`==`	Equal to
`!=`	Not equal to
`is`	Object identity
`is not`	Negated object identity
`in`*	Member of
`not in`*	Not member of

* Supported by iterable types (more details later)

<, <=, >, and >= are only defined where it makes sense

Objects of different types (except numeric types) never compare equal

Boolean Operations

Operation	Result
`x or y`	if `x` is false, then `y`, else `x`
`x and y`	if `x` is false, then `x`, else `y`
`not x`	if `x` is false, then `True`, else `False`

and and or are short-circuit operators: the second argument is not always evaluated

and and or return one of their operands, not boolean results

Other operations and built-in functions that have boolean results return 0 or False for false and 1 or True for true

Truth Value Testing Examples


>>> (1 + 3) == 4

True

>>> 4 == 4.0 # different numeric types, same value

True

>>> 4 == "4" # "4" is non-numeric (string)

False

>>> 3 * False + 5. * True + True

6.0

>>> (3 < 5) and (6 < 2) # both operands evaluated

False

>>> (3 < 5) or (6 < 2) # only (3 < 5) evaluated

True

>>> (3 > 5) and (2 < 6) # only (3 > 5) evaluated

False

>>> bool(-4.75) # cast to boolean

True

>>> 7.5 or True # and and or return operands (!!!)

7.5

>>> True or 7.5 # and and or return operands (!!!)

True

>>> 1 < (1 + 4j) # < does not make sense

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances
of 'int' and 'complex'


>>> # don't worry if you don't fully understand the next
>>> # examples
>>> animals = ["cat", "dog", "chicken"] # a list of animals
>>> cat in animals

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'cat' is not defined

>>> "cat" in animals # check membership

True

>>> ("dog" in animals) and ("elephant" not in animals)

True

>>> domestic_animals = ["cat", "dog", "chicken"]
>>> domestic_animals == animals # equality test

True

>>> domestic_animals is animals # identity test

False

Strings

Strings are sequences of characters that represent textual data
Handled with str objects
String literals can be written in three ways

Single quotes: 'This is a string'
Double quotes: "This is another string"
Triple quotes (can span multiple lines): """Yet another string""" or '''Same but with single quotes'''

Be consistent: choose either single or double quotes and stick with your choice!

Special characters, e.g., quotes (') in single quote strings, can be escaped using backslash (\)

Useful escape sequences: \' (single quote), \" (double quote), \n (line feed), \t (horizontal tab), \v (vertical tab), ...

String Operations and Methods

Strings can be concatenated with the + operator (can be incompatible with other types)
Strings support indexing and slicing string[start:end:step]

Positions always start at 0
start is included, end is excluded
start omitted → 0
end omitted → last_index + 1
step omitted → 1
Negative indexing or slicing means reverse order

Strings implement many useful methods for text manipulation
- lower(), upper(), ... → change letters casing
- lstrip(), rstrip(), ... → handle trailing white space
- islower(), isalpha(), isdecimal(), ... → verify properties


>>> "Hello" + "world" + "!" # string concatenation

'Helloworld!'

>>> "John's age is " + 30 # try to concatenate with int

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

>>> "John's age is " + str(30)

"John's age is 30"

>>> 3 * "Hello! "

'Hello! Hello! Hello! '

>>> alphabet = "abcdefghijklmnopqrstuvwxyz"
>>> alphabet[0]

'a'

>>> alphabet[-1] # reverse order indexing

'z'

>>> alphabet[:4] # slicing (same as alphabet[0:4])

'abcd'

>>> alphabet[4:7] # 5th to 6th character 

'efg'

>>> alphabet[20:] # slice from 21st character to end of string

'uvwxyz'

>>> alphabet[::2] # one every two characters

'acegikmoqsuwy'

>>> alphabet[-1:-6:-1] # last five characters in reverse order

'zyxwv'

>>> book = "The Hound of the Baskervilles"
>>> book.upper()

'THE HOUND OF THE BASKERVILLES'

>>> book.split() # split into a list of words

['The', 'Hound', 'of', 'the', 'Baskervilles']

>>> "      Too much trailing whitespace        ".strip()

'Too much trailing whitespace'

>>> "Hi!".isalpha()

False
>>> "1234".isnumeric()

True

String Formatting

Strings can be constructed from other values and variables by using the format() method

The format string contains replacement fields, specified using curly braces {}
Replacement fields

Are substituted by the provided values
Can contain names (e.g., {name}, {age}, ...) of keyword arguments or numeric indexes (e.g., {0}, {1}, ...) of positional arguments
Can contain format specifications (e.g., :%, :.2f, ...) that customize their presentation

Syntax


format_string.format(value_1, value_2, ...)

String Formatting Examples


>>> author = "Arthur Conan Doyle"
>>> book = "The Hound of the Baskervilles"
>>> birthplace = "Edinburgh"
>>> print("{} was originally published in {}.".format(book, 1902))

The Hound of the Baskervilles was originally published in 1902.

>>> print("{1} is the author of {0}.".format(book, author)) # using indexes of positional arguments

Arthur Conan Doyle is the author of The Hound of the Baskervilles.

>>> print("{name} was born in {birthplace}.".format(name=author, birthplace=birthplace)) # using names of keyword arguments

Arthur Conan Doyle was born in Edinburgh.

>>> print("You scored {:.2%} on the exam!".format(0.95)) # format as percentage w/2 significant digits

You scored 95.00% on the exam!

>>> print("The world's population in {year} was {population:,.0f}.".format(year=2018, population=7.59e9))

The world's population in 2018 was 7,590,000,000.

Mutable vs. Immutable Objects

An immutable object can't be modified after it is created

Operations on immutable objects yield new objects

int, float, bool, and str objects are all immutable
Mutable objects can be modified after they are created

Operations on mutable objects can either modify the object in place or yield new objects (so read the docs)

Mutables you will use the most (detailed in Data Structures section)

Lists (list): ordered sequence of objects

e.g., ["cat", "dog", "chicken"]

Sets (set): unordered collections of unique objects

e.g., {"cat", "dog", "chicken"}

Dictionaries (dict): key → value mappings

e.g., {"name" : "Cloud", "age": 21, "job": "mercenary"}

Mutable vs. Immutable Objects


>>> text = "Meat"
>>> id(text)  # check object's identity

140385175741424

>>> text[2] = "e"  # trying to modify the string will raise an exception (result in an error)
  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

>>> text = text + " is delicious!"
>>> print(text)

Meat is delicious!

>>> id(text) # text is now bound to a new object

140385175731376

>>> animals = ["cat", "dog", "chicken"]
>>> id(animals)
140385175649664

>>> animals[2] = "duck" # try to replace the third member of the list
>>> animals
['cat', 'dog', 'duck']

>>> id(animals) # re-check the object's identity (it remains the same)
140385175649664

Control Flow

Python executes statements in a sequential (i.e., top-down) order
Control flow statements make it possible to alter this flow

Execute one or more statements only if a specific condition is met (if statement)
Repeatedly execute one or more statements...

... as long as a specific condition is met (while statement)
... for each member of a sequence of objects (for statement)

if, while, and for statements are examples of compound statements, i.e., statements that contain and affect the execution of (groups of) other statements.

`if` Statements

The statement is executed as follows

The truth value of the expression in the if clause's header is evaluated
If (and only if) it evaluates to true, then the block of indented statements (called suite) is executed
The rest of the program is executed

Syntax (simple form)


if expression:
    statement_1
    statement_2
    ...
    statement_n
rest_of_program

Flow diagram (simple form)

`if` Statements

The statement is executed as follows

If the expression in the if clause's header is true

That clause's suite is executed
The rest of the program is executed

If not, the same principle is applied in order to the following elif (optional) clauses
If all expressions are false

The suite of the else (optional) clause is executed
The rest of the program is executed

Only one suite is executed at most!

Syntax (general form)


if expression_1:
    statement_1
    ...
elif expression_2:
    ...
...
elif expression_n:
    ...
else:
    ...
rest_of_program

`if` Statements

Flow diagram (general form)

Syntax (general form)


if expression_1:
    statement_1
    ...
elif expression_2:
    ...
...
elif expression_n:
    ...
else:
    ...
rest_of_program

`if` Statements Example

Code


number = int(input("Please enter an integer: ")) # ask user to provide input
if (number % 2):
    print("{} is an odd number".format(number))
elif not ((number % 3) or (number % 4)):
    print("{} is even and can be divided by both 3 and 4".format(number))
elif number > 10:
    print("{} is even and greater than 10".format(number))
print("Finished") # rest of the program

Outputs of different executions


Please enter an integer: 1
1 is an odd number
Finished


Please enter an integer: 8

Finished


Please enter an integer: 12
12 is even and can be divided by both 3 and 4
Finished


Please enter an integer: 16
16 is even and greater than 10
Finished

`while` Statements

The statement is executed as follows

The expression in the while clause's header is evaluated
If the expression is true

The suite is executed
The while statement loops on itself (i.e., goes back to the expression evaluation step)

If the expression is false, the rest of the program is executed

If the suite doesn't contain a break statement or never alters how the expression evaluates, the while statement will loop indefinitely!

Syntax


while expression:
    statement_1
    statement_2
    ...
    statement_n
rest_of_program

Flow diagram

`while` Statements Examples


>>> i = 0
>>> while (i < 5):
...     print(i)
...     i += 1 # if i never changes, the loop will print 0 indefinitely
... 

0
1
2
3
4

  >>> i, j = 0, 1
  >>> while (i < 10):
  ...     j += 1
  ...     print("i = {}. j = {}.".format(i, j))
  ...     if (j == 4): # if j reaches 4, break out of the loop
  ...         break
  ... 

i = 0. j = 2.
i = 0. j = 3.
i = 0. j = 4.

`for` Statements

The for statement is executed as follows

The expression in the clause's header is evaluated to produce an iterable
For each object in the iterable

element is bound to that object
The suite is executed

Once the end of the iterable is reached, the rest of the program is executed

The loop is not executed if the iterable is empty.

for (and while) statements can contain an else clause.

Syntax


for element in expression:
    statement_1
    statement_2
    ...
    statement_n
rest_of_program

Flow diagram

Iterables

An iterable is an object capable of returning its members one at a time

Iterables contain a collection of objects (members)...
... that can be iterated over (traversed) one by one

Iterables you will interact most with are sequences and collections

Sequences (ordered collections): strings (str), lists* (list), tuples* (tuple), ranges (range), ...
Unordered collections: sets* (set), frozen sets* (frozenset), dictionaries* (dict), ...

Iteration order is not guaranteed for unordered collections, only for sequences!

* Will be discussed in detail in the Data Structures section of the course

`for` Statements Examples


>>> for char in "Hello": # you can iterate over strings (sequences of characters)
...     print(char)
... 

H
e
l
l
o

  >>> for i in range(10): # range objects are immutable sequences of integers (0–9 here)
  ...     print(i)
  ...     if i == 5:
  ...         break # you can also break out of for loops
  ... 
  
  0
  1
  2
  3
  4
  5
  
  >>> animals = ["cat", "dog", "chicken"] # a list of animals
  >>> for animal in animals:
  ...     print(animal)
  ... 
  
  cat
  dog
  chicken

>>> for i, animal in enumerate(animals): # use the enumerate() function if you need the index while looping
...     print("animals[{}] contains {}.".format(i, animal))
... 

animals[0] contains cat.
animals[1] contains dog.
animals[2] contains chicken.

Unidiomatic Control Flow

Using an additional state variable as an index to iterate over sequences is considered non-pythonic!


>>> animals = ["cat", "dog", "chicken"] # a list of animals
>>> # this is bad
>>> i = 0
>>> while (i < len(animals)): # len() returns the length (number of elements) of a sequence or a collection
...     print(animals[i])
...     i += 1
... 

cat
dog
chicken

>>> for i in range(len(animals)): # this is equally bad
...   print(animals[i])
... 

cat
dog
chicken

>>> for animal in animals: # the most idiomatic (pythonic) way
...     print(animal)
... 

cat
dog
chicken

`break` and `continue` Statements

Loop iterations can be skipped using break or continue statements

Can only occur in for and while loops
break → terminates the nearest enclosing loop
continue → continues with the next cycle of the nearest enclosing loop (terminates the current cycle only)


>>> for i in range(1, 10):
...     if not (i % 3):
...         continue # only iterations involving multiples of 3 are skipped
...     print(i)
...      

1
2
4
5
7
8

>>> for i in range(1, 10):
...     if not (i % 3):
...         break # as soon as the 1st multiple of 3 is encountred, execution of all subsequent loops is skipped
...     print(i)
... 

1
2

Functions

Functions and Methods

Functions are reusable blocks of statements that can be executed (called) any number of times
Functions are one of the fundamental and most important building blocks in programming
Methods are functions associated with objects of a specific type
Benefits

Avoid code and logic redundencies (“Don't Repeat Yourself”)
Reusability and parametrized behavior
Easier code maintenance

Examples of functions used so far: print(), input(), len(), enumerate(), ...
Examples of methods used so far: format(), upper(), isnumeric(), isalpha() (all defined for str objects), ...

Defining Functions

Syntax


      def function_name(param_1, param_2, ...):
          statement_1
          statement_2
          ...

Functions are defined through def statements
Function definition statements are compound statements that contain

The function's signature
- The function's name
- A pair of parentheses (), potentially enclosing a sequence of comma-separated parameters (inputs used by the function)
The function's body: an indented block of statements, executed when the function is called

Executing the statement binds the function name in the current local namespace to a function object (the executable code)

The function definition does not execute the function body!

Calling Functions

Syntax


      function_name(arg_1, arg_2, ...)

Functions are executed through function calls
Function calls are executed as follows

The arguments are evaluated
A new namespace is created (more on those in a bit)
The formal parameters (in the function definition) are bound to the corresponding arguments in the new namespace
The function's body is executed
The namespace is discarded
The function call is replaced by a value (cf. next slide)


>>> def say_hi(): # function definition with no parameters
...     print("Hi!")
... 
>>> say_hi()  

Hi!

  >>> def add(a, b): # function definition with two parameters a and b
  ...     print(a + b)
  ... 
  >>> add(5, 10) # function call: a is bound to 5, b to 10, and the body is executed  
  
  15

`return` Statements

Syntax (inside function definition only)


      return expression_list

Example


>>> def add(a, b):
...     return a + b
... 
>>> a = add(7, 10) # a is not the parameter
>>> b = add(20, 5) # same for b
>>> c = add(a, b)
>>> print(c)

42

Functions can return values that can be used later on
This is done by using return statements
When a return statement is encountred

The function's execution is terminated
Its value is the value of the expression list

If the function does not contain a return statement, then its value is None

Namespaces

A namespace is a collection of mappings between names and the objects they are bound to
Multiple namespaces can co-exist at the same time
Namespaces are isolated: the same name can exist in different namespaces with different bindings
Python has three namespace “types”

Built-in namespace

Created when the Python interpreter starts up
Contains all built-in names and exceptions

Global namespace

Created when the execution of a program starts
Lasts until the interpreter quits

Local (function) namespace

Created when a function is called
Deleted when the function finishes executing

Scopes

A scope is the part of the code where a namespace is directly accessible
To resolve a name, Python looks for it in the different namespaces in the following order

Local namespace (if the name reference is in a function)
Enclosing namespaces (from inner-most to outer-most)
Global namespace
Built-in namespace

global and nonlocal statements change how a name is resolved

global → direct reference to global namespace
nonlocal → direct reference to nearest enclosing namespace

Scopes and Namespaces

Code (scroll down to see the rest)


l = [1, 2, 3]

def outer_func():
    l = [4, 5, 6]

    def inner_func_a():
        l = [7, 8, 9]
        print("Inside inner_func_a:", l)

    def inner_func_b():
        global l # will be looked up in top-most scope -> global
        print("Inside inner_func_b:", l)

    def inner_func_c():
        nonlocal l  # looked up in nearest enclosing scope -> outer_func()
        print("Inside inner_func_c:", l)

    def inner_func_d():
        nonlocal l
        l = [10, 11, 12] # modify binding
        print("Inside inner_func_d:", l)

    def inner_func_e():
        nonlocal l
        l.append(42) # modify list
        print("Inside inner_func_e:", l)

    print("Before inner_func_* calls:", l)
    inner_func_a()
    print("After inner_func_a call:", l)
    inner_func_b()
    print("After inner_func_b call:", l)
    inner_func_c()
    print("After inner_func_c call:", l)
    inner_func_d()
    print("After inner_func_d call:", l)
    inner_func_e()
    print("After inner_func_e call:", l)

print("Before outer_func call:", l)
outer_func()
print("After outer_func call:", l)

Output


Before outer_func call: [1, 2, 3]

Before inner_func_* calls: [4, 5, 6]

Inside inner_func_a: [7, 8, 9]

After inner_func_a call: [4, 5, 6]

Inside inner_func_b: [1, 2, 3]

After inner_func_b call: [4, 5, 6]

Inside inner_func_c: [4, 5, 6]

After inner_func_c call: [4, 5, 6]

Inside inner_func_d: [10, 11, 12]

After inner_func_d call: [10, 11, 12]

Inside inner_func_e: [10, 11, 12, 42]

After inner_func_e call: [10, 11, 12, 42]

After outer_func call: [1, 2, 3]

Passing Arguments

Arguments can be passed to function calls in two ways

As positional arguments: parameters are bound to arguments in order of appearance (from left to right)
```
function_name(arg_1, arg_2, ...)
```
As keyword arguments: parameters are bound to the arguments they are associated with in the call (order ignored)

function_name(param_1=arg_1,param_2=arg_2, ...)

The number of arguments in the call must be the same as the number of parameters in the function's definition, unless the latter contains default parameter values (presented next)

Positional and keyword arguments can be mixed, in which case positional arguments must appear first in the function call!

Passing Arguments


>>> def add(a, b):
...     return a + b
... 
>>> add(1, 2) # 1 and 2 are positional args, params are bound in order: a -> 1, b -> 2

3

>>> add(b=3, a=4) # 3 and 4 are keyword args, bindings are based on name associations, not order: a -> 4, b -> 3

7

>>> add(4, b=10) # positional and keyword args can be mixed: a -> 4 (based on order), b -> 10 (based on name)

14

>>> add(a=1, 3) # positional args must appear first (once a kwarg appears, all those that follow must be kwargs)

  File "<stdin>", line 1
SyntaxError: positional argument follows keyword argument

>>> add(*[3, 5]) # (advanced) lists can be "unpacked" into positional args (equivalent to add(3, 5))

8

>>> add(**{ "b": 7, "a": 4}) # (advanced) dicts can be unpacked into kwargs (equivalent to add(a=4, b=7))

11

Passing Arguments

In Python, arguments are passed by object reference.

Modifications to a mutable argument inside the function will be visible to the caller
Rebindings inside the function will not change those of the caller

Passing Arguments


>>> def add_animal(animal, animal_list):
...     animal_list.append(animal) # add new animal to list
...     return animal_list
... 
>>> def reassign_animals(animal_list):
...     animal_list = ["dove", "sparrow", "eagle"] # rebind the name to a new list
...     return animal_list
... 
>>> animals = ["cat", "dog", "chicken"]
>>> 
>>> reassigned_animals = reassign_animals(animals)
>>> 
>>> print("animals:", animals) # animals is still bound to the same list

animals: ['cat', 'dog', 'chicken']

>>> print("reassigned_animals:", reassigned_animals)
reassigned_animals: ['dove', 'sparrow', 'eagle']

>>> new_animals = add_animal("elephant", animals)
>>> 
>>> print("animals:", animals) # animals did get altered
animals: ['cat', 'dog', 'chicken', 'elephant']

>>> print("new_animals:", new_animals)
new_animals: ['cat', 'dog', 'chicken', 'elephant']

Default Parameter Values

Function definitions can assign default values to parameters

def function_name(param_1, ..., param_i=expression_i, ..., param_n=expression_n):
    function_body

If a parameter has a default value, the corresponding argument can be omitted in function calls

The default value is used in this case

Parameters with no default values must precede parameters with default values in the function definition statement!

Default Parameter Values


>>> def add(a=1, b): # params with non-default values must precede those that have defaults
...     return a + b
... 

  File "<stdin>", line 1
SyntaxError: non-default argument follows default argument

>>> def add(a, b=1): # b has a default value of 1
...     return a + b
... 
>>> add(5) # equivalent to add(5, 1)

6

Default Parameter Values Pitfall

Default parameter values in function definitions are evaluated only once!

No re-evaluation each time the function is called
Mutable defaults (e.g., lists) can be mutated for subsequent calls

The following code...


def append_to(element, to=[]):
    to.append(element)
    return to

my_list = append_to(12)
print(my_list)

my_other_list = append_to(42)
print(my_other_list)

... outputs


[12]
[12, 42]

Recursion

A recursive function is a function that contains calls to itself (within its body)
A recursive function has

One or more base cases → the result is produced directly
One or more recursive cases → the function calls itself (recurs)

Example*


>>> def factorial(number):
...     if number > 0: # recursive case
...         return number * factorial(number - 1)
...     else: # trivial case (number == 0)
...         return 1
... 
>>> factorial(0) # solved directly

1

>>> factorial(3) # factorial(3) -> calls factorial(2) -> calls factorial(1) -> calls factorial(0)

6

* Something is fishy in this example. Can you point it out?

Function Annotations and Type Hinting

Function definitions can include annotations that indicate variable and return type hints (Python 3.5+)
The Python runtime does not enforce function and variable type annotations
But they can be used by third-party tools (IDEs, linters, type checkers, ...)

Syntax


      def function_name(param_1: type_1, param_2: type_2, ...) -> return_type:
          function_body

Example


>>> def add(a: int, b: int = 1) -> int:
...     return a + b
... 
>>> add(1, 2)

3

>>> add("Hello", "world") # this does not generate an error even though strings are used!

'Helloworld'

Documenting Functions (Docstrings)

Syntax


      def function_name(...):
          """Function documentation (docstring)"""
          rest_of_function_body

The very first statement of a function's body can be a string literal that documents the function (docstring)

Can be accessed using the help() function
Can be used by tools such as Sphinx to generate documentation


>>> def add(a: int, b: int = 1) -> int:
...     """Add two integers.
... 
...     Args:
...         a (int): The first integer.
...         b (int, optional): The second integer. Defaults to 1.
... 
...     Returns:
...         int: The sum of the two integers.
...     """
...     return a + b
...

Errors and Exceptions

Syntax errors occur when statements or expressions are not syntactically correct and can't be understood by the interpreter (e.g., a : is missing, a statement is wrongly indented, ...)
Exceptions are errors that occur during code execution (e.g., attempt to divide by zero)


>>> for i in [1 , 2, 3] print(i)

File "<stdin>", line 1
  for i in [1 , 2, 3] print(i)
                          ^
SyntaxError: invalid syntax

  >>> x = 3 / 0

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

Raising Exceptions

raise statements can be used to raise exceptions so that they can be handled by the surrounding code (cf. next slide)

Python provides built-in exception types for handling common errors (arithmetic errors, type errors, ...)
Unhandled exceptions immediately terminate the program


>>> def add(a: int, b: int = 1) -> int:
...     """Add two integers.
... 
...     Args:
...         a (int): The first integer.
...         b (int, optional): The second integer. Defaults to 1.
... 
...     Returns:
...         int: The sum of the two integers.
...     """
...     if not (isinstance(a, int) and isinstance(b, int)):
...         raise TypeError("Arguments must be of type int") # stops the function's execution
...     return a + b # only executed if no exceptions are raised
...

Handling Exceptions

Syntax


try:
    try_body
except (ErrorType1, ErrorType2, ...):
    except_body

Exceptions can be handled using try (compound) statements
The statement is executed as follows

The try clause is executed
If no exception occurs, the except clause is skipped
If an exception occurs

If its type matches one of those after the except keyword

The except clause is executed
Execution continues after the try statement

Otherwise, it's an unhandled exception → the execution stops


>>> try:
...     add("toto", 1)
... except TypeError:
...     print("Oups! Something went wrong!!!")

Oups! Something went wrong!!!

Data Structures

Data structures are collections of related data (objects)
Python offers 4 built-in data structures

Lists (list): mutable sequences of objects
Tuples (tuple): immutable sequences of objects
Sets (set): collections of unique objects
Dictionaries (dict): mappings between key-value pairs

Lists

Lists (built-in type list) are sequences of objects
Properties

Ordered
Iterable
Mutable
Can contain objects of different types
Subscriptable: an item of the list can be selected (l[i])
Support slicing, i.e., selection of a range of objects (l[start:end:step])


>>> animals = ["cat", "dog", "chicken"]
>>> mixed_list = [1, "pen", True, [1., 2.5, 3+4j]]
>>> print(animals)

['cat', 'dog', 'chicken']

>>> print(mixed_list)
[1, 'pen', True, [1.0, 2.5, (3+4j)]]

Constructing Lists

Lists can be constructed in several ways

A pair of square brackets denotes an empty list: []
Using square brackets with comma-separated values inside: [item_1, item_2, ...]
Using list comprehensions (presented later)
Using the type constructor: list() (for an empty list) or list(iterable)
Using list concetenations: list_1 + list_2
Using list replication: n * l (concat the list l with itself n times)

Constructing Lists


>>> empty_list = []
>>> print(empty_list)
      
[]

>>> print(len(empty_list)) # print the length (number of members) of the list

0

>>> alphabet = "abcdefghijklmnopqrstuvwxyz"
>>> alphabet_list = list(alphabet) # using the type constructor list() applied to an iterable
>>> print(alphabet_list)

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

>>> wild_animals = ["lion", "giraffe", "elephant"]
>>> domestic_animals = ["cat", "dog", "chicken"]
>>> animals = wild_animals + domestic_animals # list concatenation
>>> print(animals)
['lion', 'giraffe', 'elephant', 'cat', 'dog', 'chicken']

  >>> 3 * [1, 2, 3] # concatenate the list with itself three time
 
  [1, 2, 3, 1, 2, 3, 1, 2, 3]

Accessing List Content

Lists can be iterated over using for loops
Lists can be indexed: l[i]

Positions start at 0
Negative indexes indicate reverse order (i.e., going back from the end of the list)

Lists support slicing: l[start:end:step]

start is inclusive
end is exclusive
Elements of the slice can be omitted

Defaults: start → 0; end → len(l); step → 1

Slicing produces a new list (based on the original list)

Accessing List Content


>>> animals = ["lion", "giraffe", "elephant", "cat", "dog", "chicken"]
>>> print(animals[0]) # remember: positions start at 0
    
lion

>>> print(animals[-3]) # third member from the end of the list

cat

>>> print(animals[:3]) # slice of the three first members of the list

['lion', 'giraffe', 'elephant']

  >>> domestic_animals = animals[3:] # create a new list based on members from 4th pos. -> end of list
  >>> print(domestic_animals)

['cat', 'dog', 'chicken']

  >>> print(animals[1::2]) # select one every two animals, starting from 2nd item

['giraffe', 'cat', 'chicken']

  >>> domestic_animals[1] = "sheep" # you can use indexes to modify individual list members
  >>> print(domestic_animals)
['cat', 'sheep', 'chicken']

>>> print(animals) # the original list from which domestic_animals was created was not modified
['lion', 'giraffe', 'elephant', 'cat', 'dog', 'chicken']

>>> animals[:4:2] = ["tiger", "zebra"] # slicing can also be used to modify multiple items simultaneously
>>> print(animals)

  ['tiger', 'giraffe', 'zebra', 'cat', 'dog', 'chicken']

Common List Methods and Operations*

len(l): length (number of members) of the list l
x in l: check if object x is a member of the list l

Evaluates to True if a member of l equals x, False otherwise
Negation: x not in l (check that x is not a member of l)

l.append(x): insert object x at the end of the list l
l.insert(i, x): insert object x at position i of the list l
l.count(x): number of occurrences of object x in the list l
l.remove(x): remove the first occurrence of x in the list l
l.pop(i): pop (remove) the member at position i and return it
l.clear(): empty the list l
l.sort(): sort the list l (in-place)
l.reverse(): invert the list l (in-place)
max(l), min(l): largest / smallest item in the list

* The supported sequence operations and their priorities can be found here and here.

Common List Methods and Operations


>>> numbers = [2, 4, 27, 1, -5, 11, 3]
>>> len(numbers)
  
7

  >>> 25 not in numbers # membership test

True

>>> 3 in numbers

True

>>> numbers.append(27) # append 27 at the end of the list
>>> print(numbers)

[2, 4, 27, 1, -5, 11, 3, 27]

>>> numbers.insert(3, 42)
>>> print(numbers)

[2, 4, 27, 42, 1, -5, 11, 3, 27]

>>> numbers.count(27)
2

>>> numbers.remove(27) # remove the first occurrence of 27 from the list
>>> print(numbers)
[2, 4, 42, 1, -5, 11, 3, 27]

>>> numbers.remove(33) # trying to remove an item that doesn't exist raises an exception

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list

>>> numbers.sort(reverse=True) # sort the list in reverse order (the list is directly modified)
>>> print(numbers)
  
  [42, 27, 11, 4, 3, 2, 1, -5]
  
>>> n = numbers.pop(3) # pop the 4th item from the list and bind the name n to it 
>>> print(numbers)
  
  [42, 27, 11, 3, 2, 1, -5]
  
>>> print(n)

4

List Comprehensions

List comprehensions provide a concise syntax for creating lists based on other sequences or iterables

For each object in the iterable

element is bound to the object
expression is evaluated

The result is a list of all the values of the expression

Syntax


      [expression for element in iterable]

Example


>>> numbers_squared = [n**2 for n in range(1, 11)]
>>> print(numbers_squared)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

>>> # equivalent to
>>> numbers_squared = []
>>> for n in range(1, 11):
...     numbers_squared.append(n**2)
... 
>>> print(numbers_squared)
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Conditional List Comprehensions

List comprehensions can include if clauses to filter expressions based on some condition

Syntax


      [expression for element in iterable if condition]

Example


>>> odd_numbers = [n for n in range(10) if (n % 2)]
>>> print(odd_numbers)

[1, 3, 5, 7, 9]

>>> # equivalent to
>>> odd_numbers = []
>>> for n in range(10):
...     if (n % 2):
...         odd_numbers.append(n)
... 
>>> print(odd_numbers)

[1, 3, 5, 7, 9]

Nested List Comprehensions

List comprehensions can also involve multiple for statements

Syntax


      [expression for x in iterable_1 for y in iterable_2]

Example


>>> numbers = [2 * x + y for x in range(3) for y in range(0, 50, 10)]
>>> print(numbers)

[0, 10, 20, 30, 40, 2, 12, 22, 32, 42, 4, 14, 24, 34, 44]]

>>> # equivalent to
>>> numbers = []
>>> for x in range(3):
...     for y in range(0, 50, 10):
...         numbers.append(2 * x + y)
... 
>>> print(numbers)

[0, 10, 20, 30, 40, 2, 12, 22, 32, 42, 4, 14, 24, 34, 44]

Tuples

Tuples (built-in type tuple) are sequences used typically to store collections of heterogeneous data
Properties

Ordered
Iterable
Immutable ( unlike lists)
Can contain objects of different types
Subsciptable
Support slicing


>>> animals = ("cat", "dog", "chicken")
>>> mixed_tuple = (1, "pen", True, [1., 2.5, 3+4j])
>>> print(animals)

['cat', 'dog', 'chicken']

>>> print(mixed_tuple)
(1, 'pen', True, [1.0, 2.5, (3+4j)])

Constructing Tuples

Tuples can be constructed in several ways

A pair of empty parentheses denotes an empty tuple: ()
Using a trailing comma for a singleton tuple: (a,) or a,
Separating items with commas: (a, b, c) or a, b, c
Using the type constructor: tuple() (for an empty tuple) or tuple(iterable)
Using tuple concatenation: tuple_1 + tuple_2
Using tuple replication: n * t

It's actually the comma that makes the tuple, not the parentheses!

Constructing Tuples


>>> t = () # an empty tuple, equivalent to using the type constructor tuple()
>>> print(t)
    
    ()
    
>>> t = 1, # singleton tuple
>>> type(t)
    
    <class 'tuple'>
    
      >>> print(t)

(1,)

>>> t = (1, 2, 3) # same as t = 1, 2, 3
>>> print(t)
(1, 2, 3)

>>> t = (1, 2, 3) + (4, 5, 6) # tuple concatenation
>>> print(t)

(1, 2, 3, 4, 5, 6)


>>> t = 3 * (1, 2, 3) # tuple replication (concatenate tuple with itself multiple times)
>>> print(t)

(1, 2, 3, 1, 2, 3, 1, 2, 3)

>>> t = tuple(3*[[]]) # a tuple from a list of "three" empty lists
>>> print(t)

([], [], [])

>>> t[0].append("cat") # a little brain teaser: what is happening here???
>>> print(t)

(['cat'], ['cat'], ['cat'])

Immutability, Revisited...

The name bindings inside tuples are immutable, but not necessarily the objects those names are bound to!


>>> t = ("John", 23, ["Pasta", "Pizza", "Tiramisu"]) # name, age, and list of favorite dishes
>>> t[0] = "Jane" # The name (immutable str) can't be changed
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

t[2] = ["Fondue", "Raclette", "Tartiflette"] # the binding to the list can't be changed...

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

>>> t[2][0] = "Raclette" # ... but the list itself (mutable) can!
>>> t[2].append("Fondue")
>>> print(t)
('John', 23, ['Raclette', 'Pizza', 'Tiramisu', 'Fondue'])

Tuple Operations

Tuples support common sequence operations

Operation	Description
`len(t)`	Length (number of items) of the tuple `t`
`x in t`	`True` if a member of `t` equals `x`, `False` otherwise
`x not in t`	`False` if a member of `t` equals `x`, `True` otherwise
`t.count(x)`	Number of occurrences of `x` in the tuple `t`
`t.index(x)`	Index of the first occurrence of `x` in the tuple `t`
`max(t)`	Largest item in the tuple `t`
`min(t)`	Smallest item in the tuple `t`

Tuple Operations


>>> numbers = (1, 5, 2, 7, 8, 11, -3, 7, 42)
>>> print(len(numbers))

9

>>> print(numbers[3::2]) # tuples are subscriptable and support slicing

(7, 11, 7)

>>> (5 in numbers) and (12 not in numbers) # membership tests

True

>>> numbers.count(7)

2

>>> numbers.index(7) # index of first occurrence of 7 in the tuple

3

>>> numbers.index(7, 5) # index of first occurrence of 7 starting from index 5

7

>>> print("Smallest number: {}. Largest number: {}.".format(min(numbers), max(numbers)))

Smallest number: -3. Largest number: 42.

>>> t = (i for i in range(10) if (i % 2)) # There are no tuple comprehensions!!!
>>> type(t)

<class 'generator'>

Enumerating Iterables

The enumerate() function builds a new iterable from another iterable (passed as an argument)

Each object of the new iterable is a 2-tuple containing

The index in the original iterable...
... and the corresponding object

Used when the index is needed while iterating over an iterable

Usually, the 2-tuple is unpacked
Better than unidiomatic alternatives w/additional state variables


>>> animals = ["cat", "dog", "chicken"]
>>> for t in enumerate(animals):
...     print(t)
... 

(0, 'cat')
(1, 'dog')
(2, 'chicken')

>>> for i, animal in enumerate(animals):
...     print("The animal at position {} is {}".format(i, animal))
... 
The animal at position 0 is cat
The animal at position 1 is dog
The animal at position 2 is chicken

Sets

Sets (built-in type set) are unordered collections of unique hashable objects
Sets are very helpful when mathematical set operations (intersection, union, ...) are involved
Properties

Unordered
Iterable
Mutable (for immutable sets, use the frozenset built-in type)
Can contain objects of different types
Can only contain objects that are hashable
Not subscriptable
Do not support slicing

Hashing consists in mapping objects of varying sizes to fixed-size representations (hash values). You can learn more here.

Constructing Sets

Sets can be constructed in different ways

Using curly brackets with comma-separated values inside: {item_1, item_2, ...}
Using the type constructor: set() (for an empty set) or set(iterable)
Using set comprehensions

{} does not produce an empty set!

Concatenations and replications cannot be used to construct sets.

Constructing Sets


>>> type({}) # this is an empty dict (presented next), not an empty set (use set() instead)

<class 'dict'>

>>> s = {"Hello", 1, 3.14, 5+4j, (1, 2)} # type mixing is allowed...
>>> print(s)

{'Hello', 1, (1, 2), 3.14, (5+4j)}

>>> s = { "Hello", 1, 3.14, 5+4j, (1, 2), [1, 4, 5] } # ... as long as you don't use unhashables

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

>>> numbers = {1, 5, 2, 7, -2, 7, 4, 42, -8, 36, 42} # sets can't contain duplicates
>>> print(numbers)

{1, 2, 4, 5, 36, 7, 42, -8, -2}

>>> s = {1, 2, 3} + {4, 5, 6} # sets do not support concatenations (or replication)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'set' and 'set'

>>> s = set(range(10)) # set() can be used to construct sets from iterables
>>> print(s)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

Set Comprehensions

Set comprehensions can be used to construct sets efficiently

Similar syntax to list comprehensions (using {} instead of [])
Nesting and conditional comprehensions are supported
Additional constraint: expression must evaluate to a hashable

Syntax


      {expression for element in iterable}

Examples


>>> numbers_squared = {n**2 for n in range(1, 11)}
>>> print(numbers_squared)

{64, 1, 4, 36, 100, 9, 16, 49, 81, 25}

>>> s = { (i, j) for i in range(4) for j in range(4) if i < j}
>>> print(s)
{(0, 1), (1, 2), (1, 3), (2, 3), (0, 3), (0, 2)}

Set Operations

Sets implement extremely useful mathematical set operations

Operation	Description
`x in s`	Returns `True` if `x` is an element of `s`
`x not in s`	Returns `True` if `x` is not an element of `s`
`s1 == s2`	Returns `True` if `s1` and `s2` contain the same elements
`s1.isdisjoint(s2)`	Returns `True` if `s1` and `s2` have no elements in common
`s1.issubset(s2)` `s1 <= s2`	Tests if `s1` is a subset of `s2` (use `s1.issuperset(s2)` or `s1 >= s2` for the other way around)
`s1 < s2`	Tests if `s1` is a proper subset of `s2` (use `s1 > s2` for the other way around)
`s1.intersection(s2)` `s1 & s2`	Returns a new sets containing elements common to both `s1` and `s2`
`s1.union(s2)` `s1 \| s2`	Returns a new sets containing elements of both `s1` and `s2`
`s1.difference(s2)` `s1 - s2`	Returns a new sets containing elements that are in `s1` but not in `s2`
`s1.symmetric_difference(s2)` `s1 ^ s2`	Returns a new sets containing elements that are in either `s1` or `s2` but not both

Set Operations


>>> s1 = set(range(0, 30, 3))
>>> print(s1)

{0, 3, 6, 9, 12, 15, 18, 21, 24, 27}

>>> s2 = set(range(0, 30, 4))
>>> print(s2)

{0, 4, 8, 12, 16, 20, 24, 28}

>>> (9 in s1) and (12 in s2) # membership testing

True

>>> s1 & s2 # equivalent to s1.intersection(s2)

{0, 24, 12}

  >>> s1 | s2 # equivalent to s1.union(s2)

{0, 3, 4, 6, 8, 9, 12, 15, 16, 18, 20, 21, 24, 27, 28}

>>> s1 - s2 # equivalent to s1.difference(s2)

{3, 6, 9, 15, 18, 21, 27}

>>> s2 - s1

{4, 8, 16, 20, 28}

>>> s1 ^ s2 # equivalent to s1.symmetric_difference(s2)

{3, 4, 6, 8, 9, 15, 16, 18, 20, 21, 27, 28}

  >>> s1 <= s2 # equivalent to s1.issubset(s2)
  
  False
  
>>> {3, 6, 9} <= s1

True

>>> s1 < s1 # s1 is not a proper subset of itself (since s1 == s1)

False

Modifying Sets

Since sets are not subscriptable, elements cannot be modified using index-based assignments (i.e., s[i] = expression)
However, since sets are mutable, they provide many element-oriented and set-oriented modification methods

Operation	Description
`s.add(x)`	Adds `x` to the set `s` (`x` must be hashable)
`s.remove(x)`	Removes `x` from `s` (raises an exception if `x not in s`)
`s.discard(x)`	Removes `x` from `s` (does not raises an exception if `x not in s`)
`s.clear()`	Empties the set `s` from all its elements
`s.pop()`	Removes an arbitrary element from the set `s` and returns it
`s1 &= s2` `s1 \|= s2` `s1 -= s2` `s1 ^= s2`	Update the set `s1` by adding the result of the corresponding set operation (intersection, union, difference, and symmetric difference respectively) to `s1` and `s2`

Modifying Sets


>>> s = {5, 2, 42, -7, 3, -16}
>>> s.add(9)
>>> print(s)

{2, 3, 5, 9, 42, -16, -7}

>>> s.remove(3)
>>> print(s)

{2, 5, 9, 42, -16, -7}

>>> s.remove(101) # attempting to remove an alement that doesn't exist raises an error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 101

>>> n = s.pop()
>>> print(s, ",", n)

{5, 9, 42, -16, -7} , 2

>>> s -= {5, 4, -7}
>>> print(s)

{9, 42, -16}

>>> s |= {3, 12} | {45, 9} ^ {3, 9}
>>> print(s)

{3, 9, 42, 12, 45, -16}

>>> s.clear()
>>> print(s)

set()

Dictionaries

Dictionaries (built-in type dict) are objects that map hashable values (keys) to arbitrary objects (values)

Can be seen as (key, value) pairs
Keys are unique (they form a set)
Values are accessed through keys instead of integer indices


>>> person = {"name": "John", "age": 24, "profession": "Data Scientist"}
>>> print(person)

{'name': 'John', 'age': 24, 'profession': 'Data Scientist'}

>>> print(person["name"])

John

Dictionaries

Dictionaries (built-in type dict) are objects that map hashable values (keys) to arbitrary objects (values)
Properties

Iterable
Mutable
Preserve insertion order (Python 3.7+)
Can contain keys of different types (must be hashable)
Can contain values of different types
Subscriptable (indexed by keys)
Do not support slicing

Constructing Dictionaries

Dictionaries can be created in a multitude of ways

A pair of empty curly brackets denotes an empty dictionary: {}
Using curly brackets with comma-separated key:value pairs inside: {key_1: value_1, key_2: value_2, ...}
Using dictionary comprehensions
Using the type constructor dict() (for an empty dictionary), dict(**kwargs), dict(mapping, **kwargs), or dict(iterable, **kwargs)

Concatenations and replications cannot be used to construct dictionaries.

Constructing Dictionaries


>>> d = {} # an empty dict, equivalent to d = dict()
>>> print(d, ", ", len(d))

{} ,  0

>>> d = { 1: "Hello", (3, 5): 3+4j, "list": [1, 5]} # type mixing is accepted (for keys and values)...
>>> print(d)



>>> d = { 1: "Hello", [3, 4, 2] : (1, 2)} # ... as long as you use hashable keys

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

>>> # The following are all equivalent constructions
>>> d1 = {"name": "John", "age": 24, "profession": "Data Scientist"}
>>> d2 = dict(name="John", age=24, profession="Data Scientist")
>>> d3 = dict([("name", "John"), ("age", 24), ("profession", "Data Scientist")])
>>> d4 = dict(zip(["name", "age", "profession"], ["John", 24, "Data Scientist"]))
>>> d1 == d2 == d3 == d4

True

>>> print(d1)

{'name': 'John', 'age': 24, 'profession': 'Data Scientist'}

zip() returns an iterable object containing tuples obtained from iterating over multiple iterators simultaneously (documentation).

Dictionary Comprehensions

Comprehensions can be used to construct dictionaries

Similar to list and set comprehensions (uses {} like the latter)
Nesting and conditional comprehensions are supported
Constraint: key_expression must produce a hashable

Syntax


      {key_expression: value_expression for element in iterable}

Examples


>>> print({i : i**2 for i in range(11)})
    
    {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}
    
>>> print({i: animal for i, animal in enumerate(["cat", "dog", "chicken"])}

{0: 'cat', 1: 'dog', 2: 'chicken'}
>>> print({ animal: len(animal) for animal in ["cat", "dog", "chicken"] if len(animal) < 5 })

{'cat': 3, 'dog': 3}

Indexing Dictionaries

Dictionaries are subscriptable

Indexation by keys: d[key] (not numeric indice)

If key exists in the dictionary → returns the associated value
Else → raises a KeyError exception

d.get(key, default_value) can also be used to access the dictionary's values

If key exists in the dictionary → returns the associated value
Else

default_value is returned if it is specified
None is returned if default_value is omitted

Dictionaries are iterable

By default, iteration over keys
Possible to iterate over values and key-value pairs through views (presented in a bit)

Indexing Dictionaries


>>> d = {"a": 1, "b": 2, "c": 3}
>>> print(d["a"]) # indexation is done by keys

1

>>> print(d["e"]) # using a key that doesn't exist raises an exception

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'e'

>>> print(d.get("e")) # using an inexistant key with get() doesn't raise an exception

None

>>> print(d.get("e", 0) # the default value can be customized

0

>>> for i in d: # iterating directly on the dict operates on its keys
...     print(i)
... 

a
b
c

Dictionary View Objects

Dictionaries provide views on their entries through three methods

d.keys() → Collection of keys in the dictionary d
d.values() → Collection of values in the dictionary d
d.items() → Collection of (key, value) in the dictionary d

Dictionary views

Are iterable
Support membership tests (in and not in operators)
Are dynamic: changes to the dictionary are reflected in all its views

Dictionary View Objects


>>> d = {"a": 3, "b": 7, "c": 42, "d": 13}
>>> for key in d.keys():
...     print(key)
... 
  
a
b
c
d
  
>>> for value in d.values():
...     print(value)
... 
  
3
7
42
13
  
>>> for key, value in d.items():
...     print(key, " -> ", value)
... 
  
a  ->  3
b  ->  7
c  ->  42
d  ->  13

Operating on Dictionaries

Dictionaries support many useful operations

Operation	Description
`d[key] = value`	Insert the `(key, value)` pair in `d` (updates `d[key]` if it already exists)
`del d[key]`	Removes `key` from `d` (raises a `KeyError` exception if the key is not in the dict)
`len(d)`	Length (number of items) of the dictionary `d`
`x in d`	`True` if a key of `d` equals `x`, `False` otherwise
`x not in d`	`False` if a key of `d` equals `x`, `True` otherwise
`d.pop(key, default)`	If `key in d`, removes the key and returns the associated value, else returns `default`
`d.popitem()`	Removes and returns a `(key, value)` pair from `d` in LIFO (last in first out) order
`d.clear()`	Empty `d` from all its items

Argument Unpacking in Function Calls

The content of lists, tuples, and dictionaries can be “unpacked” and used as arguments in function calls

Lists and tuples can be unpacked as positional
arguments: func(*args)
Dictionaries can be unpacked as keyword
arguments: func(**kwargs)


>>> def add(a, b):
...     return a + b
... 
>>> l = [1, 2]
>>> d = {"a": 3, "b": 4}

>>> add(*l) # equivalent to add(l[0], l[1])

3

  >>> add(**d) # equivalent to add(a=d["a"], b=d["b"])
  
  7

Where to Go from Here

You are not Python experts yet! Your journey is just beginning...
Too many remaining topics to explore

Packaging Python code
Context managers
Generators
Decorators
Object oriented programming
Data model

Interesting readings

The Hitchhiker’s Guide to Python! (free)
Python Cookbook, 3rd Edition (paid)

If books are not your thing, check Corey Schafer's Youtube channel
Equip yourselves with the right tools

A full-fledged IDE (e.g., Visual Studio Code or PyCharm)
Code linters and formatters

This work is licensed under the
Creative Commons
Attribution-NonCommercial-ShareAlike 4.0
International Public License
(CC BY-NC-SA 4.0)

Python for Data Science

A Crash Course

Introduction to Python Programming

Outline

Basics

What is Python?

Environment Setup Recommendations

Executing Python Code

Interacting With the Python Interpreter

Using Python Source Files

Statements and Expressions

Objects

Variables

Naming Rules

Naming Rules

Comments

Numbers

Operations on Numbers

Booleans (Truth Values)

Truth Value Testing

Comparisons

Boolean Operations

Truth Value Testing Examples

Strings

String Operations and Methods

String Formatting

String Formatting Examples

Mutable vs. Immutable Objects

Mutable vs. Immutable Objects

Control Flow

Control Flow

if Statements

if Statements

if Statements

if Statements Example

while Statements

while Statements Examples

for Statements

Iterables

for Statements Examples

Unidiomatic Control Flow

break and continue Statements

Functions

Functions and Methods

Defining Functions

Calling Functions

return Statements

Namespaces

Scopes

Scopes and Namespaces

Passing Arguments

Passing Arguments

Passing Arguments

Passing Arguments

Default Parameter Values

Default Parameter Values

Default Parameter Values Pitfall

Recursion

Function Annotations and Type Hinting

Documenting Functions (Docstrings)

Errors and Exceptions

Errors and Exceptions

Raising Exceptions

Handling Exceptions

Data Structures

Data Structures

Lists

Constructing Lists

Constructing Lists

Accessing List Content

Accessing List Content

Common List Methods and Operations*

Common List Methods and Operations

List Comprehensions

Conditional List Comprehensions

Nested List Comprehensions

Tuples

Constructing Tuples

Constructing Tuples

Immutability, Revisited...

`if` Statements

`if` Statements

`if` Statements

`if` Statements Example

`while` Statements

`while` Statements Examples

`for` Statements

`for` Statements Examples

`break` and `continue` Statements

`return` Statements