Python for Data Science
A Crash Course
General Info
Khalil El Mahrsi
2022
About Me
- Ph.D. in computer science
-
12 years working on data-related problems in different contexts
(research, consulting, ...)
- Currently: Head of Data Science at Intermarché
-
Defining industrialization and coding practices for conducting data
science projects
-
Building solutions to a wide variety of business problems (churn,
sales, stock, and in-store traffic prediction; sentiment analysis;
basket analysis; ...)
- Mentoring Data Scientists
- ...
-
Fell in love with Python after watching James Powell's
“So you want to be a Python expert?”
talk
Learning Objectives
By the end of this course, you will be able to
-
Write and understand entry-level to intermediate-level code in the
Python programming language
-
Use NumPy for scientific computing and efficient manipulation of
multi-dimensional arrays and matrices
- Use pandas to load, manipulate, and analyze tabular data
- Use Matplotlib and seaborn to visualize data
Course Philosophy
-
Code examples will often tease concepts that will be discussed later so
don't worry if you don't understand something from the get-go
- Talking to a passive audience is not fun
- Interact with me
- Follow along the code examples and try to experiment with them
- No matter how stupid a question might seem, ask it anyway
-
Don't be shocked if I don't have answers to all your questions (but I'll
look them up)
- Your feedback is precious and crucial for improving the course
Evaluation
- Team project
-
Exploratory data analysis of a tabular data set
(data sets will be communicated at a later date)
- Conducted in pairs
- Expected deliverables
- Report (5–10 pages)
- Source code (preferably a github repo, zip otherwise)
- 10-min presentation of your main findings + 5-min for questions
Course Outline
Introduction to Python Programming
Scientific Computing With NumPy
Processing Tabular Data With pandas
Visualizing Data With Matplotlib and seaborn