Python for Data Science

A Crash Course



General Info



Khalil El Mahrsi
2023
Creative Commons License

About Me

  • Ph.D. in computer science
  • 12 years working on data-related problems in different contexts (research, consulting, ...)
  • Head of Data Science & Analytics at Intermarché & Netto
    • Defining industrialization and coding practices (MLOps) for conducting data science projects
    • Building solutions to a wide variety of business problems (churn, sales, stock, and in-store traffic prediction; sentiment analysis; basket analysis; ...)
    • Managing a team of 10 Data Scientists/Analysts/Engineers
    • ...
  • Fell in love with Python after watching James Powell's “So you want to be a Python expert?” talk

Learning Objectives

By the end of this course, you will be able to
  • Write and understand entry-level to intermediate-level code in the Python programming language
  • Use NumPy for scientific computing and efficient manipulation of multi-dimensional arrays and matrices
  • Use pandas to load, manipulate, and analyze tabular data
  • Use Matplotlib and seaborn to visualize data

Course Philosophy

  • Code examples will often tease concepts that will be discussed later so don't worry if you don't understand something from the get-go
  • Talking to a passive audience is not fun
    • Interact with me
    • Follow along the code examples and try to experiment with them
    • No matter how stupid a question might seem, ask it anyway
  • Don't be shocked if I don't have answers to all your questions (but I'll look them up)
  • Your feedback is precious and crucial for improving the course

Evaluation

  • Team project
    • Exploratory data analysis of a tabular data set (data sets will be communicated at a later date)
    • Conducted in pairs
  • Expected deliverables
    • Report (5–10 pages)
    • Source code (preferably a github repo, zip otherwise)
    • 10-min presentation of your main findings + 5-min for questions

Course Outline

Introduction to Python Programming
Scientific Computing With NumPy
Processing Tabular Data With pandas
Visualizing Data With Matplotlib and seaborn
This work is licensed under the
Creative Commons
Attribution-NonCommercial-ShareAlike 4.0
International Public License
(CC BY-NC-SA 4.0)