Workshop Contents

This workshop assumes you know some pandas and want to apply idiomatic constructs to existing code. There will be some lecture and then breakout time to apply the constructs to your own code.

We will cover

Types
Chaining and assign
Mutation
Aggregation
Debugging

This is a unique course. We will discover best practices of pandas and then you will be able to apply them to your own dataset. Make sure you bring a dataset to the class to practice on.

Description

This course will teach attendees features of pandas and best practices for data wrangling and exploration.
The course will consist of a mix of lectures and breakout sessions, where attendees will have the opportunity to apply the constructs to their own code.

By the end of the course, you’ll understand:

Best practices for data wrangling and exploration in pandas
How to use chaining and the assign method in pandas
How to handle data mutation and aggregation in pandas
How to debug pandas code

And you’ll be able to:

Apply idiomatic constructs to existing pandas code
Improve the performance and maintainability of pandas code
Create data visualizations using pandas and other Python libraries

This Online Training is for you because…

You are a software developer or data scientist who already has some experience with pandas and wants to take your skills to the next level
You are a beginner or intermediate-level pandas user looking to improve your data wrangling and exploration skills

Prerequisites

Basic understanding of Python
Familiarity with Jupyter notebook

Course Set-up

Attendees should have pandas and other necessary libraries installed.
Attendees should have access to Jupyter notebook, or Google Colab
Attendees are encouraged to bring their own code to work on during breakout sessions

Recommended Preparation

Read: "Effective Pandas" by Matt Harrison

Schedule
Day 1:
Segment 1: Introduction to Idiomatic Pandas
Segment 2: Cleaning Data

Day 2:
* Filtering
* Lambda
* Grouping & Pivoting

Day 3
* Summary Stats
* Histograms
* Scatter plots
* Correlations

Day 4
* Intro to ML with Pandas
* Clustering
* Dimensional Reduction