This workshop assumes you know some pandas and want to apply idiomatic constructs to existing code. There will be some lecture and then breakout time to apply the constructs to your own code.

We will cover

  • Types

  • Chaining and assign

  • Mutation

  • Aggregation

  • Debugging


This is a unique course. We will discover best practices of pandas and then you will be able to apply them to your own dataset. Make sure you bring a dataset to the class to practice on.

Description

  • This course will teach attendees features of pandas and best practices for data wrangling and exploration.

  • The course will consist of a mix of lectures and breakout sessions, where attendees will have the opportunity to apply the constructs to their own code.


By the end of the  course, you’ll understand:

  • Best practices for data wrangling and exploration in pandas

  • How to use chaining and the assign method in pandas

  • How to handle data mutation and aggregation in pandas

  • How to debug pandas code


And you’ll be able to:

  • Apply idiomatic constructs to existing pandas code

  • Improve the performance and maintainability of pandas code

  • Create data visualizations using pandas and other Python libraries


This Online Training is for you because…

  • You are a software developer or data scientist who already has some experience with pandas and wants to take your skills to the next level

  • You are a beginner or intermediate-level pandas user looking to improve your data wrangling and exploration skills


Prerequisites

  • Basic understanding of Python

  • Familiarity with Jupyter notebook


Course Set-up

  • Attendees should have pandas and other necessary libraries installed.

  • Attendees should have access to Jupyter notebook, or Google Colab

  • Attendees are encouraged to bring their own code to work on during breakout sessions


Recommended Preparation

  • Read: "Effective Pandas" by Matt Harrison




Schedule
Day 1:
Segment 1: Introduction to Idiomatic Pandas 
Segment 2: Cleaning Data

Day 2:
* Filtering 
* Lambda
* Grouping & Pivoting

Day 3
* Summary Stats
* Histograms
* Scatter plots
* Correlations

Day 4
* Intro to ML with Pandas
* Clustering
* Dimensional Reduction