Workshop Contents
This workshop assumes you know some pandas and want to apply idiomatic constructs to existing code. There will be some lecture and then breakout time to apply the constructs to your own code.
We will cover
Types
Chaining and assign
Mutation
Aggregation
Debugging
This is a unique course. We will discover best practices of pandas and then you will be able to apply them to your own dataset. Make sure you bring a dataset to the class to practice on.
Description
This course will teach attendees features of pandas and best practices for data wrangling and exploration.
The course will consist of a mix of lectures and breakout sessions, where attendees will have the opportunity to apply the constructs to their own code.
By the end of the course, you’ll understand:
Best practices for data wrangling and exploration in pandas
How to use chaining and the assign method in pandas
How to handle data mutation and aggregation in pandas
How to debug pandas code
And you’ll be able to:
Apply idiomatic constructs to existing pandas code
Improve the performance and maintainability of pandas code
Create data visualizations using pandas and other Python libraries
This Online Training is for you because…
You are a software developer or data scientist who already has some experience with pandas and wants to take your skills to the next level
You are a beginner or intermediate-level pandas user looking to improve your data wrangling and exploration skills
Prerequisites
Basic understanding of Python
Familiarity with Jupyter notebook
Course Set-up
Attendees should have pandas and other necessary libraries installed.
Attendees should have access to Jupyter notebook, or Google Colab
Attendees are encouraged to bring their own code to work on during breakout sessions
Recommended Preparation
Read: "Effective Pandas" by Matt Harrison
Schedule
Day 1:
Segment 1: Introduction to Idiomatic Pandas
Segment 2: Cleaning Data
Day 2:
* Filtering
* Lambda
* Grouping & Pivoting
Day 3
* Summary Stats
* Histograms
* Scatter plots
* Correlations
Day 4
* Intro to ML with Pandas
* Clustering
* Dimensional Reduction