EDA is primarily used in machine learning to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate. Today on the show, Ben and Michael discuss how to use EDA in machine learning models.
In this episode...
- What is EDA?
- Tips and Tricks and steps for EDA
- How to approach downsampling
- Understanding feature sets relative to your labels
- Optimizing models
- Motivating yourself to get into the data
- Tools for EDA
- A few scenarios for discussion
- What is the most detrimental EDA mistake for ML