Notebook 1: https://www.kaggle.com/kenjee/kaggle-project-from-scratch
Notebook 2: https://www.kaggle.com/kenjee/analyzing-gender-and-earning-potential-in-tech
I do two main analyses in this video. First, I look build more advanced graphs to compare differences in skills and characteristics across data science roles using the kaggle developer survey data. I then create a function to make this task more scalable.
In the second part of the video, I use a few different techniques to determine if there is gender bias in data science particularly relating to salary.
In the notebook I:
- First visualize and normalize gender differences in the sample
- Run a multiple linear regression to understand which factors contribute most to earning potential
- Run a lasso regression to narrow variable set and try to quantify the extent gender impacts earning potential
- Run a random forest on same data to evaluate feature importance (A nonlinear model like this is a good check)
- Compare models for just subsets of women and men to hopefully normalize for more variables
Part 1: https://www.youtube.com/watch?v=r-DR9HBaipU&ab_channel=KenJee
Part 2: https://www.youtube.com/watch?v=KQ80oD_boBM&feature=youtu.be&ab_channel=KenJee
Comments