Feature Selection with sklearn and Pandas thumbnail
Feature Selection with sklearn and Pandas
towardsdatascience.com
predicting the “MEDV” column The filtering here is done using correlation matrix and it is most commonly done using Pearson correlation. has correlation of above 0.5 (taking absolute value) with the output variable. If these variables are correlated with each other, then we need to keep only one of
1 Users
0 Comments
7 Highlights
7 Notes

Top Highlights

  • predicting the “MEDV” column
  • The filtering here is done using correlation matrix and it is most commonly done using Pearson correlation.
  • has correlation of above 0.5 (taking absolute value) with the output variable.
  • If these variables are correlated with each other, then we need to keep only one of them and drop the rest.
  • From the above code, it is seen that the variables RM and LSTAT are highly correlated with each other (-0.613808). Hence we would keep only one variable and drop the other. We will keep LSTAT since its correlation with MEDV is higher than that of RM.

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.