Feature Selection with sklearn and Pandas

towardsdatascience.com

predicting the “MEDV” column The filtering here is done using correlation matrix and it is most commonly done using Pearson correlation. has correlation of above 0.5 (taking absolute value) with the output variable. If these variables are correlated with each other, then we need to keep only one of

1 Users

0 Comments

7 Highlights

7 Notes

- predicting the “MEDV” column
- The filtering here is done using correlation matrix and it is most commonly done using Pearson correlation.
- has correlation of above 0.5 (taking absolute value) with the output variable.
- If these variables are correlated with each other, then we need to keep only one of them and drop the rest.
- From the above code, it is seen that the variables RM and LSTAT are highly correlated with each other (-0.613808). Hence we would keep only one variable and drop the other. We will keep LSTAT since its correlation with MEDV is higher than that of RM.

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.

© 2024 Glasp Inc. All rights reserved.