{"id":18510389,"url":"https://github.com/rakibhhridoy/machinelearning-featureselection","last_synced_at":"2026-05-02T18:34:03.112Z","repository":{"id":131513631,"uuid":"281357471","full_name":"rakibhhridoy/MachineLearning-FeatureSelection","owner":"rakibhhridoy","description":"Before training a model or feed a model, first priority is on data,not in model. The more data is preprocessed and engineered the more model will learn. Feature selectio one of the methods processing data before feeding the model. Various feature selection techniques is shown here. ","archived":false,"fork":false,"pushed_at":"2020-08-18T10:10:42.000Z","size":1,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-17T02:41:55.390Z","etag":null,"topics":["extratreesclassifier","feature-selection","gridsearchcv","lasso-regression","logistic-regression","machine-learning","numpy","pandas","pca","rfe","rfecv","scikit-learn","selectkbest"],"latest_commit_sha":null,"homepage":"https://rakibhhridoy.github.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rakibhhridoy.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-07-21T09:44:04.000Z","updated_at":"2020-08-18T17:02:22.000Z","dependencies_parsed_at":"2023-03-18T16:24:22.787Z","dependency_job_id":null,"html_url":"https://github.com/rakibhhridoy/MachineLearning-FeatureSelection","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakibhhridoy%2FMachineLearning-FeatureSelection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakibhhridoy%2FMachineLearning-FeatureSelection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakibhhridoy%2FMachineLearning-FeatureSelection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakibhhridoy%2FMachineLearning-FeatureSelection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rakibhhridoy","download_url":"https://codeload.github.com/rakibhhridoy/MachineLearning-FeatureSelection/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254129552,"owners_count":22019629,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extratreesclassifier","feature-selection","gridsearchcv","lasso-regression","logistic-regression","machine-learning","numpy","pandas","pca","rfe","rfecv","scikit-learn","selectkbest"],"created_at":"2024-11-06T15:23:08.298Z","updated_at":"2025-10-16T04:02:27.279Z","avatar_url":"https://github.com/rakibhhridoy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# *Machine Learning Feature Selection*\n\u003eStep By step\n1. *importing libraries \u0026 functions*\n```python\nimport pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn.linear_model import Lasso\nfrom sklearn.model_selection import GridSearchCV\nfrom sklearn.feature_selection import SelectKBest\nfrom sklearn.feature_selection import f_classif\nfrom sklearn.ensemble import ExtraTreesClassifier\nfrom sklearn.decomposition import PCA\nimport os\n\n\nfrom sklearn.feature_selection import RFE\nfrom sklearn.linear_model import LogisticRegression\n```\n\n2. *loading datasets*\n```python\nfile = os.getcwd()+\"/datasets_228_482_diabetes.csv\"\nnames = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']\ndf = pd.read_csv(file, names = names)\n\narray = df.values\n\nX = array[:, 0:8]\ny = array[:,8]\n```\n\n3. Different feature selection techniques:\n\u003e SelectKBest\n```python\ntest = SelectKBest(score_func = f_classif, k=4) \nfit = test.fit(X,y)\n\nfeatures = fit.transform(X)\n\ncorr_p = df['skin'].corr(df['class'])\nprint(corr_p)\n\nprint(features[0:5,:])\n\n\nmodel = LogisticRegression(solver = 'lbfgs')\nrfe = RFE(model, 3)\nfit = rfe.fit(X,y)\n\n\nprint('Num features: %d' % fit.n_features_)\nprint('Selected features: %s' % fit.support_)\nprint('feature ranking: %s' % fit.ranking_)\n```\n\u003eExtraTreeClasssifier\n```python\nmodel = ExtraTreesClassifier(n_estimators=10)\nmodel.fit(X,y)\n\nprint(model.feature_importances_)\n```\n\u003eDimensionality Reduction- PCA\n```python\npca = PCA(n_components = 3)\nfit = pca.fit(X,y)\n\nprint('Explained Variance: %s'% fit.explained_variance_ratio_)\nprint(fit.components_)\n```\n\u003e best params and score findings\n```python\nlasso = Lasso()\n\nparameters = {'alpha': [1e-15,1e-10, 1e-8, 1e-4, 1e-3,1e-2,1,5,10,20]}\n\nlasso_regressor = GridSearchCV(lasso, parameters, scoring = 'neg_mean_squared_error', cv=5)\nlasso_regressor.fit(X,y)\n\nprint(lasso_regressor.best_params_)\nprint(lasso_regressor.best_score_)\n```\n\n#### *Get Touch With Me*\nConnect- [Linkedin](https://linkedin.com/in/rakibhhridoy) \u003cbr\u003e\nWebsite- [RakibHHridoy](https://rakibhhridoy.github.io)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakibhhridoy%2Fmachinelearning-featureselection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frakibhhridoy%2Fmachinelearning-featureselection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakibhhridoy%2Fmachinelearning-featureselection/lists"}