{"id":24018748,"url":"https://github.com/iboud0/scikit_learn_clone","last_synced_at":"2025-02-25T21:28:37.978Z","repository":{"id":244721684,"uuid":"783506980","full_name":"iboud0/scikit_learn_clone","owner":"iboud0","description":"Custom implementation of various machine learning algorithms and utilities inspired by Scikit-Learn","archived":false,"fork":false,"pushed_at":"2024-06-16T23:34:21.000Z","size":615,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-09T08:16:04.634Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://pypi.org/project/sktlearn-clone/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iboud0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-08T03:02:29.000Z","updated_at":"2024-06-16T23:36:22.000Z","dependencies_parsed_at":"2024-06-17T01:39:15.720Z","dependency_job_id":"d6c52098-e6b2-4d08-84f6-5edbef2998a4","html_url":"https://github.com/iboud0/scikit_learn_clone","commit_stats":null,"previous_names":["iboud0/scikit_learn_clone"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iboud0%2Fscikit_learn_clone","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iboud0%2Fscikit_learn_clone/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iboud0%2Fscikit_learn_clone/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iboud0%2Fscikit_learn_clone/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iboud0","download_url":"https://codeload.github.com/iboud0/scikit_learn_clone/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240749561,"owners_count":19851498,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-08T10:18:44.459Z","updated_at":"2025-02-25T21:28:37.948Z","avatar_url":"https://github.com/iboud0.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### README.md\n\n# Scikit Learn Clone\n\nScikit Learn Clone is a custom implementation of various machine learning algorithms and utilities inspired by Scikit-Learn. This project is designed to provide a comprehensive set of tools for data preprocessing, model selection, evaluation, and supervised learning.\n\n## Project Structure\n\nThe project is organized into several modules, each focusing on a specific aspect of machine learning. Below is the detailed structure:\n\n```\nscikit_learn_clone/  \n├── data/  \n├── ensemble_methods/  \n│   ├── __init__.py  \n│   ├── adaBoost.py  \n│   ├── bagging.py  \n│   ├── gradient_boosting.py  \n│   ├── randomForest.py  \n│   ├── stacking.py  \n├── metrics_model_evaluation/  \n│   ├── __init__.py  \n│   ├── accuracy.py  \n│   ├── confusion_matrix.py  \n│   ├── f1_score.py  \n│   ├── mean_absolute_error.py  \n│   ├── mean_squared_error.py  \n│   ├── precision.py  \n│   ├── r2_score.py  \n│   ├── recall.py  \n├── model_selection/  \n│   ├── __init__.py  \n│   ├── cross_validation.py  \n│   ├── grid_search.py  \n│   ├── kfold.py  \n│   ├── make_scorer.py  \n│   ├── param_grid.py  \n│   ├── train_test_split.py  \n├── preprocessing/  \n│   ├── __init__.py  \n│   ├── impute_missing_values_mean.py  \n│   ├── impute_missing_values_median.py  \n│   ├── impute_missing_values_mode.py  \n│   ├── labelencoder.py  \n│   ├── normalize_features.py  \n│   ├── numerical_categorical_variable.py  \n│   ├── onehotencoder.py  \n│   ├── outliers.py  \n│   ├── scale_features_min_max.py  \n│   ├── scale_features_standard.py  \n│   ├── select_features.py  \n├── supervised_learning/  \n│   ├── __init__.py  \n│   ├── DecisionTree.py  \n│   ├── knn.py  \n│   ├── LinearRegression.py  \n│   ├── LogisticRegression.py\n│   ├── NaiveBayes.py \n│   ├── NeuralNetwork.py \n├── testing/  \n├── utilities/  \n│   ├── __init__.py  \n│   ├── Estimator.py  \n│   ├── MetaEstimator.py  \n│   ├── ModelSelector.py  \n│   ├── Pipeline.py  \n│   ├── Predictor.py  \n│   ├── Transformer.py  \n├── .gitignore  \n├── README.md  \n└── setup.py  \n\n```\n## Installation\n\nTo install the package, use pip:\n\n```bash\npip install sktlearn-clone\n```\n\n## Usage\n\nHere are some examples of how to use the various modules in this package.\n\n### Example: Decision Tree Classifier\n\n```python\nfrom scikit_learn_clone.supervised_learning.DecisionTree import DecisionTreeClassifier\nfrom scikit_learn_clone.model_selection.train_test_split import train_test_split\nfrom scikit_learn_clone.metrics_model_evaluation.accuracy import accuracy_score\n\n# Sample dataset\nX = [[0, 0], [1, 1]]\ny = [0, 1]\n\n# Train-test split\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)\n\n# Initialize and train the classifier\nclf = DecisionTreeClassifier()\nclf.fit(X_train, y_train)\n\n# Predict and evaluate\ny_pred = clf.predict(X_test)\naccuracy = accuracy_score(y_test, y_pred)\nprint(\"Accuracy:\", accuracy)\n```\n\n### Example: K-Fold Cross-Validation\n\n```python\nfrom scikit_learn_clone.model_selection.kfold import KFold\nfrom scikit_learn_clone.supervised_learning.LinearRegression import LinearRegression\n\n# Sample dataset\nX = [[i] for i in range(10)]\ny = [2 * i for i in range(10)]\n\n# Initialize KFold\nkf = KFold(n_splits=5)\n\n# Initialize model\nmodel = LinearRegression()\n\n# Perform K-Fold Cross-Validation\nfor train_index, test_index in kf.split(X):\n    X_train, X_test = [X[i] for i in train_index], [X[i] for i in test_index]\n    y_train, y_test = [y[i] for i in train_index], [y[i] for i in test_index]\n    model.fit(X_train, y_train)\n    predictions = model.predict(X_test)\n    print(f\"Fold results: {predictions}\")\n```\n\n## Features\n\n### Ensemble Methods\n\n- **AdaBoost**: Adaptive Boosting algorithm.\n- **Bagging**: Bootstrap Aggregating algorithm.\n- **Gradient Boosting**: Gradient Boosting algorithm.\n- **Random Forest**: Ensemble of Decision Trees.\n- **Stacking**: Stacked generalization.\n\n### Metrics and Model Evaluation\n\n- **Accuracy**: Classification accuracy.\n- **Confusion Matrix**: Performance measurement for classification.\n- **F1 Score**: Harmonic mean of precision and recall.\n- **Mean Absolute Error**: Regression metric.\n- **Mean Squared Error**: Regression metric.\n- **Precision**: Classification precision.\n- **R2 Score**: Coefficient of determination.\n- **Recall**: Classification recall.\n\n### Model Selection\n\n- **Cross-Validation**: Split the dataset into k consecutive folds.\n- **Grid Search**: Exhaustive search over specified parameter values.\n- **K-Fold**: K-Fold cross-validation iterator.\n- **Make Scorer**: Convert metrics into callables.\n- **Param Grid**: Define the parameter grid for search.\n- **Train-Test Split**: Split arrays or matrices into random train and test subsets.\n\n### Preprocessing\n\n- **Imputation**: Handle missing values.\n  - Mean, Median, Mode imputation.\n- **Label Encoding**: Encode categorical features as an integer array.\n- **Normalization**: Scale input vectors individually to unit norm.\n- **One-Hot Encoding**: Encode categorical integer features as a one-hot numeric array.\n- **Outlier Detection**: Identify and handle outliers in the data.\n- **Feature Scaling**: Standardize features by removing the mean and scaling to unit variance.\n  - Min-Max scaling.\n- **Feature Selection**: Select features based on importance or correlation.\n\n### Supervised Learning\n\n- **Decision Tree**: Decision Tree classifier.\n- **k-Nearest Neighbors**: k-Nearest Neighbors algorithm.\n- **Linear Regression**: Linear Regression algorithm.\n- **Logistic Regression**: Logistic Regression algorithm.\n- **Naive Bayes**: Naive Bayes algorithm.\n- **Neural Network**: Neural Network algorithm.\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository.\n2. Create a new branch (`git checkout -b feature/your-feature`).\n3. Make your changes.\n4. Commit your changes (`git commit -m 'Add some feature'`).\n5. Push to the branch (`git push origin feature/your-feature`).\n6. Open a pull request.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n```\n\nThis revised version ensures the project structure is properly separated and formatted. This will improve readability and help users quickly find relevant information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiboud0%2Fscikit_learn_clone","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiboud0%2Fscikit_learn_clone","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiboud0%2Fscikit_learn_clone/lists"}