{"id":18905016,"url":"https://github.com/zerodiscord/ai-ml","last_synced_at":"2026-03-04T21:02:44.052Z","repository":{"id":261539869,"uuid":"870006822","full_name":"ZeroDiscord/AI-ML","owner":"ZeroDiscord","description":"Uni course labs monorepo","archived":false,"fork":false,"pushed_at":"2025-05-02T10:31:57.000Z","size":4248,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-18T18:26:56.623Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ZeroDiscord.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-10-09T09:27:21.000Z","updated_at":"2024-12-16T21:14:03.000Z","dependencies_parsed_at":"2025-05-24T12:09:26.093Z","dependency_job_id":"7abeacf9-0b41-4ea6-9e39-8082e6f2c5f6","html_url":"https://github.com/ZeroDiscord/AI-ML","commit_stats":null,"previous_names":["zerodiscord/ai-ml"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ZeroDiscord/AI-ML","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeroDiscord%2FAI-ML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeroDiscord%2FAI-ML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeroDiscord%2FAI-ML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeroDiscord%2FAI-ML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ZeroDiscord","download_url":"https://codeload.github.com/ZeroDiscord/AI-ML/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeroDiscord%2FAI-ML/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30092875,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T20:42:30.420Z","status":"ssl_error","status_checked_at":"2026-03-04T20:42:30.057Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T09:10:11.662Z","updated_at":"2026-03-04T21:02:44.027Z","avatar_url":"https://github.com/ZeroDiscord.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [Lab Experiments](https://github.com/ZeroDiscord/AI-ML/tree/master/lab_4-6)\n\n## Lab 4-6: Data Import, EDA, and Preprocessing\n\n### Overview  \nThis repository covers three experiments:  \n1. **Experiment 4**: Import/export data and display basic statistics.  \n2. **Experiment 5**: Perform Exploratory Data Analysis (EDA).  \n3. **Experiment 6**: Handle missing values, outliers, and preprocess data.  \n\n### File Structure \n```\nlab_4-6/\n│\n├── data/            # Directory for input datasets\n│   └── sample.csv   # Placeholder for the dataset used in experiments\n│\n├── helpers/         # Contains scripts for individual experiments\n│   ├── exp4_impexp.py   # Code for Experiment 4\n│   ├── exp5_eda.py      # Code for Experiment 5\n│   └── exp6_preprocess.py   # Code for Experiment 6\n│\n├── output/          # Stores exported results, plots, and preprocessed data\n│   └── ...\n│\n└── driver.py        # Main driver script to run experiments\n```\n\n### Requirements  \n- Python 3.x  \n- Libraries: `pandas`, `matplotlib`, `seaborn`, `scipy`  \n\n### How to Run  \n1. Place the dataset in the `data/` directory OR simply run the driver script to import the sample dataset. \n2. Execute the main script:  \n   ```bash\n   python driver.py\n   ```\n3. Outputs for evaluation are present in the `output/` directory.\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n# [Project](https://github.com/ZeroDiscord/AI-ML/tree/master/project)\n\n### Machine Learning Pipeline\nThis Demonstrates the creation and training of a machine learning pipeline using `scikit-learn`. The pipeline consists of two main components:\n\n1. **StandardScaler**: Standardizes the features by removing the mean and scaling to unit variance.\n2. **RandomForestClassifier**: A classifier that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.\n\n#### Code Overview\n\n\n#### Steps:\n\n1. **Import Libraries**: Import the necessary libraries from `scikit-learn`.\n2. **Define Pipeline**: Create a pipeline that first scales the data using `StandardScaler` and then applies the `RandomForestClassifier`.\n3. **Train the Model**: Fit the pipeline to the training data (`X_train`, `y_train`).\n\nThis setup ensures that the data is properly scaled before being fed into the classifier, which can improve the performance of the model.\n\n```python\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.ensemble import RandomForestClassifier\n\n# Create a pipeline\npipe = Pipeline([\n    ('scaler', StandardScaler()),\n    ('rf', RandomForestClassifier())\n])\n\n# Fit the pipeline\npipe.fit(X_train, y_train)\n```\n\n### References\n- [scikit-learn: Pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html)\n- [Dataset: Sparkify](https://udacity-dsnd.s3.amazonaws.com/sparkify/mini_sparkify_event_data.json)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzerodiscord%2Fai-ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzerodiscord%2Fai-ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzerodiscord%2Fai-ml/lists"}