{"id":28265241,"url":"https://github.com/audy21/datacamp","last_synced_at":"2026-04-11T05:34:00.997Z","repository":{"id":273843869,"uuid":"921061367","full_name":"audy21/datacamp","owner":"audy21","description":"Learning portfolio documenting my progress, while taking Data Analyst \u0026 Data Science certifications from DataCamp.","archived":false,"fork":false,"pushed_at":"2025-04-08T17:43:02.000Z","size":11368,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-07T10:40:15.469Z","etag":null,"topics":["data-analysis","data-science","machine-learning","matplotlib","numpy","pandas","python","scikit-learn","seaborn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/audy21.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-23T09:09:38.000Z","updated_at":"2025-04-08T17:43:05.000Z","dependencies_parsed_at":"2025-06-18T11:43:54.789Z","dependency_job_id":null,"html_url":"https://github.com/audy21/datacamp","commit_stats":null,"previous_names":["audy21/datacamp-excercise"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/audy21/datacamp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy21%2Fdatacamp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy21%2Fdatacamp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy21%2Fdatacamp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy21%2Fdatacamp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/audy21","download_url":"https://codeload.github.com/audy21/datacamp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy21%2Fdatacamp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31670061,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T17:19:37.612Z","status":"online","status_checked_at":"2026-04-11T02:00:05.776Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","machine-learning","matplotlib","numpy","pandas","python","scikit-learn","seaborn"],"created_at":"2025-05-20T10:13:43.374Z","updated_at":"2026-04-11T05:34:00.979Z","avatar_url":"https://github.com/audy21.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 📊 DataCamp Project Portfolio\n\n## Clustering Antarctic Penguin Species\n**Description**:  \nAn unsupervised learning project analyzing physical measurements of three penguin species from Palmer Station, Antarctica.  \n**Goal**:  \nIdentify distinct species clusters using morphological features (bill length, flipper size).  \n**Lessons Learned**:  \n- Feature scaling is critical for distance-based algorithms like K-Means  \n- Composite features (e.g., bill-to-flipper ratio) can improve separation  \n- Silhouette scores help validate cluster quality  \n\n## Detect Traffic Signs with Deep Learning  \n**Description**:  \nA computer vision system to classify 43 types of German traffic signs from real-world images.  \n**Goal**:  \nBuild a CNN model deployable in autonomous vehicle systems.  \n**Lessons Learned**:  \n- Albumentations outperforms traditional augmentation for distorted signs  \n- Model pruning reduces TensorRT deployment size by 40%  \n- Grayscale conversion improves contrast for low-light signs  \n\n## Financial Fraud Detection Monitoring  \n**Description**:  \nProduction monitoring system for a live credit card fraud detection model.  \n**Goal**:  \nDetect data drift and maintain \u003e99% precision in real-time predictions.  \n**Lessons Learned**:  \n- Feature drift occurs 3x faster than label drift in financial data  \n- Evidently AI's dashboards reduce alert fatigue by 60%  \n- Cold-start problem requires synthetic fraud samples  \n\n## Predictive Modeling for Agriculture  \n**Description**:  \nSatellite-data powered yield prediction for Midwest corn farms (2015-2022).  \n**Goal**:  \nForecast harvest volumes with \u003c10% error using weather and soil data.  \n**Lessons Learned**:  \n- LSTMs outperform RF for sequential NDVI data  \n- Soil moisture embeddings boost accuracy in drought years  \n- SHAP reveals unexpected temperature threshold effects  \n\n## Service Desk Ticket Classification  \n**Description**:  \nNLP system automating 25,000+ IT support ticket categorizations monthly.  \n**Goal**:  \nReduce manual routing time by 70% while maintaining 90%+ accuracy.  \n**Lessons Learned**:  \n- DistilBERT achieves BERT-level accuracy with 40% fewer parameters  \n- Label smoothing handles ambiguous \"urgent/not urgent\" cases  \n- FastAPI deployment cuts inference latency to \u003c200ms  \n\n## Nobel Prize Winners Visualization  \n**Description**:  \nInteractive exploration of 120 years of laureate demographics and trends.  \n**Goal**:  \nUncover historical patterns in award distribution across genders/countries.  \n**Lessons Learned**:  \n- Animation is powerful for showing temporal trends  \n- Small multiples \u003e complex dashboards for category comparisons  \n- Physics laureates have longest career-to-prize gap (avg. 28 years)  \n\n---\n\n## 🛠️ Technical Stack  \n```text\n▸ Clustering: Scikit-learn | Seaborn  \n▸ CV: TensorFlow | OpenCV \n▸ NLP: Transformers | spaCy \n▸ Analytics: Pandas | Matplotlib\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faudy21%2Fdatacamp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faudy21%2Fdatacamp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faudy21%2Fdatacamp/lists"}