{"id":31389154,"url":"https://github.com/quantum-software-development/7-datamining-regression-techniques-data-integration","last_synced_at":"2025-09-28T23:59:18.466Z","repository":{"id":314477770,"uuid":"1055681269","full_name":"Quantum-Software-Development/7-DataMining-Regression-Techniques-Data-Integration","owner":"Quantum-Software-Development","description":"7-Data Minining - Regression Techniques with Data Integration","archived":false,"fork":false,"pushed_at":"2025-09-24T12:02:47.000Z","size":1247,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-24T14:12:59.085Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://github.com/Quantum-Software-Development/7-DataMining-Regression-Techniques-Data-Integration","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Quantum-Software-Development.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"Quantum-Software-Development","Custom":"https://github.com/sponsors/Quantum-Software-Development/card"}},"created_at":"2025-09-12T16:24:36.000Z","updated_at":"2025-09-24T12:02:49.000Z","dependencies_parsed_at":"2025-09-12T18:52:35.610Z","dependency_job_id":"41658045-f454-41ed-8a96-000d312d4bd9","html_url":"https://github.com/Quantum-Software-Development/7-DataMining-Regression-Techniques-Data-Integration","commit_stats":null,"previous_names":["quantum-software-development/7-datamining_xxx","quantum-software-development/7-datamining-regression-techniques-data-integration"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Quantum-Software-Development/7-DataMining-Regression-Techniques-Data-Integration","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantum-Software-Development%2F7-DataMining-Regression-Techniques-Data-Integration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantum-Software-Development%2F7-DataMining-Regression-Techniques-Data-Integration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantum-Software-Development%2F7-DataMining-Regression-Techniques-Data-Integration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantum-Software-Development%2F7-DataMining-Regression-Techniques-Data-Integration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Quantum-Software-Development","download_url":"https://codeload.github.com/Quantum-Software-Development/7-DataMining-Regression-Techniques-Data-Integration/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantum-Software-Development%2F7-DataMining-Regression-Techniques-Data-Integration/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":277446623,"owners_count":25819183,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-28T02:00:08.834Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-28T23:59:16.572Z","updated_at":"2025-09-28T23:59:18.461Z","avatar_url":"https://github.com/Quantum-Software-Development.png","language":"Jupyter Notebook","readme":"\n\u003cbr\u003e\n\n**\\[[🇧🇷 Português](README.pt_BR.md)\\] \\[**[🇺🇸 English](README.md)**\\]**\n\n\n\u003cbr\u003e\u003cbr\u003e\n\n# 7- [Data Mining]() / Regression Techniques with Data Integration\n\n\n\u003c!-- ======================================= Start DEFAULT HEADER ===========================================  --\u003e\n\n\u003cbr\u003e\u003cbr\u003e\n\n\n[**Institution:**]() Pontifical Catholic University of São Paulo (PUC-SP)  \n[**School:**]() Faculty of Interdisciplinary Studies  \n[**Program:**]() Humanistic AI and Data Science\n[**Semester:**]() 2nd Semester 2025  \nProfessor:  [***Professor Doctor in Mathematics Daniel Rodrigues da Silva***](https://www.linkedin.com/in/daniel-rodrigues-048654a5/)\n\n\u003cbr\u003e\u003cbr\u003e\n\n#### \u003cp align=\"center\"\u003e [![Sponsor Quantum Software Development](https://img.shields.io/badge/Sponsor-Quantum%20Software%20Development-brightgreen?logo=GitHub)](https://github.com/sponsors/Quantum-Software-Development)\n\n\n\u003cbr\u003e\u003cbr\u003e\n\n\u003c!--Confidentiality statement --\u003e\n\n#\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003e [!IMPORTANT]\n\u003e \n\u003e ⚠️ Heads Up\n\u003e\n\u003e * Projects and deliverables may be made [publicly available]() whenever possible.\n\u003e * The course emphasizes [**practical, hands-on experience**]() with real datasets to simulate professional consulting scenarios in the fields of **Data Analysis and Data Mining** for partner organizations and institutions affiliated with the university.\n\u003e * All activities comply with the [**academic and ethical guidelines of PUC-SP**]().\n\u003e * Any content not authorized for public disclosure will remain [**confidential**]() and securely stored in [private repositories]().  \n\u003e\n\n\n\u003cbr\u003e\u003cbr\u003e\n\n#\n\n\u003c!--END--\u003e\n\n\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\n\n\u003c!-- PUC HEADER GIF\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/user-attachments/assets/0d6324da-9468-455e-b8d1-2cce8bb63b06\" /\u003e\n--\u003e\n\n\n\u003c!-- video presentation --\u003e\n\n\n##### 🎶 Prelude Suite no.1 (J. S. Bach) - [Sound Design Remix]()\n\nhttps://github.com/user-attachments/assets/4ccd316b-74a1-4bae-9bc7-1c705be80498\n\n####  📺 For better resolution, watch the video on [YouTube.](https://youtu.be/_ytC6S4oDbM)\n\n\n\u003cbr\u003e\u003cbr\u003e\n\n\n\u003e [!TIP]\n\u003e \n\u003e  This repository is a review of the Statistics course from the undergraduate program Humanities, AI and Data Science at PUC-SP.\n\u003e\n\u003e  ### ☞ **Access Data Mining [Main Repository](https://github.com/Quantum-Software-Development/1-Main_DataMining_Repository)**\n\u003e \n\u003e\n\n\u003c!-- =======================================END DEFAULT HEADER ===========================================  --\u003e\n\n\u003cbr\u003e\u003cbr\u003e\n\n\n\n## [Overview]()\n\nThis repository covers fundamental concepts and practical techniques in Data Mining focused on clustering (grouping by similarity), various types of regression for modeling data trends, and the crucial steps for data integration and preprocessing. Each section includes theoretical explanations, use case examples, mathematical formulations using LaTeX, and Python code snippets to assist practical understanding.\n\n\u003cbr\u003e\u003cbr\u003e\n\n## [Table of Contents]()\n\n- [Clustering](#clustering)\n- [Regression Types](#regression-types)\n- [Data Integration](#data-integration)\n    - [Data Redundancy and Duplicates](#data-redundancy-and-duplicates)\n    - [Data Conflicts](#data-conflicts)\n    - [Data Compression](#data-compression)\n    - [PCA - Principal Component Analysis](#pca)\n    - [Data Standardization](#data-standardization)\n    - [Data Normalization](#data-normalization)\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\u003cbr\u003e\u003cbr\u003e\n\u003cbr\u003e\u003cbr\u003e\n\u003cbr\u003e\u003cbr\u003e\n\u003cbr\u003e\u003cbr\u003e\n\u003cbr\u003e\u003cbr\u003e\n\u003cbr\u003e\u003cbr\u003e\n\n\n\u003c!-- ========================== [Bibliographr ====================  --\u003e\n\n\n## [Bibliography]()\n\n[1](). **Castro, L. N. \u0026 Ferrari, D. G.** (2016). *Introduction to Data Mining: Basic Concepts, Algorithms, and Applications*. Saraiva.\n\n[2](). **Ferreira, A. C. P. L. et al.** (2024). *Artificial Intelligence – A Machine Learning Approach*. 2nd Ed. LTC.\n\n[3](). **Larson \u0026 Farber** (2015). *Applied Statistics*. Pearson.\n\n\n      \n\u003c!-- ======================================= Bibliography Portugues ===========================================  --\u003e\n\n\u003c!--\n\n## [Bibliography]()\n\n\n[1](). **Castro, L. N. \u0026 Ferrari, D. G.** (2016). *Introdução à mineração de dados: conceitos básicos, algoritmos e aplicações*. Saraiva.\n\n[2](). **Ferreira, A. C. P. L. et al.** (2024). *Inteligência Artificial - Uma Abordagem de Aprendizado de Máquina*. 2nd Ed. LTC.\n\n[3](). **Larson \u0026 Farber** (2015). *Estatística Aplicada*. Pearson.\n\n\n\u003cbr\u003e\u003cbr\u003e\n--\u003e\n\n\n\u003cbr\u003e\u003cbr\u003e\n\n\n\u003c!-- ======================================= Start Footer ===========================================  --\u003e\n\n\n## 💌 [Let the data flow... Ping Me !](mailto:fabicampanari@proton.me)\n\n\u003cbr\u003e\u003cbr\u003e\n\n\n\n#### \u003cp align=\"center\"\u003e  🛸๋ My Contacts [Hub](https://linktr.ee/fabianacampanari)\n\n\n\u003cbr\u003e\n\n### \u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/user-attachments/assets/517fc573-7607-4c5d-82a7-38383cc0537d\" /\u003e\n\n\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003cp align=\"center\"\u003e  ────────────── 🔭⋆ ──────────────\n\n\n\u003cp align=\"center\"\u003e ➣➢➤ \u003ca href=\"#top\"\u003eBack to Top \u003c/a\u003e\n\n\u003c!--\n\u003cp align=\"center\"\u003e  ────────────── ✦ ──────────────\n--\u003e\n\n\n\n\u003c!-- Programmers and artists are the only professionals whose hobby is their profession.\"\n\n\" I love people who are committed to transforming the world \"\n\n\" I'm big fan of those who are making waves in the world! \"\n\n##### \u003cp align=\"center\"\u003e( Rafael Lain ) \u003c/p\u003e   --\u003e\n\n#\n\n###### \u003cp align=\"center\"\u003e Copyright 2025 Quantum Software Development. Code released under the [MIT License license.](https://github.com/Quantum-Software-Development/Math/blob/3bf8270ca09d3848f2bf22f9ac89368e52a2fb66/LICENSE)\n\n\n\n\n\n\n\n\n\n\n\n","funding_links":["https://github.com/sponsors/Quantum-Software-Development","https://github.com/sponsors/Quantum-Software-Development/card"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantum-software-development%2F7-datamining-regression-techniques-data-integration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquantum-software-development%2F7-datamining-regression-techniques-data-integration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantum-software-development%2F7-datamining-regression-techniques-data-integration/lists"}