{"id":21432461,"url":"https://github.com/gagolews/datafusion","last_synced_at":"2025-03-16T22:35:24.629Z","repository":{"id":52745393,"uuid":"465098652","full_name":"gagolews/datafusion","owner":"gagolews","description":"Data Fusion (open-access research monograph, 2015)","archived":false,"fork":false,"pushed_at":"2022-08-09T00:52:13.000Z","size":5008,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-23T08:43:13.394Z","etag":null,"topics":["aggregation","data","fusion","fuzzy-logic","mean","multidimensional-analysis","multidimensional-data","spread","statistics","strings","variance"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gagolews.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null}},"created_at":"2022-03-02T00:13:41.000Z","updated_at":"2022-08-04T00:31:13.000Z","dependencies_parsed_at":"2022-08-13T02:01:27.353Z","dependency_job_id":null,"html_url":"https://github.com/gagolews/datafusion","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gagolews%2Fdatafusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gagolews%2Fdatafusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gagolews%2Fdatafusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gagolews%2Fdatafusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gagolews","download_url":"https://codeload.github.com/gagolews/datafusion/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243945321,"owners_count":20372890,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aggregation","data","fusion","fuzzy-logic","mean","multidimensional-analysis","multidimensional-data","spread","statistics","strings","variance"],"created_at":"2024-11-22T23:18:38.386Z","updated_at":"2025-03-16T22:35:24.609Z","avatar_url":"https://github.com/gagolews.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Fusion: Theory, Methods, and Applications\n\nAn open-access research monograph by [Marek Gagolewski](https://www.gagolewski.com)\n([download PDF](https://raw.githubusercontent.com/gagolews/datafusion/master/datafusion.pdf))\n\n\n---\n\nA **proper fusion of complex data** is of interest to many researchers in\ndiverse fields, including computational statistics, computational geometry,\nbioinformatics, machine learning, pattern recognition, quality management,\nengineering, statistics, finance, economics, etc. It plays a crucial role in:\n\n* synthetic description of data processes or whole domains,\n* creation of rule bases for approximate reasoning tasks,\n* reaching consensus and selection of the optimal strategy\n    in decision support systems,\n* imputation of missing values,\n* data deduplication and consolidation,\n* record linkage across heterogeneous databases,\n* clustering.\n\nFurthermore, many useful machine learning methods are based on a proper\naggregation of information entities. In particular, the class of ensemble\nmethods for classification is very successful because of the assumption\nthat no single \"weak\" classifier can perform as nicely as the whole group.\nNeural networks and other deep learning tools can be understood as hierarchies\nof individual fusion functions. Appropriate data fusion is crucial\nfor privacy reasons as well (think: GDPR).\n\nThis **open-access research monograph** integrates the spread-out results from\ndifferent domains using the methodology of the well-established classical\naggregation framework, introduces researchers and practitioners\nto Aggregation 2.0, as well as points out the challenges and interesting\ndirections for further research.\n\n---\n\n[Gagolewski M.](https://www.gagolewski.com),\n[*Data Fusion: Theory, Methods, and Applications*](https://raw.githubusercontent.com/gagolews/datafusion/master/datafusion.pdf),\nInstitute of Computer Science, Polish Academy of Sciences,\nWarsaw, Poland, 2015, 290 pp.,\nISBN: 978-83-63159-20-7,\nDOI: [10.5281/zenodo.6960306](https://doi.org/10.5281/zenodo.6960306).\n\nReviewers:\n[Gleb Beliakov](https://scholar.google.com/citations?user=_plRpWEAAAAJ) and\n[Radko Mesiar](https://scholar.google.com/citations?user=_kXpl5YAAAAJ).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgagolews%2Fdatafusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgagolews%2Fdatafusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgagolews%2Fdatafusion/lists"}