{"id":50306899,"url":"https://github.com/cissagatto/multilabelsimilaritiesmeasures","last_synced_at":"2026-05-28T17:01:59.642Z","repository":{"id":192263245,"uuid":"417327987","full_name":"cissagatto/MultiLabelSimilaritiesMeasures","owner":"cissagatto","description":"Compute similarities measures (categorical data) for all labels in label space for a multilabel dataset","archived":false,"fork":false,"pushed_at":"2023-10-30T16:45:05.000Z","size":1831,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2023-10-30T17:41:59.509Z","etag":null,"topics":["binary-coefficients","categorial-data","label-space","machine-learning","multilabel-classification","multilabel-partitions","partitions","similarities-coefficients","similarities-measures"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cissagatto.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-10-15T01:01:50.000Z","updated_at":"2023-09-02T15:19:03.000Z","dependencies_parsed_at":"2023-09-03T20:48:31.705Z","dependency_job_id":"beaa87a8-1c7a-4d96-adf5-73835a115ea1","html_url":"https://github.com/cissagatto/MultiLabelSimilaritiesMeasures","commit_stats":null,"previous_names":["cissagatto/multilabelsimilaritiesmeasures"],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/cissagatto/MultiLabelSimilaritiesMeasures","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cissagatto%2FMultiLabelSimilaritiesMeasures","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cissagatto%2FMultiLabelSimilaritiesMeasures/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cissagatto%2FMultiLabelSimilaritiesMeasures/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cissagatto%2FMultiLabelSimilaritiesMeasures/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cissagatto","download_url":"https://codeload.github.com/cissagatto/MultiLabelSimilaritiesMeasures/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cissagatto%2FMultiLabelSimilaritiesMeasures/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33617718,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-28T02:00:06.440Z","response_time":99,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binary-coefficients","categorial-data","label-space","machine-learning","multilabel-classification","multilabel-partitions","partitions","similarities-coefficients","similarities-measures"],"created_at":"2026-05-28T17:01:55.758Z","updated_at":"2026-05-28T17:01:59.613Z","avatar_url":"https://github.com/cissagatto.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MultiLabel Similarities Measures\nCompute similarities measures (categorical data) for all labels in label space for a multilabel dataset.\n\n## Multi-Label Datasets (original)\nClick [here](https://cometa.ujaen.es/datasets/) to go to the cometa page\n\n## 10-Fold Cross Validation Multi-Label Datasets\nClick [here](https://www.4shared.com/s/dYpGZWzjQ) to download\n\n## Conda Environment\n[download txt](https://www.4shared.com/s/fUCVTl13zea)\n\n[download yml](https://www.4shared.com/s/f8nOZyxj9iq)\n\n[download yaml](https://www.4shared.com/s/fk5Io4faLiq)\n\nTo use conda environment to run this experiment, please consult [here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) \n\n## Tutorial\n\nhttps://rpubs.com/cissagatto/MultiLabelSimilaritiesMeasures\n\n## How to cite \n@misc{Gatto2021, author = {Gatto, E. C.}, title = {Compute Similarities Measures for MultiLabel Classification}, year = {2021}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\\url{https://github.com/cissagatto/MultiLabelSimilaritiesMeasures}}}\n\n# Scripts\nThis code has the following script in the R folder\n\n1. functions_contingency_table_multilabel.R\n2. functions_measures_binary_data.R\n3. functions_multilabel_binary_measures.R\n4. libraries.R\n5. utils.R\n6. runCV.R\n7. runNCV.R\n8. mlsm.R\n\n## FLOWCHART\n\n\n## Preparing your experiment\n\n### Step-1\nThis code is executed in X-fold cross-validation. First, you have to obtain the X-fold cross-validation files using this [code]( https://github.com/cissagatto/CrossValidationMultiLabel). All the instructions to use the code are in the Github. After that, put the results generated in the *datasets* folder in this project as \"tar.gz\". The folder structure generated by the code CrossValidation is used here. This code don't work without theses files.\n\n### Step-2\nA file called _datasets.csv_ must be in the *root project* folder. This file is used to read information about the datasets and they are used in the code. All 74 datasets available in *Cometa* are in this file. If you want to use another dataset, please, add the following information about the dataset in the file:\n\n_Id, Name, Domain, Labels, Instances, Attributes, Inputs, Labelsets, Single, Max freq, Card, Dens, MeanIR, Scumble, TCS, AttStart, AttEnd, LabelStart, LabelEnd, xn, yn, gridn_\n\nThe *Id* of the dataset is a mandatory parameter in the command line to run all code. The fields are used in a lot of internal functions. Please, make sure that this information is available before running the code. *xn* and *yn* correspond to a dimension of the quadrangular map for kohonen, and *gridn* is (xn * yn). Example: xn = 4, yn = 4, gridn = 16.\n\n\n## RUN\n\nTo run the code, open the terminal, enter the */MultiLabelSimilaritiesMeasures/R/* folder, and type\n\n```\nRscript mlsm.R [number_dataset] [number_cores] [number_folds] [name_folder_results]\n```\n\nWhere:\n\n_number_dataset_ is the dataset number in the datasets.csv file\n\n_number_cores_ is the total cores you want to use in parallel execution.\n\n_number_folds_ is the number of folds you want for cross-validation\n\n_name_folders_results_ is the name of the folder to save the results\n\n\nAll parameters are mandatory. Example:\n\n```\nRscript mlsm.R 17 10 10 \"/dev/shm/results\"\n```\n\nThis will execute the code for the dataset number 17 in the _dataset.csv_, with 10 cores, 10 folds and the process will be store in the _/dev/shm/results/_. This code automatically makes a copy of the */dev/shm/results* in the folder *Reports* - which is in the root of the project. In this way, you can run the code using a temporary folder, like *scratch* and *shm*, to speed up the execution.\n\n\n## IMPORTANT\nI used ABS function in all functions that used SQRT. Divisions per zero were treated like zero.\n\n## Video Demonstration\nClick [here](https://youtu.be/rrSh7vF60bA) to watch a video that demonstrate how to run this code\n\n## Acknowledgment\n- This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.\n- This study was financed in part by the Conselho Nacional de Desenvolvimento Científico e Tecnológico - Brasil (CNPQ) - Process number 200371/2022-3.\n- The authors also thank the Brazilian research agencies FAPESP financial support.\n\n# Contact\nelainececiliagatto@gmail.com\n\n## Links\n\n| [Site](https://sites.google.com/view/professor-cissa-gatto) | [Post-Graduate Program in Computer Science](http://ppgcc.dc.ufscar.br/pt-br) | [Computer Department](https://site.dc.ufscar.br/) |  [Biomal](http://www.biomal.ufscar.br/) | [CNPQ](https://www.gov.br/cnpq/pt-br) | [Ku Leuven](https://kulak.kuleuven.be/) | [Embarcados](https://www.embarcados.com.br/author/cissa/) | [Read Prensa](https://prensa.li/@cissa.gatto/) | [Linkedin Company](https://www.linkedin.com/company/27241216) | [Linkedin Profile](https://www.linkedin.com/in/elainececiliagatto/) | [Instagram](https://www.instagram.com/cissagatto) | [Facebook](https://www.facebook.com/cissagatto) | [Twitter](https://twitter.com/cissagatto) | [Twitch](https://www.twitch.tv/cissagatto) | [Youtube](https://www.youtube.com/CissaGatto) |\n\n# Thanks\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcissagatto%2Fmultilabelsimilaritiesmeasures","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcissagatto%2Fmultilabelsimilaritiesmeasures","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcissagatto%2Fmultilabelsimilaritiesmeasures/lists"}