https://github.com/cissagatto/multilabelsimilaritiesmeasures
Compute similarities measures (categorical data) for all labels in label space for a multilabel dataset
https://github.com/cissagatto/multilabelsimilaritiesmeasures
binary-coefficients categorial-data label-space machine-learning multilabel-classification multilabel-partitions partitions similarities-coefficients similarities-measures
Last synced: 27 days ago
JSON representation
Compute similarities measures (categorical data) for all labels in label space for a multilabel dataset
- Host: GitHub
- URL: https://github.com/cissagatto/multilabelsimilaritiesmeasures
- Owner: cissagatto
- License: gpl-3.0
- Created: 2021-10-15T01:01:50.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-10-30T16:45:05.000Z (over 2 years ago)
- Last Synced: 2023-10-30T17:41:59.509Z (over 2 years ago)
- Topics: binary-coefficients, categorial-data, label-space, machine-learning, multilabel-classification, multilabel-partitions, partitions, similarities-coefficients, similarities-measures
- Language: R
- Homepage:
- Size: 1.75 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MultiLabel Similarities Measures
Compute similarities measures (categorical data) for all labels in label space for a multilabel dataset.
## Multi-Label Datasets (original)
Click [here](https://cometa.ujaen.es/datasets/) to go to the cometa page
## 10-Fold Cross Validation Multi-Label Datasets
Click [here](https://www.4shared.com/s/dYpGZWzjQ) to download
## Conda Environment
[download txt](https://www.4shared.com/s/fUCVTl13zea)
[download yml](https://www.4shared.com/s/f8nOZyxj9iq)
[download yaml](https://www.4shared.com/s/fk5Io4faLiq)
To use conda environment to run this experiment, please consult [here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)
## Tutorial
https://rpubs.com/cissagatto/MultiLabelSimilaritiesMeasures
## How to cite
@misc{Gatto2021, author = {Gatto, E. C.}, title = {Compute Similarities Measures for MultiLabel Classification}, year = {2021}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/cissagatto/MultiLabelSimilaritiesMeasures}}}
# Scripts
This code has the following script in the R folder
1. functions_contingency_table_multilabel.R
2. functions_measures_binary_data.R
3. functions_multilabel_binary_measures.R
4. libraries.R
5. utils.R
6. runCV.R
7. runNCV.R
8. mlsm.R
## FLOWCHART
## Preparing your experiment
### Step-1
This code is executed in X-fold cross-validation. First, you have to obtain the X-fold cross-validation files using this [code]( https://github.com/cissagatto/CrossValidationMultiLabel). All the instructions to use the code are in the Github. After that, put the results generated in the *datasets* folder in this project as "tar.gz". The folder structure generated by the code CrossValidation is used here. This code don't work without theses files.
### Step-2
A file called _datasets.csv_ must be in the *root project* folder. This file is used to read information about the datasets and they are used in the code. All 74 datasets available in *Cometa* are in this file. If you want to use another dataset, please, add the following information about the dataset in the file:
_Id, Name, Domain, Labels, Instances, Attributes, Inputs, Labelsets, Single, Max freq, Card, Dens, MeanIR, Scumble, TCS, AttStart, AttEnd, LabelStart, LabelEnd, xn, yn, gridn_
The *Id* of the dataset is a mandatory parameter in the command line to run all code. The fields are used in a lot of internal functions. Please, make sure that this information is available before running the code. *xn* and *yn* correspond to a dimension of the quadrangular map for kohonen, and *gridn* is (xn * yn). Example: xn = 4, yn = 4, gridn = 16.
## RUN
To run the code, open the terminal, enter the */MultiLabelSimilaritiesMeasures/R/* folder, and type
```
Rscript mlsm.R [number_dataset] [number_cores] [number_folds] [name_folder_results]
```
Where:
_number_dataset_ is the dataset number in the datasets.csv file
_number_cores_ is the total cores you want to use in parallel execution.
_number_folds_ is the number of folds you want for cross-validation
_name_folders_results_ is the name of the folder to save the results
All parameters are mandatory. Example:
```
Rscript mlsm.R 17 10 10 "/dev/shm/results"
```
This will execute the code for the dataset number 17 in the _dataset.csv_, with 10 cores, 10 folds and the process will be store in the _/dev/shm/results/_. This code automatically makes a copy of the */dev/shm/results* in the folder *Reports* - which is in the root of the project. In this way, you can run the code using a temporary folder, like *scratch* and *shm*, to speed up the execution.
## IMPORTANT
I used ABS function in all functions that used SQRT. Divisions per zero were treated like zero.
## Video Demonstration
Click [here](https://youtu.be/rrSh7vF60bA) to watch a video that demonstrate how to run this code
## Acknowledgment
- This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
- This study was financed in part by the Conselho Nacional de Desenvolvimento Científico e Tecnológico - Brasil (CNPQ) - Process number 200371/2022-3.
- The authors also thank the Brazilian research agencies FAPESP financial support.
# Contact
elainececiliagatto@gmail.com
## Links
| [Site](https://sites.google.com/view/professor-cissa-gatto) | [Post-Graduate Program in Computer Science](http://ppgcc.dc.ufscar.br/pt-br) | [Computer Department](https://site.dc.ufscar.br/) | [Biomal](http://www.biomal.ufscar.br/) | [CNPQ](https://www.gov.br/cnpq/pt-br) | [Ku Leuven](https://kulak.kuleuven.be/) | [Embarcados](https://www.embarcados.com.br/author/cissa/) | [Read Prensa](https://prensa.li/@cissa.gatto/) | [Linkedin Company](https://www.linkedin.com/company/27241216) | [Linkedin Profile](https://www.linkedin.com/in/elainececiliagatto/) | [Instagram](https://www.instagram.com/cissagatto) | [Facebook](https://www.facebook.com/cissagatto) | [Twitter](https://twitter.com/cissagatto) | [Twitch](https://www.twitch.tv/cissagatto) | [Youtube](https://www.youtube.com/CissaGatto) |
# Thanks