https://github.com/dice-group/learnalclengths
Concept length prediction for the ALC description logic.
https://github.com/dice-group/learnalclengths
concept-learning description-logics knowledge-base owl owl2 web-ontology-language
Last synced: 4 months ago
JSON representation
Concept length prediction for the ALC description logic.
- Host: GitHub
- URL: https://github.com/dice-group/learnalclengths
- Owner: dice-group
- License: mit
- Created: 2022-03-01T14:19:31.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2025-02-12T15:18:43.000Z (over 1 year ago)
- Last Synced: 2025-06-24T07:37:23.854Z (12 months ago)
- Topics: concept-learning, description-logics, knowledge-base, owl, owl2, web-ontology-language
- Language: Jupyter Notebook
- Homepage:
- Size: 20.4 MB
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LearnALCLengths
This repository contains our implementation of concept length predictors in the ALC description logic.
## Installation
- Clone this repository:
```
https://github.com/dice-group/LearnALCLengths.git
```
- Install Anaconda3, then all required librairies by executing the following commands (Linux):
1. ```conda create -n clip python==3.11.5 && conda activate clip ```
2. ```pip install -r requirements.txt ```
3. ```git clone https://github.com/dice-group/Ontolearn.git && cd Ontolearn && git checkout 0.5.4 && pip install -e .```
- Download DL-Learner-1.4.0 from [github](https://github.com/SmartDataAnalytics/DL-Learner/releases) and extract it into this repository (cloned above)
- Clone DLFoil and DLFocl [dlfoil](https://bitbucket.org/grizzo001/dl-foil.git), [dlfocl](https://bitbucket.org/grizzo001/dlfocl.git), and extract the two repositories into `LearnALCLengths/`
- Install Java (version 8+) and Apache Maven (Only necessary for running DL-Learner and DL-Foil/DL-Focl)
## Reproducing the reported results
### Datasets (necessary for running the algorithms)
- Download [datasets](https://files.dice-research.org/archive/CLIP/) and extract the zip file into `LearnALCLengths/` and rename the folder as Datasets
### CLIP (our method)
*Open a terminal and navigate into /reproduce_results/ ``` cd LearnALCLengths/reproduce_results/```
- Reproduce CLIP concept learning results on all KBs ``` sh reproduce_celoe_clp_experiment_all_kbs.sh```
- Reproduce the training of concept length predictors ``` sh reproduce_training_clp_on_all_kbs.sh```
- Furthermore, one can train concept length predictors on a single knowledge base as follows ``` python reproduce_training_length_predictors_K_kb.py```, where ```K``` is one of carcinogenesis, mutagenesis, semantic_bible or vicodi. Use -h to see more training options (example ```python reproduce_training_length_predictors_carcinogenesis_kb.py -h ```).
### CELOE, ELTL, OCEL from DL-Learner
*Open a terminal and navigate into /other_learning_systems/scripts ``` cd LearnALCLengths/dllearner/scripts```
- Reproduce concept learning results on knowledge base K for algorithm Algo ``` python reproduce_dllearner_experiment.py --learning_systems Algo --knowledge_bases K```
- To reproduce the results for multiple algorithms on multiple knowledge bases, use the schema ``` python reproduce_dllearner_experiment.py --learning_systems Algo1 Algo2... --knowledge_bases K1 K2...```
Note that ```Algo``` is one of celoe, ocel or eltl, and ```K``` is one of carcinogenesis, mutagenesis, semantic_bible or vicodi (all lower cased)
### DLFoil and DLFocl
*For DLFoil, open a terminal and navigate into /dl-foil/DLFoil2* ``` cd LearnALCLengths/dl-foil/DLFoil2```
- Run ```mvn clean install```
- Open a different terminal and run the following ```python LearnALCLengths/generators/generate_dlfoil_config_all_kbs.py```
- Now execute the following in the first terminal (in LearnALCLengths/dl-foil/DLFoil2): ```mvn -e exec:java -Dexec.mainClass=it.uniba.di.lacam.ml.DLFoilTest -Dexec.args=K_config.xml >> ../dlfoil_out_K.txt```, where `K` is one of carcinogenesis, mutagenesis, semantic_bible or vicodi.
Note that DLFoil fails to solve our learning problems as it gets stuck on the refinement of certain partial descriptions.
*We could not run DLFocl.*
The authors did not provide sufficient documentation to run their algorithm; the documentation is [here](https://bitbucket.org/grizzo001/dlfocl.git)
### Statistical Test
*Open a terminal and navigate into /reproduce_results/* ``` cd LearnALCLengths/reproduce_results/```
- Run Wilcoxon statistical test on concept learning results `All Algos vs CLIP`: ``` sh run_statistical_test_on_all_kbs.sh```
### Use your own data
- Add your data into Datasets: it should be a folder containing a file formatted as RDF/XML or OWL/XML and should have the same name as the folder.
- Navigate into /generators and run ```python train_data/generate_training_data.py --kb your_folder_name```, use -h to see more options. The generated file Data.json under ```your_folder_name/Train_data/``` should serve for training concept length predictors, see example scripts in ```/reproduce_results/train_clp/```.
- Similarly, learning problems can be generated using one of the example files in generators/learning_problems/ (replace folder names by your folder name)
- Navigate into /Embeddings/Compute-Embeddings/ and run the following to embed your knowledge base: ```python run_script.py --path_dataset_folder your_folder_name```
- Train concept length predictors by preparing and running your python file ``` reproduce_training_length_predictors_K_kb.py ``` following examples in ```/reproduce_results/train_clp/```.
- Finally, prepare a script (see examples in ```/reproduce_results/celoe_clp/```) and run CLIP on your data.
## Acknowledgement
We based our implementation on the open source implementation of [ontolearn](https://docs--ontolearn-docs-dice-group.netlify.app/). We would like to thank the Ontolearn team for the readable codebase.