Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/albertmeronyo/semanticcorrelation

Playground to provide semantic similarity measures between statistically correlated concepts
https://github.com/albertmeronyo/semanticcorrelation

Last synced: 14 days ago
JSON representation

Playground to provide semantic similarity measures between statistically correlated concepts

Awesome Lists containing this project

README

        

SemanticCorrelation
===================

Playground to provide semantic similarity measures between
statistically correlated concepts

## What is this?

A script that reads concept descriptions in (Linked Statistical)
datasets and outputs the semantic similarity (using
[LSI](http://www.cs.bham.ac.uk/~pxt/IDA/lsa_ind.pdf)) of all possible
pairs.

## Why?

It belongs to a [broader
effort](https://github.com/csarven/linked-dataset-similarity-correlation)
to study the relationship between correlation and semantic similarity
of datasets.

## How to use it?

`./semanticCorrelation.py [-e | -i ] -o
[-v] [-t ] [-it ]`

- `` is the number of topics for LSI (default 200)
- `` is the number of power iterations for LSI (default
2). With more iterations precision increases, but efficiency
decreases

## Example

`./semanticCorrelation.py -e http://worldbank.270a.info/sparql -o
similarities.csv -v -t 300`

## Dependencies

- Python 2.7.5
- NLTK 2.0.4
- SPARQLWrapper 1.5.2
- gensim 0.10.0

## Disclaimer

Author: [Albert Meroño-Peñuela](https://github.com/albertmeronyo)

License: [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0)