Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zechengz/hin-dataset

Heterogeneous Information Network Datasets
https://github.com/zechengz/hin-dataset

heterogeneous-information-networks hin meta-path network-embedding

Last synced: 12 days ago
JSON representation

Heterogeneous Information Network Datasets

Awesome Lists containing this project

README

        

## Heterogeneous Information Network Datasets

### Download links

[DBLP (Google Drive)](https://drive.google.com/open?id=1YG9VR3vd6ewtMhdrcNXF5T_WTbx6MwYK): 601.4MB


[SLAP (Google Drive)](https://drive.google.com/open?id=1mIcLcxyg3WZApq6a4fIlADyU42WQKeGB): 295.8MB


[ACM (Google Drive)](https://drive.google.com/open?id=16R7ewS9cb5Bci7ClC0Ao1IYQmWPb-lHs): 752.1MB


[IMDB (Google Drive)](https://drive.google.com/open?id=1tqzNDkbZWGoG-vpM_M2X-EqRoPT1rp9k): 94.3MB

### Datasets information

| Dataset | # Nodes |Node types | Meta-paths | # Meta-path instances| # Labels | # Features |
|:-------:|:----------:|:-------------------------------------:|:-----------------------------------------------------:|:------------------:|:--------:|:----------:|
| DBLP | 14475(A) | Author(A)
Paper(P)
Conference(C)| APA
APCPA | 40269
19445349 | 4 | 5000+ |
| SLAP | 20419(G) | Gene(G)
Gene Ontology(O)
Pathway(P)
Compound(C)
Tissue(T)
Gene Family(F)
Disease(D) | GTG
GFG
GDG
GPG
GOG
GG
GDCDG | 303487
582741
7494
416462
3185779
172248
18095 | 15 | 2695 |
| ACM | 12499(P) | Paper(P)
Author(A)
Proceeding(O)
Institute(I)
Conference(C) | PAP
PAIAP
POP
POCOP
PP | 91662
13303015
700386
7849967
30621 | 11 | 8000 |
| IMDB* | 18352(M) | Movie(M)
Actor(A)
Actress(E)
Director(D) | MAM?
MDM?
MEM? | 63659?
1085810?
565443?
| 9 | 1000 |

### Notice
* * Multiple label dataset.
* ? Not sure which meta-path is corresponding to which number of meta-path instances.
* + Use `nltk.corpus.stopwords` and extract the bag-of-word representation.
* For `DBLP`, `SLAP` and `ACM`, please refer to the paper [Meta Path-Based Collective Classification in Heterogeneous Information Networks](https://arxiv.org/pdf/1305.4433.pdf).
* For `IMDB`, please refer to the paper [Column Networks for Collective Classification](https://arxiv.org/pdf/1609.04508.pdf).