An open API service indexing awesome lists of open source software.

https://github.com/baranzinilab/graphrain


https://github.com/baranzinilab/graphrain

Last synced: 12 months ago
JSON representation

Awesome Lists containing this project

README

          

# graphRAIN

This repository holds the script for graphRAIN algorithm - algorithm that creates explainable node embeddings of a graph.

## Introduction

GraphRAIN stands for Graph Relational Attribute Integrated Node-embedding

This generates embedding vectors for nodes in a graph by incorporating the attributes hanging from the relationship between the nodes. Since the algorithm normalizes the relational attributes, this works for both heterogeneous and homogeneous graphs.

## How to use

Following snippet shows how to use this package:

```
from embedding import RAIN

NCORES =
NBATCH =

metadata_dict = dict(
edge_path = /path/to/edges.tsv,
edge_metadata_path = /path/to/edge_metadata.tsv,
node_metadata_path = /path/to/node_metadata.tsv,
embedding_save_path = /path/to/save/output/files,
graph_path = embedding_save_path/name_of_graph.joblib,
)

embedding_nodeId_list = List of unique nodeIds of graph whose embedding vectors need to be computed

rain = RAIN()
rain.batch_embedding(
metadata_dict = metadata_dict,
embedding_nodeId_list = embedding_nodeId_list,
ncores = NCORES,
nbatch = NBATCH
)
```
**Refer to [run_rain.py](https://github.com/BaranziniLab/graphRAIN/blob/main/run_rain.py) for a full example which you can run in your machine.**

## Sample data provided

[Sample data](https://github.com/BaranziniLab/graphRAIN/tree/main/sample_data) provided is a network of disease and symptom nodes. This is a subnetwork taken from a bigger network called [SPOKE](https://spoke.rbvi.ucsf.edu/).
The objective of providing this sample data is to let user know about the format of the input files that one should provide to run RAIN algorithm.