https://github.com/gyorilab/indra_db

A Database-based knowledge back-end built on and for INDRA. The INDRA Database is a service that can be set up by any user with their own content and knowledge access. Our implementation of the database is the back-end to many of our projects, providing a vast and detailed knowledge base derived from many resources.
https://github.com/gyorilab/indra_db

database indra machine-reading postgresql sqlalchemy systems-biology

Last synced: 23 days ago
JSON representation

Host: GitHub
URL: https://github.com/gyorilab/indra_db
Owner: gyorilab
License: gpl-3.0
Created: 2018-09-20T14:48:15.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2025-04-24T19:44:52.000Z (29 days ago)
Last Synced: 2025-04-30T14:27:28.770Z (23 days ago)
Topics: database, indra, machine-reading, postgresql, sqlalchemy, systems-biology
Language: Python
Homepage:
Size: 14.3 MB
Stars: 16
Watchers: 3
Forks: 11
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # INDRA DB



The INDRA (Integrated Network and Dynamical Reasoning Assembler) Database is a

framework for creating, maintaining, and accessing a database of content,

readings, and statements. This implementation is currently designed to work

primarily with Amazon Web Services RDS running Postrgres 9+. Used as a backend

to INDRA, the INDRA Database provides a systematic way of scaling the knowledge

acquired from other databases, reading, and manual input, and puts that

knowledge at your fingertips through a direct Python client and a REST api.

### REST API

The INDRA DB is available via a web UI at: https://db.indra.bio

At the same URL, a REST service is also available which allows for programmatic usage

as documented here: https://github.com/gyorilab/indra_db/blob/master/indra_db_service/README.md

A convenient way to query the INDRA DB is via INDRA's built-in client towards INDRA DB

which is documented here: https://indra.readthedocs.io/en/latest/modules/sources/indra_db_rest/index.html.

### Knowledge sources

The INDRA Database currently integrates and distills knowledge from several

different sources, both biology-focused natural language processing systems and

other pre-existing databases

#### Daily Readers

We have read all available content, and every day we run the following readers:

- [REACH](https://github.com/clulab/reach)

- [Sparser](https://github.com/ddmcdonald/sparser)

we read all new content with the following readers:

- [Eidos](https://github.com/clulab/eidos)

- [ISI](https://github.com/sgarg87/big_mech_isi_gg)

- [MTI](https://ii.nlm.nih.gov/MTI/index.shtml) - used specifically to tag

content with topic terms.

we read a limited subset of new content with the following readers:

- [TRIPS](http://trips.ihmc.us/parser/cgi/drum)

on the latest content drawn from:

- [PubMed](https://www.ncbi.nlm.nih.gov/pubmed/) - ~19 million abstracts and ~29 million titles

- [PubMed Central](/www.ncbi.nlm.nih.gov/pmc/) - ~2.7 million fulltext

- [Elsevier](https://www.elsevier.com/) - ~0.7 million fulltext 

(requires special access)

#### Other Readers

We also include more or less static content extracted from the following readers:

- [RLIMS-P](https://research.bioinformatics.udel.edu/rlimsp/)

#### Other Databases

We include the information from these pre-existing databases:

- [Pathway Commons database](http://pathwaycommons.org/)

- [BEL Large Corpus](https://github.com/OpenBEL/)

- [SIGNOR](https://signor.uniroma2.it/)

- [BioGRID](https://thebiogrid.org/)

- [TAS](https://www.biorxiv.org/content/10.1101/358978v1)

- [TRRUST](https://omictools.com/trrust-tool)

- [PhosphoSitePlus](https://www.phosphosite.org/)

- [Causal Biological Networks Database](http://www.causalbionet.com/)

- [VirHostNet](http://virhostnet.prabi.fr/)

- [CTD](http://ctdbase.org/)

- [Phospho.ELM](http://phospho.elm.eu.org/)

- [DrugBank](https://www.drugbank.ca/)

- [CONIB](https://pharmacome.github.io/conib/)

- [CRoG](https://github.com/chemical-roles/chemical-roles)

- [DGI](https://www.dgidb.org/)

These databases are retrieved primarily using the tools in `indra.sources`. The

statements extracted from all of these sources are stored and updated in the

database.

### Knowledge Assembly

The INDRA Database uses the powerful internal assembly tools available in INDRA

but implemented for large-scale incremental assembly. The resulting corpus of

cleaned and de-duplicated statements, each with fully maintained provenance, is

the primary product of the database.

For more details on the internal assembly process of INDRA, see the

[INDRA documentation](http://indra.readthedocs.io/en/latest/modules/preassembler).

### Access

The content in the database can be accessed by those that created it using the

`indra_db.client` submodule. This repo also implements a REST API which can be

used by those without direct acccess to the database. For access to our REST

API, please contact the authors.

## Installation

The INDRA database only works for Python 3.6+, though some parts are still compatible with 3.5.

First, [install INDRA](http://indra.readthedocs.io/en/latest/installation.html),

then simply clone this repo, and make sure that it is visible in your

`PYTHONPATH`.

## Funding

The development of INDRA DB is funded under the DARPA Communicating with Computers program (ARO grant W911NF-15-1-0544).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gyorilab/indra_db

Awesome Lists containing this project

README