Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dermatologist/pyomop

Python package for managing OHDSI clinical data models. Includes support for LLM based plain text queries!
https://github.com/dermatologist/pyomop

cdm clinical-trials datawarehouse hacktoberfest health-data-analysis health-informatics llm ohdsi python text-to-sql

Last synced: 5 days ago
JSON representation

Python package for managing OHDSI clinical data models. Includes support for LLM based plain text queries!

Host: GitHub
URL: https://github.com/dermatologist/pyomop
Owner: dermatologist
License: gpl-3.0
Created: 2020-05-02T16:09:27.000Z (over 4 years ago)
Default Branch: develop
Last Pushed: 2024-12-08T02:49:27.000Z (15 days ago)
Last Synced: 2024-12-08T03:25:34.882Z (15 days ago)
Topics: cdm, clinical-trials, datawarehouse, hacktoberfest, health-data-analysis, health-informatics, llm, ohdsi, python, text-to-sql
Language: Python
Homepage: https://nuchange.ca
Size: 502 KB
Stars: 38
Watchers: 4
Forks: 8
Open Issues: 5
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Authors: AUTHORS.md

Awesome Lists containing this project

README

        # pyomop

![Libraries.io SourceRank](https://img.shields.io/librariesio/sourcerank/pypi/pyomop)

[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)

[![PyPI download total](https://img.shields.io/pypi/dm/pyomop.svg)](https://pypi.python.org/pypi/pyomop/)

[![Build](https://github.com/dermatologist/pyomop/workflows/Python%20Test/badge.svg)](https://nuchange.ca)

### [Documentation](https://dermatologist.github.io/pyomop/)

## UPDATE

Recently added support for **LLM based natural language queries** of OMOP CDM databases using [llama-index](examples/llm_example.py). Please install the llm extras as follows. Please be cognizant of the privacy issues with publically hosted LLMs. Any feedback will be highly appreciated. [See usage](examples/llm_example.py)!

```

pip install pyomop[llm]

```

[See usage](examples/llm_example.py).

## Description

The [OHSDI](https://www.ohdsi.org/) OMOP Common Data Model allows for the systematic analysis of healthcare observational databases. This is a python library to use the CDM v6 compliant databases using SQLAlchemy as the ORM. **pyomop** also supports converting query results to a pandas dataframe (see below) for use in machine learning pipelines. See some useful [SQL Queries here.](https://github.com/OHDSI/QueryLibrary)

## Installation (stable)

```

pip install pyomop

```

## Installation (current)

* git clone this repository and:

```

pip install -e .

```

## Usage >= 4.0.0 (Async) Example

```

from pyomop import CdmEngineFactory, CdmVocabulary, CdmVector, Cohort, Vocabulary, metadata

from sqlalchemy.future import select

import datetime

import asyncio

async def main():

    cdm = CdmEngineFactory()  # Creates SQLite database by default

    # Postgres example (db='mysql' also supported)

    # cdm = CdmEngineFactory(db='pgsql', host='', port=5432,

    #                       user='', pw='',

    #                       name='', schema='cdm6')

    engine = cdm.engine

    # Create Tables if required

    await cdm.init_models(metadata)

    # Create vocabulary if required

    vocab = CdmVocabulary(cdm)

    # vocab.create_vocab('/path/to/csv/files')  # Uncomment to load vocabulary csv files

    # Add a cohort

    async with cdm.session() as session:

        async with session.begin():

            session.add(Cohort(cohort_definition_id=2, subject_id=100,

                cohort_end_date=datetime.datetime.now(),

                cohort_start_date=datetime.datetime.now()))

        await session.commit()

    # Query the cohort

    stmt = select(Cohort).where(Cohort.subject_id == 100)

    result = await session.execute(stmt)

    for row in result.scalars():

        print(row)

        assert row.subject_id == 100

    # Query the cohort pattern 2

    cohort = await session.get(Cohort, 1)

    print(cohort)

    assert cohort.subject_id == 100

    # Convert result to a pandas dataframe

    vec = CdmVector()

    vec.result = result

    print(vec.df.dtypes)

    result = await vec.sql_df(cdm, 'TEST') # TEST is defined in sqldict.py

    for row in result:

        print(row)

    result = await vec.sql_df(cdm, query='SELECT * from cohort')

    for row in result:

        print(row)

    # Close session

    await session.close()

    await engine.dispose()

# Run the main function

asyncio.run(main())

```

## Usage <=3.2.0

```

from pyomop import CdmEngineFactory, CdmVocabulary, CdmVector, Cohort, Vocabulary, metadata

from sqlalchemy.sql import select

import datetime

cdm = CdmEngineFactory()  # Creates SQLite database by default

# Postgres example (db='mysql' also supported)

# cdm = CdmEngineFactory(db='pgsql', host='', port=5432,

#                       user='', pw='',

#                       name='', schema='cdm6')

engine = cdm.engine

# Create Tables if required

metadata.create_all(engine)

# Create vocabulary if required

vocab = CdmVocabulary(cdm)

# vocab.create_vocab('/path/to/csv/files')  # Uncomment to load vocabulary csv files

# Create a Cohort (SQLAlchemy as ORM)

session =  cdm.session

session.add(Cohort(cohort_definition_id=2, subject_id=100,

            cohort_end_date=datetime.datetime.now(),

            cohort_start_date=datetime.datetime.now()))

session.commit()

result = session.query(Cohort).all()

for row in result:

    print(row)

# Convert result to a pandas dataframe

vec = CdmVector()

vec.result = result

print(vec.df.dtypes)

# Execute a query and convert it to dataframe

vec.sql_df(cdm, 'TEST') # TEST is defined in sqldict.py

print(vec.df.dtypes) # vec.df is a pandas dataframe

# OR

vec.sql_df(cdm, query='SELECT * from cohort')

print(vec.df.dtypes) # vec.df is a pandas dataframe

```

## command-line usage

```

pyomop -help

```

## Other utils

**Want to convert FHIR to pandas data frame? Try [fhiry](https://github.com/dermatologist/fhiry)**

**Use the same functions in [.NET](https://github.com/dermatologist/omopcdm-dot-net) and [Golang](https://github.com/E-Health/gocdm)!**

### Support

* Postgres

* MySQL

* SqLite

* More to follow..

## Give us a star ⭐️

If you find this project useful, give us a star. It helps others discover the project.

## Contributors

* [Bell Eapen](https://nuchange.ca) | [![Twitter Follow](https://img.shields.io/twitter/follow/beapen?style=social)](https://twitter.com/beapen)

* PRs welcome. See CONTRIBUTING.md