Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/data2health/nlp-sandbox

Cloud-based sandbox for text analytics
https://github.com/data2health/nlp-sandbox

Last synced: about 2 months ago
JSON representation

Cloud-based sandbox for text analytics

Awesome Lists containing this project

README

        

# nlp-sandbox

Developing a continuous benchmarking envrionment for NLP de-id methods.

## Project vision and values

The widespread adoption of Electronic Health Records (EHRs) has enabled secondary use of EHR data for clinical research and healthcare delivery. As much of the detailed patient information is recorded in clinical narratives, unlocking information from clinical narratives and integrating such information with structured EHR data become critical for EHR-based studies.
PHI information in clinical narratives becomes a barrier in conducting EHR-based clinical research and sharing the research data across sites.

## Vision

- Create a cloud-based environment that enables the systematic validation of text analytics tools to solve specific tasks (i.e. the “NLP Sandbox”).
- Populate the “NLP Sandbox” with appropriate reference data sets to be used in shared validation tasks.
- Engages CTSA hubs to contribute tools and methods to the project and demonstrate their performance, reproducibility, and rigor in such a shared environment

## Related Cores

- Tools and Cloud Infrastructure
- Next Generation Data Sharing
- Informatics Maturity and Best Practices

## Contact person

Point person (github handle) | Site | Program Director
----------|--------------|---------------
Justin Guinney (@jguinney) | Sage Bionetworks | Melissa Haendel (@mellybelly)

## Leads

Project scientific leadership, should be 1-3 persons.

Leads (github handle) | Site
----------|--------------|
Thomas Schaffter (@tschaffter) | Sage Bionetworks
James Eddy (@jaeddy) | Sage Bionetworks

## Team members

Members (github handle) | Site
----------|--------------|
Thomas Schaffter (@tschaffter) | Sage Bionetworks
Yao Yan (@yy6linda) | Sage Bionetworks
Yooree Chae (@ychae) | Sage Bionetworks
James Eddy (@jaeddy) | Sage Bionetworks
Justin Guinney (@jguinney) | Sage Bionetworks
George Kowalski (@gkowalski) | MCW
Bradley Taylor (@btaylormcw) | MCW
Tom Dillon (@tmdillon) | WashU

## Resources

Resource | Link | Site
----------|--------------|--------------|
GitHub team | [nlp-team](https://github.com/orgs/data2health/teams/nlp-team) | CD2H
GitHub project | [data2health/projects/7](https://github.com/orgs/data2health/projects/7) | CD2H
Google folder | [NLP Sandbox](https://drive.google.com/drive/folders/1PpFItk7GNvIjbidFNiDHmOn7NHEbpHle) | CD2H
Slack channel | [CD2H workspace / nlp-sandbox](https://app.slack.com/client/T4SPTQGE7/C010044EGTW) | CD2H

Access to resources is limited to onboarded participants ([CD2H Onboarding Form](https://bit.ly/cd2h-onboarding-form)).

## Get involved

We encourage the community to get involved. Please make tickets or provide comments.

## References

1. [NLP Sandbox - CD2H Phase III Project Proposal](https://docs.google.com/document/d/1S8LAtfgU6OitcSbPlKhKtvFxcIYQ-2t9Pw4EYlSBpRg)
2. https://github.com/data2health/nlp-review