https://github.com/closedloop-technologies/PromptedGraphs

From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.
https://github.com/closedloop-technologies/PromptedGraphs

Last synced: 4 months ago
JSON representation

From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.

Host: GitHub
URL: https://github.com/closedloop-technologies/PromptedGraphs
Owner: closedloop-technologies
License: mit
Created: 2023-09-21T15:05:49.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-05-14T18:10:19.000Z (12 months ago)
Last Synced: 2025-01-02T21:47:26.132Z (4 months ago)
Language: Python
Size: 15.5 MB
Stars: 11
Watchers: 2
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README-USECASE.md
- License: LICENSE

Awesome Lists containing this project

awesome_ai_agents - Promptedgraphs - From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment - The Power of NLP and LLMs Combined. (Building / Datasets)
awesome_ai_agents - Promptedgraphs - From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment - The Power of NLP and LLMs Combined. (Building / Datasets)

README

## Steps

1. Given data from an API endpoint
* Description of the endpoint (url, method, parameters, etc.)
* Example **Raw Data** from the endpoint
2. Generate a Pydantic **DataModel** from the example data
3. Repeat for two other endpoints
4. Construct a **DataGraph** from the **DataModels** to represent the relationships between the data
5. Generate a **PropertyGraph-Schema** from the **DataGraph** and represent as an ER-Diagram.
6. Create a schema alignment between the **PropertyGraph-Schema** and the properties of the **DataGraph**.
a. Indicate how the data models should be transformed to fit the schema.
b. TODO handle cases where keys and values need to be 'pivoted' to fit the schema.
7. Generate a **Database-Schema** from the **PropertyGraph-Schema**. The data should be third-form-normal as a default.
8. Create a **Database** from the **Database-Schema**
9. Implement ETL tasks to transform, and load data from the API endpoints into the database.
1. Transform should get the example data and convert it to an in-memory tables reflecting the database schema. These are the **staging** tables.
2. Load should first `resolve` any existing data and update primary keys with the existing keys.
3. Human in the loop for any ambiguous keys.
4. Insert the data into the database.
5. Optionally Update any existing data.

In this library, we should be able to manually chain together these functions and generate the necessary code.

The library **AutoETL** should be able to automatically chain these functions together and run them in a pipeline. The library should also be able to generate the code for the pipeline.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/closedloop-technologies/PromptedGraphs

Awesome Lists containing this project

README