https://github.com/skyffel/airbyte-connector-generator-poc
proof of concept to generate Airbyte low-code YAML connectors from API documentation
https://github.com/skyffel/airbyte-connector-generator-poc
airbyte airbyte-connectors data-engineering data-pipeline etl-automation llm-agent openapi-generator openapi3
Last synced: 5 months ago
JSON representation
proof of concept to generate Airbyte low-code YAML connectors from API documentation
- Host: GitHub
- URL: https://github.com/skyffel/airbyte-connector-generator-poc
- Owner: skyffel
- License: other
- Created: 2024-02-22T09:11:10.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-03-11T15:30:54.000Z (about 1 year ago)
- Last Synced: 2024-03-12T12:36:18.331Z (about 1 year ago)
- Topics: airbyte, airbyte-connectors, data-engineering, data-pipeline, etl-automation, llm-agent, openapi-generator, openapi3
- Language: Python
- Homepage: https://skyffel.com
- Size: 184 KB
- Stars: 11
- Watchers: 0
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - skyffel/airbyte-connector-generator-poc - proof of concept to generate Airbyte low-code YAML connectors from API documentation (Python)
README
This is a proof of concept to generate Airbyte low-code YAML connectors from API documentation. We want this to serve as inspiration to what can be done with LLMs. Here's how it works:
- Specify a goal, e.g "_Fetch all pages posts_"
- Provide one or more links to documentation, e.g for Notion: [API Intro](https://developers.notion.com/reference/intro), [API Versioning](https://developers.notion.com/reference/versioning) and [Search Endpoint](https://developers.notion.com/reference/post-search)This will generate an OpenAPI specificatin and a Airbyte low-code connector. Try it!
## 📽️ Demo
### Notion
https://github.com/skyffel/airbyte-connector-generator-poc/assets/3134895/f7e03f6d-60d9-44b6-88ee-3bc0e4bcd339### Perplexity
https://github.com/skyffel/airbyte-connector-generator-poc/assets/25622412/e2922aaa-4f19-4608-8fba-e1e08c201033## ✅ Key features
- Co-pilot for generating ETL code for HTTP APIs
- Support Airbyte low-code YAML connectors
- Scrapes API documentation
- Produces OpenAPI specifications## 📦 Installation
1. Clone the repository
```bash
git clone https://github.com/skyffel/airbyte-connector-generator-poc
cd airbyte-connector-generator-poc
```2. Setup virtual environment
```bash
# Using virtualenv
virtualenv env
source env/bin/activate# Or using venv
python3 -m venv env
source env/bin/activate# Or using poetry
poetry shell
```3. Install required packages
```bash
poetry install
playwright install
```## 🚀 How to use
Generates an Airbyte low-code YAML connector using the API documentation provided via URLs.
> Set `DEBUG=true` in `.env` to enable logs
```bash
skyffel --goal "" --urls "" --urls ""
```Here we generate a connector for extracting all blog posts from the Department of Justice.
```bash
skyffel \
--goal "extract all blog entries from department of justice" \
--urls https://www.justice.gov/developer/api-documentation/api_v1
```### 📥 Import to Airbyte
After generating the connector, you need to import it to Airbyte. Eventually they might expose an API to do this programatically 🤞 Until then, here's how:
1. Go to your Airbyte workspace
2. Click on Builder [BETA] in the menu
3. Click “New custom connector” in the upper right corner
4. Click “Import a YAML” and select the generated `airbyte_connector.yaml`
5. Fill in the test values and run the test
6. Press “Publish to workspace”