{"id":16271093,"url":"https://github.com/semanser/jsongenius","last_synced_at":"2025-10-04T06:30:25.391Z","repository":{"id":198530973,"uuid":"700831450","full_name":"semanser/JsonGenius","owner":"semanser","description":"Get structured JSON data from any page.","archived":false,"fork":false,"pushed_at":"2023-10-11T16:27:07.000Z","size":1099,"stargazers_count":175,"open_issues_count":1,"forks_count":11,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-01-19T17:09:29.205Z","etag":null,"topics":["api","go","golang","gpt","gpt-3","gpt-4","scraper","scraping","web-scraping"],"latest_commit_sha":null,"homepage":"https://singleapi.co","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/semanser.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-10-05T11:38:32.000Z","updated_at":"2024-12-20T07:18:32.000Z","dependencies_parsed_at":"2023-10-10T17:50:30.031Z","dependency_job_id":null,"html_url":"https://github.com/semanser/JsonGenius","commit_stats":null,"previous_names":["semanser/jsongenius"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanser%2FJsonGenius","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanser%2FJsonGenius/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanser%2FJsonGenius/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanser%2FJsonGenius/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/semanser","download_url":"https://codeload.github.com/semanser/JsonGenius/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235222520,"owners_count":18955328,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","go","golang","gpt","gpt-3","gpt-4","scraper","scraping","web-scraping"],"created_at":"2024-10-10T18:12:24.999Z","updated_at":"2025-10-04T06:30:20.066Z","avatar_url":"https://github.com/semanser.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# JsonGenius\n\nJsonGenius is a self-hosted scraping API that extracts structured data described by a [JSON Schema](https://json-schema.org). Provide any URL and a desired JSON Schema, and JsonGenius will return the structured data from the website.\n\n## Demo\n![image](.github/screenshots/demo.png)\n\n## Prerequisites\n- [Docker Compose](https://docs.docker.com/compose/install/)\n- `OPEN_AI_KEY` - An API key for [OpenAI](https://openai.com/). You can get one for free [here](https://platform.openai.com/account/api-keys). This should be set as an environment variable.\n\n## Usage\n####  Docker Compose (recommended)\n```bash\ngit clone https://github.com/semanser/jsongenius\ncd jsongenius\nexport OPEN_AI_KEY=\u003cyour key here\u003e\ndocker compose up\n```\nThe API will be available at http://localhost:3001. You can change the port by editing the `docker-compose.yml` file.\n\n#### Compile from source\n```bash\ngit clone https://github.com/semanser/jsongenius\ncd jsongenius\nexport OPEN_AI_KEY=\u003cyour key here\u003e\ngo build .\n./jsongenius\n```\n\n## API\n\n### POST /lookup\nThis endpoint accepts a JSON body with the following fields:\n- `url`: The URL of the website to scrape\n- `schema`: The JSON Schema to use to extract data from the website. The schema must be a valid JSON Schema object. Read more about JSON Schema [here](https://json-schema.org/).\n\n#### Example\n```bash\ncurl -X POST -H \"Content-Type: application/json\" -d '{\n  \"url\": \"https://www.amazon.com/s?k=gaming+headsets\",\n  \"schema\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"products\": {\n        \"type\": \"array\",\n        \"items\": {\n          \"type\": \"object\",\n          \"properties\": {\n            \"name\": {\n              \"type\": \"string\",\n              \"description\": \"The product name\"\n            },\n            \"price\": {\n              \"type\": \"number\",\n              \"description\": \"The price of the product in USD\"\n            }\n          }\n        }\n      }\n    }\n  }\n}' http://localhost:3001/lookup\n```\n\n### FAQ\n- **Does it work with JS heavy websites?** Yes! JsonGenius uses Chromium to render the page, so it can handle any website that a normal browser can.\n- **Can I bring my own Chromium instance?** Yes! You can set the `WS_URL` environment variable that points to a [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/) endpoint. JsonGenius will use that instead of spinning up its own Chromium instance.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsemanser%2Fjsongenius","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsemanser%2Fjsongenius","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsemanser%2Fjsongenius/lists"}