{"id":50519791,"url":"https://github.com/diffbot/diffbot-python","last_synced_at":"2026-06-03T03:06:50.252Z","repository":{"id":13580157,"uuid":"16272753","full_name":"diffbot/diffbot-python","owner":"diffbot","description":"Python client library for Diffbot APIs","archived":false,"fork":false,"pushed_at":"2026-05-26T19:59:50.000Z","size":86,"stargazers_count":124,"open_issues_count":0,"forks_count":39,"subscribers_count":14,"default_branch":"main","last_synced_at":"2026-05-26T21:25:11.846Z","etag":null,"topics":["crawler","knowledge-graph","natural-language-processing","web-data","web-data-extraction"],"latest_commit_sha":null,"homepage":"https://docs.diffbot.com","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"googlemaps/android-samples","license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/diffbot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2014-01-27T08:00:13.000Z","updated_at":"2026-05-26T20:01:16.000Z","dependencies_parsed_at":"2022-08-31T00:42:06.978Z","dependency_job_id":null,"html_url":"https://github.com/diffbot/diffbot-python","commit_stats":null,"previous_names":["diffbot/diffbot-python"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/diffbot/diffbot-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diffbot%2Fdiffbot-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diffbot%2Fdiffbot-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diffbot%2Fdiffbot-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diffbot%2Fdiffbot-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/diffbot","download_url":"https://codeload.github.com/diffbot/diffbot-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diffbot%2Fdiffbot-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33845820,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-03T02:00:06.370Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","knowledge-graph","natural-language-processing","web-data","web-data-extraction"],"created_at":"2026-06-03T03:06:49.405Z","updated_at":"2026-06-03T03:06:50.237Z","avatar_url":"https://github.com/diffbot.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Diffbot Python Library\n\nPython client library for [Diffbot](https://www.diffbot.com) APIs.\n\n\n## Installation\n\n```bash\npip install git+https://github.com/diffbot/diffbot-python.git\n```\n\nOr, for local development:\n\n```bash\npip install -e \".[dev]\"\n```\n\n## Usage\n\n### Authentication\nSet your Diffbot API token in your environment or .env.\n\n```bash\nexport DIFFBOT_API_TOKEN=\u003cTOKEN\u003e\n```\n\n### Extract structured content\n```python\nfrom diffbot import Diffbot\n\ndb = Diffbot(token=\"YOUR_TOKEN\")\ndata = db.extract(\"https://www.example.com\")\n```\n\n### Ask Diffbot LLM\n```python\nfrom diffbot import Diffbot\n\ndb = Diffbot(token=\"YOUR_TOKEN\")\nfor chunk in db.ask([{\"role\": \"user\", \"content\": \"What's the capital of France?\"}]):\n    print(chunk, end=\"\")\n```\n\n### Crawl a site for structured content\n```python\nfrom diffbot import Diffbot\n\ndb = Diffbot(token=\"YOUR_TOKEN\")\nfor event in db.crawl(\"https://www.example.com\", hops=1):\n    print(event)\n```\n\n### Query the Knowledge Graph\n```python\nfrom diffbot import Diffbot\n\ndb = Diffbot(token=\"YOUR_TOKEN\")\nresults = db.dql('type:Organization name:\"Diffbot\"')\n```\n\n### Web Search\n```python\nfrom diffbot import Diffbot\n\ndb = Diffbot(token=\"YOUR_TOKEN\")\nresults = db.web_search(\"diffbot knowledge graph\")\nfor r in results[\"search_results\"]:\n    print(r[\"score\"], r[\"title\"], r[\"pageUrl\"])\n    print(r[\"content\"])\n```\n\n### Entities (NLP)\n```python\nfrom diffbot import Diffbot\n\ndb = Diffbot(token=\"YOUR_TOKEN\")\nresult = db.entities(\"Apple CEO Tim Cook announced record quarterly earnings.\")\nfor entity in result[\"entities\"]:\n    print(entity[\"name\"], entity.get(\"type\"), entity.get(\"id\"))\nprint(\"sentiment:\", result.get(\"sentiment\"))\n```\n\n## Async Usage\n\n### Extract structured content\n```python\nimport asyncio\nfrom diffbot import DiffbotAsync\n\nasync def main():\n    async with DiffbotAsync(token=\"YOUR_TOKEN\") as db:\n        data = await db.extract(\"https://www.example.com\")\n        print(data)\n\nasyncio.run(main())\n```\n\n### Ask Diffbot LLM\n```python\nimport asyncio\nfrom diffbot import DiffbotAsync\n\nasync def main():\n    async with DiffbotAsync(token=\"YOUR_TOKEN\") as db:\n        async for chunk in db.ask([{\"role\": \"user\", \"content\": \"What's the capital of France?\"}]):\n            print(chunk, end=\"\")\n\nasyncio.run(main())\n```\n\n### Crawl a site for structured content\n```python\nimport asyncio\nfrom diffbot import DiffbotAsync\n\nasync def main():\n    async with DiffbotAsync(token=\"YOUR_TOKEN\") as db:\n        async for event in db.crawl(\"https://www.example.com\", hops=1):\n            print(event)\n\nasyncio.run(main())\n```\n\n### Query the Knowledge Graph\n```python\nimport asyncio\nfrom diffbot import DiffbotAsync\n\nasync def main():\n    async with DiffbotAsync(token=\"YOUR_TOKEN\") as db:\n        results = await db.dql('type:Organization name:\"Diffbot\"')\n        print(results)\n\nasyncio.run(main())\n```\n\n### Web Search\n```python\nimport asyncio\nfrom diffbot import DiffbotAsync\n\nasync def main():\n    async with DiffbotAsync(token=\"YOUR_TOKEN\") as db:\n        results = await db.web_search(\"diffbot knowledge graph\")\n        for r in results[\"search_results\"]:\n            print(r[\"score\"], r[\"title\"], r[\"pageUrl\"])\n            print(r[\"content\"])\n\nasyncio.run(main())\n```\n\n### Entities (NLP)\n```python\nimport asyncio\nfrom diffbot import DiffbotAsync\n\nasync def main():\n    async with DiffbotAsync(token=\"YOUR_TOKEN\") as db:\n        result = await db.entities(\"Apple CEO Tim Cook announced record quarterly earnings.\")\n        for entity in result[\"entities\"]:\n            print(entity[\"name\"], entity.get(\"type\"), entity.get(\"id\"))\n        print(\"sentiment:\", result.get(\"sentiment\"))\n\nasyncio.run(main())\n```\n\n## CLI\n\nThis library also includes a CLI.\n\n```bash\nexport DIFFBOT_API_TOKEN=your-token-here\n\ndb extract https://www.example.com\ndb ask \"What's the capital of France?\"\ndb crawl https://www.example.com --hops 1\ndb crawl-list-jobs\ndb crawl-delete-job crawl-1234567890\ndb web-search \"diffbot knowledge graph\"\ndb web-search \"diffbot knowledge graph\" -n 5 -f json\ndb entities \"Apple CEO Tim Cook announced record quarterly earnings.\"\ndb entities \"Apple CEO Tim Cook announced record quarterly earnings.\" -f dql\n```\n\n## Tests\n\nRun the mock test suite:\n```bash\npython -m pytest\n```\n\nRun live integration tests against the real API (requires a valid token):\n```bash\nDIFFBOT_TOKEN=your_token python -m pytest -m live\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiffbot%2Fdiffbot-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdiffbot%2Fdiffbot-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiffbot%2Fdiffbot-python/lists"}