{"id":21569341,"url":"https://github.com/aweirddev/researches","last_synced_at":"2025-04-10T14:07:08.821Z","repository":{"id":248881555,"uuid":"830069710","full_name":"AWeirdDev/researches","owner":"AWeirdDev","description":"The Google search scraper for any use case. Simple as it should be.","archived":false,"fork":false,"pushed_at":"2024-11-16T13:28:08.000Z","size":154,"stargazers_count":2,"open_issues_count":1,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-24T12:47:53.166Z","etag":null,"topics":["google","google-scraper","google-search","google-search-api","python"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/researches","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AWeirdDev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-17T14:25:27.000Z","updated_at":"2024-11-21T05:30:46.000Z","dependencies_parsed_at":"2024-07-21T16:06:49.203Z","dependency_job_id":"347fede5-3c07-4e15-b01a-817d92a8f8b9","html_url":"https://github.com/AWeirdDev/researches","commit_stats":null,"previous_names":["aweirddev/researches"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AWeirdDev%2Fresearches","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AWeirdDev%2Fresearches/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AWeirdDev%2Fresearches/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AWeirdDev%2Fresearches/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AWeirdDev","download_url":"https://codeload.github.com/AWeirdDev/researches/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248231928,"owners_count":21069428,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["google","google-scraper","google-search","google-search-api","python"],"created_at":"2024-11-24T11:09:12.719Z","updated_at":"2025-04-10T14:07:08.796Z","avatar_url":"https://github.com/AWeirdDev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# researches\nResearches is a Google scraper. Minimal requirements.\n\nKey designs:\n- **No beautifulsoup.** We want to make sure everything is running smoothly and not slowly.\n- **Simple API.** Great developer experience, that's all that matters.\n- **Typed.** We support typing for everything you see.\n\nNote thet `researches` **does not clean up data for you**, meaning it's better for LLM-based content consumption.\n\n```python\nsearch(\"Who invented papers?\")\n# Result(snippet=Snippet(…), aside=None, weather=None, web=[Web(…), …], …)\n```\n\n## Requirements\n- A decent computer with an Internet connection\n- Python ≥ 3.9 (`dataclasses` support)\n- `primp` – 🪞 HTTP connections \u0026 fingerprint impersonation.\n- `selectolax` – 🌯 The HTML parser.\n\n## Usage\nJust start searching right away. Don't worry, Gemini won't hurt you (also [gemini](https://preview.redd.it/l-gemini-lmao-v0-6a6q0pl4ac2d1.png?auto=webp\u0026s=31cd6b33329d895501d727e6346153bc2a3ea1d6)).\n\n```python\n# Sync code\nsearch(\n    \"US to Japan\",  # query\n    hl=\"en\",        # language\n    ua=None,        # custom user agent or ours\n    **kwargs        # kwargs to pass to primp (optional)\n) -\u003e Result\n```\n\nFor people who love async, we've also got you covered:\n```python\n# Async code\nawait asearch(\n    \"US to Japan\"   # query\n    hl=\"en\",        # language\n    ua=None,        # custom user agent or ours\n    **kwargs        # kwargs to pass to primp (optional)\n) -\u003e Result\n```\n\nSo, what does the `Result` class has to offer? At a glance:\n```haskell\nresult.snippet?\n      ⤷  .text: str\n      ⤷  .name: str?\n\nresult.aside?\n      ⤷ .text: str\n\nresult.weather?\n      ⤷ .c: str\n      ⤷ .f: str\n      ⤷ .precipitation: str\n      ⤷ .humidty: str\n      ⤷ .wind_metric: str\n      ⤷ .wind_imperial: str\n      ⤷ .description: str\n      ⤷ .forecast: PartialWeatherForReport[]\n                   ⤷ .weekday: str\n                   ⤷ .high_c: str\n                   ⤷ .low_c: str\n                   ⤷ .high_f: str\n                   ⤷ .low_f: str\n\nresult.web: Web[]\n            ⤷ .title: str\n            ⤷ .url: str\n            ⤷ .text: str\n\nresult.flights: Flight[]\n                ⤷ .title: str\n                ⤷ .description: str\n                ⤷ .duration: str\n                ⤷ .price: str\n\nresult.lyrics?\n      ⤷ .text: str\n      ⤷ .is_partial: bool\n```\n\n## Background\nData comes in different shapes and sizes, and Google played it extremely well. That also includes randomizing CSS class names making it almost impossible to scrape data.\n\nGoogle sucks, but it's actually the knowledge base we all need. Say, there are these types of result pages:\n- **Links** – What made Google, \"Google.\" Or, `\u0026udm=14`.\n- **Weather** – Weather forecast.\n- **Wikipedia (aside)** – Wikipedia text.\n- **Flights** – Flights.\n- **Lyrics** – Both full and partial lyrics. \u003ckbd\u003eunstable\u003c/kbd\u003e\n\n...and many more. (Contribute!)\n\nScraper APIs out there are hella expensive, and ain't no way I'm paying or entering their free tier. So, I made my own that's perfect for extracting data with LLMs.\n\n## Other projects\nIf you're looking for something other than Google or something more general-purposed, check these out:\n\n- [`air_web`](https://github.com/AWeirdDev/air-web) – A lightweight package for crawling with the minimalist of code.\n- [`ddginternal`](https://github.com/AWeirdDev/ddginternal) – Simple Duckduckgo scraper.\n\n***\n\n(c) 2024 AWeirdDev, [sus2790](https://github.com/sus2790), and other silly people\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faweirddev%2Fresearches","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faweirddev%2Fresearches","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faweirddev%2Fresearches/lists"}