{"id":21385066,"url":"https://github.com/ryshaal/google-scholar-scraping","last_synced_at":"2025-08-24T17:17:29.868Z","repository":{"id":259485559,"uuid":"863675299","full_name":"ryshaal/Google-Scholar-Scraping","owner":"ryshaal","description":"Google Scholar Scraper","archived":false,"fork":false,"pushed_at":"2024-09-27T09:41:36.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-16T11:43:33.260Z","etag":null,"topics":["article-extractor","scraping-websites"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ryshaal.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-26T17:53:30.000Z","updated_at":"2024-10-02T14:04:32.000Z","dependencies_parsed_at":"2024-10-25T23:10:52.044Z","dependency_job_id":"a3313898-661f-4e3d-9f07-25d0025118ed","html_url":"https://github.com/ryshaal/Google-Scholar-Scraping","commit_stats":null,"previous_names":["ryshaal/google-scholar-scraping"],"tags_count":0,"template":true,"template_full_name":null,"purl":"pkg:github/ryshaal/Google-Scholar-Scraping","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryshaal%2FGoogle-Scholar-Scraping","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryshaal%2FGoogle-Scholar-Scraping/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryshaal%2FGoogle-Scholar-Scraping/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryshaal%2FGoogle-Scholar-Scraping/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ryshaal","download_url":"https://codeload.github.com/ryshaal/Google-Scholar-Scraping/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryshaal%2FGoogle-Scholar-Scraping/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271912608,"owners_count":24842763,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-24T02:00:11.135Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["article-extractor","scraping-websites"],"created_at":"2024-11-22T11:44:28.456Z","updated_at":"2025-08-24T17:17:29.826Z","avatar_url":"https://github.com/ryshaal.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Google Scholar Scraper\n\nThis Python script allows you to scrape article data from Google Scholar based on a query provided in a text file. The script extracts information such as the article's title, authors, publication year, journal name, volume, citation count, and PDF links (if available). The results are saved as RIS files for easy citation management, and PDFs can be downloaded when available.\n\n## Features\n\n- Fetch articles from Google Scholar based on a query\n- Extract information including:\n  - Title\n  - Authors\n  - Year of publication\n  - Journal name and volume\n  - PDF link (if available)\n  - Citation count\n- Save article metadata as `.ris` files\n- Download PDFs of articles when available\n\n## How to Use\n\n1. Clone this repository:\n\n    ```bash\n    git clone https://github.com/ryshaal/Google-Scholar-Scraping/\n    ```\n2. Navigate to the project directory:\n    ```bash\n    cd Google-Scholar-Scraping\n    ```\n\n3. Create a text file for your query:\nIn the `input_query` folder, create a file named `query.txt` and enter your search query.\n\n4. Run the script:\n    ```bash\n    python gscholar.py\n    ```\n\n5. Output: \nThe script will display the article information in the terminal and save the metadata as a `.ris` file in the `output` folder. If a PDF link is available, the script will download the PDF to the same folder.\n\n**Example Query**\n\nIn the `input_query/query.txt` , you might add a search query like:\n```plaintext\nmachine learning applications in education\n```\n\n## Requirements\n\n- Python 3.x\n- `requests`\n- `beautifulsoup4`\n\nYou can install the required packages by running:\n\n```bash\npip install -r requirements.txt\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fryshaal%2Fgoogle-scholar-scraping","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fryshaal%2Fgoogle-scholar-scraping","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fryshaal%2Fgoogle-scholar-scraping/lists"}