{"id":20761290,"url":"https://github.com/abdullahashfaqvirk/Earth-Engine-Data-Scraper","last_synced_at":"2025-09-27T23:30:26.897Z","repository":{"id":225872443,"uuid":"766466377","full_name":"abdullahashfaq-ds/Earth-Engine-Data-Scraper","owner":"abdullahashfaq-ds","description":"A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.","archived":false,"fork":false,"pushed_at":"2024-10-20T06:11:48.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-17T10:19:18.931Z","etag":null,"topics":["beautifulsoup","data","data-science","python","requests","scraper","web-scraping"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/abdullahashfaq-ds.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-03T10:54:03.000Z","updated_at":"2024-10-20T06:11:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"1883b609-7426-4800-81e5-95024e56ddc1","html_url":"https://github.com/abdullahashfaq-ds/Earth-Engine-Data-Scraper","commit_stats":null,"previous_names":["abdullahashfaq-ds/earth-engine-data-scrapper","abdullahashfaq-ds/earth-engine-data-scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdullahashfaq-ds%2FEarth-Engine-Data-Scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdullahashfaq-ds%2FEarth-Engine-Data-Scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdullahashfaq-ds%2FEarth-Engine-Data-Scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdullahashfaq-ds%2FEarth-Engine-Data-Scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/abdullahashfaq-ds","download_url":"https://codeload.github.com/abdullahashfaq-ds/Earth-Engine-Data-Scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234469186,"owners_count":18838541,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","data","data-science","python","requests","scraper","web-scraping"],"created_at":"2024-11-17T10:18:44.833Z","updated_at":"2025-09-27T23:30:26.546Z","avatar_url":"https://github.com/abdullahashfaq-ds.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Earth Engine Data Scraper\n\nThis repository contains a Python based web scraper designed to extract dataset metadata from the [Google Earth Engine Datasets Catalog](https://developers.google.com/earth-engine/datasets/catalog). It can be utilized for data gathering in research, analysis, or integration into larger systems related to environmental or geospatial data exploration.\n\n## Features\n\n- Scrapes dataset information from multiple pages of the Google Earth Engine Datasets Catalog.\n- Extracts detailed metadata, including:\n  - Dataset title\n  - Availability information\n  - Provider name and URL\n  - Associated tags\n  - Table values, when available\n- Outputs the scraped data in a structured format for easy access and further analysis.\n\n## Installation\n\nTo set up and run the scraper, follow these steps:\n\n1. **Clone the Repository**\n\n    ```bash\n    git clone git@github.com:abdullahashfaq-ds/Earth-Engine-Data-Scraper.git\n    cd Earth-Engine-Data-Scraper\n    ```\n\n2. **Create and Activate a Virtual Environment**\n\n    ```bash\n    python -m venv venv\n    \n    # For Windows, use:\n    venv\\Scripts\\activate\n    \n    # For macOS/Linux, use:\n    source venv/bin/activate\n    ```\n\n3. **Install Dependencies**\n\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n4. **Run the Scraper**\n\n   The scraper logic is implemented in a Jupyter notebook located in the `Notebooks` directory. Open it with Jupyter Lab or Jupyter Notebook, and execute the cells to initiate the scraping process.\n\n## Note\n\nIf you see an unverified signature in the commits, no worries—I've just misplaced my GPG key!\n\n## License\n\nThis project is licensed under the MIT License. For more details, see the [LICENSE](LICENSE) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabdullahashfaqvirk%2FEarth-Engine-Data-Scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabdullahashfaqvirk%2FEarth-Engine-Data-Scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabdullahashfaqvirk%2FEarth-Engine-Data-Scraper/lists"}