{"id":22457585,"url":"https://github.com/sadmanca/imdb-scraper","last_synced_at":"2026-05-03T16:32:23.969Z","repository":{"id":105361303,"uuid":"335142040","full_name":"sadmanca/imdb-scraper","owner":"sadmanca","description":"Scrapes IMDb's movie database and outputs the data to CSV files.","archived":false,"fork":false,"pushed_at":"2021-02-13T01:35:32.000Z","size":29,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-27T13:43:37.057Z","etag":null,"topics":["beautifulsoup","data-scraping","imdb","numpy","pandas","python","requests"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sadmanca.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-02T02:14:50.000Z","updated_at":"2021-07-24T00:56:22.000Z","dependencies_parsed_at":"2023-07-15T15:33:15.266Z","dependency_job_id":null,"html_url":"https://github.com/sadmanca/imdb-scraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sadmanca/imdb-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sadmanca%2Fimdb-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sadmanca%2Fimdb-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sadmanca%2Fimdb-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sadmanca%2Fimdb-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sadmanca","download_url":"https://codeload.github.com/sadmanca/imdb-scraper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sadmanca%2Fimdb-scraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32577121,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","data-scraping","imdb","numpy","pandas","python","requests"],"created_at":"2024-12-06T08:08:43.759Z","updated_at":"2026-05-03T16:32:23.962Z","avatar_url":"https://github.com/sadmanca.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# IMDB Top 1000 Movies\nThis project scrapes the [IMDb \"Top 1000\" movies (sorted by popularity)](https://www.imdb.com/search/title/?groups=top_1000) webpages and outputs the data to a CSV file.\n\n## Sample Output\n\n|    | movie                      | year | runtime | imdb | metascore | votes   | grossMillions |\n| -- | -------------------------- | ---- | ------- | ---- | --------- | ------- | ------------- |\n| 0  | Dara of Jasenovac          | 2020 | 130     | 8.7  |           | 51892   |               |\n| 1  | Soul                       | 2020 | 100     | 8.1  | 83.0      | 172275  |               |\n| 2  | Groundhog Day              | 1993 | 101     | 8.0  | 72.0      | 580305  | 70.91         |\n| 3  | The Sound of Music         | 1965 | 172     | 8.0  | 63.0      | 206581  | 163.21        |\n| 4  | Avengers: Endgame          | 2019 | 181     | 8.4  | 78.0      | 815967  | 858.37        |\n| 5  | Deadpool 2                 | 2018 | 119     | 7.7  | 66.0      | 480793  | 324.59        |\n| ...| ...                        | ...  | ...     | ...  | ...       | ...     | ...           |\n\n### Downloads\n\n* [imdb_scraper.py](imdb_scraper.py) - main program that scrapes the IMDb webpages\n* [movies.csv](movies.csv) - outputted csv file\n\n## Built With\n\n* [Requests](https://requests.readthedocs.io) - library for making HTTP requests\n* [Beautiful Soup 4](https://pypi.org/project/beautifulsoup4/) - library for scraping information from web pages\n* [NumPy](https://numpy.org) - high performance library for multi-dimensional arrays\n* [Pandas](https://pandas.pydata.org) - provides tools for manipulating tables","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsadmanca%2Fimdb-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsadmanca%2Fimdb-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsadmanca%2Fimdb-scraper/lists"}