https://github.com/kumaranand05/imdb-data-scraper
Java Selenium based scraper to collect all media details from IMDb website.
https://github.com/kumaranand05/imdb-data-scraper
imdb java scraping selenium
Last synced: 2 months ago
JSON representation
Java Selenium based scraper to collect all media details from IMDb website.
- Host: GitHub
- URL: https://github.com/kumaranand05/imdb-data-scraper
- Owner: kumarAnand05
- Created: 2024-04-29T15:46:54.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-30T20:11:50.000Z (about 2 years ago)
- Last Synced: 2025-05-01T07:59:49.177Z (about 1 year ago)
- Topics: imdb, java, scraping, selenium
- Language: Java
- Homepage: https://www.imdb.com
- Size: 7.81 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# IMDb Data Scraper
By Anand Kumar
## Features
* **Entire Database Collection** : Extracts all the media data stored on IMDb website.
* **CSV Output** : Extracts the data in CSV format.
# Instructions
After you have downloaded the project files. Follow the instructions below to setup your machine to make code functional.
## Downloading/Installing dependencies
Of course you need [JDK](https://www.oracle.com/in/java/technologies/downloads/) and an IDE like [VSCode](https://code.visualstudio.com), [IntelliJ](https://www.jetbrains.com/idea/) etc. installed on your machine.
> Download Dependencies
Open the project in your IDE and connect to internet. Using the pom.xml file present in the project directory, download the dependencies by performing specific actions for the IDE that you are using.
## Dos and Don'ts
> Do's
+ You can use your machine during the process.
+ You can keep the browser and IDE in background.
> Don'ts
+ Do not click on any element of the webpage as it can lead to termination of the code.
+ Do not use console during the process.
+ Do not turn off internet or close the automated browser session.