{"id":29406320,"url":"https://github.com/ishansarvaiya/hotstarscraper","last_synced_at":"2026-04-13T13:01:50.648Z","repository":{"id":303945984,"uuid":"1017243045","full_name":"ishansarvaiya/HotstarScraper","owner":"ishansarvaiya","description":"This project is a C# application designed to scrape movie and show data from Hotstar and store it in a SQL Server database.","archived":false,"fork":false,"pushed_at":"2025-07-16T08:58:28.000Z","size":32781,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-23T07:10:30.449Z","etag":null,"topics":["csharp","dotnet","git","hotstar","scraper","selenium","sql","ssms"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ishansarvaiya.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-10T08:36:26.000Z","updated_at":"2025-07-16T08:58:32.000Z","dependencies_parsed_at":"2025-07-10T17:21:42.631Z","dependency_job_id":null,"html_url":"https://github.com/ishansarvaiya/HotstarScraper","commit_stats":null,"previous_names":["ishansarvaiya/hotstarscraper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ishansarvaiya/HotstarScraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ishansarvaiya%2FHotstarScraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ishansarvaiya%2FHotstarScraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ishansarvaiya%2FHotstarScraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ishansarvaiya%2FHotstarScraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ishansarvaiya","download_url":"https://codeload.github.com/ishansarvaiya/HotstarScraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ishansarvaiya%2FHotstarScraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31753551,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T09:16:15.125Z","status":"ssl_error","status_checked_at":"2026-04-13T09:16:05.023Z","response_time":93,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csharp","dotnet","git","hotstar","scraper","selenium","sql","ssms"],"created_at":"2025-07-10T23:19:58.705Z","updated_at":"2026-04-13T13:01:50.629Z","avatar_url":"https://github.com/ishansarvaiya.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# HotstarScraper\n\nThis project is a C# application designed to scrape movie and show data from Hotstar and store it in a SQL Server database.\n\n## Demo\n\nhttps://github.com/user-attachments/assets/b2e8e62e-7249-44c9-9bc3-f5e4bd86f616\n\n\n\n\n## Run Locally\n\nClone the project\n\n```bash\ngit clone https://github.com/ishansarvaiya/HotstarScraper.git\n```\n\nGo to the project directory\n\n```bash\ncd HotstarScraper/HotstarScraper\n```\n\nInstall dependencies\n\n```bash\ndotnet restore\n```\n\nStart the server\n\n```bash\ndotnet run\n```\n\n\n## Features\n\n- Web Scraping: Utilizes Selenium with ChromeDriver to navigate Hotstar, scroll through content, click on items, and extract details like titles, descriptions, release years, ratings, durations/seasons, languages, genres, and image URLs for both movies and shows.\n- Data Persistence: Employs Entity Framework Core to interact with a SQL Server database.\n- Database Schema: Defines models for Movie, Show, Genre, Language, and their many-to-many relationships (MovieGenre, ShowGenre, MovieLanguage).\n- Data Handling: Includes a DataService responsible for saving scraped data, checking for existing entries, and managing relationships with genres and languages, adding new ones if they don't exist.\n- Logging: Integrates Serilog for comprehensive logging of application activities, including information messages, warnings, and errors, with output directed to both console and a daily rolling file.\n- Configuration: Reads database connection strings and Serilog settings from an appsettings.json file.\n- Headless Browse: Configures ChromeDriver to run in headless mode for efficient scraping without a visible browser UI.\n## Structure\n\n- Program.cs: The main entry point of the application, handling setup, database migrations, and orchestrating the scraping and data saving processes.\n- HotstarDbContext.cs: The Entity Framework Core DbContext for interacting with the database, defining DbSet properties for all models.\n- Data/Configurations: Contains EF Core fluent API configurations for each model, defining primary keys, required properties, maximum lengths, and unique indexes.\n- Interfaces: Defines IDataService and IScraperService interfaces for abstraction.\n- Models: Contains the POCO classes representing the data entities (Movie, Show, Genre, Language) and data transfer objects for scraped data (MovieScrapeData, ShowScrapeData).\n- Services/DataService.cs: Implements IDataService, responsible for saving scraped movie and show data to the database, including handling associated genres and languages.\n- Services/ScraperService.cs: Implements IScraperService, containing the core logic for configuring the web driver, navigating Hotstar, and extracting data for movies and shows.\n- appsettings.json: Configuration file for database connection strings and Serilog settings.\n## Database Schema\n\n- Movies (Id: INT (Primary Key), Title: NVARCHAR(255) (Required), Description: NVARCHAR(4000), ReleaseYear: NVARCHAR(10), Rating: NVARCHAR(10), Duration: NVARCHAR(50), ImageUrl: NVARCHAR(MAX))\n- Shows (Id: INT (Primary Key), Title: NVARCHAR(255) (Required), Description: NVARCHAR(4000), ReleaseYear: NVARCHAR(10), Rating: NVARCHAR(10), Season: NVARCHAR(50), ImageUrl: NVARCHAR(MAX))\n- Genres (Id: INT (Primary Key), Name: NVARCHAR(255) (Required))\n- Languages (Id: INT (Primary Key), Name: NVARCHAR(255) (Required))\n- MovieGenres (MovieId: INT (Primary Key, Foreign Key to Movies), GenreId (Primary Key, Foreign Key to Genres))\n- MovieLanguages (MovieId: INT (Primary Key, Foreign Key to Movies), LanguageId (Primary Key, Foreign Key to Languages))\n- ShowGenres (ShowId: INT (Primary Key, Foreign Key to Shows), GenreId (Primary Key, Foreign Key to Genres))\n## NuGet Packages\n\n- Microsoft.EntityFrameworkCore.SqlServer\n- Microsoft.EntityFrameworkCore.Tools\n- Microsoft.Extensions.Configuration.Json\n- Selenium.WebDriver\n- Selenium.WebDriver.ChromeDriver\n- Serilog.Enrichers.Environment\n- Serilog.Enrichers.Process\n- Serilog.Enrichers.Thread\n- Serilog.Settings.Configuration\n- Serilog.Sinks.Console\n- Serilog.Sinks.File\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fishansarvaiya%2Fhotstarscraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fishansarvaiya%2Fhotstarscraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fishansarvaiya%2Fhotstarscraper/lists"}