{"id":13823607,"url":"https://github.com/dailydotdev/daily-scraper","last_synced_at":"2025-07-08T18:30:41.718Z","repository":{"id":36955239,"uuid":"269609125","full_name":"dailydotdev/daily-scraper","owner":"dailydotdev","description":"Fetches information about every webpage 🤖","archived":false,"fork":false,"pushed_at":"2024-11-12T11:16:23.000Z","size":1581,"stargazers_count":108,"open_issues_count":17,"forks_count":28,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-11-12T11:35:10.963Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dailydotdev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-06-05T11:04:59.000Z","updated_at":"2024-11-12T11:02:38.000Z","dependencies_parsed_at":"2024-01-13T16:20:34.164Z","dependency_job_id":"23fd3bd3-8318-4e4d-8213-7373320300bc","html_url":"https://github.com/dailydotdev/daily-scraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailydotdev%2Fdaily-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailydotdev%2Fdaily-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailydotdev%2Fdaily-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailydotdev%2Fdaily-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dailydotdev","download_url":"https://codeload.github.com/dailydotdev/daily-scraper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225453341,"owners_count":17476706,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T09:00:38.151Z","updated_at":"2025-07-08T18:30:41.713Z","avatar_url":"https://github.com/dailydotdev.png","language":"HTML","readme":"\u003cdiv align=\"center\"\u003e\n  \u003ch1\u003eDaily Scraper\u003c/h1\u003e\n  \u003cstrong\u003eFetches information about every webpage 🤖\u003c/strong\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://circleci.com/gh/dailydotdev/daily-scraper\"\u003e\n    \u003cimg src=\"https://img.shields.io/circleci/build/github/dailydotdev/daily-scraper/master.svg\" alt=\"Build Status\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/dailydotdev/daily-scraper/blob/master/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/license/dailydotdev/daily-scraper.svg\" alt=\"License\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://stackshare.io/daily/daily\"\u003e\n    \u003cimg src=\"http://img.shields.io/badge/tech-stack-0690fa.svg?style=flat\" alt=\"StackShare\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\nThe service uses [Puppeteer](https://github.com/puppeteer/puppeteer), a headless Chrome, to scrape webpages.\nCurrently, its only purpose is to provide information when a user suggests a new source.\nThe scraper can find the icon, RSS feed, name, and other relevant information for every page.\n\n## Stack\n\n* Node v16.20.0 (a `.nvmrc` is presented for [nvm](https://github.com/nvm-sh/nvm) users).\n\n* NPM for managing dependencies.\n\n* Fastify as the web framework\n\n## Project structure\n\n* `__tests__` - There you can find all the tests and fixtures. Tests are written using `jest`.\n\n* `helm` - The home of the service helm chart for easily deploying it to Kubernetes.\n\n* `src` - This is obviously the place where you can find the source files.\n\n  * `scrape` - Stores many utility functions to scrape information from a webpage.\n\n## Local environment\n\nDaily Scraper requires nothing to run. It doesn't need any database or a service.\n\n[.env](.env) is used to set the required environment variables. It is loaded automatically by the project.\n\nFinally, run `npm run dev` to run the service and listen on port `5001`.\n\n\n## Want to Help?\n\nSo you want to contribute to Daily Scraper and make an impact, we are glad to hear it. :heart_eyes:\n\nBefore you proceed, we have a few guidelines for contribution that will make everything much easier.\nWe would appreciate it if you could dedicate the time and read them carefully:\n\nhttps://github.com/dailydotdev/.github/blob/master/CONTRIBUTING.md\n","funding_links":[],"categories":["HTML"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdailydotdev%2Fdaily-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdailydotdev%2Fdaily-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdailydotdev%2Fdaily-scraper/lists"}