{"id":13423613,"url":"https://github.com/bauripalash/escraper","last_synced_at":"2026-01-16T22:53:26.144Z","repository":{"id":103212345,"uuid":"284467461","full_name":"bauripalash/escraper","owner":"bauripalash","description":"Scrap Email Addresses From PDFs and Photos! in C++ (Python was tooo easy to do)","archived":false,"fork":false,"pushed_at":"2020-10-05T14:09:29.000Z","size":118,"stargazers_count":14,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-26T23:11:35.865Z","etag":null,"topics":["cpp","email-parsing","hacktoberfest","scraper"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bauripalash.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":"palash","tidelift":null,"community_bridge":null,"liberapay":"bauripalash","issuehunt":null,"otechie":null,"custom":["https://paypal.me/bauripalash","https://buymeacoffee.com/palash","https://p-y.tm/9V-oX9y"]}},"created_at":"2020-08-02T13:30:56.000Z","updated_at":"2024-02-23T03:38:00.000Z","dependencies_parsed_at":"2023-06-29T00:01:05.532Z","dependency_job_id":null,"html_url":"https://github.com/bauripalash/escraper","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bauripalash%2Fescraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bauripalash%2Fescraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bauripalash%2Fescraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bauripalash%2Fescraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bauripalash","download_url":"https://codeload.github.com/bauripalash/escraper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243767010,"owners_count":20344855,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","email-parsing","hacktoberfest","scraper"],"created_at":"2024-07-31T00:00:38.842Z","updated_at":"2026-01-16T22:53:26.137Z","avatar_url":"https://github.com/bauripalash.png","language":"C++","readme":"![# Project X29 - Email Scraper](./media/banner.png)\n#### Escraper : Fast Email Scraper from PDF and Photos in C++ (Python was too easy to do)\n\n**If you like this project , show your support by donating or giving a 🌟 start to this repository**\n\n### 🦋 What is this?\nEscraper aka. Project X29 is an simple project to scrap email addresses from PDFs and Photos. Just Feed the Input File and get a output as a `.txt` file.\n\n### 🦋 How to Use?\n\u003e ( Assume we have a input file called called `card.pdf` which is an business card an includes some email addresses which we want to extract.)\n\nExecute this : \n```bash\n$ escraper -p card.pdf\n```\nAfter this we will get a output file called `card.pdf.txt` which will contain all the email addresses present in `card.pdf`\n\n### 🦋 Features:\n* Extract emails from a pdf file:\n```bash\n$ escraper -p/--pdf FILENAME\n```\n* Extract emails from a pdf file:\n```bash\n$ escraper -i/--image FILENAME\n```\n* Choose custom output file:\n```bash\n$ escraper -o/--out OUTPUT\n```\n\n### 🔨 How to Build?\n* Perquisites :\n    * A C++ Compile\n    ```bash\n    sudo apt install build-essentials\n    ```\n    * ImageMagick Library\n    ```bash\n    sudo apt install graphicsmagick-libmagick-dev-compat\n    ```\n    * Tesseract OCR Library\n    ```bash\n    sudo apt install tesseract-ocr libtesseract-dev libleptonica-dev\n    ```\n    * Make\n    ```bash\n    sudo apt install make\n    ```\n* Git Clone or Download this repo\n```bash\ngit clone https://github.com/bauripalash/escraper\n```\n* `cd` into the project folder\n```bash\ncd escraper\n```\n` Make\n```bash\nmake\n```\n* Now you'll have a binary called escraper\n\n---\nIf you like this project consider giving it a 🌟 star or donating. Follow me on socials [[Twitter]](https://twitter.com/bauripalash) | [[Facebook]](https://facebook.com/bauripalash) | [[Instagram]](https://instagram.com/bauripalash) | or even [[GitHub]](https://github.com/bauripalash)","funding_links":["https://ko-fi.com/palash","https://liberapay.com/bauripalash","https://paypal.me/bauripalash","https://buymeacoffee.com/palash","https://p-y.tm/9V-oX9y"],"categories":["C++"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbauripalash%2Fescraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbauripalash%2Fescraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbauripalash%2Fescraper/lists"}