{"id":43778515,"url":"https://github.com/apex-woot/mr-scraper","last_synced_at":"2026-02-05T18:00:52.463Z","repository":{"id":336407012,"uuid":"1149489399","full_name":"apex-woot/mr-scraper","owner":"apex-woot","description":"LinkedIn Profile Scraper \u0026 Data Exporter - TS/Playwright tool for extracting profiles, companies, jobs, and posts powered by Bun runtime","archived":false,"fork":false,"pushed_at":"2026-02-04T16:17:21.000Z","size":117,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-04T20:20:09.071Z","etag":null,"topics":["automation","browser-automation","bun","company-data","data-exporter","data-extraction","job-scraper","linkedin","linkedin-api","linkedin-data","linkedin-jobs","linkedin-posts","linkedin-scraper","playwright","profile-extraction","profile-scraper","scraper","typescript","web-scraping","zod"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apex-woot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-04T07:04:41.000Z","updated_at":"2026-02-04T16:55:41.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/apex-woot/mr-scraper","commit_stats":null,"previous_names":["apex-woot/mr-linkedin"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/apex-woot/mr-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apex-woot%2Fmr-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apex-woot%2Fmr-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apex-woot%2Fmr-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apex-woot%2Fmr-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apex-woot","download_url":"https://codeload.github.com/apex-woot/mr-scraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apex-woot%2Fmr-scraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29128621,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-05T17:12:17.649Z","status":"ssl_error","status_checked_at":"2026-02-05T17:11:23.670Z","response_time":65,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","browser-automation","bun","company-data","data-exporter","data-extraction","job-scraper","linkedin","linkedin-api","linkedin-data","linkedin-jobs","linkedin-posts","linkedin-scraper","playwright","profile-extraction","profile-scraper","scraper","typescript","web-scraping","zod"],"created_at":"2026-02-05T18:00:27.883Z","updated_at":"2026-02-05T18:00:52.456Z","avatar_url":"https://github.com/apex-woot.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# @apexwoot/mr-scraper\n\n[![npm version](https://img.shields.io/npm/v/@apexwoot/mr-scraper.svg)](https://www.npmjs.com/package/@apexwoot/mr-scraper)\n[![License: GPL-3.0](https://img.shields.io/badge/License-GPL--3.0-blue.svg)](LICENSE)\n[![Bun](https://img.shields.io/badge/Bun-%23000000.svg?style=flat\u0026logo=bun\u0026logoColor=white)](https://bun.sh)\n\nA high-performance LinkedIn scraper for **Bun + Node.js**. Built with **Playwright** and **Zod** for robust automation and type-safe data extraction.\n\n## Features\n\n- **Dual Runtime Support:** Optimized builds for both **Bun** and **Node.js** natively.\n- **Data Extraction:** Profiles, Companies, Job Postings, and Company Posts.\n\n- **Type Safety:** Full TypeScript support with Zod-validated schemas.\n- **Session Management:** Persist authentication via `storageState` to bypass logins.\n- **Extensible:** Custom callbacks for real-time progress tracking (JSON, Multi, Console).\n\n### 🚀 Improved Robustness\n\n| Feature | Python Version | This Version |\n| :--- | :---: | :---: |\n| **Experience** | Basic | **Robust \u0026 Detailed** |\n| **Patents** | Limited | **Full Extraction** |\n| **Data Validation** | Pydantic | **Strict Zod Schemas** |\n| **Concurrency** | Threading | **Modern Async/Await** |\n\n## Session Persistence\n\nTo avoid repeated logins and bot detection, save and reuse your session state:\n\n```typescript\n// Save session\nawait loginWithCredentials(page, { email, password });\nawait browser.context.storageState({ path: 'state.json' });\n\n// Reuse session\nconst browser = new BrowserManager({ storageState: 'state.json' });\nawait browser.start();\n```\n\n## Development\n\n```bash\nbun install    # Setup\nbun test       # Run tests\nrun build      # Build dist\n```\n\n## Roadmap / TODO\n\n- [x] High-performance Bun + Playwright core\n- [x] Robust Experience \u0026 Patent extraction\n- [ ] Robust extraction of other sections (Education, Publications, Skills, Interests, etc.)\n- [ ] Proxy support integration\n- [ ] LinkedIn Messaging scraping support\n- [ ] Recruiter-specific data points\n- [ ] Automated CAPTCHA solving hooks\n\n---\n\n*Disclaimer: This tool is for educational purposes only. Users are responsible for complying with LinkedIn's Terms of Service.*\n\n\u003csmall\u003eTypeScript port of [linkedin_scraper](https://github.com/joeyism/linkedin_scraper) by [joeyism](https://github.com/joeyism) done mostly by AI.\u003c/small\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapex-woot%2Fmr-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapex-woot%2Fmr-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapex-woot%2Fmr-scraper/lists"}