{"id":15102680,"url":"https://github.com/ausboss/4chan-scraper-playwright","last_synced_at":"2026-01-28T02:51:11.080Z","repository":{"id":250187003,"uuid":"833734906","full_name":"ausboss/4chan-scraper-playwright","owner":"ausboss","description":"A node js scraper for 4chan threads. Finds threads by text in the subject line. ","archived":false,"fork":false,"pushed_at":"2024-07-25T22:09:46.000Z","size":32,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-03T21:10:35.161Z","etag":null,"topics":["4chan","4chan-scraper","playwright","playwright-javascript","scraper"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ausboss.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-25T16:28:31.000Z","updated_at":"2024-08-03T16:03:35.000Z","dependencies_parsed_at":"2024-07-25T19:34:03.821Z","dependency_job_id":"8ea8aaf3-c44b-488f-a6d4-e76996c9b95f","html_url":"https://github.com/ausboss/4chan-scraper-playwright","commit_stats":{"total_commits":4,"total_committers":1,"mean_commits":4.0,"dds":0.0,"last_synced_commit":"714a609d197890bdaedac731416ae30b4bc8e6f1"},"previous_names":["ausboss/4chan-scraper-playwright"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ausboss/4chan-scraper-playwright","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ausboss%2F4chan-scraper-playwright","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ausboss%2F4chan-scraper-playwright/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ausboss%2F4chan-scraper-playwright/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ausboss%2F4chan-scraper-playwright/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ausboss","download_url":"https://codeload.github.com/ausboss/4chan-scraper-playwright/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ausboss%2F4chan-scraper-playwright/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28835080,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-28T02:10:51.810Z","status":"ssl_error","status_checked_at":"2026-01-28T02:10:50.806Z","response_time":57,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["4chan","4chan-scraper","playwright","playwright-javascript","scraper"],"created_at":"2024-09-25T19:04:34.792Z","updated_at":"2026-01-28T02:51:11.060Z","avatar_url":"https://github.com/ausboss.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 4chan Thread Extractor\n\nThis project is a Node.js application that extracts posts from the most popular thread matching specific subjects on any 4chan board.\n\n## Features\n\n- Flexibly search for 'general' threads or any thread by subject on any 4chan board\n- Specify multiple subject texts to match threads subject line\n- Automatically finds the most popular matching thread\n- Extracts all posts from the thread, including text and image information\n- Returns the extracted data as a formatted string\n\n## Todo\n\n- Improve the formatting of the extracted text\n\n## Prerequisites\n\n- Node.js (v14 or later recommended)\n- npm (comes with Node.js)\n\n## Installation\n\n1. Clone this repository:\n   ```\n   git clone https://github.com/yourusername/4chan-thread-extractor.git\n   ```\n2. Navigate to the project directory:\n   ```\n   cd 4chan-thread-extractor\n   ```\n3. Install the dependencies:\n   ```\n   npm install\n   ```\n\n## Project Structure\n\n```\n4chan-thread-extractor/\n├── src/\n│   ├── index.js\n│   └── utils/\n│       └── threadExtractor.js\n├── package.json\n└── README.md\n```\n\n- `src/index.js`: The entry point of the application\n- `src/utils/threadExtractor.js`: Contains the main function for extracting the thread data\n- `package.json`: Project configuration and dependencies\n\n## Usage\n\nThe main function `extract4chanThread` in `src/utils/threadExtractor.js` takes two parameters:\n\n1. `board`: The 4chan board to search (e.g., \"g\" for /g/, \"v\" for /v/, etc.)\n2. `subjectTexts`: An array of strings to match in thread subjects\n\nTo run the script, you can modify `src/index.js` to search for specific threads. For example, to search for \"/lmg/\" threads on /g/:\n\n```javascript\nimport { extract4chanThread } from \"./utils/threadExtractor.js\";\n\nasync function main() {\n  console.log(\"Extracting posts from the most popular /lmg/ thread...\");\n  const extractedText = await extract4chanThread(\"g\", [\n    \"/lmg/\",\n    \"Local Models General\",\n  ]);\n  console.log(extractedText);\n}\n\nmain().catch(console.error);\n```\n\nThen run the script using:\n\n```\nnpm start\n```\n\nThis will execute the script and print the extracted posts to the console.\n\n### Examples\n\n1. To scrape the local models general thread on /g/:\n\n   ```javascript\n   const extractedText = await extract4chanThread(\"g\", [\n     \"/lmg/\",\n     \"Local Models General\",\n   ]);\n   ```\n\n2. To scrape the Fortnite General on /vg/:\n   ```javascript\n   const extractedText = await extract4chanThread(\"vg\", [\n     \"/fng/\",\n     \"Fortnite General\",\n   ]);\n   ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fausboss%2F4chan-scraper-playwright","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fausboss%2F4chan-scraper-playwright","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fausboss%2F4chan-scraper-playwright/lists"}