Projects in Awesome Lists by apify
A curated list of projects in awesome lists by apify .
https://github.com/apifytech/apify-js
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping
Last synced: 06 Jul 2025
https://github.com/apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping
Last synced: 03 Nov 2025
https://github.com/apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
apify automation beautifulsoup crawler crawling hacktoberfest headless headless-chrome pip playwright python scraper scraping web-crawler web-crawling web-scraping
Last synced: 06 Mar 2026
https://github.com/apify/fingerprint-suite
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
fingerprinting playwright puppeteer scraping typescript
Last synced: 01 Mar 2026
https://github.com/apify/proxy-chain
Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
headless-chrome javascript-library proxy-server proxychains
Last synced: 03 Nov 2025
https://github.com/apify/apify-mcp-server
The Apify MCP server enables your AI agents to extract data from social media, search engines, maps, e-commerce sites, or any other website using thousands of ready-made scrapers, crawlers, and automation tools available on the Apify Store.
Last synced: 30 Jan 2026
https://github.com/apify/got-scraping
HTTP client made for scraping based on got.
Last synced: 03 Nov 2025
https://github.com/apify/impit
impit | rust library for browser impersonation
Last synced: 02 Mar 2026
https://github.com/apify/mcp-cli
mcpc is a CLI client for MCP. It supports persistent sessions, stdio/HTTP, OAuth 2.1, JSON output for code mode, proxy for AI sandboxes, and much more.
ai-agents bash claude cli code-mode command-line mcp mcp-client model-context-protocol shell
Last synced: 04 Mar 2026
https://github.com/apify/apify-cli
Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.
apify command-line hacktoberfest headless-chrome puppeteer serveless
Last synced: 04 Mar 2026
https://github.com/apify/actor-page-analyzer
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
headless-chrome javascript web-scraping
Last synced: 12 Apr 2025
https://github.com/apify/apify-sdk-js
Apify SDK monorepo
actor apify javascript nodejs sdk typescript
Last synced: 19 Feb 2026
https://github.com/apify/apify-sdk-python
The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.
apify automation python scraping sdk
Last synced: 25 Feb 2026
https://github.com/apify/actor-scraper
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
Last synced: 03 Nov 2025
https://github.com/apify/browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
browser-automation headless-browsers playwright puppeteer rpa scraping web-scraping
Last synced: 03 Nov 2025
https://github.com/apify/fingerprint-generator
Generates realistic browser fingerprints
Last synced: 03 Nov 2025
https://github.com/apify/apify-actor-docker
Base Docker images for Apify actors.
Last synced: 03 Nov 2025
https://github.com/apify/mcp-client-capabilities
Index of all Model Context Protocol (MCP) clients and their capabilities
mcp mcp-clients model-context-protocol
Last synced: 05 Mar 2026
https://github.com/apify/apify-client-js
Apify API client for JavaScript / Node.js.
Last synced: 03 Nov 2025
https://github.com/apify/header-generator
NodeJs package for generating browser-like headers.
Last synced: 03 Nov 2025
https://github.com/apify/fingerprint-injector
Home of fingerprint injector.
Last synced: 03 Nov 2025
https://github.com/apify/actor-whitepaper
This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.
agents automation node-js python scraping serverless
Last synced: 03 Nov 2025
https://github.com/apify/actor-web-automation-agent
This is the experimental version of Web Automation Agent. The agent uses natural language instructions to browse the web and extract data.
Last synced: 04 Mar 2026
https://github.com/apify/apify-docs
This project is the home of Apify's documentation.
Last synced: 03 Nov 2025
https://github.com/apify/xlsx-stream
JavaScript / Node.js library to stream data into an XLSX file
Last synced: 03 Nov 2025
https://github.com/apify/actor-templates
This project is the :house: home of Apify Actor templates to help users quickly get started. Contributions welcome!
Last synced: 03 Nov 2025
https://github.com/apify/super-scraper
Generic REST API for scraping websites. Drop-in replacement for ScrapingBee, ScrapingAnt, and ScraperAPI services. And it is open-source!
api apify cheerio javascript nodejs playwright scraping typescript web-scraping
Last synced: 03 Nov 2025
https://github.com/apify/got-cjs
An action to release a CommonJS version of the popular library got, which is soon to be available only in an ESM format.
Last synced: 03 Nov 2025
https://github.com/apify/actor-content-checker
You can use this act to monitor any page's content and get a notification when content changes.
apify content-selector web-scraping
Last synced: 03 Nov 2025
https://github.com/apify/apify-shared-js
Utilities and constants shared across Apify projects.
Last synced: 25 Feb 2026
https://github.com/apify/actor-quick-start
Contains a boilerplate of an Apify actor to help you get started quickly build your own actors.
Last synced: 03 Nov 2025
https://github.com/apify/devtools-server
Runs a simple server that allows you to connect to Chrome DevTools running on dynamic hosts, not only localhost.
Last synced: 03 Nov 2025
https://github.com/apify/idcac
I Don't Care About Cookies extension compiled for use with Playwright/Puppeteer
Last synced: 03 Nov 2025
https://github.com/apify/apify-zapier-integration
Apify integration for Zapier
Last synced: 03 Mar 2026
https://github.com/apify/actor-scrapy-executor
Apify actor to run web spiders written in Python in the Scrapy library
Last synced: 03 Nov 2025
https://github.com/apify/better-sqlite3-with-prebuilds
Better SQLite prebuild & publish action
Last synced: 03 Nov 2025
https://github.com/apify/actor-llmstxt-generator
The /llms.txt Generator Actor 🕸️📄 extracts website content to create an llms.txt file for AI apps 🤖✨ like LLM fine-tuning and indexing. Output is available 📥 in the Key-Value Store for easy download and integration into workflows. 🚀
Last synced: 03 Jan 2026
https://github.com/apify/chat-with-a-website
A simple app that lets you chat with a given website.
Last synced: 03 Nov 2025
https://github.com/apify/actor-legacy-phantomjs-crawler
The actor implements the legacy Apify Crawler product. It uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of JavaScript code.
apify headless-browsers phantomjs web-crawler web-scraping
Last synced: 19 Feb 2026
https://github.com/apify/actor-vector-database-integrations
Transfer data from Apify Actors to vector databases (Chroma, Milvus, Pinecone, PostgreSQL (PG-Vector), Qdrant, and Weaviate)
Last synced: 03 Nov 2025
https://github.com/apify/browser-headers-generator
Package generating randomized browser-like headers.
Last synced: 03 Nov 2025
https://github.com/apify/langchain-apify
Apify integration for LangChain 🦜🔗
Last synced: 06 Mar 2026
https://github.com/apify/actor-whitepaper-web
Documentation site for the Actor Programming Model – a fresh take on serverless microapps. Built with Astro.
actor-model apify astro website
Last synced: 03 Nov 2025
https://github.com/apify/input-schema-editor-react
Apify input schema editor written in React.js
Last synced: 03 Nov 2025
https://github.com/apify/crawlee-parallel-scraping-example
An example repository showcasing how you can scrape in parallel using one request queue
Last synced: 03 Nov 2025
https://github.com/apify/actor-example-proxy-intercept-request
Example: Intercept requests from https connection using "Man in the middle" proxy solution.
Last synced: 10 Feb 2026
https://github.com/apify/actor-example-secret-input
Example actor showcasing the secret input fields
Last synced: 26 Jan 2026
https://github.com/apify/apify-web-covid-19
A list of public COVID-19 APIs to be rendered on https://apify.com/covid-19
Last synced: 03 Nov 2025
https://github.com/apify/apify-storage-local-js
Local emulation of the apify-client NPM package, which enables local use of Apify SDK.
Last synced: 03 Nov 2025
https://github.com/apify/actor-imagediff
Returns an image containing difference of two given images.
Last synced: 03 Nov 2025
https://github.com/apify/aidevworld2023
How to get clean web data for chatbots and LLMs slides and supporting materials.
Last synced: 03 Nov 2025
https://github.com/apify/llama-hub
A library of data loaders for LLMs made by the community -- to be used with GPT Index and/or LangChain
Last synced: 11 Apr 2025
https://github.com/apify/apify-eslint-config
Apify ESLint preset to be shared between projects
Last synced: 03 Nov 2025
https://github.com/apify/waw-file-specification
Contains specification of the Web Automation Workflow (WAW) file.
Last synced: 23 Feb 2026
https://github.com/apify/scraping-tools-js
A library of utility functions that make scraping, data extraction and usage of headless browsers easier and faster.
Last synced: 03 Nov 2025
https://github.com/apify/actor-example-php
Example of Apify actor using PHP
Last synced: 03 Nov 2025
https://github.com/apify/apify-shared-python
Constants and utilities shared across Apify's Python libraries and projects.
Last synced: 16 Jan 2026
https://github.com/apify/rag-web-browser
RAG Web Browser is a tool to provide your RAG pipelines with up-to-date information from the web.
Last synced: 03 Nov 2025
https://github.com/apify/openapi
An OpenAPI specification for the Apify API.
Last synced: 12 Apr 2025
https://github.com/apify/slack-messages-action
It wraps up messages sending from Apify GitHub workflows into Slack.
Last synced: 03 Nov 2025
https://github.com/apify/playwright-test-actor
Source code for the Playwright Test public actor.
Last synced: 03 Nov 2025
https://github.com/apify/actor-selenium-mocha-runner
Actor that runs Selenium based Mocha tests.
Last synced: 01 Mar 2026
https://github.com/apify/scrapy-migrator
A standalone POC script for wrapping Scrapy projects with Apify middleware.
Last synced: 09 Nov 2025
https://github.com/apify/actor-crawler-puppeteer
DEPRECATED: An Apify actor that enables crawling of websites using headless Chrome and Puppeteer. The actor is highly customizable and supports recursive crawling of websites as well as lists of URLs.
Last synced: 03 Nov 2025
https://github.com/apify/actor-crawler-cheerio
DEPRECATED: An actor that crawls websites and parses HTML pages using Cheerio library. Supports recursive crawling as well as URL lists.
Last synced: 03 Nov 2025
https://github.com/apify/actor-email-signature-generator
Apify Email Signature Generator
Last synced: 20 Jan 2026
https://github.com/apify/actor-algolia-website-indexer
Apify actor that crawls website and indexes selected web pages to Algolia index. It's used to power the search on https://help.apify.com
Last synced: 03 Nov 2025
https://github.com/apify/apify-eslint-config-ts
Typescript ESLint configuration shared across projects in Apify.
Last synced: 09 Nov 2025
https://github.com/apify/pull-request-toolkit-action
The Github action that makes sure that each PR is correctly set up and has a milestone set.
Last synced: 03 Nov 2025
https://github.com/apify/apify-tsconfig
TypeScript configuration shared across projects in Apify.
Last synced: 03 Nov 2025
https://github.com/apify/actor-monorepo-example
An example repository with multiple Apify Actors sharing code between each other.
Last synced: 03 Nov 2025
https://github.com/apify/actor-scrapy-books-example
Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.
Last synced: 03 Nov 2025
https://github.com/apify/apify-sdk-v2
Snapshot of Apify SDK v2 + sdk.apify.com website. This project is no longer maintained. See the https://github.com/apify/apify-sdk-js repo instead!
Last synced: 03 Nov 2025
https://github.com/apify/apify-haystack
The official integration for Apify and Haystack 2.0
Last synced: 03 Nov 2025
https://github.com/apify/keboola-ex-apify
Apify extractor for Keboola Connection
Last synced: 03 Nov 2025
https://github.com/apify/actor-integration-tests
This Apify actor is used for integration tests.
Last synced: 03 Nov 2025
https://github.com/apify/apify-docs-preset
Common preset for the v2 documentation Docusaurus instances.
Last synced: 03 Nov 2025
https://github.com/apify/release-pr-action
This action simplify creating of release PR
Last synced: 01 Feb 2026
https://github.com/apify/actor-aws-costs-to-slack
This tool integrates with AWS to monitor service usage costs and posts a summary of these costs to a Slack channel. The summary includes costs for various AWS services along with a chart that provides a visual breakdown of the costs over time.
Last synced: 03 Nov 2025
https://github.com/apify/apify.github.io
The top-level organization Github Page.
Last synced: 25 Jan 2026
https://github.com/apify/docs-search-modal
Custom Algolia search modal for Apify Documentation.
Last synced: 03 Nov 2025
https://github.com/apify/echo-standby-actor
An example Actor using Standby mode
Last synced: 02 Feb 2026