An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by apify

A curated list of projects in awesome lists by apify .

https://github.com/apifytech/apify-js

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 06 Jul 2025

https://github.com/apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 03 Nov 2025

https://github.com/apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation beautifulsoup crawler crawling hacktoberfest headless headless-chrome pip playwright python scraper scraping web-crawler web-crawling web-scraping

Last synced: 06 Mar 2026

https://github.com/apify/fingerprint-suite

Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

fingerprinting playwright puppeteer scraping typescript

Last synced: 01 Mar 2026

https://github.com/apify/proxy-chain

Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

headless-chrome javascript-library proxy-server proxychains

Last synced: 03 Nov 2025

https://github.com/apify/apify-mcp-server

The Apify MCP server enables your AI agents to extract data from social media, search engines, maps, e-commerce sites, or any other website using thousands of ready-made scrapers, crawlers, and automation tools available on the Apify Store.

agents ai mcp mcp-server

Last synced: 30 Jan 2026

https://github.com/apify/got-scraping

HTTP client made for scraping based on got.

Last synced: 03 Nov 2025

https://github.com/apify/impit

impit | rust library for browser impersonation

Last synced: 02 Mar 2026

https://github.com/apify/mcp-cli

mcpc is a CLI client for MCP. It supports persistent sessions, stdio/HTTP, OAuth 2.1, JSON output for code mode, proxy for AI sandboxes, and much more.

ai-agents bash claude cli code-mode command-line mcp mcp-client model-context-protocol shell

Last synced: 04 Mar 2026

https://github.com/apify/apify-cli

Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.

apify command-line hacktoberfest headless-chrome puppeteer serveless

Last synced: 04 Mar 2026

https://github.com/apify/actor-page-analyzer

Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.

headless-chrome javascript web-scraping

Last synced: 12 Apr 2025

https://github.com/apify/apify-sdk-python

The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.

apify automation python scraping sdk

Last synced: 25 Feb 2026

https://github.com/apify/actor-scraper

House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.

apify web-scraping

Last synced: 03 Nov 2025

https://github.com/apify/browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

browser-automation headless-browsers playwright puppeteer rpa scraping web-scraping

Last synced: 03 Nov 2025

https://github.com/apify/apify-client-python

Apify API client for Python

api apify client python scraping

Last synced: 30 Jan 2026

https://github.com/apify/fingerprint-generator

Generates realistic browser fingerprints

Last synced: 03 Nov 2025

https://github.com/apify/apify-actor-docker

Base Docker images for Apify actors.

Last synced: 03 Nov 2025

https://github.com/apify/mcp-client-capabilities

Index of all Model Context Protocol (MCP) clients and their capabilities

mcp mcp-clients model-context-protocol

Last synced: 05 Mar 2026

https://github.com/apify/apify-client-js

Apify API client for JavaScript / Node.js.

Last synced: 03 Nov 2025

https://github.com/apify/header-generator

NodeJs package for generating browser-like headers.

Last synced: 03 Nov 2025

https://github.com/apify/fingerprint-injector

Home of fingerprint injector.

Last synced: 03 Nov 2025

https://github.com/apify/actor-whitepaper

This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.

agents automation node-js python scraping serverless

Last synced: 03 Nov 2025

https://github.com/apify/covid-19

Open APIs with statistics about Covid-19

Last synced: 03 Nov 2025

https://github.com/apify/actor-web-automation-agent

This is the experimental version of Web Automation Agent. The agent uses natural language instructions to browse the web and extract data.

Last synced: 04 Mar 2026

https://github.com/apify/apify-docs

This project is the home of Apify's documentation.

Last synced: 03 Nov 2025

https://github.com/apify/xlsx-stream

JavaScript / Node.js library to stream data into an XLSX file

Last synced: 03 Nov 2025

https://github.com/apify/actor-templates

This project is the :house: home of Apify Actor templates to help users quickly get started. Contributions welcome!

Last synced: 03 Nov 2025

https://github.com/apify/super-scraper

Generic REST API for scraping websites. Drop-in replacement for ScrapingBee, ScrapingAnt, and ScraperAPI services. And it is open-source!

api apify cheerio javascript nodejs playwright scraping typescript web-scraping

Last synced: 03 Nov 2025

https://github.com/apify/got-cjs

An action to release a CommonJS version of the popular library got, which is soon to be available only in an ESM format.

Last synced: 03 Nov 2025

https://github.com/apify/actor-content-checker

You can use this act to monitor any page's content and get a notification when content changes.

apify content-selector web-scraping

Last synced: 03 Nov 2025

https://github.com/apify/apify-shared-js

Utilities and constants shared across Apify projects.

Last synced: 25 Feb 2026

https://github.com/apify/actor-quick-start

Contains a boilerplate of an Apify actor to help you get started quickly build your own actors.

Last synced: 03 Nov 2025

https://github.com/apify/devtools-server

Runs a simple server that allows you to connect to Chrome DevTools running on dynamic hosts, not only localhost.

Last synced: 03 Nov 2025

https://github.com/apify/idcac

I Don't Care About Cookies extension compiled for use with Playwright/Puppeteer

Last synced: 03 Nov 2025

https://github.com/apify/apify-zapier-integration

Apify integration for Zapier

api apify web-scraping zapier

Last synced: 03 Mar 2026

https://github.com/apify/homebrew-tap

A Homebrew tap for Apify tools

Last synced: 03 Nov 2025

https://github.com/apify/actor-scrapy-executor

Apify actor to run web spiders written in Python in the Scrapy library

apify scrapy scrapy-spiders

Last synced: 03 Nov 2025

https://github.com/apify/better-sqlite3-with-prebuilds

Better SQLite prebuild & publish action

Last synced: 03 Nov 2025

https://github.com/apify/actor-llmstxt-generator

The /llms.txt Generator Actor 🕸️📄 extracts website content to create an llms.txt file for AI apps 🤖✨ like LLM fine-tuning and indexing. Output is available 📥 in the Key-Value Store for easy download and integration into workflows. 🚀

Last synced: 03 Jan 2026

https://github.com/apify/chat-with-a-website

A simple app that lets you chat with a given website.

Last synced: 03 Nov 2025

https://github.com/apify/actor-legacy-phantomjs-crawler

The actor implements the legacy Apify Crawler product. It uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of JavaScript code.

apify headless-browsers phantomjs web-crawler web-scraping

Last synced: 19 Feb 2026

https://github.com/apify/workflows

Apify's reusable github workflows

Last synced: 03 Nov 2025

https://github.com/apify/actor-vector-database-integrations

Transfer data from Apify Actors to vector databases (Chroma, Milvus, Pinecone, PostgreSQL (PG-Vector), Qdrant, and Weaviate)

Last synced: 03 Nov 2025

https://github.com/apify/browser-headers-generator

Package generating randomized browser-like headers.

Last synced: 03 Nov 2025

https://github.com/apify/langchain-apify

Apify integration for LangChain 🦜🔗

Last synced: 06 Mar 2026

https://github.com/apify/actor-whitepaper-web

Documentation site for the Actor Programming Model – a fresh take on serverless microapps. Built with Astro.

actor-model apify astro website

Last synced: 03 Nov 2025

https://github.com/apify/input-schema-editor-react

Apify input schema editor written in React.js

Last synced: 03 Nov 2025

https://github.com/apify/crawlee-parallel-scraping-example

An example repository showcasing how you can scrape in parallel using one request queue

Last synced: 03 Nov 2025

https://github.com/apify/actor-example-proxy-intercept-request

Example: Intercept requests from https connection using "Man in the middle" proxy solution.

Last synced: 10 Feb 2026

https://github.com/apify/actor-example-secret-input

Example actor showcasing the secret input fields

Last synced: 26 Jan 2026

https://github.com/apify/apify-web-covid-19

A list of public COVID-19 APIs to be rendered on https://apify.com/covid-19

Last synced: 03 Nov 2025

https://github.com/apify/apify-storage-local-js

Local emulation of the apify-client NPM package, which enables local use of Apify SDK.

Last synced: 03 Nov 2025

https://github.com/apify/actor-imagediff

Returns an image containing difference of two given images.

Last synced: 03 Nov 2025

https://github.com/apify/aidevworld2023

How to get clean web data for chatbots and LLMs slides and supporting materials.

Last synced: 03 Nov 2025

https://github.com/apify/llama-hub

A library of data loaders for LLMs made by the community -- to be used with GPT Index and/or LangChain

Last synced: 11 Apr 2025

https://github.com/apify/apify-eslint-config

Apify ESLint preset to be shared between projects

Last synced: 03 Nov 2025

https://github.com/apify/waw-file-specification

Contains specification of the Web Automation Workflow (WAW) file.

Last synced: 23 Feb 2026

https://github.com/apify/scraping-tools-js

A library of utility functions that make scraping, data extraction and usage of headless browsers easier and faster.

Last synced: 03 Nov 2025

https://github.com/apify/actor-example-php

Example of Apify actor using PHP

Last synced: 03 Nov 2025

https://github.com/apify/apify-shared-python

Constants and utilities shared across Apify's Python libraries and projects.

Last synced: 16 Jan 2026

https://github.com/apify/rag-web-browser

RAG Web Browser is a tool to provide your RAG pipelines with up-to-date information from the web.

crawling llm scraper serp

Last synced: 03 Nov 2025

https://github.com/apify/openapi

An OpenAPI specification for the Apify API.

Last synced: 12 Apr 2025

https://github.com/apify/slack-messages-action

It wraps up messages sending from Apify GitHub workflows into Slack.

Last synced: 03 Nov 2025

https://github.com/apify/playwright-test-actor

Source code for the Playwright Test public actor.

Last synced: 03 Nov 2025

https://github.com/apify/actor-selenium-mocha-runner

Actor that runs Selenium based Mocha tests.

Last synced: 01 Mar 2026

https://github.com/apify/scrapy-migrator

A standalone POC script for wrapping Scrapy projects with Apify middleware.

Last synced: 09 Nov 2025

https://github.com/apify/actor-crawler-puppeteer

DEPRECATED: An Apify actor that enables crawling of websites using headless Chrome and Puppeteer. The actor is highly customizable and supports recursive crawling of websites as well as lists of URLs.

Last synced: 03 Nov 2025

https://github.com/apify/actor-crawler-cheerio

DEPRECATED: An actor that crawls websites and parses HTML pages using Cheerio library. Supports recursive crawling as well as URL lists.

Last synced: 03 Nov 2025

https://github.com/apify/actor-email-signature-generator

Apify Email Signature Generator

Last synced: 20 Jan 2026

https://github.com/apify/actor-algolia-website-indexer

Apify actor that crawls website and indexes selected web pages to Algolia index. It's used to power the search on https://help.apify.com

Last synced: 03 Nov 2025

https://github.com/apify/apify-eslint-config-ts

Typescript ESLint configuration shared across projects in Apify.

Last synced: 09 Nov 2025

https://github.com/apify/pull-request-toolkit-action

The Github action that makes sure that each PR is correctly set up and has a milestone set.

Last synced: 03 Nov 2025

https://github.com/apify/apify-tsconfig

TypeScript configuration shared across projects in Apify.

Last synced: 03 Nov 2025

https://github.com/apify/actor-monorepo-example

An example repository with multiple Apify Actors sharing code between each other.

Last synced: 03 Nov 2025

https://github.com/apify/actor-scrapy-books-example

Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.

apify scrapy

Last synced: 03 Nov 2025

https://github.com/apify/komparz

Special, yet insignificant actors

Last synced: 03 Nov 2025

https://github.com/apify/apify-sdk-v2

Snapshot of Apify SDK v2 + sdk.apify.com website. This project is no longer maintained. See the https://github.com/apify/apify-sdk-js repo instead!

Last synced: 03 Nov 2025

https://github.com/apify/apify-haystack

The official integration for Apify and Haystack 2.0

apify haystack-ai rag

Last synced: 03 Nov 2025

https://github.com/apify/keboola-ex-apify

Apify extractor for Keboola Connection

Last synced: 03 Nov 2025

https://github.com/apify/actor-integration-tests

This Apify actor is used for integration tests.

Last synced: 03 Nov 2025

https://github.com/apify/apify-docs-preset

Common preset for the v2 documentation Docusaurus instances.

Last synced: 03 Nov 2025

https://github.com/apify/ow-cjs

Last synced: 27 Jan 2026

https://github.com/apify/page-analyzer-ui

Interface for act-page-analyzer

Last synced: 26 Jan 2026

https://github.com/apify/release-pr-action

This action simplify creating of release PR

Last synced: 01 Feb 2026

https://github.com/apify/actor-aws-costs-to-slack

This tool integrates with AWS to monitor service usage costs and posts a summary of these costs to a Slack channel. The summary includes costs for various AWS services along with a chart that provides a visual breakdown of the costs over time.

Last synced: 03 Nov 2025

https://github.com/apify/apify.github.io

The top-level organization Github Page.

Last synced: 25 Jan 2026

https://github.com/apify/docs-search-modal

Custom Algolia search modal for Apify Documentation.

Last synced: 03 Nov 2025

https://github.com/apify/echo-standby-actor

An example Actor using Standby mode

Last synced: 02 Feb 2026