Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with apify

A curated list of projects in awesome lists tagged with apify .

https://github.com/apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 29 Sep 2024

https://github.com/apifytech/apify-js

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 05 Aug 2024

https://github.com/apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation beautifulsoup crawler crawling headless headless-chrome pip playwright python scraper scraping web-crawler web-crawling web-scraping

Last synced: 30 Sep 2024

https://github.com/apify/apify-cli

Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.

apify command-line headless-chrome puppeteer serveless

Last synced: 29 Sep 2024

https://github.com/apify/actor-scraper

House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.

apify web-scraping

Last synced: 01 Aug 2024

https://github.com/VaclavRut/actor-amazon-crawler

Amazon crawler - this configuration will extract items for a keywords that you will specify in the input, and it will automatically extract all pages for the given keyword. You can specify more keywords on the input for one run.

amazon-com amazon-crawler amazon-de amazon-extractor apify apify-cli apify-proxy apify-sdk extract-items

Last synced: 01 Aug 2024

https://github.com/apify/actor-content-checker

You can use this act to monitor any page's content and get a notification when content changes.

apify content-selector web-scraping

Last synced: 01 Aug 2024

https://github.com/Nikolay-Lysenko/servifier

An easy-to-use tool for making web service with API from your own Python functions.

api-maker apify ml-engineering model-to-production web-service

Last synced: 02 Aug 2024

https://github.com/shruagarwal/activeloop-langchain-coursebot

Ask questions related to langchain course brought by Activeloop.

apify apify-api cohere langchain llm-chatbot openai-api streamlit

Last synced: 01 Oct 2024

https://github.com/umran/apify

A tool to bootstrap a headless content management and delivery system using graphql, mongodb, redis and elasticsearch

apify backend bootstrap-modern-backends cms cms-backend elasticsearch graphql mongodb mongoosejs redis

Last synced: 27 Sep 2024