Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Crawler

A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (web spidering).

https://github.com/sxoxgxi/webcrawler

A multi threaded web crawler

crawler python webcrawling

Last synced: 25 Jan 2025

https://github.com/zfael/scrape-it-all

Modular web scraper for Node.JS

crawler scraper scraping scraping-websites web-scraping

Last synced: 23 Dec 2024

https://github.com/leegeunhyeok/python-gongucrawler

파이썬3 공유마당 이미지 및 상세정보 크롤러

crawler python

Last synced: 22 Dec 2024

https://github.com/filipsedivy/tachometer-check

🚘 MDČR - kontrola tachometru

crawler czech-republic mdcr

Last synced: 23 Dec 2024

https://github.com/alancesar/crawler

HTML crawler

crawler docker spider

Last synced: 30 Jan 2025

https://github.com/s3rgeym/wscrap

Command line web scraping tool.

crawler scraping

Last synced: 23 Dec 2024

https://github.com/mahdijamebozorg/cryptonewscrawler

An end-to-end AI pipeline that performs technical and fundamental analysis of different cryptocurrencies.

crawler crypto cryptocurrency data-mining datamining information-retrieval llm python

Last synced: 16 Jan 2025

https://github.com/phanletrunghieu/webcrawler

A web crawler with Spring MVC

crawler java servlet spring-mvc springframework

Last synced: 28 Jan 2025

https://github.com/jyasskin/pbot-crawler

Crawler for PBOT's website to show what has changed.

crawler

Last synced: 28 Jan 2025

https://github.com/allotmentandy/socialmedialinkextractor

php laravel package to extract social media links from an array of links for my spider, used as part of a spider for checking londinium.com website links

crawler extractor facebook laravel linked-list php social social-network spider twitter url youtube

Last synced: 23 Dec 2024

https://github.com/sssshefer/web-crawler-http

Basic web crawler which represents the linking structure of the website

crawler jest jest-tests js

Last synced: 12 Jan 2025

https://github.com/serge45/pytwgasprices

APIs to fetch the latest Taiwan gas prices

crawler gas price python taiwan

Last synced: 14 Jan 2025

https://github.com/nagilum/focus

Simple CLI tool, written in C#, to crawl a site and log the responses.

cli crawl crawler csharp playwright

Last synced: 16 Jan 2025

https://github.com/ilovebacteria/digikala-api

This python package requests to Digikala API and gets a product detail.

crawler digikala pypi

Last synced: 14 Nov 2024

https://github.com/bingxyz/btcethcrawler

telegram 比特幣、乙太幣廣播頻道

bash bash-script crawler telegram-bot

Last synced: 22 Jan 2025

https://github.com/kimi0230/pstocks

Python 爬股市

crawler numpy pandas python python3 stocks

Last synced: 16 Jan 2025

https://github.com/vaibhavyadav-dev/codeforces-problemset-scrapper

Web Scrapper that scrap the whole problemset of Codeforces into csv or json file.

codeforces competative competative-programming crawler problemset programming python scrapy-crawler scrapy-spider

Last synced: 16 Jan 2025

https://github.com/iamtonmoy0/sitemap-crawler

site map crawler with golang and goquery

crawler

Last synced: 05 Jan 2025

https://github.com/brnrajoriya/india-s-states-and-cities-crawler

Crawler to crawl india's all states and cities

cities crawler india php script states

Last synced: 16 Jan 2025

https://github.com/erickj3/strike-api

this is a web scraping api with nestsj

api crawler flow nestjs scraping typescript

Last synced: 24 Jan 2025

https://github.com/huakunshen/cron-crawler-template

Web Crawler Cron Job Template running with GitHub Action. Capable of sending email notifications.

crawler github-actions python

Last synced: 17 Jan 2025

https://github.com/aristotelesbr/api_quotes

Project test for job.

crawler mongodb rails5

Last synced: 17 Jan 2025

https://github.com/kahsolt/qzone_mood_dumper

Dump your qzone mood(说说) history to local SQL database storage

crawler dumper qzone-mood

Last synced: 03 Jan 2025

https://github.com/thiiagoms/car-stealth

REST API to all cars that were stolen

api cars crawler student

Last synced: 16 Jan 2025

https://github.com/bruce-lee-ly/crawler

Several fun crawler cases implemented in Python.

crawler python

Last synced: 16 Jan 2025

https://github.com/mstephen19/apify-click-events

Like TypeScript, but for clicking ;) Manage automated clicks, and ensure your Apify web-crawler is only clicking exactly what you allow it to

apify apify-sdk crawler scraper web-automation

Last synced: 10 Dec 2024

https://github.com/luanpotter/series-api

A simple IMDB crawler feeding a Series API

api crawler imdb json rest series

Last synced: 16 Dec 2024

https://github.com/brianbruggeman/vax

A vaccination signup tool

covid-19 crawler signup vaccination

Last synced: 16 Jan 2025

https://github.com/notreeceharris/webstalker

🕸 A Powerful Relational Web Crawler

crawler datamining webcrawler

Last synced: 14 Jan 2025

https://github.com/frobware/grawler

Web Crawler

crawler go

Last synced: 16 Jan 2025

https://github.com/mustafadalga/website-crawler

Hedef web sitesini tarayarak linklerini listeleyen bir web crawler scripti || A web crawler script that lists links by scanning the target website.

crawl crawler crawling-sites hacking hacking-tool web-crawler web-crawler-python web-crawling

Last synced: 18 Jan 2025

https://github.com/gxjansen/website-to-pdf

Creates a PDF based on the content of a website/subomain

claude-3-sonnet crawler python3

Last synced: 10 Dec 2024

https://github.com/athulmurali/flickr-api-docs-crawler

A python based crawler that extracts the documentation of apis and writes it into a file as JSON. A beautiful documentation page can be built from the JSON file using Docusaurus

api beautifulsoup4 crawler documentation python3

Last synced: 09 Jan 2025

https://github.com/jannchie/go-probe

HTML and JSON data crawler based on Golang. Simple and fast, very easy to use.

collector crawler fetcher golang spider

Last synced: 23 Dec 2024

https://github.com/sc0vu/gocrawl

Simple crawl for golang

crawler golang

Last synced: 30 Jan 2025

https://github.com/tssujt/async-crawler-sample

A simple crawler sample based on asyncio~

aiohttp asyncio crawler

Last synced: 22 Jan 2025

https://github.com/freakwill/mycrawlers

🕷 My Crawlers for Movies、Information、Encyclopedia...

baidu crawler douban movie quotes taobao

Last synced: 26 Jan 2025

https://github.com/guilhem/cachanais

Populate cache by crawling pages

cache crawler hacktoberfest

Last synced: 22 Dec 2024

https://github.com/ymdarake/otenki-crawler

Yet another weather data scraper.

crawler weather weather-data

Last synced: 16 Jan 2025

https://github.com/shentengtu/cht-yp-crawler

Simple Crawler of www.iyp.com.tw.

crawler node-js nodejs yellow-pages yellowpages

Last synced: 11 Jan 2025

https://github.com/kartikmehta8/pycrawler

PyCrawler is a web scraper that takes a link as input and returns all the links connected to the page(s). Goes beyond recursion. Threaded.

crawler cybersecurity python

Last synced: 16 Jan 2025

https://github.com/frostming/daily-wallpaper

A small crawler to get wallpapers from Unsplash

crawler python requests unsplash wallpaper

Last synced: 25 Jan 2025

https://github.com/pinpox/go-random-downloader

Download Html using "Random Page"

crawler golang

Last synced: 28 Jan 2025

https://github.com/truongdd03/searchengine

A search engine written in c++.

cpp crawler search search-engine

Last synced: 20 Dec 2024

https://github.com/jeanluc162/prnt-sc-crawler

Crawler for the Website prnt.sc

crawler net5 net50 prntsc screenshots

Last synced: 16 Jan 2025

https://github.com/kodemartin/webcrawler

A simple webcrawler

crawler rust

Last synced: 25 Jan 2025

https://github.com/kenanbek/tutorial-python-crawler

Crawling website data using Python with requests and Beautiful Soup libraries

beautifulsoup crawler crawling miner parser python python-requests requests

Last synced: 11 Dec 2024

https://github.com/zahraarshia/cti_crawl

This cyber threat intelligence crawler can be used to gather information from various sources, including open-source and commercial feeds.

crawler cti cyber-news-bot cyber-threat-intelligence mongodb python scrapy sqlite3 web-scraper

Last synced: 09 Jan 2025

https://github.com/yaoshanliang/linkedinspider

Crawl job information from LinkedIn for data analysis

big-data crawler python social-network-analysis

Last synced: 11 Dec 2024

https://github.com/landrisek/contentbot

Create simple content (discussion posts and products description) from previously used data or crawl them from public data.

content crawler golang php php72

Last synced: 12 Jan 2025

https://github.com/claudio-code/nap-web-crawler

Created It crawler to find broken links in docs of framework and languages

crawler

Last synced: 11 Dec 2024

https://github.com/r3c0ger/douban-movie-top250-crawler

Crawl the movie information of Douban Movie Top-250, including movie name, movie link, director, starring, release time, production country/region, type, rating, number of reviews and introduction.

beautifulsoup4 crawler lxml python3 spider

Last synced: 09 Jan 2025

https://github.com/tetreum/puppeteer-for-crawling

Daily use crawling methods for puppeteer

crawler crawling puppeteer

Last synced: 04 Feb 2025

https://github.com/leonardopinho/instagramfeed

Image list based on a tag for the Instagram feed.

crawler instagram python

Last synced: 02 Feb 2025

https://github.com/yanglr/csharp_spider

Crawler in C#

crawler csharp spider

Last synced: 22 Jan 2025

https://github.com/zhanziyuan/webdownloader

Download elements from the specified website.

crawler downloader image image-downloader python python-crawler web

Last synced: 08 Jan 2025

https://github.com/raspi/scrapy-kuntavaalit2021-almamedia

Fetch Almamedia kuntavaalit 2021 data

crawler mirror python scrapy spider webcrawler

Last synced: 10 Nov 2024

https://github.com/kianoushamirpour/crawl_google_scholar_with_selenium_fastapi_mongodb

Crawl google scholar profiles with selenium, store the extracted data in the MongoDB and serve the queries with FastAPI.

crawler fastapi google-scholar mongodb python selenium

Last synced: 25 Dec 2024

https://github.com/sandrewtx08/gearbest_scraper

Seeks catalog ads from Gearbest web page, scraping catalogs information then it's storing by a sequence of SQL commands through a relational database.

crawler gearbest lxml python scraper scraping sqlite3

Last synced: 10 Jan 2025

https://github.com/vhdm/twitter-hashtag-crawler

Twitter hashtag crawler by selenium, without using the Twitter API ;)

crawler python tor twitter

Last synced: 05 Jan 2025

https://github.com/raspi/scrapy-kuntavaalit2021-sanoma

Fetch Sanoma kuntavaalit 2021 data

crawler mirror python scrapy spider webcrawler

Last synced: 10 Nov 2024

https://github.com/n3d1117/sisop17

Esercizio per esame di Sistemi Operativi - 2017

crawler html java parser semaphores synchronization thread-safety threading

Last synced: 19 Dec 2024

https://github.com/viktorholk/ranged

A Rust-based web crawler and pattern matcher

crawler regex rust scraper web

Last synced: 12 Dec 2024

https://github.com/docongminh/vinbdi-crawler

crawl data using scrapy + bs4

bs4-requests crawler scrapy splash

Last synced: 28 Dec 2024

https://github.com/raspi/scrapy-transcend

Crawler for transcend (us.transcend-info.com)

crawler hardware memory scrapy spider

Last synced: 08 Jan 2025

https://github.com/raspi/scrapy-kuntavaalit2021-keskisuomalainen

Fetch Keskisuomalainen kuntavaalit 2021 data

crawler mirror python scrapy spider webcrawler

Last synced: 10 Nov 2024

https://github.com/antoniowd/crawly

Un web crawler para explorar la web en busca de determinada informacion (email, telefonos, etc...)

crawler got jsdom nodejs webcrawler webscraping

Last synced: 12 Dec 2024

https://github.com/zaneh/ocw-crawler

Crawl MIT OpenCourseWare courses with Kimurai. Not affiliated.

crawler kimurai mit ocw opencourseware spider

Last synced: 15 Jan 2025

https://github.com/fredcodee/pexel.com-image-scrapper

download images from pexel.com

crawler image python selenium

Last synced: 08 Jan 2025

https://github.com/madret/selenium_crawler

Selenium Webcrawler based on the chromedriver.

chromedriver crawler human-like selenium selenium-webdriver webcrawler

Last synced: 15 Jan 2025

https://github.com/rutopio/crawler-cpbl-player-data

針對中華職棒官網的球員資料進行爬蟲與整理。

cpbl crawler crawling python

Last synced: 31 Jan 2025

https://github.com/webdevcave/directory-crawler-php

Directory Crawler PHP is a simple PHP library for recursively crawling through directories and listing files and directories.

crawler crawling directory path php php-library

Last synced: 09 Nov 2024

https://github.com/engageintellect/scrapers

A repository of web scrapers using Python & Scrapy

crawler python scrapy spider

Last synced: 13 Dec 2024

https://github.com/zenixls2/2chpreprocess

Dump messages from 2ch with some preprocessing for ML analysis

2ch crawler python

Last synced: 31 Jan 2025

https://github.com/daviddavo/blogspot-crawler

Crawler for blogspot and blogger with beautifulsoup

crawler hacktoberfest python

Last synced: 23 Jan 2025

https://github.com/moe131/webcrawler

Python web crawler designed to scrape websites

crawler crawling-python python python-crawler scraping simhash web-crawler

Last synced: 23 Dec 2024

https://github.com/luickk/vulnerability-crawler

Small python program meant to analyze random sites found on google for any vulnerabilities!

crawler xss

Last synced: 28 Dec 2024

https://github.com/aweirddev/air-web

A lightweight package for crawling the web with the minimalist of code.

crawl crawler markdown scrape scraper web

Last synced: 09 Nov 2024

https://github.com/copha-project/copha

Open-Source Software For Managing Tasks

crawler framework nodejs puppeteer selenium

Last synced: 15 Jan 2025

https://github.com/lesterrry/mutt

More Usable Time Tracker

crawler ios-calendar parser

Last synced: 07 Jan 2025

https://github.com/tetreum/xupopter_chrome_extension

Extension to easily create crawling recipes

crawler scrapper scrapping webscraper

Last synced: 17 Dec 2024

https://github.com/devindon/movie-crawler

Movie crawler for douban.com, pianku.tv, etc.

crawler nodejs typescript

Last synced: 02 Feb 2025

https://github.com/izh318/genie-music-artist-album-crawler

지니뮤직에 등록 되어 있는 특정 아티스트의 앨범 정보를 한 번에 크롤링 하는 Python Script 입니다.

crawler genie genie-music gui

Last synced: 28 Dec 2024