An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with nutch

A curated list of projects in awesome lists tagged with nutch .

https://github.com/apache/nutch

Apache Nutch is an extensible and scalable web crawler

apache crawling hadoop java nutch web-crawler

Last synced: 23 Apr 2025

https://github.com/USCDataScience/sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

big-data distributed-systems information-retrieval nutch search search-engine solr spark tika web-crawler

Last synced: 25 Mar 2025

https://github.com/nasa-jpl-memex/memex-explorer

Viewers for statistics and dashboarding of Domain Search Engine data

ache anaconda apache crawler dashboard domain-discovery memex-explorer miniconda nutch tika

Last synced: 25 Nov 2024

https://github.com/daijiale/ocr_fontssearchengine

A OCR Search Engine With Tesseract Nutch Solr And PHP

font mac-tesseract nutch ocr-php ocr-web solr tesseract-ocr

Last synced: 15 Apr 2025

https://github.com/apache/nutch-webapp

Apache Nutch is an extensible and scalable web crawler

apache crawling hadoop java nutch web-crawler

Last synced: 09 Apr 2025

https://github.com/nbro/financialnewssearchengine

A very simple search engine "specialised" in searching financial news.

angularjs hbase nutch search-engine solr spring-boot

Last synced: 11 Apr 2025

https://github.com/jgimeno/solr-nutch-orchestrator

Launch fast and easy an Apache Solr linked with Apache Nutch in separated docker containers.

nutch orchestration solr

Last synced: 04 Mar 2025

https://github.com/balestrapatrick/applesearch

A Vapor app consisting in a simple search engine built for my information retrieval course project.

nutch solr swift vapor wikipedia

Last synced: 13 Mar 2025

https://github.com/apache/nutch-site

Apache Nutch Website

apache hugo nutch

Last synced: 25 Mar 2025