Projects in Awesome Lists tagged with nutch
A curated list of projects in awesome lists tagged with nutch .
https://github.com/apache/nutch
Apache Nutch is an extensible and scalable web crawler
apache crawling hadoop java nutch web-crawler
Last synced: 23 Apr 2025
https://github.com/USCDataScience/sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
big-data distributed-systems information-retrieval nutch search search-engine solr spark tika web-crawler
Last synced: 25 Mar 2025
https://github.com/nasa-jpl-memex/memex-explorer
Viewers for statistics and dashboarding of Domain Search Engine data
ache anaconda apache crawler dashboard domain-discovery memex-explorer miniconda nutch tika
Last synced: 25 Nov 2024
https://github.com/daijiale/ocr_fontssearchengine
A OCR Search Engine With Tesseract Nutch Solr And PHP
font mac-tesseract nutch ocr-php ocr-web solr tesseract-ocr
Last synced: 15 Apr 2025
https://github.com/apache/nutch-webapp
Apache Nutch is an extensible and scalable web crawler
apache crawling hadoop java nutch web-crawler
Last synced: 09 Apr 2025
https://github.com/nbro/financialnewssearchengine
A very simple search engine "specialised" in searching financial news.
angularjs hbase nutch search-engine solr spring-boot
Last synced: 11 Apr 2025
https://github.com/jgimeno/solr-nutch-orchestrator
Launch fast and easy an Apache Solr linked with Apache Nutch in separated docker containers.
Last synced: 04 Mar 2025