Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eea/eea-crawler

EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).
https://github.com/eea/eea-crawler

airflow-dags crawler elasticsearch etl-pipeline indexing

Last synced: about 2 months ago
JSON representation

EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).

Awesome Lists containing this project

README

        

# Airflow and Logstash configurations for EEA-Crawler

See https://github.com/eea/eea.docker.airflow for integration and deployment.

Building this docker image will create a volume that contains the Python code
DAGs required for the crawler operation.