Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eea/eea-crawler
EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).
https://github.com/eea/eea-crawler
airflow-dags crawler elasticsearch etl-pipeline indexing
Last synced: about 2 months ago
JSON representation
EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).
- Host: GitHub
- URL: https://github.com/eea/eea-crawler
- Owner: eea
- Created: 2021-05-31T11:41:24.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2024-11-22T08:12:51.000Z (2 months ago)
- Last Synced: 2024-11-22T09:19:54.841Z (2 months ago)
- Topics: airflow-dags, crawler, elasticsearch, etl-pipeline, indexing
- Language: Python
- Homepage:
- Size: 477 KB
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Airflow and Logstash configurations for EEA-Crawler
See https://github.com/eea/eea.docker.airflow for integration and deployment.
Building this docker image will create a volume that contains the Python code
DAGs required for the crawler operation.