Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/apache/tez
Apache Tez
https://github.com/apache/tez
apache big-data hadoop java tez
Last synced: 2 days ago
JSON representation
Apache Tez
- Host: GitHub
- URL: https://github.com/apache/tez
- Owner: apache
- License: apache-2.0
- Created: 2013-04-08T07:20:23.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2025-01-08T07:12:00.000Z (11 days ago)
- Last Synced: 2025-01-10T02:04:18.090Z (9 days ago)
- Topics: apache, big-data, hadoop, java, tez
- Language: Java
- Homepage: https://tez.apache.org/
- Size: 29.1 MB
- Stars: 484
- Watchers: 34
- Forks: 425
- Open Issues: 59
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-dataops - Apache Tez - A generic data-processing pipeline engine envisioned as a low-level engine. (Data Processing)
README
Apache Tez
==========Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions
such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.At its heart, tez is very simple and has just two components:
* The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to
perform arbitrary data-processing. Every 'task' in tez has the following:
- Input to consume key/value pairs from.
- Processor to process them.
- Output to collect the processed key/value pairs.* A master for the data-processing application, where-by one can put together arbitrary data-processing 'tasks'
described above into a task-DAG to process data as desired.
The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.