https://github.com/apache/incubator-tez
Mirror of Apache Tez (Incubating)
https://github.com/apache/incubator-tez
big-data java tez
Last synced: 7 months ago
JSON representation
Mirror of Apache Tez (Incubating)
- Host: GitHub
- URL: https://github.com/apache/incubator-tez
- Owner: apache
- License: apache-2.0
- Created: 2013-06-12T07:00:18.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2023-06-15T16:13:52.000Z (over 2 years ago)
- Last Synced: 2025-06-16T00:57:19.176Z (8 months ago)
- Topics: big-data, java, tez
- Language: Java
- Size: 8.59 MB
- Stars: 60
- Watchers: 16
- Forks: 41
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.txt
- License: LICENSE.txt
Awesome Lists containing this project
README
Apache Tez
==========
Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions
such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.
At it's heart, tez is very simple and has just two components:
* The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to
perform arbitrary data-processing. Every 'task' in tez has the following:
- Input to consume key/value pairs from.
- Processor to process them.
- Output to collect the processed key/value pairs.
* A master for the data-processing application, where-by one can put together arbitrary data-processing 'tasks'
described above into a task-DAG to process data as desired.
The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.