https://github.com/chucheng92/hadoopdedup
:watermelon:基于Hadoop和HBase的大规模海量数据去重
https://github.com/chucheng92/hadoopdedup
big-data cdc dedup fsp mapreduce
Last synced: about 1 year ago
JSON representation
:watermelon:基于Hadoop和HBase的大规模海量数据去重
- Host: GitHub
- URL: https://github.com/chucheng92/hadoopdedup
- Owner: chucheng92
- Created: 2016-09-07T12:27:39.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2018-04-08T10:15:11.000Z (about 8 years ago)
- Last Synced: 2025-03-28T15:11:35.424Z (about 1 year ago)
- Topics: big-data, cdc, dedup, fsp, mapreduce
- Language: Java
- Homepage: http://rann.cc
- Size: 12 MB
- Stars: 29
- Watchers: 9
- Forks: 16
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## 基于Hadoop和HBase的大规模海量数据去重
## 目录
data - 数据集
docs - 文档
src - MapReduce
## 环境
Hadoop版本1.1.2
HBase 0.94.8