https://github.com/aikuyun/bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
https://github.com/aikuyun/bigdata-doc
bigdata flink hadoop hdfs hive kafka mapreduce
Last synced: 8 months ago
JSON representation
大数据学习笔记,学习路线,技术案例整理。
- Host: GitHub
- URL: https://github.com/aikuyun/bigdata-doc
- Owner: aikuyun
- Created: 2018-10-17T05:46:50.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2023-01-04T15:27:52.000Z (over 3 years ago)
- Last Synced: 2023-03-05T13:22:33.699Z (over 3 years ago)
- Topics: bigdata, flink, hadoop, hdfs, hive, kafka, mapreduce
- Language: Shell
- Homepage: https://data.cuteximi.com
- Size: 2.38 MB
- Stars: 39
- Watchers: 2
- Forks: 21
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README



# 大数据学习资源整合
大数据与机器学习笔记,持续更新中。
# 文章分类
- 大数据技术周报
- [大数据技术周报,每周更新](https://mp.weixin.qq.com/mp/appmsgalbum?__biz=MzU0OTgxNjMyNA==&action=getalbum&album_id=2052897255342309378&scene=173&from_msgid=2247484324&from_itemidx=1&count=3&nolastread=1#wechat_redirect)
- 机器学习
- [从机器学习谈起](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/ml-guid.md)
- [机器学习术语](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/ml-term.md)
- [机器学习路线](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/study-road.md)
- [推荐两个网站,认清自己的阶段](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/study-website.md)
- 分布式基础
- [分布式基础](https://github.com/aikuyun/ziyuan/blob/master/docs/distribute/distribute.md)
- 大数据生态
- [HDFS](https://github.com/aikuyun/ziyuan/tree/master/docs/ziyuan01#hdfs)
- [MapReduce](https://github.com/aikuyun/ziyuan/tree/master/docs/ziyuan01#mapreduce)
- [Hive](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan01/Hive.md)
- 深挖底层
- [Hadoop HA 机制](https://github.com/aikuyun/ziyuan/tree/master/docs/ziyuan02#hadoop-ha-%E6%9C%BA%E5%88%B6)
- [MR原理和运行过程](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/MRyuanli.md)
- [NameNode内部解析](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/MRyuanli.md)
- [二次排序](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/secondarySort.md)
- [kafka](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/kafka-01.md)
- 解决方案
- [很多大厂解决方案](https://github.com/aikuyun/ziyuan/blob/master/docs/It-chat/case.md)
- [日均万亿条数据如何处理?爱奇艺实时计算平台这样做](https://mp.weixin.qq.com/s/DKP08aUSNMOySNcs_y6ODA)
- [揭秘微信「看一看」 是如何为你推荐的](https://mp.weixin.qq.com/s/Regv8UUc5PH9HcnUq_zq3A)
- 技术文章整理
- [技术文章整理](https://github.com/aikuyun/ziyuan/blob/master/docs/artical/artical.md)
- Spark
- [Spark 调优](https://mp.weixin.qq.com/s/iNovecaYkKrytNgQMvIMZw)
- [Spark shuffle 寻址流程](https://mp.weixin.qq.com/s/0eQPmVnXCbEr1ziPAW569A)
- [Spark shuffle 调优](https://mp.weixin.qq.com/s/keJnU0trtTW9W-zBWPKD5A)
- [Spark 数据本地化级别](https://mp.weixin.qq.com/s/kF4zjiambBohSJG9gZW8_g)
- [Spark 的核心 RDD 以及 Stage 划分细节,运行模式总结](https://mp.weixin.qq.com/s/aPwsPTkFakBwv3MIioaOOg)
- kafka
- [kafka + sparkstreaming](https://mp.weixin.qq.com/s/wKjSalxFdVkRXGPnNVg_2g)
- [kafka 数据丢失与重复消费](https://mp.weixin.qq.com/s/ROoVOVgNW8jzdCZeAwLTDQ)
- HBase
- [HBase 架构](https://mp.weixin.qq.com/s/j2Kbi003Etzw_15KwV0TyQ)
- [HBase 架构补充](https://mp.weixin.qq.com/s/7yRequ0pqGN_00zi704wwA)
- Hadoop
- [Hadoop HA 原理分析](https://mp.weixin.qq.com/s/BmVvoi8k0mU9pmGQCl2Sug)
- [Hadoop系列之 1.0 和 2.0 架构](https://mp.weixin.qq.com/s/B_wOtK1gSVlmB4cF5hZG2A)
- [Hadoop系列之 Hive](https://mp.weixin.qq.com/s/fWKX6NR908fLbVUMFwpj8A)
- [Hadoop系列之 Mapreduce](https://mp.weixin.qq.com/s/JDDTTy6QfZtwz547M88GMQ)
- [Hadoop系列之 HDFS](https://mp.weixin.qq.com/s/Dcsat0-iRB_xYRBoMfhoXg)
- Flink
- [Flink社区电子书](https://mp.weixin.qq.com/s?__biz=MzIwMjA2MTk4Ng==&mid=2247485438&idx=1&sn=2bb7f82402dc4607f94cdb78e48cd48b&chksm=96e52633a192af25a5c6b2371dfed395aa46168639c01bb49dbc36381f2b3dd889bfe9256d6a&xtrack=1&scene=0&subscene=91&sessionid=1555230598&clicktime=1555230760&ascene=7&devicetype=android-27&version=27000334&nettype=cmnet&abtest_cookie=BAABAAoACwASABMABQAjlx4AVpkeAMeZHgDRmR4A3JkeAAAA&lang=zh_CN&pass_ticket=cRjrq%2F8EqXfIhZvDoJO4rqTvtx1hEu4fyHiignznzsezMHPtQ83VFn8G02ozwToC&wx_header=1)
- [Flink 里程碑版本即将发布,快点入手](https://mp.weixin.qq.com/s/OmTmPHaP0vSPT128eAf2Ig)
- [重磅福利!《Apache Flink 十大技术难点实战》发布,帮你从容应对生产环境中的技术难题](https://mp.weixin.qq.com/s/U3c4oXFLPuc4XiUNUY55gg)
- [2020 年 Flink 学习资料整合,建议收藏](https://mp.weixin.qq.com/s/wuKBvNbkO-pTWZEMSvGLNg)
- 数据仓库
- [离线数仓与实时数仓(一)](https://mp.weixin.qq.com/s/dpwQ4sx-IWL66m03lPa6rg)
- [58全站用户行为数据仓库建设及实践](https://mp.weixin.qq.com/s/MnfdsLHGjK9okv020cS_Kg)
- [干货 | 携程机票数据仓库建设之路](https://mp.weixin.qq.com/s/oPQFDl-A-6BnPXhNdwnePA)
- [干货 | 携程Hadoop跨机房架构实践](https://mp.weixin.qq.com/s/S5SXNabYqwyUMl1ReLayKw)
- Hive 基础
- [Hive 数据压缩格式总结](https://mp.weixin.qq.com/s/T6Y4vMYghb_asWdtsZnjpA)
- [CombineFileInputFormat 文件分片总结](https://mp.weixin.qq.com/s/DZ-CfrVrr7i0iA2GRdBN1g)
- [Hive SQL 窗口函数](https://mp.weixin.qq.com/s/qhP2tOS5plxaczPN1JkWJw)
- [Hive SQL 分析函数](https://mp.weixin.qq.com/s/6nNr97z-Rj5Alofl8wCwhw)
- 底层基础
- [深入理解 MySQL 索引底层原理](https://mp.weixin.qq.com/s/J7eQcwBgQEGJk4bGIa9wDA)
- [缓存击穿、缓存失效及热点key的解决方案](https://mp.weixin.qq.com/s/TqqTDy2YizLMwE0tyHxKVA)
## 欢迎关注原创公众号
公众号:大数据学习指南 专注大数据数据技术

其他平台,会不定时同步更新。
- [语雀](https://www.yuque.com/cuteximi/base)
- [知乎](https://zhuanlan.zhihu.com/bigdata1995)
- [头条号](https://www.toutiao.com/c/user/70068423102/#mid=1579500719412238)