An open API service indexing awesome lists of open source software.

https://github.com/aikuyun/bigdata-doc

大数据学习笔记,学习路线,技术案例整理。
https://github.com/aikuyun/bigdata-doc

bigdata flink hadoop hdfs hive kafka mapreduce

Last synced: 8 months ago
JSON representation

大数据学习笔记,学习路线,技术案例整理。

Awesome Lists containing this project

README

          

![自学大数据](https://img.shields.io/badge/%E8%87%AA%E5%AD%A6-%E5%A4%A7%E6%95%B0%E6%8D%AE-brightgreen.svg)
![自学机器学习](https://img.shields.io/badge/%E8%87%AA%E5%AD%A6-%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0-brightgreen.svg)
![大数据进击之路](https://img.shields.io/badge/%E8%87%AA%E5%AD%A6-%E5%A4%A7%E6%95%B0%E6%8D%AE%E8%BF%9B%E5%87%BB%E4%B9%8B%E8%B7%AF-blue.svg)

# 大数据学习资源整合

大数据与机器学习笔记,持续更新中。

# 文章分类
- 大数据技术周报
- [大数据技术周报,每周更新](https://mp.weixin.qq.com/mp/appmsgalbum?__biz=MzU0OTgxNjMyNA==&action=getalbum&album_id=2052897255342309378&scene=173&from_msgid=2247484324&from_itemidx=1&count=3&nolastread=1#wechat_redirect)

- 机器学习
- [从机器学习谈起](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/ml-guid.md)
- [机器学习术语](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/ml-term.md)
- [机器学习路线](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/study-road.md)
- [推荐两个网站,认清自己的阶段](https://github.com/aikuyun/bigdata-doc/blob/master/docs/ml/study-website.md)

- 分布式基础

- [分布式基础](https://github.com/aikuyun/ziyuan/blob/master/docs/distribute/distribute.md)

- 大数据生态
- [HDFS](https://github.com/aikuyun/ziyuan/tree/master/docs/ziyuan01#hdfs)

- [MapReduce](https://github.com/aikuyun/ziyuan/tree/master/docs/ziyuan01#mapreduce)

- [Hive](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan01/Hive.md)

- 深挖底层
- [Hadoop HA 机制](https://github.com/aikuyun/ziyuan/tree/master/docs/ziyuan02#hadoop-ha-%E6%9C%BA%E5%88%B6)

- [MR原理和运行过程](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/MRyuanli.md)

- [NameNode内部解析](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/MRyuanli.md)

- [二次排序](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/secondarySort.md)

- [kafka](https://github.com/aikuyun/ziyuan/blob/master/docs/ziyuan02/kafka-01.md)

- 解决方案
- [很多大厂解决方案](https://github.com/aikuyun/ziyuan/blob/master/docs/It-chat/case.md)
- [日均万亿条数据如何处理?爱奇艺实时计算平台这样做](https://mp.weixin.qq.com/s/DKP08aUSNMOySNcs_y6ODA)
- [揭秘微信「看一看」 是如何为你推荐的](https://mp.weixin.qq.com/s/Regv8UUc5PH9HcnUq_zq3A)

- 技术文章整理

- [技术文章整理](https://github.com/aikuyun/ziyuan/blob/master/docs/artical/artical.md)

- Spark
- [Spark 调优](https://mp.weixin.qq.com/s/iNovecaYkKrytNgQMvIMZw)
- [Spark shuffle 寻址流程](https://mp.weixin.qq.com/s/0eQPmVnXCbEr1ziPAW569A)
- [Spark shuffle 调优](https://mp.weixin.qq.com/s/keJnU0trtTW9W-zBWPKD5A)
- [Spark 数据本地化级别](https://mp.weixin.qq.com/s/kF4zjiambBohSJG9gZW8_g)
- [Spark 的核心 RDD 以及 Stage 划分细节,运行模式总结](https://mp.weixin.qq.com/s/aPwsPTkFakBwv3MIioaOOg)

- kafka
- [kafka + sparkstreaming](https://mp.weixin.qq.com/s/wKjSalxFdVkRXGPnNVg_2g)
- [kafka 数据丢失与重复消费](https://mp.weixin.qq.com/s/ROoVOVgNW8jzdCZeAwLTDQ)

- HBase
- [HBase 架构](https://mp.weixin.qq.com/s/j2Kbi003Etzw_15KwV0TyQ)
- [HBase 架构补充](https://mp.weixin.qq.com/s/7yRequ0pqGN_00zi704wwA)

- Hadoop
- [Hadoop HA 原理分析](https://mp.weixin.qq.com/s/BmVvoi8k0mU9pmGQCl2Sug)
- [Hadoop系列之 1.0 和 2.0 架构](https://mp.weixin.qq.com/s/B_wOtK1gSVlmB4cF5hZG2A)
- [Hadoop系列之 Hive](https://mp.weixin.qq.com/s/fWKX6NR908fLbVUMFwpj8A)
- [Hadoop系列之 Mapreduce](https://mp.weixin.qq.com/s/JDDTTy6QfZtwz547M88GMQ)
- [Hadoop系列之 HDFS](https://mp.weixin.qq.com/s/Dcsat0-iRB_xYRBoMfhoXg)

- Flink
- [Flink社区电子书](https://mp.weixin.qq.com/s?__biz=MzIwMjA2MTk4Ng==&mid=2247485438&idx=1&sn=2bb7f82402dc4607f94cdb78e48cd48b&chksm=96e52633a192af25a5c6b2371dfed395aa46168639c01bb49dbc36381f2b3dd889bfe9256d6a&xtrack=1&scene=0&subscene=91&sessionid=1555230598&clicktime=1555230760&ascene=7&devicetype=android-27&version=27000334&nettype=cmnet&abtest_cookie=BAABAAoACwASABMABQAjlx4AVpkeAMeZHgDRmR4A3JkeAAAA&lang=zh_CN&pass_ticket=cRjrq%2F8EqXfIhZvDoJO4rqTvtx1hEu4fyHiignznzsezMHPtQ83VFn8G02ozwToC&wx_header=1)
- [Flink 里程碑版本即将发布,快点入手](https://mp.weixin.qq.com/s/OmTmPHaP0vSPT128eAf2Ig)
- [重磅福利!《Apache Flink 十大技术难点实战》发布,帮你从容应对生产环境中的技术难题](https://mp.weixin.qq.com/s/U3c4oXFLPuc4XiUNUY55gg)
- [2020 年 Flink 学习资料整合,建议收藏](https://mp.weixin.qq.com/s/wuKBvNbkO-pTWZEMSvGLNg)

- 数据仓库
- [离线数仓与实时数仓(一)](https://mp.weixin.qq.com/s/dpwQ4sx-IWL66m03lPa6rg)
- [58全站用户行为数据仓库建设及实践](https://mp.weixin.qq.com/s/MnfdsLHGjK9okv020cS_Kg)
- [干货 | 携程机票数据仓库建设之路](https://mp.weixin.qq.com/s/oPQFDl-A-6BnPXhNdwnePA)
- [干货 | 携程Hadoop跨机房架构实践](https://mp.weixin.qq.com/s/S5SXNabYqwyUMl1ReLayKw)

- Hive 基础
- [Hive 数据压缩格式总结](https://mp.weixin.qq.com/s/T6Y4vMYghb_asWdtsZnjpA)
- [CombineFileInputFormat 文件分片总结](https://mp.weixin.qq.com/s/DZ-CfrVrr7i0iA2GRdBN1g)
- [Hive SQL 窗口函数](https://mp.weixin.qq.com/s/qhP2tOS5plxaczPN1JkWJw)
- [Hive SQL 分析函数](https://mp.weixin.qq.com/s/6nNr97z-Rj5Alofl8wCwhw)

- 底层基础
- [深入理解 MySQL 索引底层原理](https://mp.weixin.qq.com/s/J7eQcwBgQEGJk4bGIa9wDA)
- [缓存击穿、缓存失效及热点key的解决方案](https://mp.weixin.qq.com/s/TqqTDy2YizLMwE0tyHxKVA)

## 欢迎关注原创公众号

公众号:大数据学习指南 专注大数据数据技术

![扫我](https://cdn.nlark.com/yuque/0/2021/png/199648/1631944506464-83677e15-283f-43de-b106-5ff823300c85.png?x-oss-process=image%2Fresize%2Cw_900%2Climit_0)

其他平台,会不定时同步更新。

- [语雀](https://www.yuque.com/cuteximi/base)
- [知乎](https://zhuanlan.zhihu.com/bigdata1995)
- [头条号](https://www.toutiao.com/c/user/70068423102/#mid=1579500719412238)