Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-data-engineer
This is a repo with links to everything you'd ever want to learn about data engineering
https://github.com/xiaomingx/awesome-data-engineer
Last synced: 2 days ago
JSON representation
-
入门指南
-
精选的[超过25本数据工程经典书籍列表](books.md)
-
公司博客中的数据工程内容:
-
数据工程领域的白皮书:
- 大数据质量:一种数据质量分析模型
- Spark:基于工作集的集群计算
- 大数据质量:一种数据质量分析模型
- 五层商业智能架构
- 湖仓一体:新一代统一数据仓储与高级分析平台
- 大数据质量:一种数据质量分析模型
- 湖仓体系:数据仓库及其他
- Google 文件系统
- 构建通用数据湖仓
- XTable 实战:数据湖中的无缝互操作
- MapReduce:简化大规模集群的数据处理
- 五层商业智能架构
- 湖仓一体:新一代统一数据仓储与高级分析平台
- 大数据质量:一种数据质量分析模型
- 湖仓体系:数据仓库及其他
- 构建通用数据湖仓
- XTable 实战:数据湖中的无缝互操作
- MapReduce:简化大规模集群的数据处理
- 大数据质量:一种数据质量分析模型
-
数据工程公司分类
- dbt
- Databricks
- Onehouse
- Delta Lake
- Firebolt
- Gable
- Coalesce
- Soda
- DQOps
- HEDDA.IO
- DataExpert.io
- LearnDataEngineering.com
- Airflow
- Kestra
- Shipyard
- Hamilton
- Tabular
- Airflow
- Kestra
- Shipyard
- Hamilton
- Tabular
- ByteByteGo
- Metabase
- Looker Studio
- Tableau
- Apache Superset
- Databricks
- Onehouse
- Delta Lake
- Firebolt
- dbt
- Gable
- Metabase
- Looker Studio
- Tableau
- Apache Superset
- Cube
- Coalesce
- Soda
- DQOps
- HEDDA.IO
- DataExpert.io
- LearnDataEngineering.com
- Meltano
- Cube
- dlt
- Sling
- Apache Druid
- ClickHouse
- Apache Pinot
- Apache Kylin
- DuckDB
- QuestDB
- AdalFlow
- dlt
- Sling
- Meltano
- Apache Druid
- ClickHouse
- Apache Pinot
- Apache Kylin
- DuckDB
- QuestDB
- AdalFlow
- LangChain
- LlamaIndex
- Aggregations.io
- Responsive
- RisingWave
- Striim
- LangChain
- LlamaIndex
- Aggregations.io
- Responsive
- RisingWave
- Striim
- Snowflake
- LearnDataEngineering.com
-
精选的[超过10个值得加入的数据工程社区](communities.md)
-
-
社交媒体账号列表
-
数据工程领域的白皮书:
- Li Yin
- Jaco van Gelder
- Joseph Machado
- Dipankar Mazumdar
- Darshil Parmar - parmar/) (100k+) | | | |
- Data with Zach
- E-learning Bridge
- TrendyTech
- Darshil Parmar - parmar/) (100k+) | | | |
- ByteByteGo
- The Ravit Show
- Guy in a Cube
- Adam Marczak
- Data with Zach
- E-learning Bridge
- TrendyTech
- nullQueries
- TECHTFQ by Thoufiq
- SQLBI
- Azure Lib - goyal-93805a17/) (100k+) | | | |
- Kahan Data Solutions
- Ankit Bansal
- Mr. K Talks Tech
- Li Yin
- Jaco van Gelder
- Joseph Machado
- Simon Späti
- Dipankar Mazumdar
- Hugo Lu
- Tobias Macey
- ByteByteGo
- The Ravit Show
- Guy in a Cube
- Adam Marczak
- nullQueries
- TECHTFQ by Thoufiq
- SQLBI
- Azure Lib - goyal-93805a17/) (100k+) | | | |
- Kahan Data Solutions
- Ankit Bansal
- Mr. K Talks Tech
- Li Yin
- Jaco van Gelder
- Joseph Machado
- Eric Roby
- Simon Späti
- Dipankar Mazumdar
- Daniel Ciocirlan
- Hugo Lu
- Tobias Macey
- Marcos Ortiz
- Julien Hurault
- Alex The Analyst - freberg/) (100k+) | | | [@alex_the_analyst](https://www.tiktok.com/@alex_the_analyst) (10k+) |
- Marc Lamberti
- Chip Huyen
- Alex Merced Data
- Marcos Ortiz
- Julien Hurault
- Alex The Analyst - freberg/) (100k+) | | | [@alex_the_analyst](https://www.tiktok.com/@alex_the_analyst) (10k+) |
- Marc Lamberti
- Chip Huyen
- Alex Merced Data
- John Kutay
- Lakshmi Sontenam
- Hassaan Akbar
- Python Basics
- Constantin Lungu
- Ijaz Ali
- Subhankar
- Big Data Show
- John Kutay
- Lakshmi Sontenam
- Hassaan Akbar
- Python Basics
- Constantin Lungu
- Ijaz Ali
- Subhankar
- Big Data Show
-
优质播客推荐
- The Data Engineering Show
- Data Engineering Podcast
- DataTopics
- The Data Engineering Side Of Data
- The Data Coffee Break Podcast
- The Datastack Show
- Intricity101 Data Sharks Podcast
- Analytics Power Hour
- Catalog & Cocktails
- Datatalks
- Data Brew by Databricks
- The Data Cloud Podcast by Snowflake
- Open||Source||Data by Datastax
- The Data Scientist Show
- MLOps.community
- Monday Morning Data Chat
- The Data Chief
- DataWare
- What's New in Data
- Streaming Audio by Confluent
-
数据工程术语资源
-
优质数据工程资讯订阅
-
优秀播客推荐
-
优秀 [20+ 新闻通讯推荐](newsletters.md)
-
术语表:
-
Programming Languages
Categories
Sub Categories
Keywords
rag
6
python
4
framework
4
llm
4
machine-learning
4
agent
2
software-engineering
2
pandas
2
orchestration
2
mlops
2
llmops
2
lineage
2
feature-engineering
2
etl-pipeline
2
etl-framework
2
etl
2
dataframe
2
data-science
2
data-engineering
2
data-analysis
2
dag
2
vector-database
2
multi-agents
2
llamaindex
2
fine-tuning
2
data
2
application
2
agents
2
trainer
2
summarization
2
retriever
2
reranker
2
question-answering
2
optimizer
2
nlp
2
information-retrieval
2
generative-ai
2
faiss
2
chatbot
2
bm25
2
ai
2