{"id":13570389,"url":"https://github.com/DataExpert-io/data-engineer-handbook","last_synced_at":"2025-04-04T06:32:30.391Z","repository":{"id":208119553,"uuid":"720853375","full_name":"DataExpert-io/data-engineer-handbook","owner":"DataExpert-io","description":"This is a repo with links to everything you'd ever want to learn about data engineering","archived":false,"fork":false,"pushed_at":"2024-11-06T21:53:22.000Z","size":162,"stargazers_count":11688,"open_issues_count":7,"forks_count":1623,"subscribers_count":258,"default_branch":"main","last_synced_at":"2024-11-08T01:45:42.684Z","etag":null,"topics":["apachespark","awesome","bigdata","data","dataengineering","sql"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DataExpert-io.png","metadata":{"files":{"readme":"README.md","changelog":"newsletters.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-19T19:47:41.000Z","updated_at":"2024-11-08T01:45:20.000Z","dependencies_parsed_at":"2023-12-06T07:55:27.035Z","dependency_job_id":"aa5f7b3c-3d82-4e14-96ae-4b840ea35b1a","html_url":"https://github.com/DataExpert-io/data-engineer-handbook","commit_stats":null,"previous_names":["dataengineer-io/data-engineer-handbook","dataexpert-io/data-engineer-handbook"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataExpert-io%2Fdata-engineer-handbook","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataExpert-io%2Fdata-engineer-handbook/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataExpert-io%2Fdata-engineer-handbook/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataExpert-io%2Fdata-engineer-handbook/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DataExpert-io","download_url":"https://codeload.github.com/DataExpert-io/data-engineer-handbook/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247134992,"owners_count":20889412,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apachespark","awesome","bigdata","data","dataengineering","sql"],"created_at":"2024-08-01T14:00:51.874Z","updated_at":"2025-04-04T06:32:25.382Z","avatar_url":"https://github.com/DataExpert-io.png","language":null,"readme":"# The Data Engineering Handbook\n\nThis repo has all the resources you need to become an amazing data engineer!\n\nMake sure to check out the [projects](projects.md) section for more hands-on examples!\n\nMake sure to check out the [interviews](interviews.md) section for more advice on how to pass data engineering interviews!\n\n## Resources\n\nGreat books:\n\n- [Fundamentals of Data Engineering](https://www.amazon.com/Fundamentals-Data-Engineering-Robust-Systems/dp/1098108302/)\n- [Designing Data-Intensive Applications](https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321/)\n- [Designing Machine Learning Systems](https://www.amazon.com/Designing-Machine-Learning-Systems-Production-Ready/dp/1098107969)\n- [The Hundred Page Machine Learning Book](https://www.amazon.com/Hundred-Page-Machine-Learning-Book/dp/199957950X)\n- [Kimball - The Data Warehouse Toolkit](https://ia801609.us.archive.org/14/items/the-data-warehouse-toolkit-kimball/The%20Data%20Warehouse%20Toolkit%20-%20Kimball.pdf)\n- [Data Mesh](https://www.oreilly.com/library/view/data-mesh/9781492092384/)\n- [Machine Learning System Design Interview](https://www.amazon.com/Machine-Learning-System-Design-Interview/dp/1736049127)\n- [Streaming Systems](https://www.amazon.com/Streaming-Systems-Where-Large-Scale-Processing/dp/1491983876)\n- [High Performance Spark](https://www.amazon.com/High-Performance-Spark-Practices-Optimizing/dp/1491943203)\n- [Building Evolutionary Architectures, 2nd Edition](https://www.oreilly.com/library/view/building-evolutionary-architectures/9781492097532/)\n- [Data Management at Scale, 2nd Edition](https://www.oreilly.com/library/view/data-management-at/9781098138851/)\n- [Deciphering Data Architectures](https://www.oreilly.com/library/view/deciphering-data-architectures/9781098150754/)\n- [97 Things Every Data Engineer Should Know: Collective Wisdom from the Experts](https://www.amazon.com/Things-Every-Data-Engineer-Should/dp/1492062413)\n- [Data Governance: The Definitive Guide](https://www.oreilly.com/library/view/data-governance-the/9781492063483/)\n- [Trino: The Definitive Guide](https://trino.io/trino-the-definitive-guide.html)\n- [Delta Lake: The Definitive Guide](https://www.oreilly.com/library/view/delta-lake-the/9781098151935/)\n- [Hadoop: The Definitive Guide](https://www.oreilly.com/library/view/hadoop-the-definitive/9781491901687/)\n- [Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications](https://www.amazon.com/Modern-Engineering-Apache-Spark-Hands/dp/1484274512)\n- [Data Engineering with dbt: A practical guide to building a dependable data platform with SQL](https://www.amazon.com/Data-Engineering-dbt-cloud-based-dependable-ebook/dp/B0C4LL19G7)\n- [Data Engineering with AWS](https://www.oreilly.com/library/view/data-engineering-with/9781804614426/)\n- [Practical DataOps: Delivering Agile Date Science at Scale](https://www.amazon.com/Practical-DataOps-Delivering-Agile-Science/dp/1484251032)\n- [Data Engineering Design Patterns](https://www.dedp.online/)\n- [Snowflake Data Engineering](https://www.manning.com/books/snowflake-data-engineering)\n- [Unlocking dbt](https://www.amazon.com/Unlocking-dbt-Design-Transformations-Warehouse/dp/1484296990/)\n- [Learning Spark, Second Edition](https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf)\n\nCommunities:\n\n- [Seattle Data Guy Discord](https://discord.gg/ah95MZKkFF)\n- [EcZachly Data Engineering Discord](https://discord.gg/JGumAXncAK)\n- [AdalFlow Discrod (LLM Library)](https://discord.com/invite/ezzszrRZvT)\n- [Chip Huyen MLOps Discord](https://discord.gg/dzh728c5t3)\n- [Data Engineer Things Community](https://www.dataengineerthings.org/aboutus/)\n- [DBT Community](https://www.getdbt.com/community/join-the-community/)\n- [r/dataengineering](https://www.reddit.com/r/dataengineering)\n- [Microsoft Fabric Community](https://community.fabric.microsoft.com/)\n- [r/MicrosoftFabric](https://www.reddit.com/r/MicrosoftFabric/)\n- [Data Talks Club Slack](https://datatalks.club/slack)\n- [Data Engineering Wiki](https://dataengineering.wiki/)\n\nCompanies:\n\n- Orchestration  \n  - [Mage](https://www.mage.ai)\n  - [Astronomer](https://www.astronomer.io)\n  - [Prefect](https://www.prefect.io)\n  - [Dagster](https://www.dagster.io)\n  - [Airbyte](https://airbyte.com)\n  - [Kestra](https://kestra.io/) \n  - [Shipyard](https://www.shipyardapp.com/)\n  - [Hamilton](https://github.com/dagworks-inc/hamilton)\n- Data Lake / Cloud\n  - [Tabular](https://www.tabular.io)\n  - [Microsoft](https://www.microsoft.com)\n  - [Databricks](https://www.databricks.com/company/about-us)\n  - [Onehouse](https://www.onehouse.ai)\n  - [Delta Lake](https://delta.io/)\n- Data Warehouse\n  - [Snowflake](https://www.snowflake.com/en/)\n  - [Firebolt](https://www.firebolt.io/)\n- Data Quality\n  - [dbt](https://www.getdbt.com/)\n  - [Gable](https://www.gable.ai)\n  - [Great Expectations](https://www.greatexpectations.io)\n  - [Streamdal](https://streamdal.com)\n  - [Coalesce](https://coalesce.io/)\n  - [Soda](https://www.soda.io/)\n  - [DQOps](https://dqops.com/)\n- Education Companies\n  - [DataExpert.io](https://www.dataexpert.io)\n  - [LearnDataEngineering.com](https://www.learndataengineering.com)\n  - [AlgoExpert](https://www.algoexpert.io)\n  - [ByteByteGo](https://www.bytebytego.com)\n- Analytics / Visualization\n  - [Preset](https://www.preset.io)\n  - [Starburst](https://www.starburst.io)\n  - [Metabase](https://www.metabase.com/)\n  - [Looker Studio](https://lookerstudio.google.com/overview)\n  - [Tableau](https://www.tableau.com/)\n  - [Power BI](https://powerbi.microsoft.com/)\n  - [Apache Superset](https://superset.apache.org/)\n- Data Integration\n  - [Cube](https://cube.dev)\n  - [Fivetran](https://www.fivetran.com)\n  - [Airbyte](https://airbyte.io)\n  - [dlt](https://dlthub.com/)\n  - [Sling](https://slingdata.io/)\n  - [Meltano](https://meltano.com/)\n- Modern OLAP\n  - [Apache Druid](https://druid.apache.org/)\n  - [ClickHouse](https://clickhouse.com/)\n  - [Apache Pinot](https://pinot.apache.org/)\n  - [Apache Kylin](https://kylin.apache.org/)\n  - [DuckDB](https://duckdb.org/)\n- LLM application library\n  - [AdalFlow](https://github.com/SylphAI-Inc/AdalFlow)\n\nData Engineering blogs of companies:\n\n- [Netflix](https://netflixtechblog.com/tagged/big-data)\n- [Uber](https://www.uber.com/blog/houston/data/?uclick_id=b2f43229-f3f4-4bae-bd5d-10a05db2f70c)\n- [Databricks](https://www.databricks.com/blog/category/engineering/data-engineering)\n- [Airbnb](https://medium.com/airbnb-engineering/data/home)\n- [Amazon AWS Blog](https://aws.amazon.com/blogs/big-data/)\n- [Microsoft Data Architecture Blogs](https://techcommunity.microsoft.com/t5/data-architecture-blog/bg-p/DataArchitectureBlog)\n- [Microsoft Fabric Blog](https://blog.fabric.microsoft.com/)\n- [Oracle](https://blogs.oracle.com/datawarehousing/)\n- [Meta](https://engineering.fb.com/category/data-infrastructure/)\n- [Onehouse](https://www.onehouse.ai/blog)\n\nData Engineering Whitepapers:\n\n- [A Five-Layered Business Intelligence Architecture](https://ibimapublishing.com/articles/CIBIMA/2011/695619/695619.pdf)\n- [Lakehouse:A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics](https://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf)\n- [Big Data Quality: A Data Quality Profiling Model](https://link.springer.com/chapter/10.1007/978-3-030-23381-5_5)\n- [The Data Lakehouse: Data Warehousing and More](https://arxiv.org/abs/2310.08697)\n- [Spark: Cluster Computing with Working Sets](https://dl.acm.org/doi/10.5555/1863103.1863113)\n- [The Google File System](https://research.google/pubs/the-google-file-system/)\n- [Building a Universal Data Lakehouse](https://www.onehouse.ai/whitepaper/onehouse-universal-data-lakehouse-whitepaper)\n- [XTable in Action: Seamless Interoperability in Data Lakes](https://arxiv.org/abs/2401.09621)\n- [MapReduce: Simplified Data Processing on Large Clusters](https://research.google/pubs/mapreduce-simplified-data-processing-on-large-clusters/)\n\nGreat YouTube Channels:\n\n- 100k+ subscribers\n  - [E-learning Bridge](https://www.youtube.com/@shashank_mishra)\n  - [TrendyTech](https://www.youtube.com/c/TrendytechInsights)\n  - [Darshil Parmar](https://www.youtube.com/@DarshilParmar)\n  - [Andreas Kretz](https://www.youtube.com/c/andreaskayy)\n  - [ByteByteGo](https://www.youtube.com/c/ByteByteGo)\n  - [The Ravit Show](https://youtube.com/@theravitshow)\n  - [Guy in a Cube](https://www.youtube.com/@GuyInACube)\n  - [Adam Marczak](https://www.youtube.com/@AdamMarczakYT)\n  - [nullQueries](https://www.youtube.com/@nullQueries)\n  - [TECHTFQ by Thoufiq](https://www.youtube.com/@techTFQ)\n- 10k+ subscribers\n  - [Data with Zach](https://www.youtube.com/c/datawithzach)\n  - [Seattle Data Guy](https://www.youtube.com/c/SeattleDataGuy)\n  - [Azure Lib](https://www.youtube.com/@azurelib-academy)\n  - [Advancing Analytics](https://www.youtube.com/@AdvancingAnalytics)\n  - [Kahan Data Solutions](https://www.youtube.com/@KahanDataSolutions)\n  - [Ankit Bansal](https://youtube.com/@ankitbansal6)\n  - [Mr. K Talks Tech](https://www.youtube.com/channel/UCzdOan4AmF65PmLLks8Lmww)\n- 1k+ subscribers\n  - [Eric Roby](https://www.youtube.com/@codingwithroby)\n\nGreat Podcasts\n\n- [The Data Engineering Show](https://www.dataengineeringshow.com/)\n- [Data Engineering Podcast](https://www.dataengineeringpodcast.com/)\n- [DataTopics](https://www.datatopics.io/)\n- [The Data Engineering Side Of Data](https://podcasts.apple.com/us/podcast/the-engineering-side-of-data/id1566999533)\n- [DataWare](https://www.ascend.io/dataaware-podcast/)\n- [The Data Coffee Break Podcast](https://www.deezer.com/us/show/5293247)\n- [Thd datastack show](https://datastackshow.com/)\n- [Intricity101 Data Sharks Podcast](https://www.intricity.com/learningcenter/podcast)\n- [Drill to Detail with Mark Rittman](https://www.rittmananalytics.com/drilltodetail/)\n- [Analytics Power Hour](https://analyticshour.io/)\n- [Catalog \u0026 cocktails](https://listen.casted.us/public/127/Catalog-%26-Cocktails-2fcf8728)\n- [Datatalks](https://datatalks.club/podcast.html)\n- [Data Brew by Databricks](https://www.databricks.com/discover/data-brew)\n- [The Data Cloud Podcast by Snowflake](https://rise-of-the-data-cloud.simplecast.com/)\n- [What's New in data](https://www.striim.com/podcast/)\n- [Open||Source||Data by Datastax](https://www.datastax.com/resources/podcast/open-source-data)\n- [Streaming Audio by confluent](https://developer.confluent.io/podcast/)\n- [The Data Scientist Show](https://podcasts.apple.com/us/podcast/the-data-scientist-show/id1584430381)\n- [MLOps.community](https://podcast.mlops.community/)\n- [Monday Morning Data Chat](https://open.spotify.com/show/3Km3lBNzJpc1nOTJUtbtMh)\n- [The Data Chief](https://www.thoughtspot.com/data-chief/podcast)\n\nNewsletters:\n\n- [DataEngineer.io Newsletter](https://blog.dataengineer.io)\n- [Seattle Data Guy](https://seattledataguy.substack.com)\n- [Joe Reis](https://joereis.substack.com)\n- [Data Engineering Weekly](https://www.dataengineeringweekly.com)\n- [Data Engineering Central](https://dataengineeringcentral.substack.com)\n- [Dutch Engineer](https://dutchengineer.substack.com)\n- [ByteByteGo](https://blog.bytebytego.com)\n- [Start Data Engineering](https://www.startdataengineering.com)\n- [Developing Dev](https://www.developing.dev)\n- [High Growth Engineer](https://careercutler.substack.com/)\n- [Learn Analytics Engineering](https://learnanalyticsengineering.substack.com/)\n- [Marvelous MLOps](https://marvelousmlops.substack.com/)\n- [medium Data Engineering Newsletter](https://medium.com/data-engineering-weekly)\n- [Benn Stancil](https://benn.substack.com/)\n- [Metadata Weekly](https://metadataweekly.substack.com/)\n- [Technically](https://technically.substack.com/)\n- [Blef.fr Data News](https://www.blef.fr/blog/)\n- [All Hands on Data](https://allhandsondata.substack.com/)\n- [Modern Data 101](https://moderndata101.substack.com/)\n- [SELECT Insights](https://newsletter.ssp.sh/)\n- [Interesting Data Gigs](https://newsletter.interestinggigs.com)\n- [Ju Data Engineering Weekly](https://juhache.substack.com/)\n- [From An Engineer Sight](https://fromanengineersight.substack.com/)\n\nGlossaries:\n- [Data Engineering Vault](https://www.ssp.sh/brain/data-engineering/)\n- [Airbyte Data Glossary](https://glossary.airbyte.com/)\n- [Data Engineering Wiki by Reddit](https://dataengineering.wiki/Index)\n- [Seconda Glossary](https://www.secoda.co/glossary/)\n- [Glossary Databricks](https://www.databricks.com/glossary)\n- [Airtable Glossary](https://airtable.com/shrGh8BqZbkfkbrfk/tbluZ3ayLHC3CKsDb)\n- [Data Engineering Glossary by Dagster](https://dagster.io/glossary)\n\nLinkedIn\n\n- 100k+ Followers\n  - [Zach Wilson](https://www.linkedin.com/in/eczachly)\n  - [Ben Rogojan](https://www.linkedin.com/in/benjaminrogojan)\n  - [Sumit Mittal](https://www.linkedin.com/in/bigdatabysumit/)\n  - [Shashank Mishra](https://www.linkedin.com/in/shashank219/)\n  - [Chip Huyen](https://www.linkedin.com/in/chiphuyen/)\n  - [Alex Xu](https://www.linkedin.com/in/alexxubyte)\n  - [Deepak Goyal](https://www.linkedin.com/in/deepak-goyal-93805a17/)\n  - [Andreas Kretz](https://www.linkedin.com/in/andreas-kretz)\n- 50k+ Followers\n  - [Joe Reis](https://www.linkedin.com/in/josephreis)\n  - [Darshil Parmar](https://www.linkedin.com/in/darshil-parmar/)\n  - [Ankit Bansal](https://www.linkedin.com/in/ankitbansal6/)\n  - [Marc Lamberti](https://www.linkedin.com/in/marclamberti)\n- 10k+ Followers\n  - [Li Yin](https://www.linkedin.com/in/li-yin-ai/)\n  - [Joseph Machado](https://www.linkedin.com/in/josephmachado1991/)\n  - [Eric Roby](https://www.linkedin.com/in/codingwithroby/)\n  - [Simon Whiteley](https://www.linkedin.com/in/simon-whiteley-uk/)\n  - [Simon Späti](https://www.linkedin.com/in/sspaeti/)\n- 5k+ Followers\n  - [Dipankar Mazumdar](https://www.linkedin.com/in/dipankar-mazumdar/)\n  - [Daniel Ciocirlan](https://www.linkedin.com/in/danielciocirlan)\n  - [Hugo Lu](https://www.linkedin.com/in/hugo-lu-confirmed/)\n  - [Tobias Macey](https://www.linkedin.com/in/tmacey)\n  - [Marcos Ortiz](https://www.linkedin.com/in/mlortiz)\n  - [Julien Hurault](https://www.linkedin.com/in/julienhuraultanalytics/)\n- 1k+ Followers\n  - [Shruti Mantri](https://www.linkedin.com/in/shruti-mantri-88527a67/)\n  - [Volker Janz](https://www.linkedin.com/in/vjanz/)\n  - [Benoit Pimpaud)(https://www.linkedin.com/in/pimpaudben/)\n\nTwitter / X\n\n- [Zach Wilson](https://www.twitter.com/EcZachly)\n- [Seattle Data Guy](https://www.twitter.com/SeattleDataGuy)\n- [Sumit Mittal](https://www.twitter.com/bigdatasumit)\n- [Joseph Machado](https://twitter.com/startdataeng)\n- [Alex Xu](https://twitter.com/alexxubyte/)\n- [Eric Roby](https://twitter.com/codingwithroby)\n- [Andreas Kretz](https://twitter.com/andreaskayy)\n- [Marc Lamberti](https://twitter.com/marclambertiml)\n- [Dipankar Mazumdar](https://twitter.com/Dipankartnt)\n- [Start Data Engineering](https://twitter.com/startdataeng)\n- [Data Cyborg](https://twitter.com/data_cyborg)\n- [Simon Späti](https://twitter.com/sspaeti)\n- [Marcos Ortiz](https://twitter.com/marcosluis2186)\n\nInstagram\n\n- [Zach Wilson](https://www.instagram.com/eczachly)\n- [Andreas Kretz](https://www.instagram.com/learndataengineering)\n- [Seattle Data Guy](https://www.instagram.com/seattledataguy)\n\nTikTok\n\n- [Zach Wilson](https://www.tiktok.com/@eczachly)\n- [Alex The Analyst](https://www.tiktok.com/@alex_the_analyst)\n- [Marcos Ortiz](https://www.tiktok.com/@marcosluis2186)\n\nDesign Patterns\n\n- [Cumulative Table Design](https://www.github.com/EcZachly/cumulative-table-design)\n- [Microbatch Deduplication](https://www.github.com/EcZachly/microbatch-hourly-deduped-tutorial)\n- [The Little Book of Pipelines](https://www.github.com/EcZachly/little-book-of-pipelines)\n- [Data Developer Platform](https://datadeveloperplatform.org/architecture/)\n\nCourses / Academies\n\n- [DataExpert.io course](https://www.dataexpert.io) use code **HANDBOOK10** for a discount!\n- [LearnDataEngineering.com](https://www.learndataengineering.com)\n- [Technical Freelancer Academy](https://www.technicalfreelanceracademy.com/) Use code **zwtech** for a discount!\n- [IBM Data Engineering for Everyone](https://www.edx.org/learn/data-engineering/ibm-data-engineering-basics-for-everyone)\n- [Qwiklabs](https://www.qwiklabs.com/)\n- [DataCamp](https://www.datacamp.com/)\n- [Udemy Courses from Shruti Mantri](https://www.udemy.com/user/shruti-mantri-5/)\n- [Rock the JVM](https://rockthejvm.com/) teaches Spark (in Scala), Flink and others\n- [Data Engineering Zoomcamp by DataTalksClub](https://datatalks.club/)\n- [Efficient Data Processing in Spark](https://josephmachado.podia.com/efficient-data-processing-in-spark)\n- [Scaler](https://www.scaler.com/)\n\nCertifications Courses\n\n- [Google Cloud Certified - Professional Data Engineer](https://cloud.google.com/certification/data-engineer)\n- [Databricks - Data Engineer Professional](https://www.databricks.com/learn/certification/data-engineer-professional)\n- [Azure Data Engineer Associate](https://learn.microsoft.com/credentials/certifications/azure-data-engineer/)\n- [Microsoft Fabric Analytics Engineer Associate](https://learn.microsoft.com/credentials/certifications/fabric-analytics-engineer-associate/)\n- [Exam DP-203: Data Engineering on Microsoft Azure](https://learn.microsoft.com/en-us/credentials/certifications/exams/dp-203/?tab=tab-learning-paths)\n- [AWS Certified Data Engineer - Associate](https://aws.amazon.com/certification/certified-data-engineer-associate/)\n\nConferences\n\n- [Trino Summit - December 13-14, 2023 - Virtual](https://www.starburst.io/info/trinosummit2023/)\n- [Data Universe - April 10-11, 2024 - New York City](https://www.datauniverseevent.com/)\n- [Data Nova @ Data Universe - April 10-11, 2024 - New York City](https://www.starburst.io/datanova/)\n- [DataTune Conference - March 8-9, 2024 - Nashville, TN](https://www.datatuneconf.com/)\n","funding_links":[],"categories":["Jupyter Notebook","A01_机器学习教程","Repos","Comprehensive Getting Started with DE Resources","\u003ca name=\"Jupyter%20Notebook\"\u003e\u003c/a\u003eJupyter Notebook","⚙️ Data Engineering"],"sub_categories":["Resources"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDataExpert-io%2Fdata-engineer-handbook","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDataExpert-io%2Fdata-engineer-handbook","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDataExpert-io%2Fdata-engineer-handbook/lists"}