Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tomaztk/azure-databricks

Azure Databricks - Advent of 2020 Blogposts
https://github.com/tomaztk/azure-databricks

azure-data-factory azure-databricks azure-machine-learnning data-analytics data-engineerg databricks databricks-notebooks machine-learning mlflow mllib notebook notebooks pyspark python r-language scala spark spark-structured-streaming sparkr sql

Last synced: about 3 hours ago
JSON representation

Azure Databricks - Advent of 2020 Blogposts

Awesome Lists containing this project

README

        

# Microsoft Azure Databricks

![](http://img.shields.io/badge/Azure-Databricks-red.svg) ![](http://img.shields.io/badge/Microsoft-Azure-blue.svg)
[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Ftomaztk%2FAzure-Databricks&count_bg=%2379C83D&title_bg=%23555555&icon=microsoftazure.svg&icon_color=%230A6BFF&title=hits&edge_flat=false)](https://hits.seeyoufarm.com)
![](https://img.shields.io/github/forks/tomaztk/azure-databricks?style=social)


Microsoft Azure Databricks repository is
a set of blogposts as a **Advent of Azure Databricks** _2020_ presented to readers for easier onboarding with Azure Databricks!

## Table of content / Featured blogposts

1. [Dec 01 2020 - What is Azure DataBricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2001%202020%20-%20What%20is%20Azure%20DataBricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/01/advent-of-2020-day-1-what-is-azure-databricks/))
2. [Dec 02 2020 - How to get started with Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/%20Dec%2002%202020%20-%20How%20to%20get%20started%20with%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/02/advent-of-2020-day-2-how-to-get-started-with-azure-databricks/))
3. [Dec 03 2020 - Getting to know the workspace and Azure Databricks platform](https://github.com/tomaztk/Azure-Databricks/blob/main/%20Dec%2003%202020%20-%20Getting%20to%20know%20the%20workspace%20and%20Azure%20Databricks%20platform.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/03/advent-of-2020-day-3-getting-to-know-the-workspace-and-azure-databricks-platform/))
4. [Dec 04 2020 - Creating your first Azure Databricks cluster](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2004%202020%20-%20Creating%20your%20first%20Azure%20Databricks%20cluster.md) ([blogspot](https://tomaztsql.wordpress.com/2020/12/04/advent-of-2020-day-4-creating-your-first-azure-databricks-cluster/))
5. [Dec 05 2020 - Basics on architecture of clusters, workers, DBFS storage jobs](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2005%202020%20-%20Understanding%20Azure%20Databricks%20cluster%20architecture%2C%20workers%2C%20drivers%20and%20jobs.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/05/advent-of-2020-day-5-understanding-azure-databricks-cluster-architecture-workers-drivers-and-jobs/))
6. [Dec 06 2020 - Importing and storing data to Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2006%202020%20-%20Importing%20and%20storing%20data%20to%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/06/advent-of-2020-day-6-importing-and-storing-data-to-azure-databricks/))
7. [Dec 07 2020 - Starting with Databricks notebooks and loading data](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2007%202020%20-%20Starting%20with%20Databricks%20notebooks%20and%20loading%20data%20to%20DBFS.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/07/advent-of-2020-day-7-starting-with-databricks-notebooks-and-loading-data-to-dbfs/))
8. [Dec 08 2020 - Using Databricks CLI and DBFS CLI for file upload](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2008%202020%20-%20Using%20Databricks%20CLI%20and%20DBFS%20CLI%20for%20file%20upload.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/08/advent-of-2020-day-8-using-databricks-cli-and-dbfs-cli-for-file-upload/))
9. [Dec 09 2020 - Connect to Azure Blob storage using Notebooks in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2009%202020%20-%20Connect%20to%20Azure%20Blob%20storage%20using%20Notebooks%20in%20%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/09/advent-of-2020-day-9-connect-to-azure-blob-storage-using-notebooks-in-azure-databricks/))
10. [Dec 10 2020 - Using Azure Databricks Notebooks with SQL for Data engineering tasks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2010%202020%20-%20Using%20Azure%20Databricks%20Notebooks%20with%20SQL%20for%20Data%20engineering%20tasks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/10/advent-of-2020-day-10-using-azure-databricks-notebooks-with-sql-for-data-engineering-tasks/))
11. [Dec 11 2020 - Using Azure Databricks Notebooks with R to do Data engineerg and data analytics](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2011%202020%20-%20Using%20Azure%20Databricks%20Notebooks%20with%20SQL%20for%20Data%20engineering%20tasks.md)) ([blogpost](https://tomaztsql.wordpress.com/2020/12/11/advent-of-2020-day-11-using-azure-databricks-notebooks-with-r-language-for-data-analytics/))
12. [Dec 12 2020 - Using Azure Databricks Notebooks with Python to do Data engineerg and data analytics](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2012%202020%20-%20Using%20Azure%20Databricks%20Notebooks%20with%20Python%20Language%20for%20data%20analytics.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/12/advent-of-2020-day-12-using-azure-databricks-notebooks-with-python-language-for-data-analytics/))
13. [Dec 13 2020 - Using Python Databricks Koalas with Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2013%202020%20-%20Using%20Python%20Databricks%20Koalas%20with%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/13/advent-of-2020-day-13-using-python-databricks-koalas-with-azure-databricks/))
14. [Dec 14 2020 - From configuration to execution of Databricks jobs](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2014%202020%20-%20%20From%20configuration%20to%20execution%20of%20Databricks%20jobs.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/14/advent-of-2020-day-14-from-configuration-to-execution-of-databricks-jobs/))
15. [Dec 15 2020 - Databricks Spark UI, Event Logs, Driver logs and Metrics](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2015%202020%20-%20Databricks%20Spark%20UI%2C%20Event%20Logs%2C%20Driver%20logs%20and%20Metrics.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/15/advent-of-2020-day-15-databricks-spark-ui-event-logs-driver-logs-and-metrics/))
16. [Dec 16 2020 - Databricks experiments, models and MLFlow](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2016%202020%20-%20Databricks%20experiments%2C%20models%20and%20MLFlow.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/16/advent-of-2020-day-16-databricks-experiments-models-and-mlflow/))
17. [Dec 17 2020 - End-to-End Machine learning project in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2017%202020%20-%20End-to-End%20Machine%20learning%20project%20in%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/17/advent-of-2020-day-17-end-to-end-machine-learning-project-in-azure-databricks/))
18. [Dec 18 2020 - Using Azure Data Factory with Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2018%202020%20-%20Using%20Azure%20Data%20Factory%20with%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/18/advent-of-2020-day-18-using-azure-data-factory-with-azure-databricks/))
19. [Dec 19 2020 - Using Azure Data Factory with Azure Databricks for merging CSV files](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2019%202020%20-%20Using%20Azure%20Data%20Factory%20with%20Azure%20Databricks%20for%20merging%20CSV%20files.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/19/advent-of-2020-day-19-using-azure-data-factory-with-azure-databricks-for-merging-csv-files/))
20. [Dec 20 2020 - Orchestrating multiple notebooks with Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2020%202020%20-%20Orchestrating%20multiple%20notebooks%20with%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/16/advent-of-2020-day-16-databricks-experiments-models-and-mlflow/
))
21. [Dec 21 2020 - Using Scala with Spark Core API in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2021%202020%20-%20Using%20Scala%20with%20Spark%20Core%20API%20in%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/21/advent-of-2020-day-21-using-scala-with-spark-core-api-in-azure-databricks/))
22. [Dec 22 2020 - Using Spark SQL and DataFrames in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2022%202020%20-%20Using%20Spark%20SQL%20and%20DataFrames%20in%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/22/advent-of-2020-day-22-using-spark-sql-and-dataframes-in-azure-databricks/))
23. [Dec 23 2020 - Using Spark Streaming in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2023%202020%20-%20Using%20Spark%20Streaming%20in%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/23/advent-of-2020-day-23-using-spark-streaming-in-azure-databricks/))
24. [Dec 24 2020 - Using Spark MLlib for Machine Learning in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2024%202020%20-%20Using%20Spark%20MLlib%20for%20Machine%20Learning%20in%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/24/advent-of-2020-day-24-using-spark-mllib-for-machine-learning-in-azure-databricks/))
25. [Dec 25 2020 - Using Spark GraphFrames in Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2025%202020%20-%20Using%20Spark%20GraphFrames%20in%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/25/advent-of-2020-day-25-using-spark-graphframes-in-azure-databricks/))
26. [Dec 26 2020 - Connecting Azure Machine Learning Services Workspace and Azure Databricks](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2026%202020%20-%20Connecting%20Azure%20Machine%20Learning%20Services%20Workspace%20and%20Azure%20Databricks.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/26/advent-of-2020-day-26-connecting-azure-machine-learning-services-workspace-and-azure-databricks/))
27. [Dec 27 2020 - Connecting Azure Databricks with on premise environment](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2027%202020%20-%20Connecting%20Azure%20Databricks%20with%20on%20premise%20environment.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/27/advent-of-2020-day-27-connecting-azure-databricks-with-on-premise-environment/))
28. [Dec 28 2020 - Infrastructure as Code and how to automate, script and deploy Azure Databricks with Powershell](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2028%202020%20-%20Infrastructure%20as%20Code%20and%20how%20to%20automate%2C%20script%20and%20deploy%20Azure%20Databricks%20with%20Powershell.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/28/advent-of-2020-day-28-infrastructure-as-code-and-how-to-automate-script-and-deploy-azure-databricks-with-powershell/))
29. [Dec 29 2020 - Performance tuning for Apache Spark](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2029%202020%20-%20Performance%20tuning%20for%20Apache%20Spark.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/29/advent-of-2020-day-29-performance-tuning-of-apache-spark/))
30. [Dec 30 2020 - Monitoring and troubleshooting of Apache Spark](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2030%202020%20-%20Monitoring%20and%20troubleshooting%20of%20Apache%20Spark.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/30/advent-of-2020-day-30-monitoring-and-troubleshooting-of-apache-spark/))
31. [Dec 31 2020 - Azure Databricks documentation, learning materials and additional resources](https://github.com/tomaztk/Azure-Databricks/blob/main/Dec%2031%202020%20-%20Azure%20Databricks%20documentation%2C%20learning%20materials%20and%20additional%20resources.md) ([blogpost](https://tomaztsql.wordpress.com/2020/12/31/advent-of-2020-day-31-azure-databricks-documentation-learning-materials-and-additional-resources/))

## Additional Material

Additional Material as a collection of demo materials from different sessions is also available for use in this repository.

## Blog

All posts were originally posted on my [blog](https://tomaztsql.wordpress.com) and made copy here at Github. On Github is extremely simple to clone the code, markdown file and all the materials.

## Cloning the repository
You can follow the steps below to clone the repository.

```
git clone -n https://github.com/tomaztk/Azure-Databricks.git
```

## Contact
Get in contact:

[![Gmail](https://img.shields.io/badge/Gmail-D14836?style=for-the-badge&logo=gmail&logoColor=white&)](mailto:[email protected]?subject=[GithubRepo]%20AzureDatabricks)

[![Github URL](https://img.shields.io/twitter/url/https/twitter.com/tomaz_tsql.svg?style=social&label=Follow%20%40tomaz_tsql)](https://github.com/tomaztk)

## Contributing
Do the usual GitHub fork and pull request dance. Add yourself (or I will add you to the contributors section) if you want to.

## Suggestions
Feel free to suggest any new topics that you would like to be covered.

## License
[MIT](https://choosealicense.com/licenses/mit/) © Tomaž Kaštrun