awesome-data-science-and-engineering
A curated list of Data Science and Engineering frameworks, tools, libraries and related list of tutorials.
https://github.com/Dineshkarthik/awesome-data-science-and-engineering
Last synced: 1 day ago
JSON representation
-
Frameworks
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Installation
- Getting started - simple airflow setup and running dags.
- Key concepts - DAGs, operators, sensors, tasks
- Executors - explanation on different type of executors
- Triggering DAGs - triggering DAG via UI, REST api, Command Line
- Kubernetes operator - explanation on how to use k8s operator
- Using kubernetes executor - run airflow with kubernetes executor in mini-kube using helm.
- Deep Dive into airflow on kubernetes executor - Running airflow reliabley with Kubernetes.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Tips, Tricks and Best practises
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
- Getting started - simple airflow setup and running dags.
-
Libraries
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Homepage
- Documentation
- Basics - Reading files into data frames and selecting.
- Aggregation and grouping - aggregation (such as min, max, sum, count, etc.) and grouping.
- Data wrangling - merge, sort, reset_index, fillna.
- Pivot table - explains the pandas [pivot_table](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.tools.pivot.pivot_table.html) function and how to use it for data analysis.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Time series analysis - about powerful time series tools in the pandas library.
- Pandas Cheatsheet
- Documentation
- Basics - Data types, Array creation, I/O with NumPy, Indexing, Broadcasting, Byte-swapping, Structured arrays, Subclassing ndarray
- Arrays and Vectorized Computation - ndarrays, array & scalar operations, transposing arrays and swapping axes
- Mathematical function - trigonometric, hyperbolic, exponents and logarithms, complex numeric functions
- UFuncs - Universal Functions - Introduction, expolaration, trigonometric functions, advanced ufuncs, special ufuncs
- Array manipulation - changing shapes, transpose-like operations, changing number of dimensions, changing kind of array, joining-splitting-tilling of arrays, adding-removing & rearraning elements
- Fancy Indexing - fancy indexing, combined indexing
- Numpy Cheatsheet
- Documentation
- Getting started - Setting up alembic connection, running migrations, downgrading
- Batch migratoins - Runnign batch migrations
- Working with branches - merging two divergent source trees
- Alembic and Postgresql/ - Running simple migrations on postgres using alembic.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Getting started - a little dive into the above mentioned basic concepts.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Auto-generate migrations - Alembic can view the status of the database and compare against the table metadata in the application, generating the “obvious” migrations based on a comparison.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
- Pandas plotting - Pandas and Matplotlib to produce the graphs.
-
Tools
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- Installation.html
- Documentation
- Using Jupyter Notebook - explains how to install, run, and use Jupyter Notebooks for data science, including tips, best practices, and examples.
- Jupyter notebook tips, tricks & shortcuts - keyboard shortcuts, IPython magic commands.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Using notebook inside JupyterLab - Using calssic jupyter notebooks inside jupyter lab.
- Basic Features
- Importing Libraries
- More resources
- GPU Notebooks
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- Using Jupyter Notebook - explains how to install, run, and use Jupyter Notebooks for data science, including tips, best practices, and examples.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- JupyterLab evolution of the jupyter notebook - An overview of JupyterLab, the next generation of the Jupyter Notebook.
- Getting started - a small dive into basic features
- Running PySpark in colab - How to run Apache spark python api in google colab.
-
Big Data
- Introduction to Dataframes in PySpark - creating dataframes, commonly used functions on dataframes.
- Spark SQL - guide.html) for machine learning, [GraphX](https://spark.apache.org/docs/latest/graphx-programming-guide.html) for graph processing, and [Spark Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html).
- Homepage
- Installation
- Documentation
- Getting Started - overview os sparkcontext, sqlcontext, spark ml, basic operations, data processing.
- Introduction to Dataframes in PySpark - Part 2 - hands on dataframe using opensource datasets.
- PySpark SQL - basics of pyspark sql.
- PySpark SQL Cheatsheet
- UDF - User-Defined Functions - register UDFs, invoke UDFs.
- Structured streaming - batch/interactive processing, stream processing.
- Example scripts - set of example scriupts from Apache spark github repo.
- PySpark Streaming with Kafka/ - using pyspark StreamingContext connect to kafka and process the data from kafka stream.
- Learning Apache Spark with Python
- Using Parquet files - loading data, paritions, schema merging, metadata refreshing.
- Structured streaming - batch/interactive processing, stream processing.
- Introduction to Dataframes in PySpark - creating dataframes, commonly used functions on dataframes.
Categories
Sub Categories