An open API service indexing awesome lists of open source software.

https://github.com/baptvit/big_data

My courses and activities in Big Data
https://github.com/baptvit/big_data

big-data hadoop hbase hive kafka mapreduce oozie pig python3 scala spark zookeeper

Last synced: 7 months ago
JSON representation

My courses and activities in Big Data

Awesome Lists containing this project

README

          

# Learning Big Data
# Resource attributes

Since resources across the internet vary in terms of their pre-requisites and general accessibility, it is useful to
give attributes to them so that it is easy to understand where a resource fits into the wider machine learning scope. Below is a few suggested attributes (please extend):

- :blue_book: = Doing
- :heavy_check_mark: = Completed
- :rainbow: = creative
- :bowtie: = beginner
- :sweat_smile: = intermediate, some pre-requisites
- :godmode: = advanced, many pre-requisites

#### Tools Used
Hadoop, Hive, HBase, ZooKeeper, Oozie, Sorl, Kafka, Pig, MapReduce, YARN, Spark, Scala and Python.

### Accelerated Learning Techniques
- Watch videos at 2x or 3x speed using a browser extension
- Handwrite notes as you watch for memory retention
- Immerse yourself in the [community](https://medium.com/@exastax/top-20-data-science-blogs-and-websites-for-data-scientists-d88b7d99740)

# Real-World Tools

## Big Data Fundamentals
- #### [Big Data Fundamentals](https://cognitiveclass.ai/learn/big-data/) [RESULTS](https://github.com/helpthx/Big_Data/tree/master/Big%20Data%20Fundamentals):heavy_check_mark:
- [Big Data 101](https://cognitiveclass.ai/courses/what-is-big-data/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Big%20Data%20Fundamentals/Cognitive%20Class%20BD0101EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Hadoop 101](https://cognitiveclass.ai/courses/introduction-to-hadoop/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Big%20Data%20Fundamentals/BigDataUniversity%20BD0111EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Spark Fundamentals I](https://cognitiveclass.ai/courses/what-is-spark/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Big%20Data%20Fundamentals/Cognitive%20Class%20BD0211EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:

- Big Data Fundamentos 2.0 from Data Science Academy [RESULTS](https://github.com/helpthx/Big_Data/blob/master/certificate-big-data-fundamentos-20.pdf) :heavy_check_mark:

## Hadoop
- #### [Hadoop Fundamentals](https://cognitiveclass.ai/learn/hadoop/) [RESULTS](https://github.com/helpthx/Big_Data/tree/master/Hadoop%20Fundamentals) :heavy_check_mark:
- [Hadoop 101](https://cognitiveclass.ai/courses/introduction-to-hadoop/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Big%20Data%20Fundamentals/BigDataUniversity%20BD0111EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [MapReduce and YARN](https://cognitiveclass.ai/courses/mapreduce-and-yarn/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Fundamentals/Big%20Data%20University%20BD0115EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Moving Data into Hadoop](https://cognitiveclass.ai/courses/flume-sqoop-moving-data-into-hadoop/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Fundamentals/Big%20Data%20University%20BD0131EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Accessing Hadoop Data Using Hive](https://cognitiveclass.ai/courses/hadoop-hive/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Fundamentals/Big%20Data%20University%20BD0141EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- #### [Hadoop Programming](https://cognitiveclass.ai/learn/big-data-hadoop-programming/) [RESULTS](https://github.com/helpthx/Big_Data/tree/master/Hadoop%20Programming) :heavy_check_mark:
- [MapReduce and YARN](https://cognitiveclass.ai/courses/mapreduce-and-yarn/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Fundamentals/Big%20Data%20University%20BD0115EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Apache Pig 101](https://cognitiveclass.ai/courses/introduction-to-pig/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Programming/Big%20Data%20University%20BD0121EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Simplifying Data Pipelines with Apache Kafka](https://cognitiveclass.ai/courses/simplifyingdatapipelines/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Programming/Big%20Data%20University%20BD0123EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:

- #### [Hadoop Administration](https://cognitiveclass.ai/learn/hadoop-administration/) [RESULTS](https://github.com/helpthx/Big_Data/tree/master/Hadoop%20Administration) :heavy_check_mark:
- [Moving Data into Hadoop](https://cognitiveclass.ai/courses/flume-sqoop-moving-data-into-hadoop/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Fundamentals/Big%20Data%20University%20BD0131EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Controlling Hadoop Jobs Using Oozie](https://cognitiveclass.ai/courses/controlling-hadoop-jobs-using-oozie/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Administration/Big%20Data%20University%20BD0133EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Developing Distributed Applications Using ZooKeeper](https://cognitiveclass.ai/courses/developing-distributed-applications-using-zookeeper/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Administration/Big%20Data%20University%20BD0135EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Solr 101](https://cognitiveclass.ai/courses/introduction-to-solr/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Administration/Big%20Data%20University%20BD0137EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:

- #### [Hadoop Data Access](https://cognitiveclass.ai/learn/big-data-storage-and-retrieval/) [RESULTS](https://github.com/helpthx/Big_Data/tree/master/Hadoop%20Data%20Access) :heavy_check_mark:
- [Accessing Hadoop Data Using Hive](https://cognitiveclass.ai/courses/hadoop-hive/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Fundamentals/Big%20Data%20University%20BD0141EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Using HBase for Real-time Access to your Big Data](https://cognitiveclass.ai/courses/using-hbase-for-real-time-access-to-your-big-data/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Data%20Access/Big%20Data%20University%20BD0143EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [SQL Access for Hadoop](https://cognitiveclass.ai/courses/sql-access-for-hadoop/) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Hadoop%20Data%20Access/Big%20Data%20University%20BD0145EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:

- #### [Intro to Hadoop and MapReduce]( https://www.udacity.com/course/intro-to-hadoop-and-mapreduce--ud617)

## Scala
- #### [Scala Programming for Data Science](https://cognitiveclass.ai/learn/scala)
- [Scala 101](https://courses.cognitiveclass.ai/courses/course-v1:BigDataUniversity+SC0101EN+2016/info) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Scala%20Programming%20for%20Data%20Science/Scala%20101/Module%201:%20Introduction/Big%20Data%20University%20SC0101EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Spark Overview for Scala Analytics](https://courses.cognitiveclass.ai/courses/course-v1:BigDataUniversity+SC0103EN+2016/info) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Scala%20Programming%20for%20Data%20Science/Spark%20Overview%20for%20Scala%20Analytics/Big%20Data%20University%20SC0103EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:
- [Data Science for Scala](https://cognitiveclass.ai/courses/data-science-scala) [RESULTS](https://github.com/helpthx/Big_Data/blob/master/Scala%20Programming%20for%20Data%20Science/Data%20Science%20with%20Scala/Lightbend%20SC0105EN%20Certificate%20_%20Cognitive%20Class.pdf) :heavy_check_mark:

## Data Storytelling
- Edx https://www.edx.org/course/analytics-storytelling-impact-1

## Spark
- [Spark Workshop PDF](https://stanford.edu/~rezab/sparkclass/slides/itas_workshop.pdf )