Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/wittline/sparksql-with-python

This repository has some examples of using Spark and SparkSQL with Python through PySpark
https://github.com/wittline/sparksql-with-python

flask-api python spark sparksql

Last synced: 12 days ago
JSON representation

This repository has some examples of using Spark and SparkSQL with Python through PySpark

Awesome Lists containing this project

README

        

# SparkSQL with Python

This repository has some examples of using Spark and SparkSQL with Python through PySpark

## Profeco

We will work with the Profeco dataset, which you can download here: [Profeco](https://drive.google.com/uc?export=download&id=0B-4W2dww7ELNazFfOFVhNG5vckE) , is a daily historical record of more than 2,000 products, as of 2015, in various establishments in Mexico

Check the code here

* How many records are there?
* How many categories are there?
* How many trade chains are being monitored (and therefore reported in that database)?
* What are the most monitored products in each state of the country?
* What is the trade chain with the greatest variety of monitored products?

## Countries airports

Check the code here

## API to count the number of tweets in a radius of 1km

I will separate in another file "tweets_geo.csv" all the different tweets with their geographic data information, this will help in the manipulation of this data in a query with sparkSQL

Check the data preparation code here

The details of the code for the API REST is in the folder API in this repository

![alt text](https://wittline.github.io/SparkSQL-with-Python/images/api1.PNG)

![alt text](https://wittline.github.io/SparkSQL-with-Python/images/api2.PNG)

![alt text](https://wittline.github.io/SparkSQL-with-Python/images/api3.PNG)

# Contributing and Feedback
Any ideas or feedback about this repository?. Help me to improve it.

# Authors
- Created by Ramses Alexander Coraspe Valdez
- Created on 2020

# License
This project is licensed under the terms of the MIT license.