Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark
This repo is for Twitter Arabic dataset for sentiment analysis using Apache Spark.
https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark
apache-spark arabic-nlp arabic-tweets flask pyhton3 sentiment-analysis spark twitter-api
Last synced: about 1 month ago
JSON representation
This repo is for Twitter Arabic dataset for sentiment analysis using Apache Spark.
- Host: GitHub
- URL: https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark
- Owner: javaidiqbal11
- Created: 2021-01-27T16:55:00.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-01-27T17:01:56.000Z (about 4 years ago)
- Last Synced: 2024-07-22T17:39:23.665Z (7 months ago)
- Topics: apache-spark, arabic-nlp, arabic-tweets, flask, pyhton3, sentiment-analysis, spark, twitter-api
- Language: Python
- Homepage: https://www.jtech.com.pk/
- Size: 376 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Installation Process
Spark is Java based, You need to install java on your system to run the spark.Download apache spark and place it some directory you want to place it. Uncompress it and save it as `spark` directory.
Add Environment variables for spark in `.bashrc` if you are using linux or Unix based OS. For example I have downloaded the spark
in irfan directory. Now add these lines to `.bashrc` file.
```textmate
export SPARK_HOME=/home/irfan/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
export PYSPARK_PYTHON=python3
```Test your installation of spark by running the command in terminal
```shell script
pyspark
```
It will start the python shell in spark.## Install requirements.
you need to install Python packages to run the training.```shell script
pip3 install -r requirements.txt
```Now you are good to go.
## Start spark server
Run the `start_server.sh` to start the spark. It will automatically train the model on startup and provide you with api access to
find the sentiment of given tweet using API.## Test server
```shell script
# json request
curl -X POST -d '{"sentence":"التعلم الرقمي من خلال التسجيل الرقمي افتتاح الموقع قري"}' -H "Content-Type: application/json" http://0.0.0.0:5432/analyse
```If you want to test it with Pycharm you can run the `api_test.http` file
## Note
POSTMAN has encoding issues with arabic type language such as Urdu etc so you have to use the curl or Pycharm to test it. Or
You can integrate it with your app or website.## Create database to store user tweets
```python
import findspark
findspark.init("/home/irfan/spark")
import pyspark as ps
from pyspark.sql import SQLContext
sc = ps.SparkContext('local[2]')
sqlContext = SQLContext(sc)
csv_file = "./user_tweets.csv"
sqlContext.sql("CREATE DATABASE IF NOT EXISTS Sentiment;")
sqlContext.sql("use sentiment")
df = (sqlContext.read.format("csv")
.option("inferSchema", "true")
.option("header", "true")
.load(csv_file))schema="tweet varchar(512)"
sqlContext.sql("use sentiment;")
df.write.saveAsTable("user_tweets", schema=schema)
# df.write.format("csv").saveAsTable("user_tweets", schema=schema)
df = sqlContext.read.load("spark-warehouse/sentiment.db/user_tweets")
df_sfo = sqlContext.sql("SELECT * FROM user_tweets")
tbl = sqlContext.read.format("parquet").load("spark-warehouse/sentiment.db/user_tweets")
tbl.sql_ctx.sql("INSERT INTO user_tweets VALUES ('{}')".format('قراءة المزيد')).show()
```