https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark

This repo is for Twitter Arabic dataset for sentiment analysis using Apache Spark.
https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark

apache-spark arabic-nlp arabic-tweets flask pyhton3 sentiment-analysis spark twitter-api

Last synced: 7 months ago
JSON representation

This repo is for Twitter Arabic dataset for sentiment analysis using Apache Spark.

Host: GitHub
URL: https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark
Owner: javaidiqbal11
Created: 2021-01-27T16:55:00.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2021-01-27T17:01:56.000Z (over 4 years ago)
Last Synced: 2025-01-03T00:35:35.389Z (9 months ago)
Topics: apache-spark, arabic-nlp, arabic-tweets, flask, pyhton3, sentiment-analysis, spark, twitter-api
Language: Python
Homepage: https://www.jtech.com.pk/
Size: 376 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          ## Installation Process

Spark is Java based, You need to install java on your system to run the spark.

Download apache spark and place it some directory you want to place it. Uncompress it and save it as `spark` directory.

Add Environment variables for spark in `.bashrc` if you are using linux or Unix based OS. For example I have downloaded the spark

in irfan directory. Now add these lines to `.bashrc` file. 

```textmate

export SPARK_HOME=/home/irfan/spark

export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin

export PYSPARK_PYTHON=python3

```

Test your installation of spark by running the command in terminal

```shell script

pyspark

```

It will start the python shell in spark.

## Install requirements.

you need to install Python packages to run the training.

```shell script

pip3 install -r requirements.txt

```

Now you are good to go.

## Start spark server

Run the `start_server.sh` to start the spark. It will automatically train the model on startup and provide you with api access to

find the sentiment of given tweet using API.

## Test server

```shell script

# json request

curl -X POST -d '{"sentence":"التعلم الرقمي من خلال التسجيل الرقمي افتتاح الموقع قري"}' -H "Content-Type: application/json" http://0.0.0.0:5432/analyse

```

If you want to test it with Pycharm you can run the `api_test.http` file

## Note

POSTMAN has encoding issues with arabic type language such as Urdu etc so you have to use the curl or Pycharm to test it. Or

You can integrate it with your app or website.

## Create database to store user tweets

```python

import findspark

findspark.init("/home/irfan/spark")

import pyspark as ps

from pyspark.sql import SQLContext

sc = ps.SparkContext('local[2]')

sqlContext = SQLContext(sc)

csv_file = "./user_tweets.csv"

sqlContext.sql("CREATE DATABASE IF NOT EXISTS Sentiment;")

sqlContext.sql("use sentiment")

df = (sqlContext.read.format("csv")

  .option("inferSchema", "true")

  .option("header", "true")

  .load(csv_file))

schema="tweet varchar(512)"

sqlContext.sql("use sentiment;")

df.write.saveAsTable("user_tweets", schema=schema)

# df.write.format("csv").saveAsTable("user_tweets", schema=schema)

df = sqlContext.read.load("spark-warehouse/sentiment.db/user_tweets")

df_sfo = sqlContext.sql("SELECT * FROM user_tweets")

tbl = sqlContext.read.format("parquet").load("spark-warehouse/sentiment.db/user_tweets")

tbl.sql_ctx.sql("INSERT INTO user_tweets  VALUES ('{}')".format('قراءة المزيد')).show()

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/javaidiqbal11/arabic-tweets-sentiment-analysis-using-spark

Awesome Lists containing this project

README