https://github.com/m-taghizadeh/bigdata_projects
Projects related to Big Data course will be implemented in this repository.
https://github.com/m-taghizadeh/bigdata_projects
big-data computer-vision dna-sequencing fake-news-detection image-captioning machine-learning transformer vision-transformer
Last synced: 8 days ago
JSON representation
Projects related to Big Data course will be implemented in this repository.
- Host: GitHub
- URL: https://github.com/m-taghizadeh/bigdata_projects
- Owner: M-Taghizadeh
- Created: 2022-10-23T14:05:39.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-12-19T19:58:15.000Z (over 1 year ago)
- Last Synced: 2023-12-20T15:13:19.530Z (over 1 year ago)
- Topics: big-data, computer-vision, dna-sequencing, fake-news-detection, image-captioning, machine-learning, transformer, vision-transformer
- Language: Jupyter Notebook
- Homepage:
- Size: 40.4 MB
- Stars: 2
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Implementation of big data and data mining projects
In this repository, various projects in the field of big data and data mining are implemented using different approaches of machine learning and deep learning. This repository is very suitable for people interested in implementing different applications of artificial intelligence and machine learning in the real world. Below you can see the list of implementations we have done so far.
|Project Title|Project Description|
|-------------|-------------------|
|[Image Captioning](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/Image%20Captioning)|In this project, we use deep learning and the architecture of **Vision Transformers**, and we implemented the task of image captioning with great precision and BLEU score. Vision Transformer architecture is the implementation of Google's Transformer architecture in the world of computer vision, the Transformer architecture was initially proposed by Google in the article **Attention is all you need** in 2017. In this implementation, **trabsformers** python library and **hugging face** are used.|
|[Fake News Detection](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/Fake%20News%20Detection)| As we know, in today's world, we are faced with a lot of information and news, many of which are fake news due to the interests of people. In this project, using natural language processing techniques and using PassiveAggressiveClassifier and TFIDF Tokenizer, the operation of distinguishing fake news from real news. We reached 93.13% accuracy.|
|[DNA Sequencing](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/DNA%20Sequencing%20Classifier)|Machine learning is widely used and interested by researchers in **bioinformatics** and natural sciences. In this project, we used the **Naive Bayese classifier** to classify the DNA sequence. **Kmers** technique is used in this project. We reached more than 98% accuracy.|
|[Diabetes Analysis](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/Diabetes%20Analysis)|In this project, we used diabetes as a case study. First, we visualized and analyzed the dataset and then applied dimension reduction techniques such as PCA on it. Finally, using the **KNN classifier**, we classified healthy people and people with diabetes with the parameters in the dataset.|
|[Predicting if a person likes a song or not](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/Predicting%20if%20a%20person%20likes%20a%20song%20or%20not)|In this project, we used people's interest in music as a case study. First, we visualized and analyzed the dataset data and then applied dimension reduction techniques such as PCA on it. Finally, using the **KNN classifier**, we classified whether a person likes this song with these features or not.|
|[Handling Imbalanced Data](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/Handling%20Imbalanced%20Data)|Handling Imbalanced Data with **SMOTE** and **Near Miss** Algorithm in Python|
|[Dimensionality reduction](https://github.com/M-Taghizadeh/BigData_Projects/tree/master/PCA)|Dimensionality reduction using PCA technique in Python using scikit learn library