Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/anurima-saha/yelp_review_classification_with_roberta
Project begins with webscrapping Yelp reviews and rating with 'BeautifulSoup' in Python, followed by Natural Language Processing (text cleaning, removing stopwords, tokenization and lemmatization) using 'NLTK'. RoBERTa from 'HuggingFace' has been fine-tuned for text classification with early stopping and regularization using Pytorch.
https://github.com/anurima-saha/yelp_review_classification_with_roberta
beautifulsoup4 deep-learning early-stopping fine-tuning huggingface-transformers large-language-model natural-language-processing nltk-python pytorch roberta-model roberta-tokenizer webscapping
Last synced: 4 days ago
JSON representation
Project begins with webscrapping Yelp reviews and rating with 'BeautifulSoup' in Python, followed by Natural Language Processing (text cleaning, removing stopwords, tokenization and lemmatization) using 'NLTK'. RoBERTa from 'HuggingFace' has been fine-tuned for text classification with early stopping and regularization using Pytorch.
- Host: GitHub
- URL: https://github.com/anurima-saha/yelp_review_classification_with_roberta
- Owner: anurima-saha
- License: mit
- Created: 2024-12-22T05:23:43.000Z (12 days ago)
- Default Branch: main
- Last Pushed: 2024-12-22T05:39:49.000Z (12 days ago)
- Last Synced: 2024-12-22T06:21:26.057Z (12 days ago)
- Topics: beautifulsoup4, deep-learning, early-stopping, fine-tuning, huggingface-transformers, large-language-model, natural-language-processing, nltk-python, pytorch, roberta-model, roberta-tokenizer, webscapping
- Language: Jupyter Notebook
- Homepage:
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Yelp_Review_Classification_with_RoBERTa
Project begins with webscrapping Yelp reviews and rating with 'BeautifulSoup' in Python, followed by Natural Language Processing (text cleaning, removing stopwords, tokenization and lemmatization) using 'NLTK'. RoBERTa from 'HuggingFace' has been fine-tuned for text classification with early stopping and regularization using Pytorch.