Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jordhy97/final_project
Aspect and opinion terms extraction for hotel's review from AiryRooms in Bahasa Indonesia
https://github.com/jordhy97/final_project
Last synced: 2 months ago
JSON representation
Aspect and opinion terms extraction for hotel's review from AiryRooms in Bahasa Indonesia
- Host: GitHub
- URL: https://github.com/jordhy97/final_project
- Owner: jordhy97
- Created: 2018-12-06T10:51:03.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-07-03T14:29:14.000Z (almost 5 years ago)
- Last Synced: 2024-01-21T03:06:33.595Z (5 months ago)
- Language: Python
- Size: 31.1 MB
- Stars: 15
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Lists
- Awesome-Indonesia-NLP - Aspect and Opinion Terms Extraction for Hotel Reviews
README
# final_project
Aspect and opinion terms extraction for hotel's review from AiryRooms in Bahasa Indonesia## Corpus description
The corpus is located in the folder data/reviews. The corpus consists of 5000 reviews (78.604 tokens) that are divided into train.txt (4000 reviews) and test.txt (1000 reviews). Here's the label distribution for the corpus.| Label | train.txt | test.txt |
| ------------- |-------------:| -----:|
| B-ASPECT | 7005 | 1758 |
| I-ASPECT | 2292 | 584 |
| B-SENTIMENT | 9646 | 2384 |
| I-SENTIMENT | 4265 | 1067 |
| OTHER | 39897 | 9706 |
| Total | 63105 | 15499 |reviews.txt contains raw reviews and reviews_preprocessed.txt contains reviews that have been preprocessed that are used to train word embedding.