https://github.com/edserranoc/nlp_rest_mex2025
https://github.com/edserranoc/nlp_rest_mex2025
nlp nltk programming-contest transformers
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/edserranoc/nlp_rest_mex2025
- Owner: edserranoc
- License: gpl-3.0
- Created: 2025-03-16T18:09:21.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-03-31T04:18:13.000Z (3 months ago)
- Last Synced: 2025-03-31T05:22:34.663Z (3 months ago)
- Topics: nlp, nltk, programming-contest, transformers
- Language: Jupyter Notebook
- Homepage:
- Size: 43.9 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Rest-Mex: Research on Sentiment Analysis Task for Mexican Tourist Texts π²π½
The goal of this task is to analyze TripAdvisor Spanish-language reviews and classify them based on three key aspects:
- sentiment polarity
- type of site
- associated Pueblo MΓ‘gico.Each review contains valuable information about a traveler's experience, and our objective is to extract meaningful insights from it. First, we need to determine the sentiment polarity of the review by assigning it a rating from 1 (very negative) to 5 (very positive), based on the original score given by the tourist. This will help in understanding overall visitor satisfaction.
Next, we classify the review according to the type of site being reviewed. The review could describe a hotel, a restaurant, or an attraction, and this categorization is based on contextual keywords and available metadata.
## Training Dataset
- File: Rest-Mex_2025_Train.csv
- Size: 208,051 instances (70% of the original dataset)
- Columns:- π Title: The title given by the tourist to their opinion (Text).
- π Review: The full review written by the tourist (Text).
- π Polarity: The sentiment polarity of the review (1 to 5).
- π Town: The town where the review is focused (Text).
- π Region: The Mexican state where the town is located (Text). This feature is not for classification but can provide additional information.
- π½οΈ Type: The category of the reviewed place (Hotel, Restaurant, Attractive).To access the data, I recommend registering for the contest in [Rest-Mex 2025](https://sites.google.com/cimat.mx/rest-mex-2025/).