{"id":19607059,"url":"https://github.com/bookingcom/ml-dataset-reviews","last_synced_at":"2026-03-04T10:31:47.925Z","repository":{"id":233386279,"uuid":"753617423","full_name":"bookingcom/ml-dataset-reviews","owner":"bookingcom","description":null,"archived":false,"fork":false,"pushed_at":"2024-07-11T14:07:54.000Z","size":7,"stargazers_count":2,"open_issues_count":1,"forks_count":2,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-07-08T12:55:09.908Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bookingcom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-06T13:31:20.000Z","updated_at":"2025-04-19T22:43:49.000Z","dependencies_parsed_at":"2024-04-16T02:10:20.751Z","dependency_job_id":"fd2ff6b8-60c6-4a59-bf85-de2f7c62e2de","html_url":"https://github.com/bookingcom/ml-dataset-reviews","commit_stats":null,"previous_names":["bookingcom/ml-dataset-reviews"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bookingcom/ml-dataset-reviews","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bookingcom%2Fml-dataset-reviews","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bookingcom%2Fml-dataset-reviews/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bookingcom%2Fml-dataset-reviews/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bookingcom%2Fml-dataset-reviews/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bookingcom","download_url":"https://codeload.github.com/bookingcom/ml-dataset-reviews/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bookingcom%2Fml-dataset-reviews/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30078308,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T08:01:56.766Z","status":"ssl_error","status_checked_at":"2026-03-04T08:00:42.919Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T10:08:37.037Z","updated_at":"2026-03-04T10:31:47.881Z","avatar_url":"https://github.com/bookingcom.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Booking.com Accommodation Review Dataset\n\nThis repository contains the training set of the user-generated review dataset of Booking.com reviews. The training set contains about 1.6M reviews from 40k accommodations around the world. All reviews were written by guests who stayed at the accommodation.\n\nThe dataset consists of English reviews published in 2023. All reviews have passed a moderation process ensuring they are genuine and do not violate the platform guidelines. In order to preserve user privacy, no personally identifiable information was included in the data. Similarly, to protect business-sensitive statistics, the dataset is limited to only tens of thousands accommodations. Finally, we selected only informative reviews that include at least 3 topics based on the [Text2topic model](https://arxiv.org/pdf/2310.14817).\n\nThe following table describes the fields in the dataset:\n|Column                     | Description                                                          |\n| ------------------------- |:--------------------------------------------------------------------:|\n| review_title              | The title of the review                                              |\n| review_positive           | Positive (\"liked\") section in review.                                |\n| review_negative           | Negative (\"disliked\") section in review.                             |\n| guest_score               | Review score for the stay                                            |\n| review_helpful_votes      | How many users marked the review as helpful                          |\n| guest_type                | There are 4 types of traveller types: Solo traveller (1 adult) /\u003cbr\u003eCouple (2 adults) / Group (\u003e2 adults) / Family with\u003cbr\u003echildren (adults \u0026 children) |\n| guest_country             | Anonymized country from which the reservation was made               |\n| room_nights               | The length of the reservation, i.e. number of nights booked          |\n| month                     | The month of the check-in date of the reservation                    |\n| accommodation_id          | An anonymized accommodation ID                                       |\n| accommodation_type        | The type of the accommodation, e.g. hotel, apartment, hostel         |\n| accommodation_score       | The overall average guest review score for the accommodation         |\n| accommodation_country     | Country of the accommodation                                         |\n| accommodation_star_rating | Accommodation star rating is provided by the property, and is\u003cbr\u003eusually determined by an official accommodation rating\u003cbr\u003eorganisation or another third party |\n| location_is_beach         | Is the accommodation located in a beach location                     |\n| location_is_ski           | Is the accommodation located in a ski location                       |\n| location_is_city_center   | Is the accommodation located in the city center                      |\n\n## License\nThe dataset is published under the following non-commercial [license](https://creativecommons.org/licenses/by-sa/4.0/deed.en)\n\n## IMPORTANT NOTE\nIf you observe any issue with git-lfs to fetch the data, the data is also available through [HuggingFace](https://huggingface.co/datasets/efainman/booking-reviews-dataset/tree/main)\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbookingcom%2Fml-dataset-reviews","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbookingcom%2Fml-dataset-reviews","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbookingcom%2Fml-dataset-reviews/lists"}