https://github.com/vicba/schoolproject
Assignment for course Machine Learning at Howest university of Applied sciences
https://github.com/vicba/schoolproject
Last synced: about 2 months ago
JSON representation
Assignment for course Machine Learning at Howest university of Applied sciences
- Host: GitHub
- URL: https://github.com/vicba/schoolproject
- Owner: Vicba
- License: mit
- Created: 2023-02-16T13:07:30.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-13T21:13:49.000Z (about 2 years ago)
- Last Synced: 2025-02-16T04:27:24.092Z (4 months ago)
- Language: HTML
- Homepage:
- Size: 15.6 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: License.md
Awesome Lists containing this project
README
# SchoolProject
## Context - drunk smurfs
Among all international hotel guests, Smurfs are burdened with the upkeep of a singular reputation:
they are (supposedly) the rowdiest bunch one can entertain, and are equally well-known for
unbridled spending as for racking up extensive costs in damages to hotel infrastructure, staff, and
occasionally also other guests – costs which typically cannot be recovered once the guest has sought
out the safety of his (or her) homeland.
It is your job as a data scientist to screen applying Smurfs clients for an exclusive hotel in the
Bahamas - yes, it's the kind of hotel you need to apply for!## Data
At your disposal is a training set containing data about the behavior of 5000 Smurf hotel guests
(train_V2.csv). This data set contains information about the profit the hotel made during their last
visit (excluding damages), but also whether they caused damages during their last visit, and for what
amount. These outcomes are respectively called 'outcome_profit', 'outcome_damage_inc', and
'outcome_damage_amount'. To predict them, you have access to a host of personal information:
previous history of profits and damages, use of hotel facilities, socio-demographics and behavioral
scores from the staff of other hotels within the hotel chains. A minor description of features is
available in dictionary.csv.
You also get information on the 500 applicants for the 2024 season (score.csv).
It is your job to return a list of 150 clients that offer an attractive balance between projected profit for the hotel, and
anticipated damages.
You will notice the data set contains a large number of oddities. You are expected to think yourself
about what is intuitive and acceptable in terms of approach, and to provide some minor reflection
on this in your technical report.