An open API service indexing awesome lists of open source software.

https://github.com/mbbrainz/datamining_expedia-ranking

Expedia Ranking assignment for Datamining techniques. Implemented XGB Ranker Model.
https://github.com/mbbrainz/datamining_expedia-ranking

datamining python ranking-algorithm xgboost

Last synced: 6 days ago
JSON representation

Expedia Ranking assignment for Datamining techniques. Implemented XGB Ranker Model.

Awesome Lists containing this project

README

          

# dmt-a2-group155
epic stuff

## 2021 winners and their tekkers

the following metric was used for performance
Evaluation metric: Normalized Discounted Cumulative Gain

1st place:
--------------------
model - Gradient Boosting Machines (GBM)
Two types of models
± without EXP features (A)
‡ 5000 elementary trees
‡ 30 hours to train
± with EXP features (B)
‡ 2500 elementary trees
‡ 20 hours to train

Most important features:
± Position
± Price
± Location desirability (ver. 2)

Down sampling negative instances improves
training time and predictive performance

2nd place winner:
------------------

model: LambdaMART
LambdaMART is a learning to rank algorithm based
on Multiple Additive Regression Tree (MART).

## Incorporated methods:

1.) Scan through all values which have a Nan count greater than 60% of the total number of rows
https://www.kaggle.com/code/vishalkasa/feature-engineering-k-means

2.)Remove the users who did not booked the hotel
https://www.kaggle.com/code/jiaofenx/expedia-hotel-recommendations

3.) Look at when the booking were made i.e weekdays vs Saturday

4.) Example using K means and various plots for data understanding
https://www.kaggle.com/code/putdejudomthai/expedia-exploratory-data-destination-search

5.) Using chi-squared feature analysis as well as PCA analysis
https://medium.com/@zander.b.tedjo/expedia-hotel-recommendations-using-machine-learning-9a8eccd4ecba

# questions 13-05
how to evaluate scores?
- from sklearn.metrics import ndcg_score
- booking