An open API service indexing awesome lists of open source software.

https://github.com/aenguerrand/thu-ml-hw3

Tsinghua project - Machine Learning course - Paper clustering
https://github.com/aenguerrand/thu-ml-hw3

clustering machine-learning tsinghua

Last synced: 4 months ago
JSON representation

Tsinghua project - Machine Learning course - Paper clustering

Awesome Lists containing this project

README

          

# thu-ml-hw3
## Usage
Run all jupyter notebook cell and put all dataset on a directory "data".
Result file create is "result.json"

### Usage (script version)
````bash
python3 hw3.py
````
## Description model
For doing this competition, I have use a cluster method based on sklearn tools.
Cluster is based on name of co author with a pre-processing for merge firstname and lastname.

And the cluster is build based on metrics "cosine" and others parameters for have a better score (0.594).