https://github.com/aenguerrand/thu-ml-hw3
Tsinghua project - Machine Learning course - Paper clustering
https://github.com/aenguerrand/thu-ml-hw3
clustering machine-learning tsinghua
Last synced: 4 months ago
JSON representation
Tsinghua project - Machine Learning course - Paper clustering
- Host: GitHub
- URL: https://github.com/aenguerrand/thu-ml-hw3
- Owner: AEnguerrand
- License: mit
- Created: 2019-01-12T12:18:19.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-01-14T06:06:51.000Z (over 6 years ago)
- Last Synced: 2025-05-20T02:10:06.332Z (5 months ago)
- Topics: clustering, machine-learning, tsinghua
- Language: Jupyter Notebook
- Homepage:
- Size: 86.9 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# thu-ml-hw3
## Usage
Run all jupyter notebook cell and put all dataset on a directory "data".
Result file create is "result.json"### Usage (script version)
````bash
python3 hw3.py
````
## Description model
For doing this competition, I have use a cluster method based on sklearn tools.
Cluster is based on name of co author with a pre-processing for merge firstname and lastname.And the cluster is build based on metrics "cosine" and others parameters for have a better score (0.594).