https://github.com/generousman/zhihurecapp

A Flask App For Analyzing ZhihuRec Dataset.
https://github.com/generousman/zhihurecapp

Last synced: 5 months ago
JSON representation

A Flask App For Analyzing ZhihuRec Dataset.

Host: GitHub
URL: https://github.com/generousman/zhihurecapp
Owner: GenerousMan
Created: 2022-11-01T08:26:00.000Z (over 3 years ago)
Default Branch: master
Last Pushed: 2022-11-21T17:53:35.000Z (over 3 years ago)
Last Synced: 2023-05-23T10:41:26.878Z (about 3 years ago)
Language: Python
Size: 11.9 MB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # ZhihuRec Data-mining

A flask app for analyzing ZhihuRec dataset.

## Requirement

``` bash

 pip install requirements.txt

```

## Usage

- [Dataset] Put dataset ZhihuRec in the root directory.

- [Work Path] Set the work path in root directory.

- [Preprocess] Run the io.py, to convert answer_infos.txt into .csv files.

`1.`First, run this command to get answers' csv files:

``` bash

 python tools/io.py

```

Or just download from here:

```

Baidu NetDisk 

Link:https://pan.baidu.com/s/1Ey-R9yo6_HNuoZuhEJivjg 

Code: 8rc7

```

Unzip and put the folder `answer_csv` into `source/`

`2.`Then you can use this command to run the flask app:

``` bash

 python app.py

```

The flask app will run on the "127.0.0.1:5000"

## Files

- `[model]` The tf-idf model will be saved here.

- `[source]` Processed files 

  - `[answer_csv]` Answers' csv files. All files are sorted.

    - `[xxxx.csv]` The xxxx means the start(min) answer's index in this file. 

- `[tools]` Tools help you analyze the dataset.

  - `[io.py]` Used to read/write/convert dataset.

  - `[tfidf.py]` TF-IDF algorithm. its mainly functions are 

    - `train()`

    - `load_tfidf()`

    - `save_tfidf()`

    - `compare_similarity()`.

- `[zhihuRec]` The dataset. You should put txt files here.

- `[app.py]` The entry of the flask app.

- `[preprocess.py]` Use the code in `tools` to create tfidf matrix, and save the result into `model`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/generousman/zhihurecapp

Awesome Lists containing this project

README