https://github.com/cleoold/markov_cn_node
Backend creating random Chinese sentences based on chat history / 根据消息记录生成随机句子
https://github.com/cleoold/markov_cn_node
chinese jieba markov
Last synced: 4 months ago
JSON representation
Backend creating random Chinese sentences based on chat history / 根据消息记录生成随机句子
- Host: GitHub
- URL: https://github.com/cleoold/markov_cn_node
- Owner: cleoold
- License: mit
- Created: 2020-03-09T03:56:28.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-06T16:26:17.000Z (over 3 years ago)
- Last Synced: 2025-03-24T15:01:57.366Z (about 1 year ago)
- Topics: chinese, jieba, markov
- Language: TypeScript
- Homepage: https://cleoold.github.io/Markov_CN_node
- Size: 320 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
### construction site!
### Markov-Chinese
Generates random text based on the input sentences. Simple, one process, no database.
### Intro
The program will start a server that only does two things:
* receive sentences to build/update models and
* output new random Chinese sentences using the Markov chain algorithm, upon calling.
It can be served as a minimalist backend of a Markov chat bot just by using the http request. Each model must have a name (`string`) that can be differentiated among others and the generated text will be generated using the model with this name.
### Usage
Type `npm run start` in the *project root folder* to run the app. By default it runs on `127.0.0.1:6758`, however this can be changed in config.ts.
The app manages multiple models. Let us assume that the name is `erika`.
#### Insert sentences / modify table
Send the following POST request to the endpoint `/sentence`:
```
POST /sentence HTTP/1.1
Host: 127.0.0.1:6758
Content-Type: application/json
{"id":"erika","message":"你好,我是马尔可夫机器人"}
// OR
{"id":"erika","message":["你好,我是马尔可夫机器人", "哈喽", "楼上是群龙王"]}
```
The posted json must be in a json format:
* field `id`: the name of the model, it cannot contain any of `/\?%*:|"<>`.
* field `message`: the message passed to the model. This field can be an array of strings, indicating that all of the content will be added.
If everything works, a folder `models/erika` will be created which contains the history. `conversation.txt` records all the messages sent, while the `json` files contain the actual Markov model. Also a status code `200` is returned. Otherwise codes `400` or `500` are returned, and information is logged to the console.
#### Generate sentence
After repeating the above steps many times, one can send a GET request to the endpoint `/sentence` to retrieve a sentence:
```
GET /sentence?id=erika&wc=20 HTTP/1.1
Host: 127.0.0.1:6758
```
* parameter `id`: specify the name of the model
* parameter `wc`: *maximum* word count of the sentence. 10 will be used if this option is not present.
If the model does not exist, a status code of `404` is returned, otherwise the result json is returned and it looks like
```
{
"status": 200,
"message": "楼上是马尔可夫机器人"
}
```
plus a status code of `200`.
Yes, that's all.
#### Scripts
* `npm run start`
Starts server at 127.0.0.1:6758.
* `npm run dict`
This command will regenerate all the dictionaries based on `conversations.txt`. It is good to run this script every time after a breaking update.
### Thanks to
Thanks to [nodejieba](https://github.com/yanyiwu/nodejieba) for providing such a wonderful word cut library.
### TODO list
* Complete API readme docs.
* Improve English sentences (probably necessary).