Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/portasynthinca3/markov
Text generation library for Elixir/Erlang based on Markov chains
https://github.com/portasynthinca3/markov
context-awareness elixir erlang markov-chain nlp text-generation
Last synced: 3 months ago
JSON representation
Text generation library for Elixir/Erlang based on Markov chains
- Host: GitHub
- URL: https://github.com/portasynthinca3/markov
- Owner: portasynthinca3
- License: wtfpl
- Created: 2021-09-06T13:58:21.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-10-19T02:59:22.000Z (over 1 year ago)
- Last Synced: 2024-01-27T02:42:50.792Z (12 months ago)
- Topics: context-awareness, elixir, erlang, markov-chain, nlp, text-generation
- Language: Elixir
- Homepage:
- Size: 1.46 MB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Markov
Text generation library based on nth-order Markov chains
![Hex.pm](https://img.shields.io/hexpm/v/markov)
![Hex.pm](https://img.shields.io/hexpm/dw/markov)## Features
- **Token sanitation** (optional): ignores letter case and punctuation when switching states, but still keeps the output as-is
- **Operation history** (optional): recalls the operations it was instructed to perform, incl. past training data
- **Probability shifting** (optional): gives less frequent generation paths more chance to get used, which makes the output more original but may produce nonsense
- **Tagging** (optional): you can tag your source data and alter the probabilities of tagged generation paths according to your rules
- **Prompted generation** (optional) grants your model the ability to answer questions given to it provided that the training data consists mostly of Q&A pairs
- **Managed disk storage** so you don't have to worry about storing and loading the models
- **Transparent fragmentation** reduces RAM usage and loading times with huge models## Usage
In `mix.exs`:
```elixir
defp deps do
[{:markov, "~> 4.0"}]
end
```Unlike Markov 1.x, this version has very strong opinions on how you should create and persist your models (that also differs from 2.x and 3.x).
Example workflow (click [here](https://hexdocs.pm/markov/api-reference.html) for full docs):
```elixir
# The model will be stored under this path
{:ok, model} = Markov.load("./model_path", sanitize_tokens: true, store_log: [:train])# train using four strings
:ok = Markov.train(model, "hello, world!")
:ok = Markov.train(model, "example string number two")
:ok = Markov.train(model, "hello, Elixir!")
:ok = Markov.train(model, "fourth string")# generate text
{:ok, text} = Markov.generate_text(model)
IO.puts(text)# commit all changes and unload
Markov.unload(model)# these will return errors because the model is unloaded
# Markov.generate_text(model)
# Markov.train(model, "hello, world!")# load the model again
{:ok, model} = Markov.load("./model_path")# enable probability shifting and generate text
:ok = Markov.configure(model, shift_probabilities: true)
{:ok, text} = Markov.generate_text(model)
IO.puts(text)# print log
model |> Markov.read_log |> IO.inspect# this will also write our new just-set option
Markov.unload(model)
```## Credits
- [The English dictionary in a CSV format](https://www.bragitoff.com/2016/03/english-dictionary-in-csv-format/)