{"id":13413544,"url":"https://github.com/Shixzie/nlp","last_synced_at":"2025-03-14T19:32:34.243Z","repository":{"id":57496895,"uuid":"79992041","full_name":"shixzie/nlp","owner":"shixzie","description":"[UNMANTEINED] Extract values from strings and fill your structs with nlp.","archived":true,"fork":false,"pushed_at":"2017-09-18T14:32:30.000Z","size":52,"stargazers_count":389,"open_issues_count":3,"forks_count":34,"subscribers_count":22,"default_branch":"master","last_synced_at":"2024-08-01T21:27:29.128Z","etag":null,"topics":["go","golang","natural-language-processing","nlp","parse","text","text-extraction"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shixzie.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-01-25T07:19:03.000Z","updated_at":"2024-07-02T15:31:54.000Z","dependencies_parsed_at":"2022-09-03T02:30:56.732Z","dependency_job_id":null,"html_url":"https://github.com/shixzie/nlp","commit_stats":null,"previous_names":["nymiun/nlp"],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shixzie%2Fnlp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shixzie%2Fnlp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shixzie%2Fnlp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shixzie%2Fnlp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shixzie","download_url":"https://codeload.github.com/shixzie/nlp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243635521,"owners_count":20322952,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","golang","natural-language-processing","nlp","parse","text","text-extraction"],"created_at":"2024-07-30T20:01:42.837Z","updated_at":"2025-03-14T19:32:33.929Z","avatar_url":"https://github.com/shixzie.png","language":"Go","funding_links":[],"categories":["Natural Language Processing","自然语言处理","自然語言處理","\u003cspan id=\"自然语言处理-natural-language-processing\"\u003e自然语言处理 Natural Language Processing\u003c/span\u003e"],"sub_categories":["Advanced Console UIs","Uncategorized","暂未分类","形态分析","高級控制台界面","高级控制台界面","交流","Strings","Morphological Analyzers","\u003cspan id=\"高级控制台用户界面-advanced-console-uis\"\u003e高级控制台用户界面 Advanced Console UIs\u003c/span\u003e"],"readme":"[![GoDoc](https://godoc.org/github.com/shixzie/nlp?status.svg)](https://godoc.org/github.com/shixzie/nlp) \n[![Go Report Card](https://goreportcard.com/badge/github.com/shixzie/nlp)](https://goreportcard.com/report/github.com/shixzie/nlp)\n[![Build Status](https://travis-ci.org/shixzie/nlp.svg?branch=master)](https://travis-ci.org/shixzie/nlp)\n[![codecov](https://codecov.io/gh/shixzie/nlp/branch/master/graph/badge.svg)](https://codecov.io/gh/shixzie/nlp)\n\n\n# nlp\n\n\u003e `nlp` is a general purpose any-lang Natural Language Processor that parses the data inside a text and returns a filled model\n\n## Supported types\n```go\nint  int8  int16  int32  int64\nuint uint8 uint16 uint32 uint64\nfloat32 float64\nstring\ntime.Time\ntime.Duration\n```\n\n## Installation\n```\n// go1.8+ is required\ngo get -u github.com/shixzie/nlp\n```\n\n\n**Feel free to create PR's and open Issues :)**\n\n## How it works\n\nYou will always begin by creating a NL type calling nlp.New(), the NL type is a \nNatural Language Processor that owns 3 funcs, RegisterModel(), Learn() and P().\n\n### RegisterModel(i interface{}, samples []string, ops ...ModelOption) error\n\nRegisterModel takes 3 parameters, an empty struct, a set of samples and some options for the model.\n\nThe empty struct lets nlp know all possible values inside the text, for example:\n```go\ntype Song struct {\n\tName        string // fields must be exported\n\tArtist      string\n\tReleasedAt  time.Time\n}\nerr := nl.RegisterModel(Song{}, someSamples, nlp.WithTimeFormat(\"2006\"))\nif err != nil {\n\tpanic(err)\n}\n// ...\n```\n\ntells nlp that inside the text may be a Song.Name, a Song.Artist and a Song.ReleasedAt.\n\nThe samples are the key part about nlp, not just because they set the *limits*\nbetween *keywords* but also because they will be used to choose which model \nuse to handle an expression.\n\nSamples must have a special syntax to set those *limits* and *keywords*.\n```go\nsongSamples := []string{\n\t\"play {Name} by {Artist}\",\n\t\"play {Name} from {Artist}\",\n\t\"play {Name}\",\n\t\"from {Artist} play {Name}\",\n\t\"play something from {ReleasedAt}\",\n}\n```\n\nIn the example below, you can see we're reffering to the Name and Artist fields\nof the `Song` type declared above, both `{Name}` and `{Artist}` are our *keywords* \nand yes! you guessed it! Everything between `play` and `by` will be treated as a\n`{Name}`, and everything that's after `by` will be treated as an `{Artist}` meaning \nthat `play` and `by` are our *limits*.\n```\n     limits\n ┌─────┴─────┐\n┌┴─┐        ┌┴┐\nplay {Name} by  {Artist}\n     └─┬──┘     └───┬──┘\n       └──────┬─────┘\n           keywords\n```\n\nAny character can be a *limit*, a `,` for example can be used as a limit.\n\n*keywords* as well as *limits* are `CaseSensitive` so be sure to type them right.\n\n**Note that putting 2 *keywords* together will cause that only 1 or none of them will be detected**\n\n\u003e *limits are important* - Me :3\n\n\n### Learn() error\n\nLearn maps all models samples to their respective models using the NaiveBayes \nalgorithm based on those samples. `Learn()` also trains all registered models\nso they're able to fit expressions in the future.\n\n```go\n// must call after all models are registrated and before calling nl.P()\nerr := nl.Learn() \nif err != nil {\n\tpanic(err)\n}\n// ...\n```\n\nOnce the algorithm has finished learning, we're now ready to start Processing \nthose texts.\n\n**Note that you must call NL.Learn() after all models are registrated and before calling NL.P()**\n\n### P(expr string) interface{}\n\nP first asks the trained algorithm which model should be used, once we get\nthe right *and already trained* model, we just make it fit the expression.\n\n**Note that everything in the expression must be separated by a _space_ or _tab_**\n\nWhen processing an expression, nlp searches for the *limits* inside that \nexpression and evaluates which sample fits better the expression, it doesn't\nmatter if the text has `trash`. In this example:\n```\n     limits\n ┌─────┴─────┐\n┌┴─┐        ┌┴┐\nplay {Name} by  {Artist}\n     └─┬──┘     └───┬──┘\n       └──────┬─────┘\n           keywords\n```\n\nwe have 2 *limits*, `play` and `by`, it doesn't matter if we had an expression \n*hello sir can you pleeeeeease play King by Lauren Aquilina*, since:\n```\n                                  limits\n            trash              ┌────┴────┐\n┌─────────────┴─────────────┐ ┌┴─┐      ┌┴┐\nhello sir can you pleeeeeease play King by  Lauren Aquilina\n                                   └┬─┘     └─────┬───────┘\n                                 {Name}       {Artist}\n                                 └─┬──┘       └───┬──┘\n                                   └──────┬───────┘\n                                       keywords\n```\n\n`{Name}` would be replaced with `King`, \n`{Artist}` would be replaced with `Lauren Aquilina`, \n`trash` would be ignored as well as the *limits* `play` and `by`, \nand then **a pointer to a filled struct with the type used to register the model** (`Song`) \n( `Song.Name` being `{Name}` and `Song.Artist` beign `{Artist}` ) \n**will be returned**.\n\n## Usage\n\n```go\ntype Song struct {\n\tName       string\n\tArtist     string\n\tReleasedAt time.Time\n}\n\nsongSamples := []string{\n\t\"play {Name} by {Artist}\",\n\t\"play {Name} from {Artist}\",\n\t\"play {Name}\",\n\t\"from {Artist} play {Name}\",\n\t\"play something from {ReleasedAt}\",\n}\n\nnl := nlp.New()\nerr := nl.RegisterModel(Song{}, songSamples, nlp.WithTimeFormat(\"2006\"))\nif err != nil {\n\tpanic(err)\n}\n\nerr = nl.Learn() // you must call Learn after all models are registered and before calling P\nif err != nil {\n\tpanic(err)\n}\n\n// after learning you can call P the times you want\ns := nl.P(\"hello sir can you pleeeeeease play King by Lauren Aquilina\") \nif song, ok := s.(*Song); ok {\n\tfmt.Println(\"Success\")\n\tfmt.Printf(\"%#v\\n\", song)\n} else {\n\tfmt.Println(\"Failed\")\n}\n\n// Prints\n//\n// Success\n// \u0026main.Song{Name: \"King\", Artist: \"Lauren Aquilina\"}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FShixzie%2Fnlp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FShixzie%2Fnlp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FShixzie%2Fnlp/lists"}