{"id":16876179,"url":"https://github.com/geange/lucene-go","last_synced_at":"2025-10-04T05:59:50.956Z","repository":{"id":37809065,"uuid":"504222937","full_name":"geange/lucene-go","owner":"geange","description":"A Go port of Apache Lucene（Go版Lucene）","archived":false,"fork":false,"pushed_at":"2025-03-14T14:02:55.000Z","size":1577,"stargazers_count":54,"open_issues_count":2,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-14T15:22:52.448Z","etag":null,"topics":["database","go","golang","lucene","search-engine"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/geange.png","metadata":{"files":{"readme":"README-zh_CN.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-16T16:10:48.000Z","updated_at":"2025-03-05T02:55:19.000Z","dependencies_parsed_at":"2024-07-15T17:17:06.227Z","dependency_job_id":"855e0b0f-fa90-46e6-b241-09fe6bbdf92f","html_url":"https://github.com/geange/lucene-go","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geange%2Flucene-go","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geange%2Flucene-go/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geange%2Flucene-go/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geange%2Flucene-go/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/geange","download_url":"https://codeload.github.com/geange/lucene-go/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243847060,"owners_count":20357317,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","go","golang","lucene","search-engine"],"created_at":"2024-10-13T15:38:27.924Z","updated_at":"2025-10-04T05:59:45.901Z","avatar_url":"https://github.com/geange.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# lucene-go\n\n[![GoDoc](https://godoc.org/github.com/geange/lucene-go?status.svg)](https://godoc.org/github.com/geange/lucene-go)\n[![Go](https://github.com/geange/lucene-go/actions/workflows/go.yml/badge.svg)](https://github.com/geange/lucene-go/actions/workflows/go.yml)\n[![CodeQL](https://github.com/geange/lucene-go/actions/workflows/codeql.yml/badge.svg)](https://github.com/geange/lucene-go/actions/workflows/codeql.yml)\n[![codecov](https://codecov.io/gh/geange/lucene-go/graph/badge.svg?token=52HZJSPPS6)](https://codecov.io/gh/geange/lucene-go)\n\n**[English](README.md)**\n\n## 概要\n\nLucene是一个搜索引擎库。`lucene-go` 是它的Golang版本实现。\n\n### 当前版本\n\n* 仅支持Go1.21+\n* 基于lucene-8.11.2开发\n* 部分库基本可用，单元测试补齐中\n\n### 我们的目标\n\n* 初期尽可能兼容Java版本Lucene的API接口\n* 维护一套高质量的Go版本的搜索引擎库\n* 提供比Java版本Lucene更强的性能\n\n### 当前任务\n\n* 完善基础库的单元测试\n* 完善开发文档、设计文档\n* 增加代码用例\n\n### 项目概览\n\n项目的目标在开发的过程中几经波折，遇到的困难远超预期，语言的差异以及原理性知识的缺乏，经过一年的开发逐步完成下面几大模块的开发：\n\n* core/store: lucene的存储模块，主要负责数据的序列化处理\n* core/document: 定义lucene的一些搜索相关的数据结构\n* core/index: lucene索引的实现，也是对外暴露主要的包\n* core/search: 主要包含query的实现（用于在索引中进行数据检索）\n* memory: 实现了一个内存实现的搜索引擎，一个简化版的Lucene\n* util/fst: FST的实现（Lucene的重要的数据结构）\n* util/automaton: 自动机的实现\n* codes: 序列化相关，当前仅支持simpleText（使用纯文本信息记录索引数据）的格式\n\n\u003e 需要注意的是，当前项目并不完善！请勿用于任意项目～\n\n## 技术文档\n\n### FST\n\n* [1. 图解FST构造算法](https://juejin.cn/post/7311603506222088207)\n* [2. FST构造-工程优化](https://juejin.cn/post/7311957969423663119)\n\n## 尝试\n\n\u003e go1.21+\n\n### 案例\n\n#### IndexWriter\n\nUsing `IndexWriter`\n\n```go\npackage main\n\nimport (\n  \"context\"\n  \"fmt\"\n  \"os\"\n\n  \"github.com/geange/lucene-go/codecs/simpletext\"\n  \"github.com/geange/lucene-go/core/document\"\n  \"github.com/geange/lucene-go/core/index\"\n  \"github.com/geange/lucene-go/core/search\"\n  \"github.com/geange/lucene-go/core/store\"\n)\n\nfunc main() {\n  err := os.RemoveAll(\"data\")\n  if err != nil {\n    panic(err)\n  }\n\n  dir, err := store.NewNIOFSDirectory(\"data\")\n  if err != nil {\n    panic(err)\n  }\n\n  codec := simpletext.NewCodec()\n  similarity, err := search.NewBM25Similarity()\n  if err != nil {\n    panic(err)\n  }\n\n  config := index.NewIndexWriterConfig(codec, similarity)\n\n  ctx := context.Background()\n\n  writer, err := index.NewIndexWriter(ctx, dir, config)\n  if err != nil {\n    panic(err)\n  }\n  defer writer.Close()\n\n  {\n    doc := document.NewDocument()\n    doc.Add(document.NewTextField(\"a\", \"74\", true))\n    doc.Add(document.NewTextField(\"b\", \"86\", true))\n    doc.Add(document.NewTextField(\"c\", \"1237\", true))\n    docID, err := writer.AddDocument(ctx, doc)\n    if err != nil {\n      panic(err)\n    }\n    fmt.Println(\"add new document:\", docID)\n  }\n\n  {\n    doc := document.NewDocument()\n    doc.Add(document.NewTextField(\"a\", \"74\", true))\n    doc.Add(document.NewTextField(\"b\", \"123\", true))\n    doc.Add(document.NewTextField(\"c\", \"789\", true))\n\n    docID, err := writer.AddDocument(context.Background(), doc)\n    if err != nil {\n      panic(err)\n    }\n    fmt.Println(\"add new document:\", docID)\n  }\n\n  {\n    doc := document.NewDocument()\n    doc.Add(document.NewTextField(\"a\", \"741\", true))\n    doc.Add(document.NewTextField(\"b\", \"861\", true))\n    doc.Add(document.NewTextField(\"c\", \"12137\", true))\n    docID, err := writer.AddDocument(context.Background(), doc)\n    if err != nil {\n      panic(err)\n    }\n    fmt.Println(\"add new document:\", docID)\n  }\n}\n\n```\n\n#### IndexReader\n\n```go\npackage main\n\nimport (\n\t\"context\"\n\t\"fmt\"\n\n\t_ \"github.com/geange/lucene-go/codecs/simpletext\"\n\t\"github.com/geange/lucene-go/core/index\"\n\tindex2 \"github.com/geange/lucene-go/core/interface/index\"\n\t\"github.com/geange/lucene-go/core/search\"\n\t\"github.com/geange/lucene-go/core/store\"\n)\n\nfunc main() {\n\tdir, err := store.NewNIOFSDirectory(\"data\")\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\treader, err := index.OpenDirectoryReader(context.Background(), dir, nil, nil)\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\tsearcher, err := search.NewIndexSearcher(reader)\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\tquery := search.NewTermQuery(index.NewTerm(\"a\", []byte(\"74\")))\n\tbuilder := search.NewBooleanQueryBuilder()\n\tbuilder.AddQuery(query, index2.OccurMust)\n\tbooleanQuery, err := builder.Build()\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\ttopDocs, err := searcher.SearchTopN(context.Background(), booleanQuery, 2)\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\tresult := topDocs.GetScoreDocs()\n\tfor _, scoreDoc := range result {\n\t\tdocID := scoreDoc.GetDoc()\n\t\tdocument, err := reader.Document(context.Background(), docID)\n\t\tif err != nil {\n\t\t\tpanic(err)\n\t\t}\n\n\t\tvalues := make([]any, 0)\n\t\tfor _, field := range document.Fields() {\n\t\t\tvalues = append(values, fmt.Sprintf(\"name:%s, value:%v\", field.Name(), field.Get()))\n\t\t}\n\t\tfmt.Println(\"docId: \", scoreDoc.GetDoc(), \"values\", values)\n\t}\n}\n\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeange%2Flucene-go","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgeange%2Flucene-go","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeange%2Flucene-go/lists"}