{"id":38144422,"url":"https://github.com/lokicui/doc2vec-golang","last_synced_at":"2026-01-16T22:53:51.147Z","repository":{"id":57511734,"uuid":"73475874","full_name":"lokicui/doc2vec-golang","owner":"lokicui","description":"doc2vec , word2vec, implemented by golang. word embedding representation","archived":false,"fork":false,"pushed_at":"2018-03-30T07:15:11.000Z","size":4317,"stargazers_count":39,"open_issues_count":1,"forks_count":11,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-06-20T11:09:43.168Z","etag":null,"topics":["doc2vec","doc2vec-golang","golang","word2vec"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lokicui.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-11-11T12:27:15.000Z","updated_at":"2024-06-08T05:04:38.000Z","dependencies_parsed_at":"2022-08-31T08:51:08.655Z","dependency_job_id":null,"html_url":"https://github.com/lokicui/doc2vec-golang","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lokicui/doc2vec-golang","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lokicui%2Fdoc2vec-golang","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lokicui%2Fdoc2vec-golang/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lokicui%2Fdoc2vec-golang/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lokicui%2Fdoc2vec-golang/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lokicui","download_url":"https://codeload.github.com/lokicui/doc2vec-golang/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lokicui%2Fdoc2vec-golang/sbom","scorecard":{"id":597697,"data":{"date":"2025-08-11","repo":{"name":"github.com/lokicui/doc2vec-golang","commit":"1dcd526672bd38c819d86cd29dee2506b69633f3"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-20T23:34:12.884Z","repository_id":57511734,"created_at":"2025-08-20T23:34:12.884Z","updated_at":"2025-08-20T23:34:12.884Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28486797,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T22:47:06.728Z","status":"ssl_error","status_checked_at":"2026-01-16T22:46:52.401Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["doc2vec","doc2vec-golang","golang","word2vec"],"created_at":"2026-01-16T22:53:50.269Z","updated_at":"2026-01-16T22:53:51.133Z","avatar_url":"https://github.com/lokicui.png","language":"Go","readme":"# doc2vec-golang\ngolang implement of Tomas Mikolov's word/document embedding. You may want to feel the basic idea from Mikolov's two orignal papers, [word2vec](http://arxiv.org/pdf/1301.3781.pdf) and [doc2vec](http://cs.stanford.edu/~quocle/paragraph_vector.pdf). More recently, Andrew M. Dai etc from Google reported its power in more [detail](http://arxiv.org/pdf/1507.07998.pdf)\n\n# usage\n```javascript\n[@bjsjs_11_83 doc2vec-golang]$ ./control build\ntraning Exec build ok\nbuild ok\n\n# The training data(data/zhihu_data.1w) is one document per line, two columns divided by tab, \n# the first column is id, and the second column is the segmented document separated by spaces.\n[@bjsjs_11_83 doc2vec-golang]$ ./train  data/zhihu_data.1w          \nSkip-Gram Iter:48 Alpha: 0.000796  Progress: 96.81%  Words/sec: 24.27k  \n2018-03-30 14:53:00.218536235 +0800 CST training end, 1342521 26861\n\n[@bjsjs_11_83 doc2vec-golang]$ ./knn 2.model \n\nplease select operation type:\n        0:word2words\n        1:doc_likelihood\n        2:leave one out key words\n        3:sen2words\n        4:sen2docs\n        5:word2docs\n        6:doc2docs\n        7:doc2words\n0\nEnter text:网页\n        1       网页\n        0.7823723719117796      不让\n        0.7651260773728028      浏览\n        0.7642516944020028      邮件\n        0.7601415883811553      近\n        0.7517607921006224      迷恋\n        0.7492900066365179      等同\n        0.7485966355448261      传说\n        0.7463299535930537      基于\n        0.7447865182221745      版\n\nplease select operation type:\n        0:word2words\n        1:doc_likelihood\n        2:leave one out key words\n        3:sen2words\n        4:sen2docs\n        5:word2docs\n        6:doc2docs\n        7:doc2words\n```\n\n# Dependencies\n* [golang](https://golang.org/)\n* [msgp](https://github.com/tinylib/msgp)\n\n# 已实现特性\n* doc2vec支持CBOW和Skip-Gram两种模型，Negative Sampling和Hierarchical Softmax优化均已实现\n* online infer document\n* [likelihood of document](http://arxiv.org/abs/1504.07295)\n* doc2words\n* doc2docs\n* word2words\n* word2docs\n\n# 未实现特性\n* [wmd](https://github.com/hiyijian/doc2vec/blob/master/jmlr.org/proceedings/papers/v37/kusnerb15.pdf)\n* [doc2vec添加同义词语义约束](http://home.ustc.edu.cn/~quanliu/papers/SWE.pdf)\n* 句子提取核心词\n\n# 参考资料\n* google [word2vec](https://code.google.com/archive/p/word2vec/source/default/source) 实现\n* [hiyijian/doc2vec](https://github.com/hiyijian/doc2vec)\n* [word2vec语义约束](https://github.com/iunderstand/SWE)\n* [doc2vec添加同义词语义约束](http://home.ustc.edu.cn/~quanliu/papers/SWE.pdf)\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flokicui%2Fdoc2vec-golang","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flokicui%2Fdoc2vec-golang","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flokicui%2Fdoc2vec-golang/lists"}