{"id":35304372,"url":"https://github.com/vinhkhuc/JFastText","last_synced_at":"2026-01-05T04:01:21.523Z","repository":{"id":57723907,"uuid":"81030108","full_name":"vinhkhuc/JFastText","owner":"vinhkhuc","description":"Java interface for fastText","archived":false,"fork":false,"pushed_at":"2023-06-05T07:07:59.000Z","size":59,"stargazers_count":244,"open_issues_count":50,"forks_count":98,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-12-16T04:17:01.125Z","etag":null,"topics":["java","jni","machine-learning","model-compression","nlp","text-classification","word-embeddings"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vinhkhuc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-02-05T23:33:29.000Z","updated_at":"2025-11-30T15:17:52.000Z","dependencies_parsed_at":"2025-05-08T23:39:15.008Z","dependency_job_id":null,"html_url":"https://github.com/vinhkhuc/JFastText","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/vinhkhuc/JFastText","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinhkhuc%2FJFastText","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinhkhuc%2FJFastText/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinhkhuc%2FJFastText/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinhkhuc%2FJFastText/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vinhkhuc","download_url":"https://codeload.github.com/vinhkhuc/JFastText/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinhkhuc%2FJFastText/sbom","scorecard":{"id":922421,"data":{"date":"2025-08-11","repo":{"name":"github.com/vinhkhuc/JFastText","commit":"130e5e0243e8946124ba8bfa7ab81175bb1995b7"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.6,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Code-Review","score":1,"reason":"Found 3/26 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":9,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Warn: project license file does not contain an FSF or OSI license."],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 7 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-25T05:38:40.827Z","repository_id":57723907,"created_at":"2025-08-25T05:38:40.827Z","updated_at":"2025-08-25T05:38:40.827Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27769802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-16T02:00:10.477Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java","jni","machine-learning","model-compression","nlp","text-classification","word-embeddings"],"created_at":"2025-12-30T17:00:49.812Z","updated_at":"2026-01-05T04:01:21.518Z","avatar_url":"https://github.com/vinhkhuc.png","language":"Java","readme":"[![Build Status](https://travis-ci.org/vinhkhuc/JFastText.svg?branch=master)](https://travis-ci.org/vinhkhuc/JFastText)\n\nTable of Contents\n=================\n\n  * [Introduction](#introduction)\n  * [Maven Dependency](#maven-dependency)\n  * [Building](#building)\n  * [Quick Application - Language Identification](#quick-application-\\--language-identification)\n  * [Detailed Examples](#detailed-examples)\n  * [API](#api)\n  * [FastText's Command Line](#fasttexts-command-line)\n  * [License](#license)\n  * [References](#references)\n  \n\n## Introduction\nJFastText is a Java wrapper for Facebook's [fastText](https://github.com/facebookresearch/fastText), \na library for efficient learning of word embeddings and fast sentence classification. The JNI interface\nis built using [javacpp](https://github.com/bytedeco/javacpp).\n\nThe library provides full fastText's command line interface. It also provides the API for\nloading trained model from file to do label prediction in memory. Model training and quantization\nare supported via the command line interface.\n\nJFastText is ideal for building fast text classifiers in Java.\n\n## Maven Dependency\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.github.vinhkhuc\u003c/groupId\u003e\n  \u003cartifactId\u003ejfasttext\u003c/artifactId\u003e\n  \u003cversion\u003e0.5\u003c/version\u003e\n\u003c/dependency\u003e\n```\nThe Jar package on Maven Central is bundled with precompiled fastText library for Windows, Linux and\nMacOSX 64bit.\n\n## Building\nC++ compiler (g++ on Mac/Linux or cl.exe on Windows) is required to compile fastText's code.\n\n```bash\ngit clone --recursive https://github.com/vinhkhuc/JFastText\ncd JFastText\nmvn package\n```\n\n## Quick Application - Language Identification\nJFastText can use FastText's pretrained models directly. Language identification models can be downloaded [here](https://fasttext.cc/docs/en/language-identification.html).\nIn this quick example, we will use the [quantized model](https://s3-us-west-1.amazonaws.com/fasttext-vectors/supervised_models/lid.176.ftz)\nwhich is super small and a bit less accurate than the original model.\n\n```bash\n$ wget -q https://s3-us-west-1.amazonaws.com/fasttext-vectors/supervised_models/lid.176.ftz \\\n    \u0026\u0026 { echo \"This is English\"; echo \"Xin chào\"; echo \"Привет\"; } \\\n    | java -jar target/jfasttext-*-jar-with-dependencies.jar predict lid.176.ftz -\n__label__en\n__label__vi\n__label__ru\n```\n\n## Detailed Examples\nExamples on how to use JFastText can be found at [examples/api](examples/api) and [examples/cmd](examples/cmd).\n\n## API\n\n### Initialization\n\n```java\nimport com.github.jfasttext.JFastText;\n...\nJFastText jft = new JFastText();\n```\n\n### Word embedding learning\n```java\njft.runCmd(new String[] {\n        \"skipgram\",\n        \"-input\", \"src/test/resources/data/unlabeled_data.txt\",\n        \"-output\", \"src/test/resources/models/skipgram.model\",\n        \"-bucket\", \"100\",\n        \"-minCount\", \"1\"\n});\n```\n\n### Text classification\n```java\n// Train supervised model\njft.runCmd(new String[] {\n        \"supervised\",\n        \"-input\", \"src/test/resources/data/labeled_data.txt\",\n        \"-output\", \"src/test/resources/models/supervised.model\"\n});\n\n// Load model from file\njft.loadModel(\"src/test/resources/models/supervised.model.bin\");\n\n// Do label prediction\nString text = \"What is the most popular sport in the US ?\";\nJFastText.ProbLabel probLabel = jft.predictProba(text);\nSystem.out.printf(\"\\nThe label of '%s' is '%s' with probability %f\\n\",\n        text, probLabel.label, Math.exp(probLabel.logProb));\n```\n\n## FastText's Command Line\nFastText's command line interface can be accessed as follows:\n```bash\n$ java -jar target/jfasttext-*-jar-with-dependencies.jar\nusage: fasttext \u003ccommand\u003e \u003cargs\u003e\n\nThe commands supported by fasttext are:\n\n  supervised              train a supervised classifier\n  quantize                quantize a model to reduce the memory usage\n  test                    evaluate a supervised classifier\n  predict                 predict most likely labels\n  predict-prob            predict most likely labels with probabilities\n  skipgram                train a skipgram model\n  cbow                    train a cbow model\n  print-word-vectors      print word vectors given a trained model\n  print-sentence-vectors  print sentence vectors given a trained model\n  print-ngrams            print ngrams given a trained model and word\n  nn                      query for nearest neighbors\n  analogies               query for analogies\n  dump                    dump arguments,dictionary,input/output vectors\n\n```\n\nFor example:\n\n```bash\n$ java -jar target/jfasttext-*-jar-with-dependencies.jar quantize -h\n```\n\n## License\nBSD\n\n## References\n(From fastText's [references](https://github.com/facebookresearch/fastText#references))\n\nPlease cite [1](#enriching-word-vectors-with-subword-information) if using this code for learning word representations or [2](#bag-of-tricks-for-efficient-text-classification) if using for text classification.\n\n### Enriching Word Vectors with Subword Information\n\n[1] P. Bojanowski\\*, E. Grave\\*, A. Joulin, T. Mikolov, [*Enriching Word Vectors with Subword Information*](https://arxiv.org/abs/1607.04606)\n\n```\n@article{bojanowski2016enriching,\n  title={Enriching Word Vectors with Subword Information},\n  author={Bojanowski, Piotr and Grave, Edouard and Joulin, Armand and Mikolov, Tomas},\n  journal={arXiv preprint arXiv:1607.04606},\n  year={2016}\n}\n```\n\n### Bag of Tricks for Efficient Text Classification\n\n[2] A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, [*Bag of Tricks for Efficient Text Classification*](https://arxiv.org/abs/1607.01759)\n\n```\n@article{joulin2016bag,\n  title={Bag of Tricks for Efficient Text Classification},\n  author={Joulin, Armand and Grave, Edouard and Bojanowski, Piotr and Mikolov, Tomas},\n  journal={arXiv preprint arXiv:1607.01759},\n  year={2016}\n}\n```\n\n### FastText.zip: Compressing text classification models\n\n[3] A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, T. Mikolov, [*FastText.zip: Compressing text classification models*](https://arxiv.org/abs/1612.03651)\n\n```\n@article{joulin2016fasttext,\n  title={FastText.zip: Compressing text classification models},\n  author={Joulin, Armand and Grave, Edouard and Bojanowski, Piotr and Douze, Matthijs and J{\\'e}gou, H{\\'e}rve and Mikolov, Tomas},\n  journal={arXiv preprint arXiv:1612.03651},\n  year={2016}\n}\n```\n\n(\\* These authors contributed equally.)\n","funding_links":[],"categories":["人工智能"],"sub_categories":["Spring Cloud框架"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvinhkhuc%2FJFastText","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvinhkhuc%2FJFastText","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvinhkhuc%2FJFastText/lists"}