{"id":23080640,"url":"https://github.com/chen0040/java-text-embedding","last_synced_at":"2025-08-15T22:31:13.138Z","repository":{"id":53769630,"uuid":"123841784","full_name":"chen0040/java-text-embedding","owner":"chen0040","description":"Word embedding in Java","archived":false,"fork":false,"pushed_at":"2021-03-15T08:25:30.000Z","size":57,"stargazers_count":5,"open_issues_count":1,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2023-07-26T21:56:13.759Z","etag":null,"topics":["document-embedding","glove","glove-embeddings","word-embeddings"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chen0040.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-03-05T00:12:07.000Z","updated_at":"2023-06-13T11:43:40.000Z","dependencies_parsed_at":"2022-09-01T17:02:50.125Z","dependency_job_id":null,"html_url":"https://github.com/chen0040/java-text-embedding","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen0040%2Fjava-text-embedding","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen0040%2Fjava-text-embedding/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen0040%2Fjava-text-embedding/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen0040%2Fjava-text-embedding/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chen0040","download_url":"https://codeload.github.com/chen0040/java-text-embedding/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229964387,"owners_count":18152034,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["document-embedding","glove","glove-embeddings","word-embeddings"],"created_at":"2024-12-16T13:15:54.038Z","updated_at":"2024-12-16T13:15:55.400Z","avatar_url":"https://github.com/chen0040.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# java-word-embedding\n\nWord embedding in Java\n\nThe current project provides GloVe word embedding that developer can directly use within their project.\n\n# Install\n\nAdd the following dependency to your POM file:\n\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.github.chen0040\u003c/groupId\u003e\n  \u003cartifactId\u003ejava-text-embedding\u003c/artifactId\u003e\n  \u003cversion\u003e1.0.1\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n# Usage\n\nThe [sample codes](src/main/java/com/github/chen0040/embeddings/GloVeModelDemo.java) below shows how to use\n[GloVeModel](src/main/java/com/github/chen0040/embeddings/GloVeModel.java) to create GloVe word embedding of different\ndimensions (e.g., 50, 100, 200, 300) \n\n```java\n\nimport org.slf4j.Logger;\nimport org.slf4j.LoggerFactory;\nimport com.github.chen0040.embeddings.GloVeModel;\n\npublic class GloVeModelDemo {\n\n    private static final Logger logger = LoggerFactory.getLogger(GloVeModelDemo.class);\n\n    public static void main(String[] args) {\n        GloVeModel model = new GloVeModel();\n        model.load100();\n\n        logger.info(\"word2em size: {}\", model.size());\n        logger.info(\"word2em dimension for individual word: {}\", model.getWordVecDimension());\n\n        logger.info(\"father: {}\", model.encodeWord(\"father\"));\n        logger.info(\"mother: {}\", model.encodeWord(\"mother\"));\n        logger.info(\"man: {}\", model.encodeWord(\"man\"));\n        logger.info(\"woman: {}\", model.encodeWord(\"woman\"));\n        logger.info(\"boy: {}\", model.encodeWord(\"boy\"));\n        logger.info(\"girl: {}\", model.encodeWord(\"girl\"));\n        \n        logger.info(\"distance between boy and girl: {}\", model.distance(\"boy\", \"girl\"));\n\n\n        String doc = \"The Zen of Python. Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules.\";\n\n        logger.info(\"doc: {}\", model.encodeDocument(doc));\n\n\n    }\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchen0040%2Fjava-text-embedding","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchen0040%2Fjava-text-embedding","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchen0040%2Fjava-text-embedding/lists"}