{"id":18275748,"url":"https://github.com/davideuler/word2vec","last_synced_at":"2025-08-25T06:32:32.430Z","repository":{"id":138210917,"uuid":"44059131","full_name":"davideuler/word2vec","owner":"davideuler","description":"Automatically exported from code.google.com/p/word2vec","archived":false,"fork":false,"pushed_at":"2015-10-11T16:22:37.000Z","size":204,"stargazers_count":0,"open_issues_count":35,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-09T04:13:44.890Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/davideuler.png","metadata":{"files":{"readme":"README.txt","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-10-11T16:10:30.000Z","updated_at":"2015-10-11T16:22:39.000Z","dependencies_parsed_at":"2023-03-13T12:11:59.152Z","dependency_job_id":null,"html_url":"https://github.com/davideuler/word2vec","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/davideuler/word2vec","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davideuler%2Fword2vec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davideuler%2Fword2vec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davideuler%2Fword2vec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davideuler%2Fword2vec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/davideuler","download_url":"https://codeload.github.com/davideuler/word2vec/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davideuler%2Fword2vec/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272013576,"owners_count":24858475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-25T02:00:12.092Z","response_time":1107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T12:13:54.938Z","updated_at":"2025-08-25T06:32:32.398Z","avatar_url":"https://github.com/davideuler.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"Tools for computing distributed representtion of words\n------------------------------------------------------\n\nWe provide an implementation of the Continuous Bag-of-Words (CBOW) and the Skip-gram model (SG), as well as several demo scripts.\n\nGiven a text corpus, the word2vec tool learns a vector for every word in the vocabulary using the Continuous\nBag-of-Words or the Skip-Gram neural network architectures. The user should to specify the following:\n - desired vector dimensionality\n - the size of the context window for either the Skip-Gram or the Continuous Bag-of-Words model\n - training algorithm: hierarchical softmax and / or negative sampling\n - threshold for downsampling the frequent words \n - number of threads to use\n - the format of the output word vector file (text or binary)\n\nUsually, the other hyper-parameters such as the learning rate do not need to be tuned for different training sets. \n\nThe script demo-word.sh downloads a small (100MB) text corpus from the web, and trains a small word vector model. After the training\nis finished, the user can interactively explore the similarity of the words.\n\nMore information about the scripts is provided at https://code.google.com/p/word2vec/\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavideuler%2Fword2vec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavideuler%2Fword2vec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavideuler%2Fword2vec/lists"}