{"id":20623222,"url":"https://github.com/soskek/skip_thought","last_synced_at":"2026-04-18T08:02:16.971Z","repository":{"id":113372908,"uuid":"108285152","full_name":"soskek/skip_thought","owner":"soskek","description":"Language Model and Skip-Thought Vectors (Kiros et al. 2015)","archived":false,"fork":false,"pushed_at":"2018-02-17T06:22:53.000Z","size":31,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-14T06:37:01.267Z","etag":null,"topics":["chainer","language-model","neural-network","skip-thought-vectors","skip-thoughts"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soskek.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-25T14:56:17.000Z","updated_at":"2019-04-30T14:40:30.000Z","dependencies_parsed_at":"2023-06-15T11:30:24.575Z","dependency_job_id":null,"html_url":"https://github.com/soskek/skip_thought","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/soskek/skip_thought","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soskek%2Fskip_thought","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soskek%2Fskip_thought/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soskek%2Fskip_thought/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soskek%2Fskip_thought/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soskek","download_url":"https://codeload.github.com/soskek/skip_thought/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soskek%2Fskip_thought/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31961348,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chainer","language-model","neural-network","skip-thought-vectors","skip-thoughts"],"created_at":"2024-11-16T12:26:19.735Z","updated_at":"2026-04-18T08:02:16.952Z","avatar_url":"https://github.com/soskek.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sentence-level Language Model and Skip-thought Vector\n\nTraining script is as follows:\n\n```\npython -u train.py -g 3 --train train_data --valid valid_data --vocab vocab.t100 -u 512 --layer 1 --dropout 0.1 --batchsize 128 --out output_dir\n```\n\nIf you add `--language-model`, a model to be trained is a sentence-level language model.\nOtherwise, the model is a skip-thought model by default.\n\nDataset of training and validation should have one-line-one-sentence format.\nTraining a skip-thought model uses only neighbor sentences in paragraphs, which are separated by blank lines.\n\nCounting-based vocabulary file `vocab.t100` can be constructed by the script below:\n\n```\npython construct_vocab.py --data train_data -t 100 --save vocab.t100\n```\n\n\nFor skip-thought vector, see [Skip-Thought Vectors](https://arxiv.org/pdf/1506.06726.pdf), Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler, NIPS 2015.\n\n\n### Computation Cost\n\n#### Sentence-level Language Model\n\nFor 128 sentence pairs in a minibatch, 512-unit LSTM with vocabulary size of 22231 can process 10 iterations per second on 7.5GB GPU memory.\nOn dataset with 4,300,000 pairs, training is performed over 5 epoch in 4.5 hours.\n\n#### Skip-thought Vector\n\nFor 128 sentence pairs in a minibatch, 512-unit GRU with vocabulary size of 22231 can process 2-2.5 iterations per second on 7.5GB GPU memory.\nOn dataset with 4,000,000 pairs, training is performed over 5 epoch in 18-22 hours.\n\n\n### Use wikitext103 as Dataset\n\n```\nsh prepare_rawwikitext.sh\n```\n\n```\nPYTHONIOENCODING=utf-8 python preprocess_spacy.py datasets/wikitext-103-raw/wiki.train.raw \u003e datasets/wikitext-103-raw/spacy_wikitext-103-raw.train\nPYTHONIOENCODING=utf-8 python preprocess_spacy.py datasets/wikitext-103-raw/wiki.valid.raw \u003e datasets/wikitext-103-raw/spacy_wikitext-103-raw.valid\nPYTHONIOENCODING=utf-8 python preprocess_spacy.py datasets/wikitext-103-raw/wiki.test.raw \u003e datasets/wikitext-103-raw/spacy_wikitext-103-raw.test\n```\n\n```\nPYTHONIOENCODING=utf-8 python preprocess_after_spacy.py datasets/wikitext-103-raw/spacy_wikitext-103-raw.train \u003e datasets/wikitext-103-raw/spacy_wikitext-103-raw.train.after\nPYTHONIOENCODING=utf-8 python preprocess_after_spacy.py datasets/wikitext-103-raw/spacy_wikitext-103-raw.valid \u003e datasets/wikitext-103-raw/spacy_wikitext-103-raw.valid.after\nPYTHONIOENCODING=utf-8 python preprocess_after_spacy.py datasets/wikitext-103-raw/spacy_wikitext-103-raw.test \u003e datasets/wikitext-103-raw/spacy_wikitext-103-raw.test.after\n```\n\n```\npython construct_vocab.py --data datasets/wikitext-103-raw/spacy_wikitext-103-raw.train.after -t 100 --save datasets/wikitext-103-raw/spacy_wikitext-103-raw.train.after.vocab.t100\n```\n\n```\npython -u train.py -g 3 --train datasets/wikitext-103-raw/spacy_wikitext-103-raw.train.after --valid datasets/wikitext-103-raw/spacy_wikitext-103-raw.valid.after --vocab datasets/wikitext-103-raw/spacy_wikitext-103-raw.train.after.vocab.t100 -u 512 --layer 1 --dropout 0.1 --batchsize 128 --out outs/st.u512.l1.d01.b128\n```\n\n\n---\n---\n\n# Efficient Softmax Approximation\n\nImplementations of Blackout and Adaptive Softmax for efficiently calculating word distribution for language modeling of very large vocabularies.\n\nLSTM language models are derived from [rnnlm_chainer](https://github.com/soskek/rnnlm_chainer).\n\nAvailable output layers are as follows\n\n- Linear + softmax with cross entropy loss. A usual output layer.\n- `--share-embedding`: A variant using the word embedding matrix shared with the input layer for the output layer.\n- `--adaptive-softmax`: [Adaptive softmax](http://proceedings.mlr.press/v70/grave17a/grave17a.pdf)\n- `--blackout`: [BlackOut](https://arxiv.org/pdf/1511.06909.pdf) (BlackOut is not faster on GPU.)\n\n### Adaptive Softmax\n\n- Efficient softmax approximation for GPUs\n- Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, Hervé Jégou, ICML 2017\n- [paper](http://proceedings.mlr.press/v70/grave17a/grave17a.pdf)\n- [authors' Lua code](https://github.com/facebookresearch/adaptive-softmax)\n\n### BlackOut\n\n- BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies\n- Shihao Ji, S. V. N. Vishwanathan, Nadathur Satish, Michael J. Anderson, Pradeep Dubey, ICLR 2016\n- [paper](https://arxiv.org/pdf/1511.06909.pdf)\n- [authors' C++ code](https://github.com/IntelLabs/rnnlm)\n\n# How to Run\n\n```\npython -u train.py -g 0\n```\n\n## Datasets\n\n- PennTreeBank\n- Wikitext-2\n- Wikitext-103\n\nFor wikitext, run `prepare_wikitext.sh` for downloading the datasets.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoskek%2Fskip_thought","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoskek%2Fskip_thought","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoskek%2Fskip_thought/lists"}