{"id":24176775,"url":"https://github.com/prashantranjan09/elmo-tutorial","last_synced_at":"2026-03-17T18:03:28.430Z","repository":{"id":236587969,"uuid":"139959098","full_name":"PrashantRanjan09/Elmo-Tutorial","owner":"PrashantRanjan09","description":"A short tutorial on Elmo training (Pre trained, Training on new data, Incremental training)","archived":false,"fork":false,"pushed_at":"2020-06-20T06:41:13.000Z","size":406,"stargazers_count":153,"open_issues_count":0,"forks_count":38,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-08-14T00:30:08.009Z","etag":null,"topics":["allen","allennlp","elmo","elmo-tutorial","tutorial","word-embeddings","word-vectors"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PrashantRanjan09.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-07-06T08:51:48.000Z","updated_at":"2025-05-31T01:39:25.000Z","dependencies_parsed_at":"2024-04-27T23:38:39.901Z","dependency_job_id":"72fa029d-fb1a-41ef-a8d0-58bfd3974529","html_url":"https://github.com/PrashantRanjan09/Elmo-Tutorial","commit_stats":null,"previous_names":["prashantranjan09/elmo-tutorial"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/PrashantRanjan09/Elmo-Tutorial","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrashantRanjan09%2FElmo-Tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrashantRanjan09%2FElmo-Tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrashantRanjan09%2FElmo-Tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrashantRanjan09%2FElmo-Tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PrashantRanjan09","download_url":"https://codeload.github.com/PrashantRanjan09/Elmo-Tutorial/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrashantRanjan09%2FElmo-Tutorial/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30628405,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-17T17:32:55.572Z","status":"ssl_error","status_checked_at":"2026-03-17T17:32:38.732Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["allen","allennlp","elmo","elmo-tutorial","tutorial","word-embeddings","word-vectors"],"created_at":"2025-01-13T03:34:12.808Z","updated_at":"2026-03-17T18:03:28.399Z","avatar_url":"https://github.com/PrashantRanjan09.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Elmo-Tutorial\n\nThis is a short tutorial on using Deep contextualized word representations (ELMo) which is discussed in the paper https://arxiv.org/abs/1802.05365.\nThis tutorial can help in using:\n\n* **Pre Trained Elmo Model**  - refer _Elmo_tutorial.ipynb_ \u003cbr\u003e\n* **Training an Elmo Model on your new data from scratch** \u003cbr\u003e\n\nTo train and evaluate a biLM, you need to provide:\n   * a vocabulary file \n   * a set of training files \n   * a set of heldout files \n\nThe vocabulary file is a text file with one token per line. It must also include the special tokens , and\nThe vocabulary file should be sorted in descending order by token count in your training data. The first three entries/lines should be the special tokens : \u003cbr\u003e\n`\u003cS\u003e` , \u003cbr\u003e\n`\u003c/S\u003e`  and \u003cbr\u003e\n`\u003cUNK\u003e`.\u003cbr\u003e\n\nThe training data should be randomly split into many training files, each containing one slice of the data. Each file contains pre-tokenized and white space separated text, one sentence per line. \n\n**Don't include the `\u003cS\u003e` or `\u003c/S\u003e` tokens in your training data.**\n\nOnce done, git clone **https://github.com/allenai/bilm-tf.git**\nand run:\n\n    python bin/train_elmo.py --train_prefix= \u003cpath to training folder\u003e --vocab_file \u003cpath to vocab file\u003e --save_dir \u003cpath where models will be checkpointed\u003e\n\nTo get the weights file, \nrun:\n\n    python bin/dump_weights.py --save_dir /output_path/to/checkpoint --outfile/output_path/to/weights.hdf5\n\nIn the save dir, one options.json will be dumped and above command will give you a weights file required to create an Elmo model (options file and the weights file)\n\nFor more information refer **Elmo_tutorial.ipynb**\n\n\n* ## Incremental Learning/Training \u003cbr\u003e\n\nTo incrementally train an existing model with new data \u003cbr\u003e \n\nWhile doing Incremental training :\ngit clone https://github.com/allenai/bilm-tf.git\n\nOnce done, replace _train_elmo_ within allenai/bilm-tf/bin/ with **train_elmo_updated.py** provided at home.\n\n**Updated changes** :\u003cbr\u003e\n\n_train_elmo_updated.py_\n\n    tf_save_dir = args.save_dir\n    tf_log_dir = args.save_dir\n    train(options, data, n_gpus, tf_save_dir, tf_log_dir,restart_ckpt_file)\n    \n    if __name__ == '__main__':\n    parser = argparse.ArgumentParser()\n    parser.add_argument('--save_dir', help='Location of checkpoint files')\n    parser.add_argument('--vocab_file', help='Vocabulary file')\n    parser.add_argument('--train_prefix', help='Prefix for train files')\n    parser.add_argument('--restart_ckpt_file', help='latest checkpoint file to start with')\n    \nThis takes an argument (--restart_ckpt_file) to accept the path of the checkpointed file. \n\n\nreplace _training.py_ within allenai/bilm-tf/bilm/ with **training_updated.py** provided at home.\nAlso, make sure to put your embedding layer name in line 758 in **training_updated.py** :\n\n    exclude = ['the embedding layer name you want to remove']\n    \n**Updated changes** :\u003cbr\u003e\n\n_training_updated.py_\n\n        # load the checkpoint data if needed\n        if restart_ckpt_file is not None:\n            reader = tf.train.NewCheckpointReader(your_checkpoint_file)\n            cur_vars = reader.get_variable_to_shape_map()\n            exclude = ['the embedding layer name you want to remove']\n            variables_to_restore = tf.contrib.slim.get_variables_to_restore(exclude=exclude)\n            loader = tf.train.Saver(variables_to_restore)\n            #loader = tf.train.Saver()\n            loader.save(sess,'/tmp')\n            loader.restore(sess, '/tmp')\n            with open(os.path.join(tf_save_dir, 'options.json'), 'w') as fout:\n                fout.write(json.dumps(options))\n\n        summary_writer = tf.summary.FileWriter(tf_log_dir, sess.graph)\n        \nThe code reads the checkpointed file and reads all the current variables in the graph and excludes the layers mentioned in the _exclude_ variable, restores rest of the variables along with the associated weights.\n\nFor training run: \n\n     python bin/train_elmo_updated.py --train_prefix= \u003cpath to training folder\u003e --vocab_file \u003cpath to vocab file\u003e --save_dir \u003cpath where models will be checkpointed\u003e --restart_ckpt_file \u003cpath to checkpointed model\u003e\n \n \nIn the _train_elmo_updated.py_ within bin, set these options based on your data:\n    \n    batch_size = 128  # batch size for each GPU\n    n_gpus = 3\n\n    # number of tokens in training data \n    n_train_tokens = \n\n    options = {\n     'bidirectional': True,\n     'dropout': 0.1,\n     'all_clip_norm_val': 10.0,\n\n     'n_epochs': 10,\n     'n_train_tokens': n_train_tokens,\n     'batch_size': batch_size,\n     'n_tokens_vocab': vocab.size,\n     'unroll_steps': 20,\n     'n_negative_samples_batch': 8192,\n       \n\n**Visualisation**\n\nVisualization of the word vectors using Elmo:\n\n* Tsne\n![Optional Text](../master/Tsne_vis.png)\n\n* Tensorboard \n\n![Optional Text](../master/tensorboard_vis.png)\n\n\n### Using Elmo Embedding layer in consequent models\nif you want to use Elmo Embedding layer in consequent model build refer : https://github.com/PrashantRanjan09/WordEmbeddings-Elmo-Fasttext-Word2Vec\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprashantranjan09%2Felmo-tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprashantranjan09%2Felmo-tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprashantranjan09%2Felmo-tutorial/lists"}