{"id":13697086,"url":"https://github.com/hyqneuron/pytorch-avitm","last_synced_at":"2025-05-03T17:33:11.697Z","repository":{"id":145586484,"uuid":"93694187","full_name":"hyqneuron/pytorch-avitm","owner":"hyqneuron","description":"PyTorch implementation of AVITM","archived":false,"fork":false,"pushed_at":"2018-07-14T08:54:03.000Z","size":3217,"stargazers_count":159,"open_issues_count":5,"forks_count":43,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-08-03T18:21:48.370Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hyqneuron.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2017-06-08T01:17:32.000Z","updated_at":"2024-05-10T06:50:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"fedbe499-645a-495b-8a27-17995949176a","html_url":"https://github.com/hyqneuron/pytorch-avitm","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyqneuron%2Fpytorch-avitm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyqneuron%2Fpytorch-avitm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyqneuron%2Fpytorch-avitm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyqneuron%2Fpytorch-avitm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hyqneuron","download_url":"https://codeload.github.com/hyqneuron/pytorch-avitm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224369864,"owners_count":17299961,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T18:00:52.471Z","updated_at":"2024-11-13T00:31:26.436Z","avatar_url":"https://github.com/hyqneuron.png","language":"Python","funding_links":[],"categories":["Models"],"sub_categories":["Embedding based Topic Models"],"readme":"\n# PyTorch Implementation of Autoencoding Variational Inference for Topic Models\n\n[Original Paper](https://arxiv.org/abs/1703.01488). \n[Original Tensorflow implementation](https://github.com/akashgit/autoencoding_vi_for_topic_models).\n\nMuch of the code and all of the data is copied from the above repo.\n\nWhat this repo contains:\n- `pytorch_run.py`: PyTorch code for training, testing and visualizing AVITM\n- `pytorch_model.py`: PyTorch code for ProdLDA\n- `pytorch_visualize.py`: code for PyTorch graph visualization\n- `tf_run.py`: Tensorflow code for training and testing AVITM, entirely copied from source repo.\n- `tf_model.py`: Tensorflow code for ProdLDA, adapted from source repo.\n- `data` folder: 20Newsgroup dataset, entirely copied from source repo.\n\nNote that the tensorflow implementation prints the topic words first, then has to wait a few seconds to print the\nperplexity, as testing right now isn't parallelized.\n\n# Running the code\n\nCode can be run with pytorch 0.1.12. Subsequent versions of pytorch upgraded several interface and broke the code.\n\n```bash\n\n# PyTorch version\npython pytorch_run.py --start\n\n# Tensorflow version\npython tf_run.py -p\n\n```\n\nTunable parameters for both scripts:\n\n```bash\n-f 100   # hidden layer size of encoder1\n-s 100   # hidden layer size of encoder2\n-t 50    # number of topics\n-b 200   # batch size\n-e 80    # number of epochs to train\n-r 0.002 # learning rate\n```\n\nTunable parameters for PyTorch script:\n\n```bash\n-m 0.99  # momentum\n-v 0.995 # variance in the prior\n```\n\nIf you want to run the tensorflow code, please note that I'm using tensorflow 1.1. If you use an older version there\nmight be compatibility issues (some difference in interface, for example `tf.mul` becomes `tf.multiply`).\n\n\n# Sample output\n\n```\nxxx@xxx:xxx/pytorch_avitm$ python pytorch_run.py --start\nConverting data to one-hot representation\nData Loaded\nDim Training Data (11258, 1995)\nDim Test Data (7487, 1995)\nEpoch 0, loss=779.540215743\nEpoch 5, loss=682.539052863\nEpoch 10, loss=665.758558307\nEpoch 15, loss=660.786747447\nEpoch 20, loss=646.323563425\nEpoch 25, loss=639.089690627\nEpoch 30, loss=638.143001623\nEpoch 35, loss=632.981146561\nEpoch 40, loss=626.119186669\nEpoch 45, loss=622.517933093\nEpoch 50, loss=619.359790467\nEpoch 55, loss=618.568074544\nEpoch 60, loss=622.428580301\nEpoch 65, loss=613.454376756\nEpoch 70, loss=614.152974447\nEpoch 75, loss=614.537361547\n---------------Printing the Topics------------------\nmidea\n     lebanese arab israel lebanon israeli arabs palestinian peace village civilian\nsport\n     cup wings leafs gm st coach playoff det rangers montreal\nsport\n     player defensive offense coach hitter playoff braves pitcher deserve pitch\ncomp \n     jpeg gif converter compression xlib official extension fund stephanopoulos toolkit\n\n     abuse legitimate anonymous cryptography usenet secure privacy server mechanism directory\n\n     cs push ax ah al null db byte oname bh\njesus\n     doctrine eternal bible christ jesus pray church sin holy god\n\n     mw eus ax sl bhj mg mi pl pd rg\nnasa \n     spacecraft nasa star medical volume patient japanese mission culture rocket\ncomp \n     dos shipping printer manual parallel adapter software port remote video\n\n     thanks uucp _eos_ georgia appreciate kevin curious anyone hus gordon\npolit|crime\n     firearm amendment minority crime militia homicide federal prohibit assault weapon\ngears\n     bike honda bmw sport _eos_ ground wave andy front motorcycle\ngears\n     bike battery gear helmet plug dealer mile transmission oil amp\n\n     turkish turks island muslim mountain armenia war armenian southern village\ncomp \n     ide scsi quadra scsus isa spec cpu cache mhz meg\n\n     mw sl bio mi jumper wm mb connector mg adapter\nnasa \n     spacecraft nasa rocket km orbit shuttle solar mission star billion\n\n     oname contest remark winner entry prior output char null io\ngears|car  \n     bike dog rider wheel ride oil accident safety helmet batf\n\n     wire wiring voltage neutral ground nec trip outlet panel circuit\ncomp \n     xterm cpu font binary extension vga workstation server toolkit distribution\ncrime\n     apartment girl rape armenians neighbor soldier burn hide woman armenian\n\n     stephanopoulos apartment myers meeting armenians february job consideration walk federal\n\n     puck player score acquire penalty game cup playoff offense defense\n\n     annual player cup excellent hockey app sport green nhl update\n\n     annual june origin shipping papers copy rider excellent nasa print\ncomp \n     screen gateway swap meg menu font frame mouse setup colormap\ncomp \n     workstation hp database graphics amiga dec render processing frame directory\njesus\n     eternal sin heaven faith jesus pray christianity bible god resurrection\njesus\n     absolute doctrine bible scripture truth interpretation belief faith god christianity\n\n     pp winnipeg pt rangers louis minnesota philadelphia calgary jose montreal\n\n     gateway quadra vga mouse card video port boot setup ram\ncrime\n     weapon crime criminal violent gun batf gang firearm insurance accident\n\n     mw cross cache link motherboard ram sl wm eus unit\n\n     enforcement encrypt escrow key clipper ripem secure algorithm chip session\n\n     det cup que van tor pit gm leafs wings playoff\ngears|car  \n     bike helmet gear rear detector honda wheel saturn dealer engine\nmidea\n     islamic islam atheist israel religious israeli muslims atheism arab religion\njesus\n     passage jesus verse prophecy worship matthew scripture doctrine biblical holy\n\n     escrow clipper wiretap crypto secure nsa scheme proposal chip warrant\n\n     phil germany april _eos_ curious usa gordon ticket associate reserve\n\n     enforcement americans federal conversation policy encryption legitimate militia clipper economic\ngears|sport|car  \n     gear pitch hitter hit ab helmet rear wheel player worst\n\n     battery amp brand modem shipping electronics voltage external hus audio\n\n     armenia turks turkish genocide armenians armenian muslim escape nazi minority\ncomp \n     button font menu expose specify screen xterm colormap render event\nmidea\n     oo israeli palestinian pl sl rg israel bhj arab arabs\n\n     hus shipping thanks brand appreciate hello condition advance gateway tube\n\n     moral morality reasoning evidence definition existence science conclusion murder objective\n---------------End of Topics------------------\n('The approximated perplexity is: ', 1152.8633604900842)\n\n```\n# PyTorch Graph visualization\n\nRed nodes are weights, orange ones operations, and blue ones variables. Input at top, output at bottom.\n\n![PyTorch forward graph](pytorch_model.png)\n\n\n# Tensorflow Graph visualization\n\nVisualization with Tensorboard. Gives a better high-level overview. Note input is at the bottom, and output is at the\ntop.\n\n![Tensorflow forward graph](tf_model.png)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyqneuron%2Fpytorch-avitm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhyqneuron%2Fpytorch-avitm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyqneuron%2Fpytorch-avitm/lists"}