{"id":15933051,"url":"https://github.com/nzw0301/numba-lda","last_synced_at":"2025-04-03T14:42:14.206Z","repository":{"id":85563417,"uuid":"171824850","full_name":"nzw0301/numba-lda","owner":"nzw0301","description":null,"archived":false,"fork":false,"pushed_at":"2019-02-21T07:50:39.000Z","size":5,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-10-29T07:04:18.160Z","etag":null,"topics":["latent-dirichlet-allocation","lda","machine-learning","nlp","topic-modeling"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nzw0301.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-21T07:46:54.000Z","updated_at":"2020-12-14T15:08:28.000Z","dependencies_parsed_at":"2023-07-20T12:00:50.194Z","dependency_job_id":null,"html_url":"https://github.com/nzw0301/numba-lda","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nzw0301%2Fnumba-lda","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nzw0301%2Fnumba-lda/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nzw0301%2Fnumba-lda/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nzw0301%2Fnumba-lda/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nzw0301","download_url":"https://codeload.github.com/nzw0301/numba-lda/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247023248,"owners_count":20870928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["latent-dirichlet-allocation","lda","machine-learning","nlp","topic-modeling"],"created_at":"2024-10-07T02:20:43.818Z","updated_at":"2025-04-03T14:42:14.176Z","avatar_url":"https://github.com/nzw0301.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Numba implementation of Latent Dirichlet Allocation (LDA) with Gibbs sampling\n\nThis code is almost 100x faster than pure python version.\n\n## Requirements\n\n- python \u003e= 3.6\n- numba\n- numpy\n\n---\n\n## Example\n\n### Generate dataset\n\n```sh\nwget https://cs.nyu.edu/~roweis/data/nips12raw_str602.mat\npython mat2doc.py\n```\n\n### Run LDA!\n\n```sh\npython run_lda.py\n```\n\n```\ntopic k=0\nlearning 0.03689455257690798\nstate 0.024255139224188087\ntime 0.015599175458748034\naction 0.009397594140824722\nfunction 0.009253218702735502\nreinforcement 0.009115405784559428\nalgorithm 0.009108843264646282\ncontrol 0.008977592866383355\npolicy 0.008767592229162671\noptimal 0.007323837848270471\n\ntopic k=1\nfunction 0.018532924551682692\nalgorithm 0.009569897494138776\nfunctions 0.009339126613127633\nlearning 0.009057073314114013\ncase 0.008156781975848319\nvector 0.007316320125252179\nnumber 0.006957343199234844\nlinear 0.006891815665120569\nset 0.006626856505441108\nproblem 0.005464455030718311\n\ntopic k=2\nneural 0.014693381961396692\nanalog 0.013947414087294368\ncircuit 0.012824694761625214\nfigure 0.011822535900457445\ninput 0.011649230232736703\nchip 0.01130261889729522\noutput 0.011084103055386457\ntime 0.009908638526497947\nsignal 0.009441466726555077\ncurrent 0.008266002197666569\n\ntopic k=3\ndata 0.024904547519382585\nmodel 0.022418628784658737\nmodels 0.011401092737355295\ngaussian 0.010383016621591813\ndistribution 0.010236581015899806\nparameters 0.009445131432754906\nalgorithm 0.009344021133586615\nprobability 0.008169049726010267\nnoise 0.006310712158537881\nlikelihood 0.006164276552845873\n\ntopic k=4\nspeech 0.021998096896237575\nrecognition 0.018257055892100794\nword 0.013921241087767149\nsystem 0.012335177620575517\ncontext 0.01093875217663506\nstate 0.01036121819673376\ntree 0.009887123138605828\nsequence 0.009869883318310266\ntime 0.009697485115354653\nhmm 0.007844204433581825\n\ntopic k=5\nnetwork 0.022863073278914698\nneural 0.019493265579492308\ntime 0.019026000237810575\nmodel 0.01865768520377909\nsystem 0.017063485802747292\nstate 0.013792628410975153\ncontrol 0.012621716437113866\ndynamics 0.011648705078553078\nmemory 0.011362848634230134\nnetworks 0.009730168558001014\n\ntopic k=6\nmodel 0.013859093043465\ncells 0.011926354592537792\nneurons 0.010983271372507045\ncell 0.010296334212237737\ninput 0.008293737745011954\nactivity 0.007750397618245268\nresponse 0.007653372595608361\nneuron 0.00706346045797596\nstimulus 0.006908220421756907\nsynaptic 0.006621026354751659\n\ntopic k=7\nimage 0.019469298410382575\nimages 0.012860513840618495\nfigure 0.010751225970009355\nvisual 0.009627542138262725\nobject 0.009603735277420635\nmotion 0.008327687536284632\nfeature 0.007589674850179854\nrecognition 0.006908798630096091\nfeatures 0.00663740041649627\nmodel 0.006394570435906955\n\ntopic k=8\nnetwork 0.043920836523675455\nlearning 0.02732697867929228\ninput 0.02563930610404473\nnetworks 0.022206751713710726\nunits 0.020890939197416025\noutput 0.01894264675179126\ntraining 0.017299470251973967\nweights 0.015611797676726415\nhidden 0.015224046162262759\nneural 0.014175210098549591\n\ntopic k=9\nset 0.018181579066742296\ntraining 0.01680302603405747\ndata 0.014843678261210032\nperformance 0.012490458193309786\ntest 0.009546429682830329\nclassification 0.009299425023220943\nnumber 0.008758685092724715\nresults 0.007022976673847939\nneural 0.006872771137598987\nclass 0.006338707008713824\n\ndoc id=0: network network network network ...\ntopic k=0, θ_dk=0.09238653001464127\ntopic k=1, θ_dk=0.48477306002928255\ntopic k=2, θ_dk=0.1509516837481698\ntopic k=3, θ_dk=0.00014641288433382135\ntopic k=4, θ_dk=0.00014641288433382135\ntopic k=5, θ_dk=0.08067349926793557\ntopic k=6, θ_dk=0.0030746705710102485\ntopic k=7, θ_dk=0.017715959004392382\ntopic k=8, θ_dk=0.11142020497803803\ntopic k=9, θ_dk=0.058711566617862365\n```\n\n---\n\n## References\n\n- Thomas L. Griffiths and Mark Steyvers. [Finding Scientific Topics](http://psiexp.ss.uci.edu/research/papers/sciencetopics.pdf). PNAS, 2004.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnzw0301%2Fnumba-lda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnzw0301%2Fnumba-lda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnzw0301%2Fnumba-lda/lists"}