{"id":15519626,"url":"https://github.com/rmitsch/tapas","last_synced_at":"2025-07-05T06:32:28.217Z","repository":{"id":90224632,"uuid":"108546558","full_name":"rmitsch/tAPAS","owner":"rmitsch","description":"Assisted hyperparameter optimization for t-SNE in a word embedding context.","archived":false,"fork":false,"pushed_at":"2018-07-13T12:16:11.000Z","size":4208,"stargazers_count":1,"open_issues_count":2,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-12-29T10:44:31.535Z","etag":null,"topics":["dimensionality-reduction","hyperparameter-optimization","t-sne","visualization","word-embeddings"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rmitsch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-27T12:59:42.000Z","updated_at":"2024-10-12T14:47:02.000Z","dependencies_parsed_at":null,"dependency_job_id":"5c12003f-ebec-4db1-9ecc-ef5fc399fa5d","html_url":"https://github.com/rmitsch/tAPAS","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmitsch%2FtAPAS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmitsch%2FtAPAS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmitsch%2FtAPAS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rmitsch%2FtAPAS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rmitsch","download_url":"https://codeload.github.com/rmitsch/tAPAS/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239697013,"owners_count":19682345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dimensionality-reduction","hyperparameter-optimization","t-sne","visualization","word-embeddings"],"created_at":"2024-10-02T10:22:13.706Z","updated_at":"2025-02-19T16:41:01.852Z","avatar_url":"https://github.com/rmitsch.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tAPAS\n### Assisted Parameter optimization by Approximating neighbourhood Similarity\n\nBayesian optimization of hyperparameters for t-SNE in the context of word embeddings - i. e.: Input is a word embedding (labels + coordinates in high-dimensional space). The optimization procedure samples the parameter space to generate low-dimensional approximations of the original word embeddig data using t-SNE. The quality/truthfulness of the resulting models are evaluated with several metrics:\n* Trustworthiness: Measure for proportion of points too close together\u003csup\u003e1\u003c/sup\u003e in the low-dimensional space.\n* Continuity: Measure for proportion of points too far apart\u003csup\u003e1\u003c/sup\u003e in the low-dimensional space.\n* Generalization: Generalization error of 1-nearest neighbour classifier (e. g. word embedding is clustered in high-dimensional and low-dimensional space - the higher the similarity between the cluster labels, the lower the generalization error).\n* Relative word embedding quality: QVEC [1] is used to evalute the intrinsic quality of the original word embedding and its dimensionality-reduced projection. The ratio is referred to as 'relative word embedding quality'.\n\nThe first three measures were chosen following [2].\n\n![Main View](https://raw.githubusercontent.com/rmitsch/tapas/master/doc/main.png)\n\n![Generation of New Runs](https://raw.githubusercontent.com/rmitsch/tapas/master/doc/run_generation.png)\n\n\u003csup\u003e1\u003c/sup\u003e: In terms of neighbourhood ranks, not absolute distances.\n\n_____\n\n[1] Y. Tsvetkov, M. Faruqui, W. Ling, G. Lample, and C. Dyer, “Evaluation of Word Vector Representations by Subspace Alignment,” 2015, pp. 2049–2054.\n\n[2] L. J. P. van der Maaten, E. O. Postma, and H. J. van den Herik, Dimensionality Reduction: A Comparative Review. 2008.\n ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmitsch%2Ftapas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frmitsch%2Ftapas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmitsch%2Ftapas/lists"}