{"id":18295649,"url":"https://github.com/koheiw/lsx","last_synced_at":"2025-04-05T12:31:40.743Z","repository":{"id":43696079,"uuid":"58551823","full_name":"koheiw/LSX","owner":"koheiw","description":"Semi-supervised algorithm for document scaling","archived":false,"fork":false,"pushed_at":"2024-05-29T06:58:56.000Z","size":119921,"stargazers_count":54,"open_issues_count":6,"forks_count":5,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-05-29T20:15:35.290Z","etag":null,"topics":["lsa","quanteda","sentiment-analysis","text-analysis"],"latest_commit_sha":null,"homepage":"https://koheiw.github.io/LSX/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/koheiw.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-05-11T14:24:06.000Z","updated_at":"2024-06-06T06:23:00.619Z","dependencies_parsed_at":"2024-06-06T06:36:14.762Z","dependency_job_id":null,"html_url":"https://github.com/koheiw/LSX","commit_stats":{"total_commits":650,"total_committers":3,"mean_commits":"216.66666666666666","dds":"0.013846153846153841","last_synced_commit":"1f5a68764f86eb01fd529b2ef4b8531ff5bd8aa7"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koheiw%2FLSX","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koheiw%2FLSX/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koheiw%2FLSX/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koheiw%2FLSX/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/koheiw","download_url":"https://codeload.github.com/koheiw/LSX/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247338917,"owners_count":20923000,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lsa","quanteda","sentiment-analysis","text-analysis"],"created_at":"2024-11-05T14:36:53.145Z","updated_at":"2025-04-05T12:31:35.725Z","avatar_url":"https://github.com/koheiw.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: \n  rmarkdown::github_document\n---\n\n```{r, echo=FALSE}\nknitr::opts_chunk$set(\n  collapse = FALSE,\n  comment = \"##\",\n  fig.path = \"images/\",\n  dpi = 150,\n  fig.height = 5,\n  fig.width = 10\n)\n```\n\n# LSS: Semi-supervised algorithm for document scaling\n\n\u003c!-- badges: start --\u003e\n[![CRAN\nVersion](https://www.r-pkg.org/badges/version/LSX)](https://CRAN.R-project.org/package=LSX)\n[![Downloads](https://cranlogs.r-pkg.org/badges/LSX)](https://CRAN.R-project.org/package=LSX)\n[![Total\nDownloads](https://cranlogs.r-pkg.org/badges/grand-total/LSX?color=orange)](https://CRAN.R-project.org/package=LSX)\n[![R build\nstatus](https://github.com/koheiw/LSX/workflows/R-CMD-check/badge.svg)](https://github.com/koheiw/LSX/actions)\n[![codecov](https://codecov.io/gh/koheiw/LSX/branch/master/graph/badge.svg)](https://app.codecov.io/gh/koheiw/LSX)\n\u003c!-- badges: end --\u003e\n\nIn quantitative text analysis, the cost of training supervised machine learning models tend to be very high when the corpus is large. Latent Semantic Scaling (LSS) is a semi-supervised document scaling technique that I developed to perform large scale analysis at low cost. Taking user-provided *seed words* as weak supervision, it estimates polarity of words in the corpus by latent semantic analysis and locates documents on a unidimensional scale (e.g. sentiment). \n\n## Installation\n\nFrom CRAN:\n\n```{r, eval=FALSE}\ninstall.packages(\"LSX\")\n```\n\nFrom Github:\n\n```{r, eval=FALSE}\ndevtools::install_github(\"koheiw/LSX\")\n```\n\n## Examples\n\nPlease visit the package website to understand the usage of the functions:\n\n- [Introduction to LSX](https://koheiw.github.io/LSX/articles/pkgdown/basic.html)\n- [Application in research](https://koheiw.github.io/LSX/articles/pkgdown/research.html)\n- [Selection of seed words](https://koheiw.github.io/LSX/articles/pkgdown/seedwords.html)\n\nPlease read the following papers for the algorithm and methodology, and its application to non-English texts (Japanese and Hebrew): \n\n- Watanabe, Kohei. 2020. [\"Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages\"](https://www.tandfonline.com/doi/full/10.1080/19312458.2020.1832976), *Communication Methods and Measures*.\n- Watanabe, Kohei, Segev, Elad, \u0026 Tago, Atsushi. (2022). [\"Discursive diversion: Manipulation of nuclear threats by the conservative leaders in Japan and Israel\"](https://journals.sagepub.com/doi/full/10.1177/17480485221097967), *International Communication Gazette*. \n\n## Other publications\n\nLSS has been used for research in various fields of social science.\n\n- Nakamura, Kentaro. 2022 [Balancing Opportunities and Incentives: How Rising China’s Mediated Public Diplomacy Changes Under Crisis](https://ijoc.org/index.php/ijoc/article/view/18676/3968), *International Journal of Communication*.\n- Zollinger, Delia. 2022 [Cleavage Identities in Voters’ Own Words: Harnessing Open-Ended Survey Responses](https://onlinelibrary.wiley.com/doi/10.1111/ajps.12743), *American Journal of Political Science*.\n- Brändle, Verena K., and Olga Eisele. 2022. [\"A Thin Line: Governmental Border Communication in Times of European Crises\"](https://onlinelibrary.wiley.com/doi/full/10.1111/jcms.13398) *Journal of Common Market Studies*.\n- Umansky, Natalia. 2022. [\"Who gets a say in this? Speaking security on social media\"](https://journals.sagepub.com/doi/10.1177/14614448221111009). *New Media \u0026 Society*.\n- Rauh, Christian, 2022. [\"Supranational emergency politics? What executives’ public crisis communication may tell us\"](https://www.tandfonline.com/doi/full/10.1080/13501763.2021.1916058), *Journal of European Public Policy*.\n- Trubowitz, Peter and Watanabe, Kohei. 2021. [\"The Geopolitical Threat Index: A Text-Based Computational Approach to Identifying Foreign Threats\"](https://academic.oup.com/isq/advance-article/doi/10.1093/isq/sqab029/6278490), *International Studies Quarterly*.\n- Vydra, Simon and Kantorowicz, Jaroslaw. 2020. [\"Tracing Policy-relevant Information in Social Media: The Case of Twitter before and during the COVID-19 Crisis\"](https://www.degruyter.com/document/doi/10.1515/spp-2020-0013/html). *Statistics, Politics and Policy*.\n- Watanabe, Kohei. 2017. [\"Measuring News Bias: Russia's Official News Agency ITAR-TASS’s Coverage of the Ukraine Crisis\"](http://journals.sagepub.com/eprint/TBc9miIc89njZvY3gyAt/full), *European Journal Communication*.\n\nMore publications are available on [Google Scholar](https://scholar.google.com/scholar?oi=bibs\u0026hl=en\u0026cites=5312969973901591795).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkoheiw%2Flsx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkoheiw%2Flsx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkoheiw%2Flsx/lists"}