{"id":15805300,"url":"https://github.com/lsys/lexicalrichness","last_synced_at":"2025-04-09T12:04:16.947Z","repository":{"id":32410625,"uuid":"132715931","full_name":"LSYS/LexicalRichness","owner":"LSYS","description":":smile_cat: :speech_balloon: A module to compute textual lexical richness (aka lexical diversity).","archived":false,"fork":false,"pushed_at":"2023-08-27T05:12:59.000Z","size":3625,"stargazers_count":90,"open_issues_count":3,"forks_count":19,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-10-06T02:07:44.390Z","etag":null,"topics":["data-mining","data-science","information-retrieval","lexical-analysis","lexical-analyzer","linguistic-analysis","natural-language","natural-language-processing","nlp","python"],"latest_commit_sha":null,"homepage":"http://lexicalrichness.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LSYS.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"docs/CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-05-09T07:06:02.000Z","updated_at":"2024-09-29T09:21:47.000Z","dependencies_parsed_at":"2024-06-19T00:18:42.491Z","dependency_job_id":"2bbc26de-204c-43a5-bdbc-24eb73f4a8b5","html_url":"https://github.com/LSYS/LexicalRichness","commit_stats":{"total_commits":86,"total_committers":8,"mean_commits":10.75,"dds":0.4418604651162791,"last_synced_commit":"e7e2816911039d0d713fcbefcd70e3c20db044e3"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LSYS%2FLexicalRichness","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LSYS%2FLexicalRichness/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LSYS%2FLexicalRichness/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LSYS%2FLexicalRichness/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LSYS","download_url":"https://codeload.github.com/LSYS/LexicalRichness/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248036064,"owners_count":21037092,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-mining","data-science","information-retrieval","lexical-analysis","lexical-analyzer","linguistic-analysis","natural-language","natural-language-processing","nlp","python"],"created_at":"2024-10-05T02:08:02.206Z","updated_at":"2025-04-09T12:04:16.924Z","avatar_url":"https://github.com/LSYS.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"===============\nLexicalRichness\n===============\n|\t|pypi| |conda-forge| |latest-release| |python-ver| \n|\t|ci-status| |rtfd| |maintained|\n|\t|PRs| |codefactor| |isort|\n|\t|license| |mybinder| |zenodo|\n\n`LexicalRichness \u003chttps://github.com/lsys/lexicalrichness\u003e`__ is a small Python module to compute textual lexical richness (aka lexical diversity) measures.\n\nLexical richness refers to the range and variety of vocabulary deployed in a text by a speaker/writer `(McCarthy and Jarvis 2007) \u003chttps://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1028.8657\u0026rep=rep1\u0026type=pdf\u003e`_ . Lexical richness is used interchangeably with lexical diversity, lexical variation, lexical density, and vocabulary richness and is measured by a wide variety of indices. Uses include (but not limited to) measuring writing quality, vocabulary knowledge `(Šišková 2012) \u003chttps://www.researchgate.net/publication/305999633_Lexical_Richness_in_EFL_Students'_Narratives\u003e`_ , speaker competence, and socioeconomic status `(McCarthy and Jarvis 2007) \u003chttps://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1028.8657\u0026rep=rep1\u0026type=pdf\u003e`_. \nSee the `notebook \u003chttps://nbviewer.org/github/LSYS/LexicalRichness/blob/master/docs/example.ipynb\u003e`_ for examples.\n\n.. TOC\n.. contents:: **Table of Contents**\n   :depth: 1\n   :local:\n\t\n1. Installation\n---------------\n**Install using PIP**\n\n.. code-block:: bash\n\n\tpip install lexicalrichness\n\nIf you encounter, \n\n.. code-block:: python\n\n\tModuleNotFoundError: No module named 'textblob'\n\ninstall textblob:\n\n.. code-block:: bash\n\n\tpip install textblob\n\n*Note*: This error should only exist for :code:`versions \u003c= v0.1.3`. Fixed in \n`v0.1.4 \u003chttps://github.com/LSYS/LexicalRichness/releases/tag/0.1.4\u003e`__ by `David Lesieur \u003chttps://github.com/davidlesieur\u003e`__ and `Christophe Bedetti \u003chttps://github.com/cbedetti\u003e`__.\n\n\n**Install from Conda-Forge**\n\n*LexicalRichness* is now also available on conda-forge. If you have are using the `Anaconda \u003chttps://www.anaconda.com/distribution/#download-section\u003e`__ or `Miniconda \u003chttps://docs.conda.io/en/latest/miniconda.html\u003e`__ distribution, you can create a conda environment and install the package from conda.\n\n.. code-block:: bash\n\n\tconda create -n lex\n\tconda activate lex \n\tconda install -c conda-forge lexicalrichness\n\n*Note*: If you get the error :code:`CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'` with :code:`conda activate lex` in *Bash* either try\n\n\t* :code:`conda activate bash` in the *Anaconda Prompt* and then retry :code:`conda activate lex` in *Bash*\n\t* or just try :code:`source activate lex` in *Bash*\n\n**Install manually using Git and GitHub**\n\n.. code-block:: bash\n\n\tgit clone https://github.com/LSYS/LexicalRichness.git\n\tcd LexicalRichness\n\tpip install .\n\n**Run from the cloud**\n\nTry the package on the cloud (without setting anything up on your local machine) by clicking the icon here:  \n\n|mybinder|\n\n\n\n2. Quickstart\n-------------\n\n.. code-block:: python\n\n\t\u003e\u003e\u003e from lexicalrichness import LexicalRichness\n\n\t# text example\n\t\u003e\u003e\u003e text = \"\"\"Measure of textual lexical diversity, computed as the mean length of sequential words in\n            \t\ta text that maintains a minimum threshold TTR score.\n\n            \t\tIterates over words until TTR scores falls below a threshold, then increase factor\n            \t\tcounter by 1 and start over. McCarthy and Jarvis (2010, pg. 385) recommends a factor\n            \t\tthreshold in the range of [0.660, 0.750].\n            \t\t(McCarthy 2005, McCarthy and Jarvis 2010)\"\"\"\n\n\t# instantiate new text object (use the tokenizer=blobber argument to use the textblob tokenizer)\n\t\u003e\u003e\u003e lex = LexicalRichness(text)\n\n\t# Return word count.\n\t\u003e\u003e\u003e lex.words\n\t57\n\n\t# Return (unique) word count.\n\t\u003e\u003e\u003e lex.terms\n\t39\n\n\t# Return type-token ratio (TTR) of text.\n\t\u003e\u003e\u003e lex.ttr\n\t0.6842105263157895\n\n\t# Return root type-token ratio (RTTR) of text.\n\t\u003e\u003e\u003e lex.rttr\n\t5.165676192553671\n\n\t# Return corrected type-token ratio (CTTR) of text.\n\t\u003e\u003e\u003e lex.cttr\n\t3.6526846651686067\n\n\t# Return mean segmental type-token ratio (MSTTR).\n\t\u003e\u003e\u003e lex.msttr(segment_window=25)\n\t0.88\n\n\t# Return moving average type-token ratio (MATTR).\n\t\u003e\u003e\u003e lex.mattr(window_size=25)\n\t0.8351515151515151\n\n\t# Return Measure of Textual Lexical Diversity (MTLD).\n\t\u003e\u003e\u003e lex.mtld(threshold=0.72)\n\t46.79226361031519\n\n\t# Return hypergeometric distribution diversity (HD-D) measure.\n\t\u003e\u003e\u003e lex.hdd(draws=42)\n\t0.7468703323966486\n\t\n\t# Return voc-D measure.\n\t\u003e\u003e\u003e lex.vocd(ntokens=50, within_sample=100, iterations=3)\n\t46.27679899103406\n\n\t# Return Herdan's lexical diversity measure.\n\t\u003e\u003e\u003e lex.Herdan\n\t0.9061378160786574\n\n\t# Return Summer's lexical diversity measure.\n\t\u003e\u003e\u003e lex.Summer\n\t0.9294460323356605\n\n\t# Return Dugast's lexical diversity measure.\n\t\u003e\u003e\u003e lex.Dugast\n\t43.074336212149774\n\n\t# Return Maas's lexical diversity measure.\n\t\u003e\u003e\u003e lex.Maas\n\t0.023215679867353005\n\n\t# Return Yule's K.\n\t\u003e\u003e\u003e lex.yulek\n\t153.8935056940597\n\n\t# Return Yule's I.\n\t\u003e\u003e\u003e lex.yulei\n\t22.36764705882353\n\t\n\t# Return Herdan's Vm.\n\t\u003e\u003e\u003e lex.herdanvm\n\t0.08539428890448784\n\n\t# Return Simpson's D.\n\t\u003e\u003e\u003e lex.simpsond\n\t0.015664160401002505\n\n\t\n3. Use LexicalRichness in your own pipeline\n-------------------------------------------\n:code:`LexicalRichness` comes packaged with minimal preprocessing + tokenization for a quick start. \n\nBut for intermediate users, you likely have your preferred :code:`nlp_pipeline`:\n\n.. code-block:: python\n\n\t# Your preferred preprocessing + tokenization pipeline\n\tdef nlp_pipeline(text):\n\t    ...\n\t    return list_of_tokens\n\nUse :code:`LexicalRichness` with your own :code:`nlp_pipeline`:\n\n.. code-block:: python\n\n\t# Initiate new LexicalRichness object with your preprocessing pipeline as input\n\tlex = LexicalRichness(text, preprocessor=None, tokenizer=nlp_pipeline)\n\n\t# Compute lexical richness\n\tmtld = lex.mtld()\n\t\nOr use :code:`LexicalRichness` at the end of your pipeline and input the :code:`list_of_tokens` with :code:`preprocessor=None` and :code:`tokenizer=None`:\n\t\n.. code-block:: python\n\n\t# Preprocess the text\n\tlist_of_tokens = nlp_pipeline(text)\n\t\n\t# Initiate new LexicalRichness object with your list of tokens as input\n\tlex = LexicalRichness(list_of_tokens, preprocessor=None, tokenizer=None)\n\n\t# Compute lexical richness\n\tmtld = lex.mtld()\t\n\t\n4. Using with Pandas\n--------------------\nHere's a minimal example using `lexicalrichness` with a `Pandas` `dataframe` with a column containing text:\n\n.. code-block:: python\n\n\tdef mtld(text):\n\t    lex = LexicalRichness(text)\n\t    return lex.mtld()\n\t\t\n\tdf['mtld'] = df['text'].apply(mtld)\n\n\n5. Attributes\n-------------\n\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``wordlist``            | list of words                                                   \t\t      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``words``  \t\t  | number of words (w) \t\t\t\t   \t\t\t      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``terms``\t\t  | number of unique terms (t)\t\t\t                                      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``preprocessor``        | preprocessor used\t\t                                                      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``tokenizer``           | tokenizer used\t\t                                                      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``ttr``\t\t  | type-token ratio computed as t / w (Chotlos 1944, Templin 1957)         \t      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``rttr``\t          | root TTR computed as t / sqrt(w) (Guiraud 1954, 1960)                             |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``cttr``\t          | corrected TTR computed as t / sqrt(2w) (Carrol 1964)\t\t              |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Herdan`` \t          | log(t) / log(w) (Herdan 1960, 1964)                                               |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Summer``    \t  | log(log(t)) / log(log(w)) (Summer 1966)                                           |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Dugast``          \t  | (log(w) ** 2) / (log(w) - log(t) (Dugast 1978)\t\t\t\t      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``Maas`` \t          | (log(w) - log(t)) / (log(w) ** 2) (Maas 1972)                                     |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``yulek``\t          | Yule's K (Yule 1944, Tweedie and Baayen 1998)                                     |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``yulei``\t          | Yule's I (Yule 1944, Tweedie and Baayen 1998)                                     |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``herdanvm``\t          | Herdan's Vm (Herdan 1955, Tweedie and Baayen 1998)                                |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``simpsond``\t          | Simpson's D (Simpson 1949, Tweedie and Baayen 1998)                               |\n+-------------------------+-----------------------------------------------------------------------------------+\n\n6. Methods\n----------\n\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``msttr``            \t  | Mean segmental TTR (Johnson 1944)\t\t\t\t\t\t      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``mattr``  \t\t  | Moving average TTR (Covington 2007, Covington and McFall 2010)\t\t      |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``mtld``\t\t  | Measure of Lexical Diversity (McCarthy 2005, McCarthy and Jarvis 2010)            |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``hdd``                 | HD-D (McCarthy and Jarvis 2007)                                                   |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``vocd``                | voc-D (Mckee, Malvern, and Richards 2010)                                         |\n+-------------------------+-----------------------------------------------------------------------------------+\n| ``vocd_fig``            | Utility to plot empirical voc-D curve \t                                      |\n+-------------------------+-----------------------------------------------------------------------------------+\n\n**Plot the empirical voc-D curve**\n\n.. code-block:: python\n\n\tlex.vocd_fig(\n\t    ntokens=50,  # Maximum number for the token/word size in the random samplings\n\t    within_sample=100,  # Number of samples\n\t    seed=42,  # Seed for reproducibility\n\t)\n\n.. image:: https://raw.githubusercontent.com/LSYS/LexicalRichness/master/docs/images/vocd.png\n\t:width: 450\n\n\n**Assessing method docstrings**\n\n.. code-block:: python\n\n\t\u003e\u003e\u003e import inspect\n\n\t# docstring for hdd (HD-D)\n\t\u003e\u003e\u003e print(inspect.getdoc(LexicalRichness.hdd))\n\n\tHypergeometric distribution diversity (HD-D) score.\n\n\tFor each term (t) in the text, compute the probabiltiy (p) of getting at least one appearance\n\tof t with a random draw of size n \u003c N (text size). The contribution of t to the final HD-D\n\tscore is p * (1/n). The final HD-D score thus sums over p * (1/n) with p computed for\n\teach term t. Described in McCarthy and Javis 2007, p.g. 465-466.\n\t(McCarthy and Jarvis 2007)\n\n\tParameters\n\t__________\n\tdraws: int\n\t    Number of random draws in the hypergeometric distribution (default=42).\n\n\tReturns\n\t_______\n\tfloat\n\t\nAlternatively, just do\n\n.. code-block:: python\n\n\t\u003e\u003e\u003e print(lex.hdd.__doc__)\n\t\n\tHypergeometric distribution diversity (HD-D) score.\n\n            For each term (t) in the text, compute the probabiltiy (p) of getting at least one appearance\n            of t with a random draw of size n \u003c N (text size). The contribution of t to the final HD-D\n            score is p * (1/n). The final HD-D score thus sums over p * (1/n) with p computed for\n            each term t. Described in McCarthy and Javis 2007, p.g. 465-466.\n            (McCarthy and Jarvis 2007)\n\n            Parameters\n            ----------\n            draws: int\n                Number of random draws in the hypergeometric distribution (default=42).\n\n            Returns\n            -------\n            float\t\n\t    \n7. Formulation \u0026 Algorithmic Details\n------------------------------------\nFor details under the hood, please see `this section \u003chttps://lexicalrichness.readthedocs.io/en/latest/#details-of-lexical-richness-measures\u003e`_ in the docs (or `see here \u003chttps://www.lucasshen.com/software/lexicalrichness/doc#details-of-lexical-richness-measures\u003e`_).\n\n\t    \n8. Example use cases\n--------------------\n* `[1] \u003chttps://doi.org/10.1007/s10579-021-09562-4\u003e`_ **SENTiVENT** used the metrics that LexicalRichness provides to estimate the classification difficulty of annotated categories in their corpus (Jacobs \u0026 Hoste 2020). The metrics show which categories will be more difficult for modeling approaches that rely on linguistic inputs because greater lexical diversity means greater data scarcity and more need for generalization. (h/t Gilles Jacobs)\n\n\tJacobs, Gilles, and Véronique Hoste. \"SENTiVENT: enabling supervised information extraction of company-specific events in economic and financial news.\" Language Resources and Evaluation (2021): 1-33.\n\n\t.. raw:: html\n\n\t   \u003cdetails\u003e\n\t   \u003csummary\u003e\u003ca\u003eClick here for citation metadata\u003c/a\u003e\u003c/summary\u003e\n\n\t.. code-block:: bib\n\n\t\t@article{jacobs2021sentivent, \n\t\t\ttitle={SENTiVENT: enabling supervised information extraction of company-specific events in economic and financial news},\n\t\t\tauthor={Jacobs, Gilles and Hoste, V{\\'e}ronique},\n\t\t\tjournal={Language Resources and Evaluation},\n\t\t\tpages={1--33},\n\t\t\tyear={2021},\n\t\t\tpublisher={Springer}\n\t\t}\n\t\n\t.. raw:: html\n\n    \n* | `[2] \u003chttps://www.lucasshen.com/research/media.pdf\u003e`_ **Measuring political media using text data.** This chapter of my thesis investigates whether political media bias manifests by coverage accuracy. As covaraites, I use characteristics of the text data (political speech and news article transcripts). One of the ways speeches can be characterized is via lexical richness.\n    \n\t.. raw:: html\n\n\t   \u003cdetails\u003e\n\t   \u003csummary\u003e\u003ca\u003eShen, Lucas (2021). Measuring political media using text data [Click for metadata]\u003c/a\u003e\u003c/summary\u003e\n\n\t.. code-block:: bib\n\n\t\t@techreport{accuracybias, \n\t\t\ttitle={Measuring Political Media Slant Using Text Data},\n\t\t\tauthor={Shen, Lucas},\n\t\t\turl={https://www.lucasshen.com/research/media.pdf},\n\t\t\tyear={2021}\n\t\t}\n\t\n\t.. raw:: html    \t    \n\t\n* `[3] \u003chttps://github.com/notnews/unreadable_news\u003e`_ **Unreadable News: How Readable is American News?** This study characterizes modern news by readability and lexical richness. Focusing on the NYT, they find increasing readability and lexical richness, suggesting that NYT feels competition from alternative sources to be accessible while maintaining its key demographic of college-educated Americans. \n   \n\t.. raw:: html\n\n\t   \u003cdetails\u003e\n\t   \u003csummary\u003e\u003ca\u003eNYT's lexical superiority?\u003c/a\u003e\u003c/summary\u003e\n\t\t\n\t\t\u003cp align=\"left\"\u003e\n\t\t\t\u003cimg width=\"45%\" src=\"https://raw.githubusercontent.com/lsys/lexicalrichness/master/docs/images/boxplot_lex_nyt_cnn_npr_msnbc.png\"\u003e\n\t\t\t\u003cbr\u003e\n\t\t\tSource: \u003ca href=\"https://github.com/notnews/unreadable_news\"\u003e(https://github.com/notnews/unreadable_news)\u003c/a\u003e\n\t\t\u003c/p\u003e\n\t   \n\t\n\t.. raw:: html    \n\n* `[4] \u003chttps://github.com/g-hurst/Comparing-Properties-of-German-and-English-Books\u003e`_ **German is more complicated than English** This study analyses a small sample of English books and compares them to their German translation. Within the sample, it can be observed that the German translations tend to be shorter in length, but contain more unique terms than their English counterparts. LexicalRichness was used to generate the statistics modeled within the study. \n   \n\t.. raw:: html\n\n\t   \u003cdetails\u003e\n\t   \u003csummary\u003e\u003ca\u003eWords vs Terms in Each Book\u003c/a\u003e\u003c/summary\u003e\n\t\t\n\t\t\u003cp align=\"left\"\u003e\n\t\t\t\u003cimg width=\"50%\" src=\"https://github.com/g-hurst/Comparing-Properties-of-German-and-English-Books/blob/main/figures/words%20vs%20terms%20scatter.png\"\u003e\n\t\t\t\u003cbr\u003e\n\t\t\tSource: \u003ca href=\"https://github.com/g-hurst/Comparing-Properties-of-German-and-English-Books\"\u003e(https://github.com/g-hurst/Comparing-Properties-of-German-and-English-Books)\u003c/a\u003e\n\t\t\u003c/p\u003e  \n\t\n\t.. raw:: html    \n\t\n\t    \n9. Contributing\n---------------\n**Author**\n\n`Lucas Shen \u003chttps://www.lucasshen.com/\u003e`__\n\n**Contributors**\n\n.. image:: https://contrib.rocks/image?repo=lsys/lexicalrichness\n   :target: https://github.com/lsys/lexicalrichness/graphs/contributors\n\nContributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given. \nSee here for `how to contribute  \u003c./docs//CONTRIBUTING.rst\u003e`__ to this project.\nSee here for `Contributor Code of\nConduct \u003chttp://contributor-covenant.org/version/1/0/0/\u003e`__.\n\nIf you'd like to contribute via a Pull Request (PR), feel free to open an issue on the `Issue Tracker\n\u003chttps://github.com/LSYS/LexicalRichness/issues\u003e`__ to discuss the potential contribution via a PR.\n\n10. Citing\n----------\nIf you have used this codebase and wish to cite it, here is the citation metadata.\n\nCodebase:\n\n.. code-block:: bib\n\n\t@misc{lex,\n\t\tauthor = {Shen, Lucas},\n\t\tdoi = {10.5281/zenodo.6607007},\n\t\tlicense = {MIT license},\n\t\ttitle = {{LexicalRichness: A small module to compute textual lexical richness}},\n\t\turl = {https://github.com/LSYS/lexicalrichness},\n\t\tyear = {2022}\n\t}\n\nDocumentation on formulations and algorithms:\n\n.. code-block:: bib\n\n\t@misc{accuracybias, \n\t\ttitle={Measuring Political Media Slant Using Text Data},\n\t\tauthor={Shen, Lucas},\n\t\turl={https://www.lucasshen.com/research/media.pdf},\n\t\tyear={2021}\n\t}\n\nThe package is released under the `MIT\nLicense \u003chttps://opensource.org/licenses/MIT\u003e`__.\n\n.. macros -------------------------------------------------------------------------------------------------------\n.. badges\n.. |pypi| image:: https://badge.fury.io/py/lexicalrichness.svg\n\t:target: https://pypi.org/project/lexicalrichness/\n.. |conda-forge| image:: https://img.shields.io/conda/vn/conda-forge/lexicalrichness   \n\t:target: https://anaconda.org/conda-forge/lexicalrichness\n.. |latest-release| image:: https://img.shields.io/github/v/release/lsys/lexicalrichness   \n\t:target: https://github.com/LSYS/LexicalRichness/releases\n.. |ci-status| image:: https://github.com/LSYS/LexicalRichness/actions/workflows/build.yml/badge.svg?branch=master   \n\t:target: https://github.com/LSYS/LexicalRichness/actions/workflows/build.yml\n.. |python-ver| image:: https://img.shields.io/pypi/pyversions/lexicalrichness   \n\t:target: https://img.shields.io/pypi/pyversions/lexicalrichness\n.. |codefactor| image:: https://www.codefactor.io/repository/github/lsys/lexicalrichness/badge\n\t:target: https://www.codefactor.io/repository/github/lsys/lexicalrichness     \n.. |lgtm| image:: https://img.shields.io/lgtm/grade/python/g/LSYS/LexicalRichness.svg?logo=lgtm\u0026logoWidth=18)\n\t:target: https://lgtm.com/projects/g/LSYS/LexicalRichness/context:python   \n.. |maintained| image:: https://img.shields.io/badge/Maintained%3F-yes-green.svg\n   :target: https://GitHub.com/Naereen/StrapDown.js/graphs/commit-   \n.. |PRs| image:: https://img.shields.io/badge/PRs-welcome-brightgreen.svg\n\t:target: http://makeapullrequest.com   \n.. |license| image:: https://img.shields.io/github/license/LSYS/LexicalRichness?color=blue\u0026label=License  \n\t:target: https://github.com/LSYS/LexicalRichness/blob/master/LICENSE   \n.. |mybinder| image:: https://mybinder.org/badge_logo.svg\n   :target: https://mybinder.org/v2/gh/LSYS/lexicaldiversity-example/main?labpath=example.ipynb\t\n.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.6607007.svg\n   :target: https://doi.org/10.5281/zenodo.6607007\n\t\t\n.. |rtfd| image:: https://readthedocs.org/projects/lexicalrichness/badge/?version=latest\n    :target: https://lexicalrichness.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n.. |isort| image:: https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat\u0026amp;labelColor=ef8336\n\t:target: https://pycqa.github.io/isort\n\t:alt: Imports: isort\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flsys%2Flexicalrichness","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flsys%2Flexicalrichness","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flsys%2Flexicalrichness/lists"}