{"id":15642098,"url":"https://github.com/transitive-bullshit/text-summarization","last_synced_at":"2025-04-14T15:55:19.096Z","repository":{"id":41512174,"uuid":"219609995","full_name":"transitive-bullshit/text-summarization","owner":"transitive-bullshit","description":"Automagically generates summaries from html or text.","archived":false,"fork":false,"pushed_at":"2023-02-10T05:18:52.000Z","size":334,"stargazers_count":66,"open_issues_count":2,"forks_count":20,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-28T04:41:56.948Z","etag":null,"topics":["extractive-summarization","extractive-text-summarization","summarization","summarize","summary","text"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/transitive-bullshit.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-04T22:38:00.000Z","updated_at":"2024-09-13T05:17:19.000Z","dependencies_parsed_at":"2023-02-10T13:46:15.496Z","dependency_job_id":null,"html_url":"https://github.com/transitive-bullshit/text-summarization","commit_stats":{"total_commits":17,"total_committers":2,"mean_commits":8.5,"dds":0.05882352941176472,"last_synced_commit":"82ebef3c2bc01a0d5a3fad7511c6253f75909a8a"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transitive-bullshit%2Ftext-summarization","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transitive-bullshit%2Ftext-summarization/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transitive-bullshit%2Ftext-summarization/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transitive-bullshit%2Ftext-summarization/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/transitive-bullshit","download_url":"https://codeload.github.com/transitive-bullshit/text-summarization/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248912391,"owners_count":21182267,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extractive-summarization","extractive-text-summarization","summarization","summarize","summary","text"],"created_at":"2024-10-03T11:54:27.472Z","updated_at":"2025-04-14T15:55:19.061Z","avatar_url":"https://github.com/transitive-bullshit.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# text-summarization\n\n\u003e Automagically generates summaries from html or text.\n\n[![NPM](https://img.shields.io/npm/v/text-summarization.svg)](https://www.npmjs.com/package/text-summarization) [![Build Status](https://travis-ci.com/transitive-bullshit/text-summarization.svg?branch=master)](https://travis-ci.com/transitive-bullshit/text-summarization) [![JavaScript Style Guide](https://img.shields.io/badge/code_style-standard-brightgreen.svg)](https://standardjs.com)\n\n## Intro\n\nThis module powers Automagical's text summarization, which was [acquired by Verblio in 2018](https://www.verblio.com/blog/we-bought-a-company).\n\nIt provides the most powerful and comprehensive text summarization available on NPM.\n\n## Features\n\n- Uses a variety of metrics to generate quality extractive text summaries\n- Handles html or text-based content\n- Utilizes html structure as a signal of text importance\n- Includes basic abstractive shortening of extracted sentences\n- Usable as a node module or cli\n- Thoroughly tested and used in production\n\n## Install\n\nThis module is usable either as a CLI or as a module.\n\n```bash\nnpm install --save text-summarization\n```\n\n## Usage\n\n```js\nconst summarize = require('text-summarization')\n\nconst fs = require('fs')\nconst html = fs.readFileSync('fixtures/automagical-1.html')\n\nconst summary = await summarize({ html })\nconsole.log(JSON.stringify(summary, null, 2))\n```\n\nwhich outputs:\n\n```\n{\n  \"extractive\": [\n    \"Why you should drop everything and try Automagical\",\n    \"Video content is significantly more engaging than text content\",\n    \"Go from blog post → video in 5 minutes.\",\n    \"Our builder is exceptionally easy to use.\",\n    \"For the cost of 1 highly produced video, you can get a year's worth of videos from Automagical.\"\n  ]\n}\n```\n\n## CLI\n\n```\nnpm install -g text-summarization\n```\n\nThis installs a `summarize` binary globally.\n\n```bash\n  Usage: summarize [options] \u003cfile\u003e\n\n  Options:\n    -V, --version              output the version number\n    -n, --num-sentences \u003cn\u003e    number of sentences (defaults to variable length)\n    -t, --title \u003ctitle\u003e        title\n    -c, --content-type \u003ctype\u003e  sets content type to html or text\n    -d, --detailed             print detailed info for top sentences\n    -D, --detailedAll          print detailed info for all sentences\n    -m, --media                resolve \u003ca\u003e links using iframely and return best matching media\n    -P, --no-pretty-print      disable pretty-printing output\n    -h, --help                 output usage information\n```\n\n## Metrics\n\n- tfidf overlap for base relative sentence importance\n- html node boosts for tags like `\u003ch1\u003e` and `\u003cstrong\u003e`\n- listicle boosts for lists like `2) second item`\n- penalty for poor readability or really long sentences\n\nHere's an example of a sentence's internal structure after normalization, processing, and scoring:\n\n```js\n{\n  \"index\": 8,\n  \"sentence\": {\n    \"original\": \"4. For the cost of 1 highly produced video, you can get a year's worth of videos from Automagical.\",\n    \"listItem\": 4,\n    \"actual\": \"For the cost of 1 highly produced video, you can get a year's worth of videos from Automagical.\",\n    \"normalized\": \"for the cost of 1 highly produced video you can get a years worth of videos from automagical\",\n    \"tokenized\": [\n      \"cost\",\n      \"highly\",\n      \"produced\",\n      \"video\",\n      \"years\",\n      \"worth\",\n      \"videos\",\n      \"automagical\"\n    ]\n  },\n  \"liScore\": 1,\n  \"nodeScore\": 0.7,\n  \"readabilityPenalty\": 0,\n  \"tfidfScore\": 0.8019447657605553,\n  \"score\": 5.601944765760555\n}\n```\n\n## Iframely\n\nThis module optionally supports using [iframely](https://iframely.com) to get social previews for any external links in the source html, adding the resulting images and summary text to the source pool of candidate sentences.\n\nTo enable this, set the `IFRAMELY_BASE_URL` and `IFRAMELY_API_KEY` environment variables.\n\n## References\n\n- [node-summary](https://github.com/jbrooksuk/node-summary)\n- [natural nlp](https://github.com/NaturalNode/natural)\n- [retext](https://github.com/wooorm/retext)\n- [retext-readability](https://github.com/wooorm/retext-readability)\n- [retext-simplify](https://github.com/wooorm/retext-simplify)\n- [retext-redundant-acronyms](https://github.com/wooorm/retext-redundant-acronyms)\n- [retext-repeated-words](https://github.com/wooorm/retext-repeated-words)\n\n## License\n\nMIT © [Travis Fischer](https://transitivebullsh.it)\n\nSupport my OSS work by \u003ca href=\"https://twitter.com/transitive_bs\"\u003efollowing me on twitter \u003cimg src=\"https://storage.googleapis.com/saasify-assets/twitter-logo.svg\" alt=\"twitter\" height=\"24px\" align=\"center\"\u003e\u003c/a\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftransitive-bullshit%2Ftext-summarization","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftransitive-bullshit%2Ftext-summarization","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftransitive-bullshit%2Ftext-summarization/lists"}