{"id":26453379,"url":"https://github.com/flawed-hooman/text-summarisation","last_synced_at":"2025-03-18T18:58:04.981Z","repository":{"id":249389745,"uuid":"831385548","full_name":"flawed-hooman/Text-Summarisation","owner":"flawed-hooman","description":"Summarise text using the various libraries available for Python: pyteaser, sumy, gensim, pytldr, XLNET, BERT, and GPT2. ","archived":false,"fork":false,"pushed_at":"2024-07-20T12:32:13.000Z","size":73,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-07-20T13:23:44.904Z","etag":null,"topics":["bert","gensim","gpt2","library","natural-language-processing","nlp","pyteaser","python","pytldr","sumy","text-summarization","xlnet"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flawed-hooman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-20T11:56:38.000Z","updated_at":"2024-07-20T13:23:55.866Z","dependencies_parsed_at":"2024-07-20T13:35:48.322Z","dependency_job_id":null,"html_url":"https://github.com/flawed-hooman/Text-Summarisation","commit_stats":null,"previous_names":["flawed-hooman/text-summarisation"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flawed-hooman%2FText-Summarisation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flawed-hooman%2FText-Summarisation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flawed-hooman%2FText-Summarisation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flawed-hooman%2FText-Summarisation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flawed-hooman","download_url":"https://codeload.github.com/flawed-hooman/Text-Summarisation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244287811,"owners_count":20428890,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","gensim","gpt2","library","natural-language-processing","nlp","pyteaser","python","pytldr","sumy","text-summarization","xlnet"],"created_at":"2025-03-18T18:58:04.338Z","updated_at":"2025-03-18T18:58:04.969Z","avatar_url":"https://github.com/flawed-hooman.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Extractive text summarization\nVarious ways to summarise text using the libraries available for Python :\n  1. pyteaser\n  2. sumy\n  3. gensim\n  4. pytldr\n  5. XLNET\n  6. BERT\n  7. GPT2\n  \n## INSTALLATION\npip install sumy\u003cbr\u003e\npip install gensim\u003cbr\u003e\npip install pyteaser\u003cbr\u003e\npip install pytldr\u003cbr\u003e\npip install bert-extractive-summarizer\u003cbr\u003e\npip install spacy==2.0.12\u003cbr\u003e\npip install transformers==2.2.0\u003cbr\u003e\n\u003cbr\u003e\n## Pyteaser\nPyteaser has two function:\u003cbr\u003e\n  Summarize: that takes title and text and summarizes them\u003cbr\u003e\n  SummarizeURL: that takes the url and summarizes the content of the url\u003cbr\u003e\n  \n## Sumy\nSummy has various preprocessing libraries and summarizer libraries\u003cbr\u003e\n  sumytoken: for tokenizing the text\u003cbr\u003e\n  get_stop_words: to remove the stop words from the text\u003cbr\u003e\n  stemmer: to stemp the words\u003cbr\u003e\n  LexRankSummarizer: summarizes based on lexical ranking\u003cbr\u003e\n  LsaSummarizer: summarizes based on semantic\u003cbr\u003e\n  LuhnSummarizer: summarizes based on Luhn's algorithm\u003cbr\u003e\n\n## Gensim\n  gensim has a summarize library which can be imported and used directly.\n  \n## pytldr\n pytldr is also like sumy where they have various nlp libraries like tokenizer.\u003cbr\u003e\n Here we have used TextRankSummarizer, RelevanceSummarzer, LsaSummarizer from pytldr\n\n## XLNET\nXLNet is an auto-regressive language model which outputs the joint probability of a sequence of tokens based on the transformer architecture with recurrence.\n\n## BERT\nExtractive Text summarization refers to extracting (summarizing) out the relevant information from a large document while retaining the most important information. BERT (Bidirectional Encoder Representations from Transformers) introduces rather advanced approach to perform NLP tasks.\n\n## GPT2\nGPT-2 model with 1.5 Billion parameters is a large transformer-based language model. It's trained for predicting the next word. So, we can use this specialty to summarize the data.\n\n## Note:\nRun main.py from \"for_python3\" folder while using python, else test by running \"summarize.py\" or the notebook named as \"Text Summarizer Notebook.ipynb\"\u003cbr\u003e\nPS: pytldr and pyteaser doesn't work for python3\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflawed-hooman%2Ftext-summarisation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflawed-hooman%2Ftext-summarisation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflawed-hooman%2Ftext-summarisation/lists"}