{"id":20458322,"url":"https://github.com/asifdotexe/natural-langwiz","last_synced_at":"2026-04-17T22:03:45.380Z","repository":{"id":248229176,"uuid":"828125385","full_name":"Asifdotexe/Natural-LangWiz","owner":"Asifdotexe","description":"Natural LangWiz is a repository for exploring Natural Language Processing (NLP) techniques through Jupyter notebooks. It covers everything from text preprocessing and sentiment analysis to advanced transformer models. Dive in to see how we turn raw text into actionable insights with a touch of NLP wizardry!","archived":false,"fork":false,"pushed_at":"2026-02-24T02:05:09.000Z","size":29470,"stargazers_count":1,"open_issues_count":6,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-24T08:39:50.322Z","etag":null,"topics":["learning-repository","machine-learning","natural-language-preprocessing","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Asifdotexe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-07-13T07:34:35.000Z","updated_at":"2026-02-07T15:34:19.000Z","dependencies_parsed_at":"2024-07-13T09:23:02.173Z","dependency_job_id":"aac84d9e-647f-4b0f-adf4-675b2ef9a7a3","html_url":"https://github.com/Asifdotexe/Natural-LangWiz","commit_stats":null,"previous_names":["asifdotexe/machineunderstandstextdata","asifdotexe/natural-langwiz"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Asifdotexe/Natural-LangWiz","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Asifdotexe%2FNatural-LangWiz","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Asifdotexe%2FNatural-LangWiz/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Asifdotexe%2FNatural-LangWiz/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Asifdotexe%2FNatural-LangWiz/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Asifdotexe","download_url":"https://codeload.github.com/Asifdotexe/Natural-LangWiz/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Asifdotexe%2FNatural-LangWiz/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31947761,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T17:29:20.459Z","status":"ssl_error","status_checked_at":"2026-04-17T17:28:47.801Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["learning-repository","machine-learning","natural-language-preprocessing","python"],"created_at":"2024-11-15T12:11:56.588Z","updated_at":"2026-04-17T22:03:41.519Z","avatar_url":"https://github.com/Asifdotexe.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Natural LangWiz\n\nWelcome to the **Natural LangWiz** repository! Here, we perform a bit of language wizardry to make text data magically understandable for machines. With our collection of Jupyter notebooks, we delve into various aspects of Natural Language Processing (NLP), offering detailed explanations and hands-on examples.\n\nThink of us as modern-day language wizards, transforming raw text into structured data and insightful information—no magic wand required!\n\n## Table of Contents\n\n1. [Data Preprocessing](#data-preprocessing)\n   - [Text Cleaning](#text-cleaning)\n   - [Converting Text to Lowercase](#converting-text-to-lowercase)\n   - [Removing Whitespace and Non-Textual Characters](#removing-whitespace-and-non-textual-characters)\n   - [Removing Digits](#removing-digits)\n   - [Tokenization](#tokenization)\n   - [Stemming](#stemming)\n   - [Lemmatization](#lemmatization)\n   - [Part of Speech Tagging](#part-of-speech-tagging)\n2. [Web Scraping](#web-scraping)\n   - [Wikipedia Scraping using Beautiful Soup](#wikipedia-scraping-using-beautiful-soup)\n   - [Amazon Scraping using Beautiful Soup](#amazon-scraping-using-beautiful-soup)\n3. [Word Cloud](#word-cloud)\n4. [Emojification](#emojification)\n   - [Removing Emojis](#removing-emojis)\n   - [Replacing Emojis with Text](#replacing-emojis-with-text)\n5. [Sentiment Analysis](#sentiment-analysis)\n   - [AFINN Sentiment Analysis](#afinn-sentiment-analysis)\n   - [General Sentiment Analysis](#general-sentiment-analysis)\n6. [Named Entity Recognition](#named-entity-recognition)\n7. [Similarity Checking](#similarity-checking)\n8. [Spam Detection](#spam-detection)\n9. [Transformer Models](#transformer-models)\n   - [Text Summarization](#text-summarization)\n   - [Text Generation](#text-generation)\n   - [Emotion Analysis](#emotion-analysis)\n10. [Translation](#translation)\n11. [Vectorization](#vectorization)\n12. [API Calling](#api-calling)\n13. [Grammar Checking](#grammar-checking)\n14. [N-Grams](#n-grams)\n15. [Demojification](#demojification)\n16. [Python Gemini Integration](#python-gemini-integration)\n      - [Python Gemini Notebook](#python-gemini-notebook)\n      - [Gemini TKinter Script](#gemini-tkinter-script)\n17. [Topic Modelling](#topic-modelling)\n\n## Data Preprocessing\n\nData preprocessing is a crucial step in NLP to clean and prepare text data for analysis and modeling. The following preprocessing steps are covered in the [Data Preprocessing Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_data_processing.ipynb):\n\n### Text Cleaning\n\nUsing regular expressions (regex), unwanted characters and patterns are removed from the text to make it clean and uniform.\n\n### Converting Text to Lowercase\n\nConverts all characters in the text to lowercase to ensure uniformity and avoid case sensitivity issues during analysis.\n\n### Removing Whitespace and Non-Textual Characters\n\nRemoves unnecessary whitespace and non-textual characters to streamline the text.\n\n### Removing Digits\n\nDigits are removed from the text to focus on the textual content.\n\n### Tokenization\n\nSplits the text into individual words or tokens, which are the basic units for further NLP tasks.\n\n### Stemming\n\nReduces words to their base or root form by removing suffixes. For example, \"running\" becomes \"run\".\n\n### Lemmatization\n\nSimilar to stemming, but more sophisticated. It reduces words to their dictionary form. For example, \"running\" becomes \"run\" and \"better\" becomes \"good\".\n\n### Part of Speech Tagging\n\nIdentifies and labels the part of speech (e.g., noun, verb, adjective) for each token in the text.\n\n## Web Scraping\n\nWeb scraping is the process of extracting data from websites. The following web scraping tasks are covered in the [Web Scraping Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_web_scraping.ipynb):\n\n### Wikipedia Scraping using Beautiful Soup\n\nExtracts data from Wikipedia pages using the Beautiful Soup library.\n\n### Amazon Scraping using Beautiful Soup\n\nExtracts product data from Amazon using the Beautiful Soup library.\n\n## Word Cloud\n\nA word cloud is a visual representation of text data, where the size of each word indicates its frequency or importance. The [Word Cloud Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_word_cloud.ipynb) demonstrates how to create a word cloud from a given corpus.\n\n## Emojification\n\nEmojification involves handling emojis in text data, either by removing them or replacing them with corresponding text. The following tasks are covered in the [Emojification Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_demojification.ipynb):\n\n### Removing Emojis\n\nUses the `demoji` library to identify and remove emojis from the text.\n\n### Replacing Emojis with Text\n\nUses the `emoji` library to replace emojis with their corresponding text descriptions.\n\n## Sentiment Analysis\n\nSentiment analysis determines the sentiment or emotional tone of a piece of text. The following notebooks cover different approaches:\n\n### AFINN Sentiment Analysis\n\nThe [AFINN Sentiment Analysis Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_affin_sentimental_analysis.ipynb) uses the AFINN lexicon to classify sentiment into positive, negative, or neutral.\n\n### General Sentiment Analysis\n\nThe [General Sentiment Analysis Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_sentimental_analysis.ipynb) covers broader sentiment analysis techniques and models.\n\n## Named Entity Recognition\n\nNamed Entity Recognition (NER) identifies and classifies key entities in text, such as names of people, organizations, and locations. The [Named Entity Recognition Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_name_entity_recognition.ipynb) demonstrates how to recognize and classify entities using NER techniques.\n\n## Similarity Checking\n\nSimilarity checking involves determining how similar two pieces of text are. The [Similarity Checker Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_similarity_checker.ipynb) explores various methods to compute textual similarity.\n\n## Spam Detection\n\nSpam detection identifies whether a piece of text is spam or not. The [Spam Detection Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_spam_detection.ipynb) covers techniques for classifying text as spam or non-spam.\n\n## Transformer Models\n\nTransformer models are advanced neural network architectures for NLP tasks. The following notebooks cover different applications:\n\n### Text Summarization\n\nThe [Text Summarization Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_transformer_summarization.ipynb) demonstrates how to summarize text using transformer models.\n\n### Text Generation\n\nThe [Text Generation Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_transformer_text_generation.ipynb) showcases generating coherent and contextually relevant text with transformer models.\n\n### Emotion Analysis\nThe [Emotion Analysis Notebook](https://github.com/Asifdotexe/MachineUnderstandsTextData/blob/main/code/nlp_transformer_emotion_analysis.ipynb) showcases sentimental analysis using transformer models\n\n## Translation\n\nThe [Translation Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_translator.ipynb) covers techniques for translating text between different languages.\n\n## Vectorization\n\nVectorization converts text into numerical representations. The [Vectorization Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_vectorizer.ipynb) explains different vectorization techniques, such as Bag of Words and TF-IDF.\n\n## API Calling\n\nThe [API Calling Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_api_calling.ipynb) demonstrates how to interact with external APIs to retrieve and manipulate text data.\n\n## Grammar Checking\n\nThe [Grammar Checking Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_grammar_checker.ipynb) covers techniques for identifying and correcting grammatical errors in text.\n\n## N-Grams\n\nThe [N-Grams Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/code/main/nlp_n_grams.ipynb) explains the concept of n-grams and their use in text analysis and modeling.\n\n## Demojification\n\nDemojification involves handling emojis in text data, either by removing or replacing them. For more details, refer to the [Demojification Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_demojification.ipynb).\n\n## Python Gemini Integration\n\n### Python Gemini Notebook\n\nThe [Python Gemini Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_gemini_python_integration.ipynb) contains the code for using Gemini through Python.\n\n### Gemini TKinter Script\n\nThe [Gemini TKinter Script](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/prompt_generator_tkinter.py) contains the script to run Gemini through Python using an interface.\n\n![Gemini Chat Interface](gemini_chat_interface.png)\n\n## Topic Modelling\n\nThe [Topic Modelling Notebook](https://github.com/Asifdotexe/Natural-LangWiz/blob/main/code/nlp_topic_modelling.ipynb) contains explanation and code for topic modelling, where we have used LDA (Latent Dirichlet Allocation) to discover topics within the corpus and also performed visualization using pyLDAviz.\n\n---\n\nFeel free to explore the notebooks and enhance your understanding of basic NLP concepts. If you have any questions or suggestions, please open an issue or submit a pull request.\n\nHappy Learning!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasifdotexe%2Fnatural-langwiz","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fasifdotexe%2Fnatural-langwiz","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasifdotexe%2Fnatural-langwiz/lists"}