{"id":25007889,"url":"https://github.com/samashi47/nlp_labs","last_synced_at":"2025-03-30T01:14:58.581Z","repository":{"id":232133658,"uuid":"782275813","full_name":"Samashi47/NLP_Labs","owner":"Samashi47","description":"NLP Labs for a university course","archived":false,"fork":false,"pushed_at":"2024-05-26T23:13:38.000Z","size":42057,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-05T02:55:54.635Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Samashi47.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-05T01:15:50.000Z","updated_at":"2024-05-26T23:13:44.000Z","dependencies_parsed_at":null,"dependency_job_id":"71ea4130-088d-48d5-861e-bfc8f2325222","html_url":"https://github.com/Samashi47/NLP_Labs","commit_stats":null,"previous_names":["samashi47/nlp_labs"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samashi47%2FNLP_Labs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samashi47%2FNLP_Labs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samashi47%2FNLP_Labs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samashi47%2FNLP_Labs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Samashi47","download_url":"https://codeload.github.com/Samashi47/NLP_Labs/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246262607,"owners_count":20749175,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-05T02:56:06.041Z","updated_at":"2025-03-30T01:14:58.558Z","avatar_url":"https://github.com/Samashi47.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NLP Labs\n## Description\nThis repository contains the Labs done during the course of Natural Language Processing\n\n### Author: Ahmed Samady\n### Supervised by: Lotfi El Aachak\n\n## Get Started\n\nTo start, clone this branch of the repo onto your local machine:\n```bash\ngit clone -b main --single-branch [https://github.com/Samashi47/NLP_Labs]\n```\ncreate a virtual environment in the repository by typing the followwing command:\n```bash\npython -m venv /path/to/repo/on/your/local/machine\n```\nAfter cloning the project and creating your venv, activate the venv by:\n\n```bash\n.venv\\Scripts\\activate\n```\nYou can run the following command to install the dependencies:\n```bash\npip3 install -r requirements.txt\n```\n## [Lab 1](https://github.com/Samashi47/NLP_Labs/tree/main/Lab1)\nLab 1 focuses on scraping a website with Arabic text, storing in a MongoDB database, NLP Pipeline (Tokenization, Normalization, stemming, Lemmatization and stopwords removal), PoS Tagging, and NER.\n## [Lab 2](https://github.com/Samashi47/NLP_Labs/tree/main/Lab2)\nLab 2 focuses on two main areas. The first is on RegEx and rule-based NLP, where we attempted to generate a bill from a sentence with a specific pattern. The second part, on the other hand, centers around word embeddings. In this section, we explored various techniques, including One-Hot Encoding, Bag of Words, TF-IDF, Word2Vec (both CBoW and Skip Gram), FastText, and GloVe. Additionally, we visualized the encoded vectors using t-SNE for dimensionality reduction. This allowed us to gain insights into the differences between these methods and capture the semantic relationships among words in our corpus.\n## [Lab 3](https://github.com/Samashi47/NLP_Labs/tree/main/Lab3)\nLab 3 focuses on language modeling, which includes regression for short answer grading and sentiment analysis classification. in this lab we established an NLP preprocessing pipeline, word embeddings and multiple models evaluation.\n## [Lab 4](https://github.com/Samashi47/NLP_Labs/tree/main/Lab4)\nLab 4 focuses on classification regression and transformer (text generation) and BERT (text classification). in this lab we scraped articles from Wikipedia, established an NLP pipeline, embeddings, language modeling, and evaluated the models. We also generated text using GPT2 and classified text using BERT.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamashi47%2Fnlp_labs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamashi47%2Fnlp_labs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamashi47%2Fnlp_labs/lists"}