{"id":20164706,"url":"https://github.com/nourmorsy/topic_modelling","last_synced_at":"2026-04-11T15:39:06.689Z","repository":{"id":212573183,"uuid":"731824786","full_name":"nourmorsy/Topic_Modelling","owner":"nourmorsy","description":null,"archived":false,"fork":false,"pushed_at":"2024-11-01T22:48:15.000Z","size":109,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-03T22:32:17.962Z","etag":null,"topics":["jupyter-notebook","numpy","pandas","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nourmorsy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-15T01:05:41.000Z","updated_at":"2024-11-01T22:53:23.000Z","dependencies_parsed_at":"2023-12-15T02:55:09.278Z","dependency_job_id":"0cb81c18-555a-4f76-98cb-182da18b59c4","html_url":"https://github.com/nourmorsy/Topic_Modelling","commit_stats":null,"previous_names":["nourmorsy/topic_modelling"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nourmorsy/Topic_Modelling","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nourmorsy%2FTopic_Modelling","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nourmorsy%2FTopic_Modelling/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nourmorsy%2FTopic_Modelling/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nourmorsy%2FTopic_Modelling/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nourmorsy","download_url":"https://codeload.github.com/nourmorsy/Topic_Modelling/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nourmorsy%2FTopic_Modelling/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31686141,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T13:07:20.380Z","status":"ssl_error","status_checked_at":"2026-04-11T13:06:47.903Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["jupyter-notebook","numpy","pandas","python"],"created_at":"2024-11-14T00:35:30.960Z","updated_at":"2026-04-11T15:39:06.663Z","avatar_url":"https://github.com/nourmorsy.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Topic Modelling\n\n## Overview\nThis project applies topic modeling techniques on the ArXiv dataset to uncover hidden thematic structures within academic research papers. Using natural language processing (NLP) and machine learning, we analyze the dataset to categorize papers into topics, allowing for insights into prevailing research areas and trends in scientific literature.\n\n---\n\n## Dependencies\nTo run this project, ensure you have the following dependencies installed:\n- Python 3.x\n- Jupyter Notebook\n- Libraries:\n  - `pandas`\n  - `numpy`\n  - `sklearn`\n  - `nltk`\n  - `gensim`\n  - `matplotlib`\n  - `seaborn`\n\nYou can install these dependencies using pip:\n```bash\n pip install pandas numpy sklearn nltk gensim matplotlib seaborn\n```\n\n---\n\n## Usage and Files\nThis project is structured around a Jupyter Notebook for ease of use and reproducibility.\n\n- **`topic_modelling.ipynb`**: The primary Jupyter Notebook that contains code for data loading, preprocessing, topic modeling, and visualization. Each section in the notebook guides you through the process step-by-step.\n\n---\n\n## Dataset Used\n\nThis project uses the ArXiv dataset, which contains metadata of research papers hosted on ArXiv. The dataset can be found and downloaded from Kaggle: [ArXiv](https://www.kaggle.com/datasets/Cornell-University/arxiv)\n\n## Running the Project\n\nTo run this project, follow these steps:\n\n1. **Download the Dataset**: Download the ArXiv dataset (see the link in the Dataset Used section) and place it in the `/data` directory within the project folder.\n2. ** Run the project**:\n```bash\n  jupyter notebook topic_modelling.ipynb\n```\n\n---\n  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnourmorsy%2Ftopic_modelling","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnourmorsy%2Ftopic_modelling","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnourmorsy%2Ftopic_modelling/lists"}