{"id":25961883,"url":"https://github.com/aashish1-1-1/unlinked","last_synced_at":"2025-08-12T18:17:48.105Z","repository":{"id":277658693,"uuid":"848049380","full_name":"Aashish1-1-1/Unlinked","owner":"Aashish1-1-1","description":"DNA test for links :wink:","archived":false,"fork":false,"pushed_at":"2025-02-15T07:49:48.000Z","size":139,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-04T19:50:03.547Z","etag":null,"topics":["embeddings","links","python3","scraper","stackoverflow","vector-database"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Aashish1-1-1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-27T03:15:35.000Z","updated_at":"2025-02-15T07:49:51.000Z","dependencies_parsed_at":"2025-02-15T08:28:40.751Z","dependency_job_id":"69cd2cfa-3ce3-43e0-a1d2-3f37facd2f9b","html_url":"https://github.com/Aashish1-1-1/Unlinked","commit_stats":null,"previous_names":["aashish1-1-1/unlinked"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Aashish1-1-1/Unlinked","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aashish1-1-1%2FUnlinked","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aashish1-1-1%2FUnlinked/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aashish1-1-1%2FUnlinked/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aashish1-1-1%2FUnlinked/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Aashish1-1-1","download_url":"https://codeload.github.com/Aashish1-1-1/Unlinked/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aashish1-1-1%2FUnlinked/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270111005,"owners_count":24529189,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embeddings","links","python3","scraper","stackoverflow","vector-database"],"created_at":"2025-03-04T19:50:12.165Z","updated_at":"2025-08-12T18:17:47.993Z","avatar_url":"https://github.com/Aashish1-1-1.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **Unlinked**\n\nUnlinked is a web scraper designed to identify and detect unrelated links in community-based posts, such as those on platforms like StackOverflow, Medium, and similar forums. \n\n## **How It Works**\n\nUnlinked leverages vector-based similarity detection to analyze the relevance of links within a post. The core concept is built around the use of a **vector database**, which is pre-trained on a vast corpus of data. The scraper uses **spaCy**, a popular NLP library, to compute word and sentence embeddings.\n\nIn the vector space, semantically related words or sentences tend to have smaller angles between them, resulting in a high cosine similarity score. Conversely, unrelated words or sentences will have larger angles, resulting in lower or negative cosine similarity. This allows the application to determine whether links in a post are contextually related to the content or not.\n\n![Vector Database](./images/vectordb.webp)\n\n## **Getting Started**\n\n### **Running Locally**\n\nTo run Unlinked on your local machine, follow the steps below:\n\n1. Clone the repository:\n    ```bash\n    git clone https://github.com/Aashish1-1-1/Unlinked\n    ```\n\n2. Navigate to the project directory:\n    ```bash\n    cd Unlinked/unlinked\n    ```\n\n3. Build the Docker image:\n    ```bash\n    sudo docker build -t unlinked .\n    ```\n\n4. Run the application:\n    ```bash\n    sudo docker run -i unlinked\n    ```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faashish1-1-1%2Funlinked","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faashish1-1-1%2Funlinked","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faashish1-1-1%2Funlinked/lists"}