{"id":18710855,"url":"https://github.com/sayamalt/text-similarity-quantifier","last_synced_at":"2026-04-14T15:32:25.686Z","repository":{"id":133945273,"uuid":"482073563","full_name":"SayamAlt/Text-Similarity-Quantifier","owner":"SayamAlt","description":"Successfully developed a machine learning model for computing the similarity score between two text paragraphs taken as input from a webpage. ","archived":false,"fork":false,"pushed_at":"2022-06-01T20:18:02.000Z","size":8434,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-19T08:49:59.072Z","etag":null,"topics":["bag-of-words","cosine-similarity","cosine-similarity-scores","countvectorizer","flask","machine-learning","nlp","pandas","python","text-preprocessing","tfidf"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SayamAlt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-15T20:13:04.000Z","updated_at":"2022-06-01T20:21:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"7f358b76-faaa-4b60-bfb6-5c00eb5084e6","html_url":"https://github.com/SayamAlt/Text-Similarity-Quantifier","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/SayamAlt/Text-Similarity-Quantifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SayamAlt%2FText-Similarity-Quantifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SayamAlt%2FText-Similarity-Quantifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SayamAlt%2FText-Similarity-Quantifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SayamAlt%2FText-Similarity-Quantifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SayamAlt","download_url":"https://codeload.github.com/SayamAlt/Text-Similarity-Quantifier/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SayamAlt%2FText-Similarity-Quantifier/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31803274,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T11:13:53.975Z","status":"ssl_error","status_checked_at":"2026-04-14T11:13:53.299Z","response_time":153,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bag-of-words","cosine-similarity","cosine-similarity-scores","countvectorizer","flask","machine-learning","nlp","pandas","python","text-preprocessing","tfidf"],"created_at":"2024-11-07T12:36:01.976Z","updated_at":"2026-04-14T15:32:25.670Z","avatar_url":"https://github.com/SayamAlt.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Text-Similarity-Quantifier\n\n## Objective\n\nEstablish an algorithm that can quantify the degree of similarity between the two text documents based on semantic similarity. \n\nSemantic Textual Similarity (STS) assesses the degree to which two sentences\nare semantically equivalent to each other.\n\u003cul\u003e\n  \u003cli\u003e1 means highly similar\u003c/li\u003e\n  \u003cli\u003e0 means highly dissimilar\u003c/li\u003e\n\u003c/ul\u003e\n\n## Technologies Used\n\n\u003cul\u003e\n  \u003cli\u003ePython\u003c/li\u003e\n    \u003col\u003e\n      \u003cb\u003e\u003cem\u003eLibraries Used:\u003c/em\u003e\u003c/b\u003e\n      \u003cli\u003eNumpy\u003c/li\u003e\n      \u003cli\u003ePandas\u003c/li\u003e\n      \u003cli\u003eSeaborn\u003c/li\u003e\n      \u003cli\u003eMatplotlib.pyplot\u003c/li\u003e\n      \u003cli\u003eJoblib\u003c/li\u003e\n      \u003cli\u003ewarnings\u003c/li\u003e\n      \u003cli\u003estring\u003c/li\u003e\n      \u003cli\u003eGensim Downloader\u003c/li\u003e\n      \u003cli\u003eSklearn\u003c/li\u003e\n      \u003cli\u003enltk\u003c/li\u003e\n      \u003cli\u003emath\u003c/li\u003e\n      \u003cli\u003ejson\u003c/li\u003e\n      \u003cli\u003erequests\u003c/li\u003e\n    \u003c/ol\u003e\n  \u003cli\u003eFlask\u003c/li\u003e\n  \u003cli\u003eMachine Learning\u003c/li\u003e\n  \u003cli\u003eNatural Language Processing\u003c/li\u003e\n\u003c/ul\u003e\n\n## API Endpoint\n\nThe final algorithm should be exposed as a Server API Endpoint. In order to test this API, make sure you hit a request to the server to get the result as a response to the API. The request-response body should be in the following format:\n\nRequest body: {“text1”: ”nuclear body seeks new tech …....”, ”text2”: ”terror suspects face arrest ……”}\nResponse body: {“similarity score”: 0.2 }\n\nNote: “text1”, “text2”, and “similarity score” keys should be kept as it is, without any change.\n\n## Important aspect to consider\n\n\u003cp\u003eThe given dataset does not contain any label. Therefore, can be treated as an unsupervised learning problem. However, this does not imply that supervised techniques/algorithms are not applicable. The candidate is free to use any technique.\u003c/p\u003e\n \n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsayamalt%2Ftext-similarity-quantifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsayamalt%2Ftext-similarity-quantifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsayamalt%2Ftext-similarity-quantifier/lists"}