{"id":20251814,"url":"https://github.com/jillmpla/sentimentanalysis","last_synced_at":"2026-04-27T18:32:57.128Z","repository":{"id":130358827,"uuid":"112402719","full_name":"jillmpla/sentimentanalysis","owner":"jillmpla","description":"Comment sentiment analysis of the top 25 posts (from the last 24 hrs) on a subreddit (reddit.com) using a web scraper.","archived":false,"fork":false,"pushed_at":"2020-08-27T02:52:54.000Z","size":8900,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-06-12T14:09:20.300Z","etag":null,"topics":["python","reddit","sentiment-analysis","sqlite","web-scraper"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jillmpla.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null},"funding":{"github":"JMPPMJ"}},"created_at":"2017-11-28T23:48:30.000Z","updated_at":"2020-08-27T02:52:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"47f7d10b-77da-4109-9aa3-7bbf728b5469","html_url":"https://github.com/jillmpla/sentimentanalysis","commit_stats":null,"previous_names":["jillmpla/sentimentanalysis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jillmpla/sentimentanalysis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jillmpla%2Fsentimentanalysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jillmpla%2Fsentimentanalysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jillmpla%2Fsentimentanalysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jillmpla%2Fsentimentanalysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jillmpla","download_url":"https://codeload.github.com/jillmpla/sentimentanalysis/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jillmpla%2Fsentimentanalysis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32349590,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-27T17:12:42.749Z","status":"ssl_error","status_checked_at":"2026-04-27T17:12:41.658Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["python","reddit","sentiment-analysis","sqlite","web-scraper"],"created_at":"2024-11-14T10:13:11.914Z","updated_at":"2026-04-27T18:32:57.112Z","avatar_url":"https://github.com/jillmpla.png","language":"HTML","funding_links":["https://github.com/sponsors/JMPPMJ"],"categories":[],"sub_categories":[],"readme":"\u003cb\u003eComment Sentiment Analysis of the Top 25 Posts on a Subreddit (www.reddit.com) (from the last 24 hrs)\u003c/b\u003e\n\n\u003ci\u003eLanguages:\u003c/i\u003e Python, SQL(SQLite)\n\n\u003cb\u003ePurpose of the program:\u003c/b\u003e\nTo define, evaluate, and visualize overall public sentiment towards various news articles. \n\nThree versions of the program are available in this repository:\u003cul\u003e\n\u003cli\u003eRedditbotSpidernews scraps and analyzes the top posts from the last 24 hrs on \u003ca href=\"https://www.reddit.com/r/news/top/\"\u003e/r/news/\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003eRedditbotSpiderpolitics scraps and analyzes the top posts from the last 24 hrs on \u003ca href=\"https://www.reddit.com/r/politics/top/\"\u003e/r/politics/\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003eRedditbotSpiderworldnews scraps and analyzes the top posts from the last 24 hrs on \u003ca href=\"https://www.reddit.com/r/worldnews/top/\"\u003e/r/worldnews/\u003c/a\u003e.\u003c/li\u003e\u003c/ul\u003e\n  \n\u003ci\u003eNote\u003c/i\u003e: The program can be (theoretically) used on any subreddit by changing the address and (if needed) altering the XPath's within RedditbotSpider.py.\n\n\u003chr\u003e\n\n\u003cb\u003eWhat the program does:\u003c/b\u003e\n\u003cul\u003e\n\u003cli\u003eWeb scraper connects to subreddit and collects the top 25 post titles, as well as comments within each post.\u003c/li\u003e\n\u003cli\u003eData is inserted into a SQLite database.\u003c/li\u003e\n\u003cli\u003eData is cleaned up: any rows lacking a comment are deleted.\u003c/li\u003e\n\u003cli\u003eComments are combined for each corresponding title and placed into a new database table.\n\u003cli\u003eA unique ID (1-25) is added for each title and corresponding group of comments.\n\u003cli\u003eLexicon (word-based) for sentiment analysis is applied to each set of comments.\u003c/li\u003e\n\u003cli\u003eData visualization: an interactive (html) bar chart, CSV file, and completion window are generated.\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003chr\u003e\n\n\u003cb\u003eLexicon used to extract an overall sentiment level:\u003c/b\u003e\n\u003ctable style=\"width:100%\"\u003e\n  \u003ctr\u003e\n    \u003cth\u003ePositive +1\u003c/th\u003e\n    \u003cth\u003eNegative -1\u003c/th\u003e \n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003egood\u003c/td\u003e\n    \u003ctd\u003efuck\u003c/td\u003e \n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003egreat\u003c/td\u003e\n    \u003ctd\u003ecorrupt\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003ehappy\u003c/td\u003e\n    \u003ctd\u003estupid\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003ewin\u003c/td\u003e\n    \u003ctd\u003eirrelevant\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003elove\u003c/td\u003e\n    \u003ctd\u003ecolluding\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003enice\u003c/td\u003e\n    \u003ctd\u003ehorrible\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003eauthentic\u003c/td\u003e\n    \u003ctd\u003eunfair\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003elike\u003c/td\u003e\n    \u003ctd\u003eguilty\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003efun\u003c/td\u003e\n    \u003ctd\u003efoolish\u003c/td\u003e \n  \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd\u003eappreciate\u003c/td\u003e\n    \u003ctd\u003ehateful\u003c/td\u003e \n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003chr\u003e\n\n\u003cb\u003eHow to run the program:\u003c/b\u003e\n\u003cul\u003e\n\u003cli\u003eDownload and install \u003ca href=\"https://sqlite.org/download.html\"\u003eSQLite\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003eDownload and install \u003ca href=\"https://www.python.org/downloads/\"\u003ePython 3.6.3\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003eMake sure your System PATH includes the path to Python's interpreter\u003c/li\u003e\n\u003cli\u003eIn Windows Command Prompt do/install the following:\u003c/li\u003e\u003cul\u003e\n\u003cli\u003epip3 install pandas\u003c/li\u003e\n\u003cli\u003epip3 install scrapy\u003c/li\u003e\n\u003cli\u003epip3 install plotly\u003c/li\u003e\n\u003cli\u003epip install pypiwin32\u003c/li\u003e\u003c/ul\u003e\n\u003cli\u003eDownload this repository \u0026 unzip it\u003c/li\u003e\n\u003cli\u003esentimentanalysis-master-\u003eRedditbotSpidernews or RedditbotSpiderpolitics or RedditbotSpiderworldnews-\u003eright click on main.py, edit with IDLE-\u003eRun-\u003eRun Module\n\u003c/ul\u003e\n\u003ci\u003eNote\u003c/i\u003e: Before running the program a \u003ci\u003esecond\u003c/i\u003e time, move or delete the generated/results files: test.db, temp-plot.html, and results.csv out of the RedditbotSpidernews/RedditbotSpiderpolitics/RedditbotSpiderworldnews folder.\n\n\u003chr\u003e\n\n\u003cb\u003eTools/Libraries/Packages used:\u003c/b\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"https://www.python.org/downloads/\"\u003ePython 3.6.3\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://scrapy.org/\"\u003eScrapy 1.4\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://sqlite.org/download.html\"\u003eSQLite3\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://pypi.python.org/pypi/pip\"\u003ePip/Pip3\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://pandas.pydata.org/\"\u003ePandas\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://plot.ly/python/\"\u003ePlotly\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://docs.python.org/3/library/tk.html\"\u003eTkinter\u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjillmpla%2Fsentimentanalysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjillmpla%2Fsentimentanalysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjillmpla%2Fsentimentanalysis/lists"}