{"id":17132965,"url":"https://github.com/cutwell/data-mining-casestudy","last_synced_at":"2025-03-24T05:42:06.105Z","repository":{"id":89428271,"uuid":"533511969","full_name":"Cutwell/data-mining-casestudy","owner":"Cutwell","description":"NLP and data mining for insights into Twitch and Youtube brand perception within media and on Twitter","archived":false,"fork":false,"pushed_at":"2022-09-06T21:47:13.000Z","size":794,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-29T11:28:19.247Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Cutwell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-06T21:45:45.000Z","updated_at":"2022-09-06T23:22:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"8295584b-e238-4866-859b-8b916649fa85","html_url":"https://github.com/Cutwell/data-mining-casestudy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cutwell%2Fdata-mining-casestudy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cutwell%2Fdata-mining-casestudy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cutwell%2Fdata-mining-casestudy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cutwell%2Fdata-mining-casestudy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Cutwell","download_url":"https://codeload.github.com/Cutwell/data-mining-casestudy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245217791,"owners_count":20579297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-14T19:29:01.355Z","updated_at":"2025-03-24T05:42:06.079Z","avatar_url":"https://github.com/Cutwell.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Casestudy: Twitch/Youtube sentiment in the media and on Twitter\nNLP and data mining for insights into Twitch and Youtube brand perception within media and on Twitter. \n\n## _Hypothesis 1: There is correlation between Follower Count and Tweet engagement_\n![](/.github/images/twitter_favourites.png)\n\n* We show that, when engagement measured by favorite count is normalized proportional to the ratio of @Youtube and @Twitch brand account followers, the @Twitch brand account clearly has significantly higher relative engagement with its followers.\n\n![](/.github/images/relative_engagement.png)\n\n* We further show that, when engagement is adjusted relative to the total followers of the @Twitch and @Youtube brand accounts, while both brands engage actively with a small percentage of their overall followers, @Twitch has higher relative engagement over @Youtube.\n\n## _Hypothesis 2: There is correlation between media sentiment and Twitter brand sentiment_\n\n* Tweets from a brand account will share contents with media headlines for a given time period.\n* The media will reflect a similar sentiment to the brand account.\n\n![](/.github/images/sentiment_twitter_media.png)\n\n* We show that there is negligible correlation between media and Twitter sentiment.\n* This is likely due to article headlines taking a more neutral tone to appear factual, whilst Tweets are more emotive to engage with an audience.\n\n## _Hypothesis 3: There is intersection between article headline and Twitter tweet text_\n\n![](/.github/images/intersection_twitter_media.png)\n\n* We show that there is a high intersection between article headlines and a brand's Twitter feed.\n* We also show that this correlation is relevant, as the intersection correlates with peaks in article count, indicating these intersections are not accidental.\n* Notable examples for the @Twitch brand account include spikes for new features as well as controversies.\n* There exists an increased volume of articles (and thus, increased intersection score) in recent weeks due to the limitations of Google's news feed.\n\n## Limitations of collecting data using Google's RSS feed\n\n![](/.github/images/frequency_twitter_google.png)\n\n* We collected article data for our media analysis using Google's RSS feed for the topic of \"Twitch\".\n* Relative to the available data collected from Twitter for the @Twitch brand account, we found that the RSS feed articles were not evenly distributed over time, resulting in a concentration of articles within the last few weeks.\n* Whilst this limited the scope of our analysis, the data was still provably useful.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcutwell%2Fdata-mining-casestudy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcutwell%2Fdata-mining-casestudy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcutwell%2Fdata-mining-casestudy/lists"}