Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/edochiari/tiktok-project
This project builds a predictive model to help TikTok classify user-reported content claims, improving moderation efficiency by identifying and prioritizing content that may need review. Insights from this model enable TikTok to manage reports more effectively, ensuring a safer and more engaging platform.
https://github.com/edochiari/tiktok-project
content-claims dataanalysis datacleaning hypothesis-testing jupyter-notebook regression tiktok
Last synced: about 17 hours ago
JSON representation
This project builds a predictive model to help TikTok classify user-reported content claims, improving moderation efficiency by identifying and prioritizing content that may need review. Insights from this model enable TikTok to manage reports more effectively, ensuring a safer and more engaging platform.
- Host: GitHub
- URL: https://github.com/edochiari/tiktok-project
- Owner: EdoChiari
- Created: 2024-11-07T20:46:34.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-11-08T10:30:31.000Z (3 months ago)
- Last Synced: 2024-12-09T13:40:21.494Z (about 2 months ago)
- Topics: content-claims, dataanalysis, datacleaning, hypothesis-testing, jupyter-notebook, regression, tiktok
- Language: Jupyter Notebook
- Homepage:
- Size: 1.49 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TikTok Claims Classification Project
## Overview
This project focuses on developing a predictive model to support TikTok’s moderation team by classifying user-submitted content claims efficiently. By analyzing user reports on videos and comments, the project aims to build a model that distinguishes between content with user claims versus opinions. This approach will help TikTok reduce the backlog of reports, prioritize moderation efforts, and maintain a safe and engaging community.## Project Goals
1. **Classify Content Claims**: Build and evaluate a model to predict whether a video contains a claim or an opinion, enabling TikTok to streamline content moderation.
2. **Enhance Moderation Efficiency**: Provide a scalable solution for handling high volumes of user reports, improving the speed and accuracy of content review processes.
3. **Deliver Insights for Stakeholders**: Generate actionable insights from user reports to aid TikTok leadership in understanding content trends and moderation needs.## Deliverables
The final project deliverables include:- **Model Evaluation**: Comprehensive assessment of the classification model, including accuracy, precision, and recall, to gauge its effectiveness in content claim prediction.
- **Data Visualizations**: Interactive Tableau dashboards summarizing user report trends, claim types, and other key insights, accessible to non-technical stakeholders.
- **Feature Analysis**: Examination of features that contribute most to accurate claim classification, with discussions on potential causative relationships.
- **Future Model Improvements**: Recommendations for additional features and data sources that may enhance the accuracy and relevance of the model.## Tools and Libraries Used
- **Data Analysis and Visualization**: Pandas, NumPy, Matplotlib, Seaborn, Tableau
- **Machine Learning**: Scikit-learn (for regression and classification models)
- **Notebook Environment**: Jupyter Notebook## Project Structure
The project is organized as follows:1. **Data Preparation**: Building and organizing a comprehensive dataset from user reports for claims classification, ensuring data quality and suitability for analysis.
2. **Exploratory Data Analysis (EDA)**: Analyzing user reports to identify claim patterns, trends, and factors that may impact claim classification.
3. **Hypothesis Testing**: Conducting hypothesis tests to determine the significance of various factors within user reports, informing model feature selection.
4. **Model Building and Evaluation**: Developing and testing a regression model to classify content claims, followed by evaluation using key performance metrics.
5. **Executive Summary**: A presentation-ready summary for stakeholders, highlighting findings, model performance, and potential impact on moderation efforts.## Conclusion
This project offers TikTok a data-driven approach to improve content moderation by predicting user claims more effectively. With a model that enhances the prioritization of user reports, TikTok can maintain a safe, enjoyable platform while efficiently managing moderation resources.## Badges
Add badges from somewhere like: [shields.io](https://shields.io/)
[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://choosealicense.com/licenses/mit/)
[![GPLv3 License](https://img.shields.io/badge/License-GPL%20v3-yellow.svg)](https://opensource.org/licenses/)
[![AGPL License](https://img.shields.io/badge/license-AGPL-blue.svg)](http://www.gnu.org/licenses/agpl-3.0)