https://github.com/danitilahun/nlp-content-moderator

AI-driven solution for automated content moderation.
https://github.com/danitilahun/nlp-content-moderator

bert named-entity-recognition python sentiment-analysis text-classification

Last synced: 7 days ago
JSON representation

AI-driven solution for automated content moderation.

Host: GitHub
URL: https://github.com/danitilahun/nlp-content-moderator
Owner: Danitilahun
Created: 2024-12-08T20:41:32.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-03-02T08:02:07.000Z (4 months ago)
Last Synced: 2025-05-14T17:16:14.636Z (about 1 month ago)
Topics: bert, named-entity-recognition, python, sentiment-analysis, text-classification
Language: Jupyter Notebook
Homepage:
Size: 4.69 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Amharic Text Moderation

This project provides the framework for creating an Amharic text moderation system. The goal of this project is to develop a tool capable of detecting harmful and inappropriate content in Amharic text, particularly for social media platforms. The system leverages natural language processing (NLP) techniques to identify and moderate hate speech, cyberbullying, explicit content, and sensitive topics. While the initial application is focused on social media, this system can be adapted to other platforms, including online forums, news articles, and messaging apps.

The **Amharic Text Moderation** pipeline processes text through multiple stages to assess and classify content based on sentiment, entity recognition, text classification, and topic classification. This project aims to provide an efficient, scalable, and adaptable solution to content moderation in Amharic language settings.

---

## Documentation

- **Overview**: An in-depth introduction to the project, its goals, and architecture.
[Read Overview](docs/01_overview.md)

- **Dataset Requirements**: Details on the necessary datasets to build a robust moderation system.
[Read Dataset Requirements](docs/02_dataset_requirements.md)

- **Found Datasets**: A list of available datasets suitable for training models for Amharic text moderation.
[Read Found Datasets](docs/03_found_datasets.md)

- **Dataset Challenges**: Discusses the challenges in working with Amharic text data and potential solutions.
[Read Dataset Challenges](docs/04_dataset_challenges.md)

- **Model Selection**: Criteria and suggestions for selecting appropriate models for sentiment analysis, entity recognition, and more.
[Read Model Selection](docs/05_model_selection.md)

- **Pipeline Architecture**: A detailed explanation of the system architecture and its components.
[Read Pipeline Architecture](docs/06_pipeline_architecture.md)

- **Decision Engine**: Describes the logic used to determine moderation decisions.
[Read Decision Engine](docs/07_decision_engine.md)
---

## Architecture Diagram

Here is the high-level architecture of the Amharic text moderation pipeline:

![Text Moderation Pipeline](/Diagram/RLNTZkCq5BxtKnn5MZGL7J7CWeH66hJzQLc9Tgdw0qW8hLpdD35rx60xiriXxWiy0XBYYXi4T-pxz4quTYRPj9gRoSJvllFpVUUuBvpdrbU5ObvwXJ0eScwbW3M-zJ0tALhkfT6ma2KggP6nal1Io99h3yd6eKs0EzYSdo-uHNekZAC_r.png)
High-level architecture of the Amharic text moderation pipeline. View the diagram in SVG format [here](https://www.plantuml.com/plantuml/svg/RLNTZkCq5BxtKnn5MZGL7J7CWeH66hJzQLc9Tgdw0qW8hLpdD35rx60xiriXxWiy0XBYYXi4T-pxz4quTYRPj9gRoSJvllFpVUUuBvpdrbU5ObvwXJ0eScwbW3M-zJ0tALhkfT6ma2KggP6nal1Io99h3yd6eKs0EzYSdo-uHNekZAC_rWXqpjJ0nM9n3fsZvodt75RUM9uXN8tfRSiTzY9XlZh7ZqnHeBuCExg8EiISFqWAg62WkJeuwI9kE3Z7ZL585ngSw8pQYvZnv1pJaM4WmiFLXLs4cYAcMoxsC3EcZ8ZfZB7xUtX23Aapk5fO5AOeAuzf3-xlsGQkFm_wtC788lVuqQPCwH69ZCxRpnDDPXrWIVHCorZOrQDnryF9Sjq7L_BHeNT6MfkIV1qfJjbtKl2QzkrilZHRj9xVKHv5mOJhmyJdMWhAqQSaIcO4xSCNg97qRAAICNVHCuJ6mevrZCrNRUht5eiS2ywqyGXMPha7imi6_yG0afBQV2tJv0uIlXKVtpvF-k4OGxCI7Pr_bvn-_Ujq_EVqoxlJR-zEn_-IFjH7_vwEVzGlVvwElv-EVoVVHx8FJTVKjgU0QrfzX5LHwFGrzo7xxStj9zStdrrJ1UndXZg5K72qycBuoCRMk14rkggsXVGDYerQdQReHTuCGYC7VM-9eSl0QQVrtD5UYttQ_RevMAdAadrVuk6DiQdhik92Nxh_CepNgaGKUH--skrGE-p34BNCT4CU3c9vQTjxy1Xsna9OVcWMhIqywlJ4-WMXXeCcuYX7iQyZl9JE6tku_1Z7FWWyjVXZXLeqYF6445o9IeNgbzBjQQwChGqQ7pGlf82BlGx9nvE6FSWoYrbWHw8x4m8-lNa67tOSUHxV9uzZ2BStpthTMEaFbVEmpYbERbJgOCWTfc1qOtTSFDUbbAKY_og5mUNAc-kXCc9FQ-S43qtrOPNprBoXK6G3uyh3UnW30AY3pKchUDPu4x_DeqgREtXblDmTQf6lsdjpYOLvm3Gws6axwJ5K3jiQBjB42OYKgk2QWZn8V1CPdLelE3FPMGRuclpOIaMZ6rQAiRhCwOnm9kl-r3JMHJ_FBjNWudJMqDOsd4-vSl6A97SUqC47y8giZVztiDyWxeFcw2sNEaoLsSMSgeEY0-rpWitmhHHKINt_EiPUKCLLeVu7)

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/danitilahun/nlp-content-moderator

Awesome Lists containing this project

README