https://github.com/ali7haider/classification_of_documents_using_graph-based-features_and_knn_gt
Classification of Documents Using Graph-Based Features and KNN This project offers hands-on experience with graph theory and machine learning, fostering skills in data representation, algorithm implementation, and analytical thinking in the context of document classification.
https://github.com/ali7haider/classification_of_documents_using_graph-based-features_and_knn_gt
document-classification graph-construction graph-theory knn-classification machine-learning scrapping-python
Last synced: over 1 year ago
JSON representation
Classification of Documents Using Graph-Based Features and KNN This project offers hands-on experience with graph theory and machine learning, fostering skills in data representation, algorithm implementation, and analytical thinking in the context of document classification.
- Host: GitHub
- URL: https://github.com/ali7haider/classification_of_documents_using_graph-based-features_and_knn_gt
- Owner: ali7haider
- Created: 2024-04-27T10:27:21.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-08T17:09:58.000Z (about 2 years ago)
- Last Synced: 2025-01-05T13:09:52.528Z (over 1 year ago)
- Topics: document-classification, graph-construction, graph-theory, knn-classification, machine-learning, scrapping-python
- Language: Python
- Homepage:
- Size: 1.79 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Classification_of_Documents_Using_Graph-Based-Features_and_KNN_GT
- Project Title: Classification of Documents Using Graph-Based Features and KNN This project offers hands-on experience with graph theory and machine learning, fostering skills in data representation, algorithm implementation, and analytical thinking in the context of document classification.
# Objective:
Develop a system to classify documents into predefined topics by representing each document as a directed
graph, identifying common subgraphs, and applying the K-Nearest Neighbors (KNN) algorithm based on
graph similarity measures.
# Data Scrapped:
- Diseases and Symptoms
- Scrapped from RemediesLab (https://www.remedieslabs.com)
- Sports
- Scrapped from TimesOfIndia (https://timesofindia.indiatimes.com)
- Science and Education
- Scrapped from SSEC (https://ssec.si.edu)