https://github.com/cyberpsychofc/project-based-learning-1

Oncology Document Analyzer System
https://github.com/cyberpsychofc/project-based-learning-1

Last synced: 3 months ago
JSON representation

Oncology Document Analyzer System

Host: GitHub
URL: https://github.com/cyberpsychofc/project-based-learning-1
Owner: cyberpsychofc
Created: 2025-03-20T05:23:49.000Z (3 months ago)
Default Branch: main
Last Pushed: 2025-03-20T06:16:37.000Z (3 months ago)
Last Synced: 2025-03-20T06:31:35.162Z (3 months ago)
Language: Python
Homepage:
Size: 0 Bytes
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Oncology Document Analyzer System
ODAS is a part of Project Based Learning. It focuses on delivering oncology-specific knowledge discovery through the application of NLP and Machine Learning, enabling improved drug advisory and diagnostic solutions

### Project Structure
.
├── venv
│ ├── ODAS
│ ├── data # dataset with emb model
│ ├── getters # LLM calls
│ ├── models # classifier models
│ ├── utils # prediction and pdf functionality
│ ├── app.py # frontend server
│ └── requirements.txt # dependencies
├── .gitignore
└── README.md

### Environment Variables

Place your env in your python virtual environment, in this project 'venv'. To run this project, you will need to add the following environment variables to your .env file

`GROQ_API_KEY`

`LANGCHAIN_API_KEY`

`HUGGINGFACEHUB_API_TOKEN`

To start working with this project install the required dependencies. Inside 'ODAS' dir, run the command below:

```
pip install requirements.txt
```

To the start the streamlit server
```
streamlit run app.py
```
### Training your models
#### To train classifier models
```
python models/[model-name]/train.py
```
All the trained models are stored in 'saved_models' directory.

#### To train word embedding model
```
python data/train_w2v.py
```
Embedding model is stored in 'embedding_model' directory.

## 🛠️ Contributors
- **Om Aryan**
- **Pranay Rokade**
- **Mehansh Masih**
- **Vivek Sharma**
- **Harsh Saoji**

#### Download the dataset from below:
https://www.kaggle.com/datasets/falgunipatel19/biomedical-text-publication-classification

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cyberpsychofc/project-based-learning-1

Awesome Lists containing this project

README