https://github.com/sigdelsanjog/gptmed
pip install gptmed
https://github.com/sigdelsanjog/gptmed
casual-inference conversation-ai custom-model deep-learning deep-learning-algorithms gpt language-model llm medical-llm medical-question-answering-llm model-training-and-optimization nlp pip pytorch question-answering-model redis tiny-language-model tiny-llm transformer-architecture
Last synced: 5 months ago
JSON representation
pip install gptmed
- Host: GitHub
- URL: https://github.com/sigdelsanjog/gptmed
- Owner: sigdelsanjog
- License: mit
- Created: 2026-01-08T09:02:12.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2026-02-05T20:06:29.000Z (5 months ago)
- Last Synced: 2026-02-06T05:47:48.639Z (5 months ago)
- Topics: casual-inference, conversation-ai, custom-model, deep-learning, deep-learning-algorithms, gpt, language-model, llm, medical-llm, medical-question-answering-llm, model-training-and-optimization, nlp, pip, pytorch, question-answering-model, redis, tiny-language-model, tiny-llm, transformer-architecture
- Language: Python
- Homepage: https://pypi.org/project/gptmed
- Size: 8.38 MB
- Stars: 1
- Watchers: 0
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# GptMed 🤖
[](https://pepy.tech/project/gptmed)
[](https://pepy.tech/project/gptmed)
[](https://badge.fury.io/py/gptmed)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
A lightweight GPT-based language model framework for training custom question-answering models on any domain. This package provides a transformer-based GPT architecture that you can train on your own Q&A datasets - whether it's casual conversations, technical support, education, or any other domain.
## Citation
If you use this model in your research, please cite:
```bibtex
@software{gptmed_2026,
author = {Sanjog Sigdel},
title = {GptMed: A custom causal question answering general purpose GPT Transformer Architecture Model},
year = {2026},
url = {https://github.com/sigdelsanjog/gptmed}
}
```
## Table of Contents
- [Installation](#installation)
- [From PyPI (Recommended)](#from-pypi-recommended)
- [From Source](#from-source)
- [With Optional Dependencies](#with-optional-dependencies)
- [Quick Start](#quick-start)
- [Using the High-Level API](#using-the-high-level-api)
- [Inference (Generate Answers)](#inference-generate-answers)
- [Using Command Line](#using-command-line)
- [Training Your Own Model](#training-your-own-model)
- [Model Architecture](#model-architecture)
- [Configuration](#configuration)
- [Model Sizes](#model-sizes)
- [Training Configuration](#training-configuration)
- [Observability](#observability)
- [Project Structure](#project-structure)
- [Requirements](#requirements)
- [Documentation](#documentation)
- [Performance](#performance)
- [Examples](#examples)
- [Contributing](#contributing)
- [Citation](#citation)
- [License](#license)
- [Support](#support)
## Installation
### From PyPI (Recommended)
```bash
pip install gptmed
```
### From Source
```bash
git clone https://github.com/sigdelsanjog/gptmed.git
cd gptmed
pip install -e .
```
### With Optional Dependencies
```bash
# For development
pip install gptmed[dev]
# For training with logging integrations
pip install gptmed[training]
# For visualization (loss curves, metrics plots)
pip install gptmed[visualization]
# For Explainable AI features
pip install gptmed[xai]
# All dependencies
pip install gptmed[dev,training,visualization,xai]
```
## Quick Start
### Using the High-Level API
The easiest way to use GptMed is through the high-level API:
```python
import gptmed
# 1. Create a training configuration
gptmed.create_config('my_config.yaml')
# 2. Edit my_config.yaml with your settings (data paths, model size, etc.)
# 3. Train the model
gptmed.train_from_config('my_config.yaml')
# 4. Generate answers
answer = gptmed.generate(
checkpoint='model/checkpoints/best_model.pt',
tokenizer='tokenizer/my_tokenizer.model',
prompt='What is machine learning?',
max_length=150,
temperature=0.7
)
print(answer)
```
For a complete API testing workflow, see the [gptmed-api folder](https://github.com/sigdelsanjog/gptmed/tree/main/gptmed-api) with ready-to-run examples.
### Inference (Generate Answers)
```python
from gptmed.inference.generator import TextGenerator
from gptmed.model.architecture import GPTTransformer
from gptmed.model.configs.model_config import get_small_config
# Load model
config = get_small_config()
model = GPTTransformer(config)
# Load your trained checkpoint
# model.load_state_dict(torch.load('path/to/checkpoint.pt'))
# Create generator
generator = TextGenerator(
model=model,
tokenizer_path='path/to/tokenizer.model'
)
# Generate answer
question = "What's your favorite programming language?"
answer = generator.generate(
prompt=question,
max_length=100,
temperature=0.7
)
print(f"Q: {question}")
print(f"A: {answer}")
```
### Using Command Line
```bash
# Generate answers
gptmed-generate --prompt "How do I train a custom model?" --max-length 100
# Train model
gptmed-train --model-size small --num-epochs 10 --batch-size 16
```
### Training Your Own Model
```python
from gptmed.training.train import main
from gptmed.configs.train_config import get_default_config
from gptmed.model.configs.model_config import get_small_config
# Configure training
train_config = get_default_config()
train_config.batch_size = 16
train_config.num_epochs = 10
train_config.learning_rate = 3e-4
# Start training
main()
```
## Model Architecture
The model uses a custom GPT-based transformer architecture:
- **Embedding**: Token + positional embeddings
- **Transformer Blocks**: Multi-head self-attention + feed-forward networks
- **Parameters**: ~10M (small), ~50M (medium)
- **Context Length**: 512 tokens
- **Vocabulary**: Custom SentencePiece tokenizer trained on your data
## Configuration
### Model Sizes
```python
from gptmed.model.configs.model_config import (
get_tiny_config, # ~2M parameters - for testing
get_small_config, # ~10M parameters - recommended
get_medium_config # ~50M parameters - higher quality
)
```
### Training Configuration
```python
from gptmed.configs.train_config import TrainingConfig
config = TrainingConfig(
batch_size=16,
learning_rate=3e-4,
num_epochs=10,
warmup_steps=100,
grad_clip=1.0
)
```
## Observability
**New in v0.4.0**: Built-in training monitoring with Observer Pattern architecture.
### Features
- 📊 **Loss Curves**: Track training/validation loss over time
- 📈 **Metrics Tracking**: Perplexity, gradient norms, learning rates
- 🔔 **Callbacks**: Console output, JSON logging, early stopping
- 📁 **Export**: CSV export, matplotlib visualizations
- 🔌 **Extensible**: Add custom observers for integrations (W&B, TensorBoard)
### Quick Example
```python
from gptmed.observability import MetricsTracker, ConsoleCallback, EarlyStoppingCallback
# Create observers
tracker = MetricsTracker(output_dir='./metrics')
console = ConsoleCallback(print_every=50)
early_stop = EarlyStoppingCallback(patience=3)
# Use with TrainingService (automatic)
from gptmed.services import TrainingService
service = TrainingService(config_path='config.yaml')
service.train() # Automatically creates MetricsTracker
# Or use with Trainer directly
trainer = Trainer(model, train_loader, config, observers=[tracker, console])
trainer.train()
```
### Available Observers
| Observer | Description |
| ----------------------- | --------------------------------------------------------- |
| `MetricsTracker` | Comprehensive metrics collection with export capabilities |
| `ConsoleCallback` | Real-time console output with progress bars |
| `JSONLoggerCallback` | Structured JSON logging for analysis |
| `EarlyStoppingCallback` | Stop training when validation loss plateaus |
| `LRSchedulerCallback` | Learning rate scheduling integration |
See [XAI.md](XAI.md) for future Explainable AI features roadmap.
## Project Structure
```
gptmed/
├── model/
│ ├── architecture/ # GPT transformer implementation
│ └── configs/ # Model configurations
├── inference/
│ ├── generator.py # Text generation
│ └── sampling.py # Sampling strategies
├── training/
│ ├── train.py # Training script
│ ├── trainer.py # Training loop
│ └── dataset.py # Data loading
├── observability/ # Training monitoring & XAI (v0.4.0+)
│ ├── base.py # Observer pattern interfaces
│ ├── metrics_tracker.py # Loss curves & metrics
│ └── callbacks.py # Console, JSON, early stopping
├── tokenizer/
│ └── train_tokenizer.py # SentencePiece tokenizer
├── configs/
│ └── train_config.py # Training configurations
├── services/
│ └── training_service.py # High-level training orchestration
└── utils/
├── checkpoints.py # Model checkpointing
└── logging.py # Training logging
```
## Requirements
- Python >= 3.8
- PyTorch >= 2.0.0
- sentencepiece >= 0.1.99
- numpy >= 1.24.0
- tqdm >= 4.65.0
## Documentation
📚 **[Complete User Manual](USER_MANUAL.md)** - Step-by-step guide for training your own model
### Quick Links
- [User Manual](USER_MANUAL.md) - **Start here!** Complete training pipeline guide
- [Architecture Guide](ARCHITECTURE_EXTENSION_GUIDE.md) - Understanding the model architecture
- [XAI Roadmap](XAI.md) - Explainable AI features & implementation guide
- [Deployment Guide](DEPLOYMENT_GUIDE.md) - Publishing to PyPI
- [Changelog](CHANGELOG.md) - Version history
## Performance
| Model Size | Parameters | Training Time | Inference Speed |
| ---------- | ---------- | ------------- | --------------- |
| Tiny | ~2M | 2 hours | ~100 tokens/sec |
| Small | ~10M | 8 hours | ~80 tokens/sec |
| Medium | ~50M | 24 hours | ~50 tokens/sec |
_Tested on GTX 1080 8GB_
## Examples
### Domain-Agnostic Usage
GptMed works with **any domain** - just train on your own Q&A data:
```python
# Technical Support Bot
question = "How do I reset my WiFi router?"
answer = generator.generate(question, temperature=0.7)
# Educational Assistant
question = "Explain the water cycle in simple terms"
answer = generator.generate(question, temperature=0.6)
# Customer Service
question = "What is your return policy?"
answer = generator.generate(question, temperature=0.5)
# Medical Q&A (example domain)
question = "What are the symptoms of flu?"
answer = generator.generate(question, temperature=0.7)
```
### Training Observability (v0.4.0+)
Monitor your training with built-in observability:
```python
from gptmed.observability import MetricsTracker, ConsoleCallback
# Create observers
tracker = MetricsTracker(output_dir='./metrics')
console = ConsoleCallback(print_every=10)
# Train with observability
gptmed.train_from_config(
'my_config.yaml',
observers=[tracker, console]
)
# After training - get the report
report = tracker.get_report()
print(f"Final Loss: {report['final_loss']:.4f}")
print(f"Total Steps: {report['total_steps']}")
# Export metrics
tracker.export_to_csv('training_metrics.csv')
tracker.plot_loss_curves('loss_curves.png') # Requires matplotlib
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- MedQuAD dataset creators
- PyTorch team
## Support
- 📫 [User Manual](USER_MANUAL.md)\*\* - Complete step-by-step training guide
- 📫 Issues: [GitHub Issues](https://github.com/sigdelsanjog/gptmed/issues)
- 💬 Discussions: [GitHub Discussions](https://github.com/sigdelsanjog/gptmed/discussions)
- 📧 Email: sigdelsanjog@gmail.com | sanjog.sigdel@ku.edu.np
## Changelog
[Full Changelog](https://github.com/sigdelsanjog/gptmed/blob/main/CHANGELOG.md)
---
#### Made with ❤️ from Nepal