{"id":37228770,"url":"https://github.com/sigdelsanjog/gptmed","last_synced_at":"2026-02-08T11:15:20.023Z","repository":{"id":331823809,"uuid":"1130263398","full_name":"sigdelsanjog/gptmed","owner":"sigdelsanjog","description":"pip install gptmed","archived":false,"fork":false,"pushed_at":"2026-02-05T20:06:29.000Z","size":8785,"stargazers_count":1,"open_issues_count":2,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-06T05:47:48.639Z","etag":null,"topics":["casual-inference","conversation-ai","custom-model","deep-learning","deep-learning-algorithms","gpt","language-model","llm","medical-llm","medical-question-answering-llm","model-training-and-optimization","nlp","pip","pytorch","question-answering-model","redis","tiny-language-model","tiny-llm","transformer-architecture"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/gptmed","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sigdelsanjog.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-08T09:02:12.000Z","updated_at":"2026-02-05T20:06:20.000Z","dependencies_parsed_at":null,"dependency_job_id":"23f6f03f-4990-423c-b72d-d8bc20259eb5","html_url":"https://github.com/sigdelsanjog/gptmed","commit_stats":null,"previous_names":["sigdelsanjog/gptmed"],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/sigdelsanjog/gptmed","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sigdelsanjog%2Fgptmed","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sigdelsanjog%2Fgptmed/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sigdelsanjog%2Fgptmed/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sigdelsanjog%2Fgptmed/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sigdelsanjog","download_url":"https://codeload.github.com/sigdelsanjog/gptmed/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sigdelsanjog%2Fgptmed/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29228808,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-08T09:43:19.170Z","status":"ssl_error","status_checked_at":"2026-02-08T09:42:55.556Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["casual-inference","conversation-ai","custom-model","deep-learning","deep-learning-algorithms","gpt","language-model","llm","medical-llm","medical-question-answering-llm","model-training-and-optimization","nlp","pip","pytorch","question-answering-model","redis","tiny-language-model","tiny-llm","transformer-architecture"],"created_at":"2026-01-15T03:29:19.391Z","updated_at":"2026-02-08T11:15:19.996Z","avatar_url":"https://github.com/sigdelsanjog.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GptMed 🤖\n\n[![Downloads](https://static.pepy.tech/badge/gptmed)](https://pepy.tech/project/gptmed)\n[![Downloads/Month](https://static.pepy.tech/badge/gptmed/month)](https://pepy.tech/project/gptmed)\n[![PyPI version](https://badge.fury.io/py/gptmed.svg)](https://badge.fury.io/py/gptmed)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA lightweight GPT-based language model framework for training custom question-answering models on any domain. This package provides a transformer-based GPT architecture that you can train on your own Q\u0026A datasets - whether it's casual conversations, technical support, education, or any other domain.\n\n## Citation\n\nIf you use this model in your research, please cite:\n\n```bibtex\n@software{gptmed_2026,\n  author = {Sanjog Sigdel},\n  title = {GptMed: A custom causal question answering general purpose GPT Transformer Architecture Model},\n  year = {2026},\n  url = {https://github.com/sigdelsanjog/gptmed}\n}\n```\n\n## Table of Contents\n\n- [Installation](#installation)\n  - [From PyPI (Recommended)](#from-pypi-recommended)\n  - [From Source](#from-source)\n  - [With Optional Dependencies](#with-optional-dependencies)\n- [Quick Start](#quick-start)\n  - [Using the High-Level API](#using-the-high-level-api)\n  - [Inference (Generate Answers)](#inference-generate-answers)\n  - [Using Command Line](#using-command-line)\n  - [Training Your Own Model](#training-your-own-model)\n- [Model Architecture](#model-architecture)\n- [Configuration](#configuration)\n  - [Model Sizes](#model-sizes)\n  - [Training Configuration](#training-configuration)\n- [Observability](#observability)\n- [Project Structure](#project-structure)\n- [Requirements](#requirements)\n- [Documentation](#documentation)\n- [Performance](#performance)\n- [Examples](#examples)\n- [Contributing](#contributing)\n- [Citation](#citation)\n- [License](#license)\n- [Support](#support)\n\n## Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install gptmed\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/sigdelsanjog/gptmed.git\ncd gptmed\npip install -e .\n```\n\n### With Optional Dependencies\n\n```bash\n# For development\npip install gptmed[dev]\n\n# For training with logging integrations\npip install gptmed[training]\n\n# For visualization (loss curves, metrics plots)\npip install gptmed[visualization]\n\n# For Explainable AI features\npip install gptmed[xai]\n\n# All dependencies\npip install gptmed[dev,training,visualization,xai]\n```\n\n## Quick Start\n\n### Using the High-Level API\n\nThe easiest way to use GptMed is through the high-level API:\n\n```python\nimport gptmed\n\n# 1. Create a training configuration\ngptmed.create_config('my_config.yaml')\n\n# 2. Edit my_config.yaml with your settings (data paths, model size, etc.)\n\n# 3. Train the model\ngptmed.train_from_config('my_config.yaml')\n\n# 4. Generate answers\nanswer = gptmed.generate(\n    checkpoint='model/checkpoints/best_model.pt',\n    tokenizer='tokenizer/my_tokenizer.model',\n    prompt='What is machine learning?',\n    max_length=150,\n    temperature=0.7\n)\nprint(answer)\n```\n\nFor a complete API testing workflow, see the [gptmed-api folder](https://github.com/sigdelsanjog/gptmed/tree/main/gptmed-api) with ready-to-run examples.\n\n### Inference (Generate Answers)\n\n```python\nfrom gptmed.inference.generator import TextGenerator\nfrom gptmed.model.architecture import GPTTransformer\nfrom gptmed.model.configs.model_config import get_small_config\n\n# Load model\nconfig = get_small_config()\nmodel = GPTTransformer(config)\n\n# Load your trained checkpoint\n# model.load_state_dict(torch.load('path/to/checkpoint.pt'))\n\n# Create generator\ngenerator = TextGenerator(\n    model=model,\n    tokenizer_path='path/to/tokenizer.model'\n)\n\n# Generate answer\nquestion = \"What's your favorite programming language?\"\nanswer = generator.generate(\n    prompt=question,\n    max_length=100,\n    temperature=0.7\n)\n\nprint(f\"Q: {question}\")\nprint(f\"A: {answer}\")\n```\n\n### Using Command Line\n\n```bash\n# Generate answers\ngptmed-generate --prompt \"How do I train a custom model?\" --max-length 100\n\n# Train model\ngptmed-train --model-size small --num-epochs 10 --batch-size 16\n```\n\n### Training Your Own Model\n\n```python\nfrom gptmed.training.train import main\nfrom gptmed.configs.train_config import get_default_config\nfrom gptmed.model.configs.model_config import get_small_config\n\n# Configure training\ntrain_config = get_default_config()\ntrain_config.batch_size = 16\ntrain_config.num_epochs = 10\ntrain_config.learning_rate = 3e-4\n\n# Start training\nmain()\n```\n\n## Model Architecture\n\nThe model uses a custom GPT-based transformer architecture:\n\n- **Embedding**: Token + positional embeddings\n- **Transformer Blocks**: Multi-head self-attention + feed-forward networks\n- **Parameters**: ~10M (small), ~50M (medium)\n- **Context Length**: 512 tokens\n- **Vocabulary**: Custom SentencePiece tokenizer trained on your data\n\n## Configuration\n\n### Model Sizes\n\n```python\nfrom gptmed.model.configs.model_config import (\n    get_tiny_config,   # ~2M parameters - for testing\n    get_small_config,  # ~10M parameters - recommended\n    get_medium_config  # ~50M parameters - higher quality\n)\n```\n\n### Training Configuration\n\n```python\nfrom gptmed.configs.train_config import TrainingConfig\n\nconfig = TrainingConfig(\n    batch_size=16,\n    learning_rate=3e-4,\n    num_epochs=10,\n    warmup_steps=100,\n    grad_clip=1.0\n)\n```\n\n## Observability\n\n**New in v0.4.0**: Built-in training monitoring with Observer Pattern architecture.\n\n### Features\n\n- 📊 **Loss Curves**: Track training/validation loss over time\n- 📈 **Metrics Tracking**: Perplexity, gradient norms, learning rates\n- 🔔 **Callbacks**: Console output, JSON logging, early stopping\n- 📁 **Export**: CSV export, matplotlib visualizations\n- 🔌 **Extensible**: Add custom observers for integrations (W\u0026B, TensorBoard)\n\n### Quick Example\n\n```python\nfrom gptmed.observability import MetricsTracker, ConsoleCallback, EarlyStoppingCallback\n\n# Create observers\ntracker = MetricsTracker(output_dir='./metrics')\nconsole = ConsoleCallback(print_every=50)\nearly_stop = EarlyStoppingCallback(patience=3)\n\n# Use with TrainingService (automatic)\nfrom gptmed.services import TrainingService\nservice = TrainingService(config_path='config.yaml')\nservice.train()  # Automatically creates MetricsTracker\n\n# Or use with Trainer directly\ntrainer = Trainer(model, train_loader, config, observers=[tracker, console])\ntrainer.train()\n```\n\n### Available Observers\n\n| Observer                | Description                                               |\n| ----------------------- | --------------------------------------------------------- |\n| `MetricsTracker`        | Comprehensive metrics collection with export capabilities |\n| `ConsoleCallback`       | Real-time console output with progress bars               |\n| `JSONLoggerCallback`    | Structured JSON logging for analysis                      |\n| `EarlyStoppingCallback` | Stop training when validation loss plateaus               |\n| `LRSchedulerCallback`   | Learning rate scheduling integration                      |\n\nSee [XAI.md](XAI.md) for future Explainable AI features roadmap.\n\n## Project Structure\n\n```\ngptmed/\n├── model/\n│   ├── architecture/      # GPT transformer implementation\n│   └── configs/           # Model configurations\n├── inference/\n│   ├── generator.py       # Text generation\n│   └── sampling.py        # Sampling strategies\n├── training/\n│   ├── train.py          # Training script\n│   ├── trainer.py        # Training loop\n│   └── dataset.py        # Data loading\n├── observability/         # Training monitoring \u0026 XAI (v0.4.0+)\n│   ├── base.py           # Observer pattern interfaces\n│   ├── metrics_tracker.py # Loss curves \u0026 metrics\n│   └── callbacks.py      # Console, JSON, early stopping\n├── tokenizer/\n│   └── train_tokenizer.py # SentencePiece tokenizer\n├── configs/\n│   └── train_config.py   # Training configurations\n├── services/\n│   └── training_service.py # High-level training orchestration\n└── utils/\n    ├── checkpoints.py    # Model checkpointing\n    └── logging.py        # Training logging\n```\n\n## Requirements\n\n- Python \u003e= 3.8\n- PyTorch \u003e= 2.0.0\n- sentencepiece \u003e= 0.1.99\n- numpy \u003e= 1.24.0\n- tqdm \u003e= 4.65.0\n\n## Documentation\n\n📚 **[Complete User Manual](USER_MANUAL.md)** - Step-by-step guide for training your own model\n\n### Quick Links\n\n- [User Manual](USER_MANUAL.md) - **Start here!** Complete training pipeline guide\n- [Architecture Guide](ARCHITECTURE_EXTENSION_GUIDE.md) - Understanding the model architecture\n- [XAI Roadmap](XAI.md) - Explainable AI features \u0026 implementation guide\n- [Deployment Guide](DEPLOYMENT_GUIDE.md) - Publishing to PyPI\n- [Changelog](CHANGELOG.md) - Version history\n\n## Performance\n\n| Model Size | Parameters | Training Time | Inference Speed |\n| ---------- | ---------- | ------------- | --------------- |\n| Tiny       | ~2M        | 2 hours       | ~100 tokens/sec |\n| Small      | ~10M       | 8 hours       | ~80 tokens/sec  |\n| Medium     | ~50M       | 24 hours      | ~50 tokens/sec  |\n\n_Tested on GTX 1080 8GB_\n\n## Examples\n\n### Domain-Agnostic Usage\n\nGptMed works with **any domain** - just train on your own Q\u0026A data:\n\n```python\n# Technical Support Bot\nquestion = \"How do I reset my WiFi router?\"\nanswer = generator.generate(question, temperature=0.7)\n\n# Educational Assistant\nquestion = \"Explain the water cycle in simple terms\"\nanswer = generator.generate(question, temperature=0.6)\n\n# Customer Service\nquestion = \"What is your return policy?\"\nanswer = generator.generate(question, temperature=0.5)\n\n# Medical Q\u0026A (example domain)\nquestion = \"What are the symptoms of flu?\"\nanswer = generator.generate(question, temperature=0.7)\n```\n\n### Training Observability (v0.4.0+)\n\nMonitor your training with built-in observability:\n\n```python\nfrom gptmed.observability import MetricsTracker, ConsoleCallback\n\n# Create observers\ntracker = MetricsTracker(output_dir='./metrics')\nconsole = ConsoleCallback(print_every=10)\n\n# Train with observability\ngptmed.train_from_config(\n    'my_config.yaml',\n    observers=[tracker, console]\n)\n\n# After training - get the report\nreport = tracker.get_report()\nprint(f\"Final Loss: {report['final_loss']:.4f}\")\nprint(f\"Total Steps: {report['total_steps']}\")\n\n# Export metrics\ntracker.export_to_csv('training_metrics.csv')\ntracker.plot_loss_curves('loss_curves.png')  # Requires matplotlib\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- MedQuAD dataset creators\n- PyTorch team\n\n## Support\n\n- 📫 [User Manual](USER_MANUAL.md)\\*\\* - Complete step-by-step training guide\n- 📫 Issues: [GitHub Issues](https://github.com/sigdelsanjog/gptmed/issues)\n- 💬 Discussions: [GitHub Discussions](https://github.com/sigdelsanjog/gptmed/discussions)\n- 📧 Email: sigdelsanjog@gmail.com | sanjog.sigdel@ku.edu.np\n\n## Changelog\n\n[Full Changelog](https://github.com/sigdelsanjog/gptmed/blob/main/CHANGELOG.md)\n\n---\n\n#### Made with ❤️ from Nepal\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsigdelsanjog%2Fgptmed","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsigdelsanjog%2Fgptmed","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsigdelsanjog%2Fgptmed/lists"}