{"id":48126734,"url":"https://github.com/rybkr/modelregistry","last_synced_at":"2026-04-04T16:27:29.996Z","repository":{"id":320961561,"uuid":"1083906246","full_name":"rybkr/modelregistry","owner":"rybkr","description":null,"archived":false,"fork":false,"pushed_at":"2025-12-13T02:20:42.000Z","size":1109,"stargazers_count":1,"open_issues_count":10,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-14T16:51:20.720Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rybkr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-26T23:46:16.000Z","updated_at":"2025-12-13T02:20:45.000Z","dependencies_parsed_at":"2025-10-27T01:27:06.989Z","dependency_job_id":null,"html_url":"https://github.com/rybkr/modelregistry","commit_stats":null,"previous_names":["rybkr/modelregistry"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rybkr/modelregistry","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rybkr%2Fmodelregistry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rybkr%2Fmodelregistry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rybkr%2Fmodelregistry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rybkr%2Fmodelregistry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rybkr","download_url":"https://codeload.github.com/rybkr/modelregistry/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rybkr%2Fmodelregistry/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31405701,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T10:20:44.708Z","status":"ssl_error","status_checked_at":"2026-04-04T10:20:06.846Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-04T16:27:29.391Z","updated_at":"2026-04-04T16:27:29.988Z","avatar_url":"https://github.com/rybkr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Model Registry\n\nA trustworthy model registry for machine learning models with REST API, web interface, and cloud deployment on AWS. The Model Registry provides comprehensive quality metrics evaluation, package management, and a user-friendly web interface for managing ML models, datasets, and code repositories.\n\n## Purpose\n\nThe Model Registry is designed to help organizations:\n- **Evaluate ML Models**: Automatically compute quality metrics including license compatibility, code quality, performance claims, and more\n- **Manage Packages**: Upload, search, and manage ML models, datasets, and code repositories\n- **Ingest from HuggingFace**: Automatically ingest and evaluate models from HuggingFace with quality scoring\n- **Track Quality Metrics**: Monitor model quality through comprehensive metrics including Net Score, Ramp-Up Time, Bus Factor, and more\n- **Web Interface**: User-friendly web UI for browsing, uploading, and managing packages\n- **Health Monitoring**: System health dashboard with activity tracking and logging\n\n## Features\n\n### ✅ Implemented Features\n- **REST API**: Full CRUD operations for packages\n- **Web Interface**: Modern, accessible web UI with WCAG 2.1 AA compliance\n- **Package Management**: Upload, list, search, and delete packages\n- **HuggingFace Integration**: Ingest models directly from HuggingFace\n- **Quality Metrics**: Comprehensive metric evaluation system\n- **Health Dashboard**: System monitoring and activity tracking\n- **Authentication**: User authentication and authorization system\n- **Search \u0026 Filtering**: Advanced search with regex support, sorting, and pagination\n- **CI/CD Pipeline**: Automated testing and deployment via GitHub Actions\n- **AWS Deployment**: Elastic Beanstalk deployment with automated CI/CD\n- **End-to-End Testing**: Comprehensive Selenium-based GUI tests\n\n### Quality Metrics\nThe registry evaluates models using the following metrics:\n- **Net Score**: Overall quality score (weighted combination of all metrics)\n- **License**: License clarity and LGPLv2.1 compatibility\n- **Ramp-Up Time**: How quickly new users can start using the repository\n- **Bus Factor**: Risk of abandonment based on contributors and activity\n- **Dataset \u0026 Code Availability**: Presence of datasets and code for reproducibility\n- **Dataset Quality**: Trustworthiness and maintenance of datasets\n- **Code Quality**: Code style, type checking, and popularity signals\n- **Performance Claims**: Evidence of performance benchmarks and results\n- **Size**: Model size and device compatibility\n- **Reviewedness**: Code review coverage and quality\n\n## Quick Start\n\n### Prerequisites\n- Python 3.9 or higher\n- pip package manager\n- (Optional) AWS CLI for cloud deployment\n- (Optional) Chrome/ChromeDriver for end-to-end tests\n\n### Installation\n\n1. **Clone the repository**\n   ```bash\n   git clone \u003crepository-url\u003e\n   cd modelregistry\n   ```\n\n2. **Install dependencies**\n   ```bash\n   pip install -e \".[dev]\"\n   ```\n   Or using requirements.txt:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n3. **Set up pre-commit hooks (optional)**\n   ```bash\n   pre-commit install\n   ```\n\n## Configuration\n\n### Environment Variables\n\nThe Model Registry can be configured using the following environment variables:\n\n#### Required for Full Functionality\n- **`PORT`** (default: `8000`): Port number for the Flask server\n- **`GH_API_TOKEN`** or **`GITHUB_TOKEN`**: GitHub API token for accessing GitHub repositories (required for code quality and reviewedness metrics)\n- **`PURDUE_GENAI_API_KEY`**: Purdue GenAI API key for LLM-based metric evaluation (required for license, dataset/code, and performance claims metrics)\n\n#### Optional Configuration\n- **`USER_STORAGE_BUCKET`**: AWS S3 bucket name for storing package files (optional, uses in-memory storage if not set)\n- **`DEFAULT_ADMIN_PASSWORD_HASH`**: Bcrypt hash for default admin user password (auto-generated if not set)\n- **`LOG_FILE`**: Path to log file (logging disabled if not set)\n- **`LOG_LEVEL`**: Logging level (`0` = silent, `1` = info, `2` = debug, default: `0`)\n\n#### AWS Deployment\n- **`AWS_ACCESS_KEY_ID`**: AWS access key for deployment\n- **`AWS_SECRET_ACCESS_KEY`**: AWS secret key for deployment\n- **`AWS_REGION`**: AWS region for deployment (e.g., `us-east-1`)\n\n### Configuration File\n\nCreate a `.env` file in the project root (optional, for local development):\n```bash\nPORT=8000\nGITHUB_TOKEN=your_github_token_here\nPURDUE_GENAI_API_KEY=your_purdue_genai_key_here\nLOG_LEVEL=1\nLOG_FILE=model_registry.log\n```\n\n**Note**: The `.env` file is gitignored and should not be committed to version control.\n\n## Running the Application\n\n### Local Development\n\n1. **Start the API server**\n   ```bash\n   python src/api_server.py\n   ```\n   Or using the application entry point:\n   ```bash\n   python src/application.py\n   ```\n\n2. **Access the web interface**\n   - Web UI: http://localhost:8000\n   - API: http://localhost:8000/api/health\n\n3. **Run tests**\n   ```bash\n   # Run all tests\n   pytest\n   \n   # Run with coverage\n   pytest --cov=src --cov-report=html\n   \n   # Run end-to-end tests (requires Chrome)\n   pytest test/e2e/ -v -m e2e\n   \n   # Run specific test file\n   pytest test/test_api_crud.py -v\n   ```\n\n### Using the Legacy CLI Tool\n\nThe Phase 1 command-line tool is still available:\n```bash\n./run install    # Install dependencies\n./run test       # Run tests  \n./run urls.txt   # Evaluate models from URLs\n```\n\n## Deployment\n\n### AWS Elastic Beanstalk Deployment\n\nThe Model Registry can be deployed to AWS Elastic Beanstalk with automated CI/CD.\n\n#### Quick Deploy (5 minutes)\n\n1. **Configure AWS CLI**\n   ```bash\n   aws configure\n   ```\n\n2. **Run setup script**\n   ```bash\n   ./scripts/aws_setup.sh\n   ```\n\n3. **Add secrets to GitHub**\n   - Go to: Settings → Secrets → Actions\n   - Add: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`\n\n4. **Push to deploy**\n   ```bash\n   git push origin main\n   ```\n\n**Cost**: $0/month on AWS Free Tier ✅\n\n#### Detailed Deployment Guides\n\n- **[QUICK_START_AWS.md](./QUICK_START_AWS.md)** - 5-minute quick start\n- **[AWS_DEPLOYMENT_GUIDE.md](./AWS_DEPLOYMENT_GUIDE.md)** - Detailed guide\n- **[AWS_SETUP_CHECKLIST.md](./AWS_SETUP_CHECKLIST.md)** - Step-by-step checklist\n\n### Manual Deployment\n\n1. **Set environment variables** on your deployment platform\n2. **Install dependencies**: `pip install -r requirements.txt`\n3. **Run the application**: `python src/application.py` or use a WSGI server like gunicorn:\n   ```bash\n   gunicorn -w 4 -b 0.0.0.0:8000 application:application\n   ```\n\n## Interacting with the Model Registry\n\n### Web Interface\n\nThe Model Registry provides a user-friendly web interface accessible at `http://localhost:8000` (or your deployment URL).\n\n#### Available Pages\n\n1. **Packages Page** (`/`)\n   - Browse all packages\n   - Search and filter packages\n   - Sort by name, version, size, or date\n   - Pagination support\n   - Regex search option\n\n2. **Upload Page** (`/upload`)\n   - Upload new packages manually\n   - Enter package name, version, URL, content, and metadata\n   - Form validation and error handling\n\n3. **Ingest Page** (`/ingest`)\n   - Ingest models directly from HuggingFace\n   - Automatic quality metric evaluation\n   - Progress tracking during evaluation\n\n4. **Package Detail Page** (`/packages/\u003cid\u003e`)\n   - View detailed package information\n   - See quality metrics and scores\n   - Rate packages\n   - Delete packages\n   - View metadata\n\n5. **Health Dashboard** (`/health`)\n   - System health status\n   - Package count and statistics\n   - Activity timeline\n   - System logs\n   - Reset registry option\n\n#### Web Interface Features\n\n- **WCAG 2.1 AA Compliant**: Fully accessible interface\n- **Responsive Design**: Works on desktop and mobile devices\n- **Real-time Updates**: Dynamic content loading with live regions\n- **Error Handling**: User-friendly error messages and validation\n- **Keyboard Navigation**: Full keyboard accessibility support\n\n### REST API\n\nThe Model Registry provides a comprehensive REST API for programmatic access.\n\n#### Base URL\n- Local: `http://localhost:8000/api`\n- Production: `https://your-deployment-url/api`\n\n#### Authentication\n\nMost endpoints require authentication. Authenticate first:\n```bash\nPUT /api/authenticate\nContent-Type: application/json\n\n{\n  \"user\": {\"name\": \"username\"},\n  \"secret\": {\"password\": \"password\"}\n}\n```\n\nResponse includes an authentication token to use in subsequent requests:\n```bash\nX-Authorization: \u003ctoken\u003e\n```\n\n#### API Endpoints\n\n##### Health Check\n```bash\nGET /api/health\n```\nReturns system health status.\n\n##### Package Management\n\n**List Packages**\n```bash\nGET /api/packages?offset=0\u0026limit=25\u0026name=search_term\u0026version=1.0.0\n```\nQuery parameters:\n- `offset`: Pagination offset (default: 0)\n- `limit`: Results per page (default: 25)\n- `name`: Search by name (supports regex if `use_regex=true`)\n- `version`: Filter by version\n- `use_regex`: Enable regex search (default: false)\n- `sort_field`: Sort by `alpha`, `version`, `size`, or `date`\n- `sort_order`: `ascending` or `descending`\n\n**Get Package**\n```bash\nGET /api/packages/\u003cpackage_id\u003e\n```\nReturns detailed package information including metrics.\n\n**Upload Package**\n```bash\nPOST /api/packages\nContent-Type: application/json\nX-Authorization: \u003ctoken\u003e\n\n{\n  \"name\": \"My Model\",\n  \"version\": \"1.0.0\",\n  \"metadata\": {\n    \"url\": \"https://huggingface.co/org/model\",\n    \"description\": \"Model description\"\n  }\n}\n```\n\n**Delete Package**\n```bash\nDELETE /api/packages/\u003cpackage_id\u003e\nX-Authorization: \u003ctoken\u003e\n```\n\n**Rate Package**\n```bash\nPOST /api/packages/\u003cpackage_id\u003e/rate\nContent-Type: application/json\nX-Authorization: \u003ctoken\u003e\n\n{\n  \"rating\": 4.5\n}\n```\n\n##### Model Ingestion\n\n**Ingest from HuggingFace**\n```bash\nPOST /api/ingest\nContent-Type: application/json\nX-Authorization: \u003ctoken\u003e\n\n{\n  \"url\": \"https://huggingface.co/org/model-name\"\n}\n```\n\nThe system will:\n1. Fetch model metadata from HuggingFace\n2. Evaluate all quality metrics\n3. Only ingest if all non-latency metrics score ≥ 0.5\n4. Return package ID and metrics\n\n##### System Management\n\n**Reset Registry**\n```bash\nDELETE /api/reset\nX-Authorization: \u003ctoken\u003e\n```\n⚠️ **Warning**: This deletes all packages and resets to default state.\n\n**Get Health Dashboard Data**\n```bash\nGET /api/health/dashboard\nX-Authorization: \u003ctoken\u003e\n```\n\n**Get Activity Logs**\n```bash\nGET /api/health/activity\nX-Authorization: \u003ctoken\u003e\n```\n\n**Get System Logs**\n```bash\nGET /api/health/logs\nX-Authorization: \u003ctoken\u003e\n```\n\n#### Example API Usage\n\n**Python Example**\n```python\nimport requests\n\nBASE_URL = \"http://localhost:8000/api\"\n\n# Authenticate\nauth_response = requests.put(\n    f\"{BASE_URL}/authenticate\",\n    json={\"user\": {\"name\": \"admin\"}, \"secret\": {\"password\": \"password\"}}\n)\ntoken = auth_response.json()\n\nheaders = {\"X-Authorization\": token}\n\n# List packages\nresponse = requests.get(f\"{BASE_URL}/packages\", headers=headers)\npackages = response.json()\n\n# Upload a package\npackage_data = {\n    \"name\": \"My Model\",\n    \"version\": \"1.0.0\",\n    \"metadata\": {\"url\": \"https://huggingface.co/org/model\"}\n}\nresponse = requests.post(\n    f\"{BASE_URL}/packages\",\n    json=package_data,\n    headers=headers\n)\npackage = response.json()\n```\n\n**cURL Example**\n```bash\n# Authenticate\nTOKEN=$(curl -X PUT http://localhost:8000/api/authenticate \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"user\":{\"name\":\"admin\"},\"secret\":{\"password\":\"password\"}}')\n\n# List packages\ncurl -H \"X-Authorization: $TOKEN\" \\\n  http://localhost:8000/api/packages\n\n# Upload package\ncurl -X POST http://localhost:8000/api/packages \\\n  -H \"X-Authorization: $TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"name\": \"My Model\",\n    \"version\": \"1.0.0\",\n    \"metadata\": {\"url\": \"https://huggingface.co/org/model\"}\n  }'\n```\n\n## Development\n\n### Project Structure\n\n```\nmodelregistry/\n├── src/                    # Source code\n│   ├── api_server.py      # Flask API server\n│   ├── application.py     # WSGI entry point\n│   ├── storage.py         # Storage layer\n│   ├── auth.py            # Authentication\n│   ├── metrics/           # Quality metrics\n│   ├── resources/         # Resource adapters\n│   ├── templates/         # HTML templates\n│   └── static/            # CSS, JS, assets\n├── test/                  # Test suite\n│   ├── unit/              # Unit tests\n│   ├── integration/       # Integration tests\n│   └── e2e/               # End-to-end Selenium tests\n├── scripts/               # Utility scripts\n├── .github/workflows/     # CI/CD pipelines\n└── docs/                  # Documentation\n```\n\n### Development Standards\n\n#### CI Requirements\n\nOur CI runs on **Python 3.11 and 3.12** and enforces these gates:\n\n1. **Formatting (Black)**: `black --check .`\n2. **Import Sorting (isort)**: `isort --check-only --diff .`\n3. **Linting (Flake8)**: `flake8 .`\n4. **Type Checks (mypy)**: `mypy`\n5. **Tests + Coverage**: ≥ 70% coverage required\n\n#### Running Tests\n\n```bash\n# All tests\npytest\n\n# With coverage report\npytest --cov=src --cov-report=html\n\n# Specific test category\npytest test/unit/          # Unit tests only\npytest test/integration/    # Integration tests only\npytest test/e2e/ -m e2e    # End-to-end tests only\n\n# Specific test file\npytest test/test_api_crud.py -v\n```\n\n#### Pre-commit Hooks\n\n```bash\npip install -e \".[dev]\"\npre-commit install\npre-commit run --all-files\n```\n\n### Code Quality Metrics\n\nThe registry evaluates models using comprehensive metrics:\n\n| **Metric**                  | **Description** |\n|------------------------------|-----------------|\n| **Size**                     | Computes total model file size and scores compatibility against device memory budgets (Pi, Nano, PC, AWS) using a smoothstep curve |\n| **License**                  | Uses Purdue GenAI LLM to parse README/metadata for license clarity and LGPLv2.1 compatibility |\n| **Ramp-Up Time**             | Estimates how quickly a new user can start using the repository based on README quality, examples, and HuggingFace likes |\n| **Bus Factor**               | Combines number of unique contributors with recency of updates (exponential decay, 1-year half-life) |\n| **Available Dataset \u0026 Code** | Uses GenAI to semantically evaluate README for dataset and code references |\n| **Dataset Quality**          | Evaluates dataset trustworthiness based on metadata, recency, and community validation |\n| **Code Quality**             | Runs flake8 and mypy on linked repos, combines with popularity signals |\n| **Performance Claims**       | Structured 4-bucket scoring: 0.0 (no claims) → 0.25 → 0.6 → 0.9 → 1.0 (concrete results) |\n| **Reviewedness**             | Evaluates code review coverage and quality from GitHub PRs and reviews |\n| **Net Score**                | Weighted combination: License (0.20), Ramp-Up (0.15), Bus Factor (0.15), Dataset \u0026 Code (0.10), Dataset Quality (0.10), Code Quality (0.10), Performance (0.10), Size (0.10) |\n\n## Documentation\n\n- **[E2E Tests README](./test/e2e/README.md)** - End-to-end testing guide\n- **[WCAG Compliance Assessment](./WCAG_COMPLIANCE_ASSESSMENT.md)** - Accessibility compliance details\n- **[OpenAPI Spec](./openapi-spec.yml)** - API specification\n\n## Troubleshooting\n\n### Common Issues\n\n1. **Port already in use**\n   - Change the `PORT` environment variable\n   - Or stop the process using port 8000\n\n2. **GitHub API rate limiting**\n   - Ensure `GH_API_TOKEN` or `GITHUB_TOKEN` is set\n   - Use a personal access token with appropriate permissions\n\n3. **Metrics evaluation fails**\n   - Check that `PURDUE_GENAI_API_KEY` is set (if using LLM-based metrics)\n   - Some metrics work without API keys but with reduced functionality\n\n4. **AWS deployment issues**\n   - Verify AWS credentials are set correctly\n   - Check Elastic Beanstalk logs: `eb logs`\n   - Ensure environment variables are set in EB configuration\n\n## Contributing\n\nSee the project's contribution guidelines for information on:\n- Code style and standards\n- Testing requirements\n- Pull request process\n- Development workflow\n\n## Contributors\n\n* [Aadhavan Srinivasan](https://github.com/aadhavans2027)\n* [Ryan Baker](https://github.com/rybkr)\n* [Luisa Cruz Miotto](https://github.com/lcruzmio)\n* [Nikhil Chaudhary](https://github.com/chaudhary-nikhil)\n\n## License\n\n[Add license information here]\n\n---\n\n**Last Updated**: 2025\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frybkr%2Fmodelregistry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frybkr%2Fmodelregistry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frybkr%2Fmodelregistry/lists"}