{"id":32481003,"url":"https://github.com/iamfaham/model-inference-profiler","last_synced_at":"2026-05-06T10:34:33.340Z","repository":{"id":319780892,"uuid":"1079578049","full_name":"iamfaham/model-inference-profiler","owner":"iamfaham","description":"A PyTorch-based tool for profiling deep learning model inference performance, analyzing computational bottlenecks, and visualizing resource utilization.","archived":false,"fork":false,"pushed_at":"2025-10-20T04:02:09.000Z","size":181,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-20T07:33:08.971Z","etag":null,"topics":["cuda","memory","pytorch","visualizations"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iamfaham.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-20T03:45:50.000Z","updated_at":"2025-10-20T04:03:33.000Z","dependencies_parsed_at":"2025-10-20T07:33:12.376Z","dependency_job_id":null,"html_url":"https://github.com/iamfaham/model-inference-profiler","commit_stats":null,"previous_names":["iamfaham/model-inference-profiler"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/iamfaham/model-inference-profiler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamfaham%2Fmodel-inference-profiler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamfaham%2Fmodel-inference-profiler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamfaham%2Fmodel-inference-profiler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamfaham%2Fmodel-inference-profiler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iamfaham","download_url":"https://codeload.github.com/iamfaham/model-inference-profiler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamfaham%2Fmodel-inference-profiler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32689207,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-06T08:33:17.875Z","status":"ssl_error","status_checked_at":"2026-05-06T08:33:17.221Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","memory","pytorch","visualizations"],"created_at":"2025-10-27T02:19:46.685Z","updated_at":"2026-05-06T10:34:33.335Z","avatar_url":"https://github.com/iamfaham.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Model Inference Profiler\n\nA PyTorch-based tool for profiling deep learning model inference performance, analyzing computational bottlenecks, and visualizing resource utilization.\n\n## Overview\n\nThis project provides a comprehensive profiling solution for PyTorch models, enabling you to:\n- Analyze inference performance on GPU (CUDA)\n- Identify computational bottlenecks\n- Visualize layer-wise execution time\n- Monitor memory usage patterns\n- Profile any pretrained torchvision model\n\n## Features\n\n- **Performance Profiling**: Track CPU and CUDA execution times for each operation\n- **Memory Analysis**: Monitor memory allocation and usage across layers\n- **Visual Analytics**: Generate bar charts for top time-consuming and memory-intensive operations\n- **Model Summary**: Display detailed architecture information with parameter counts\n- **Easy Integration**: Works with any PyTorch model from torchvision or custom models\n\n## Requirements\n\n```bash\ntorch\ntorchvision\ntorchinfo\nmatplotlib\n```\n\n## Installation\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/iamfaham/model-inference-profiler.git\ncd model-inference-profiler\n```\n\n2. Install dependencies:\n```bash\npip install torch torchvision torchinfo matplotlib\n```\n\n## Usage\n\n### Running in Google Colab\n\nClick the \"Open in Colab\" badge at the top to run the notebook directly in Google Colab with free GPU access.\n\n### Local Execution\n\n1. Open the Jupyter notebook:\n```bash\njupyter notebook model_inference_profiler.ipynb\n```\n\n2. Ensure GPU is available:\n```python\nimport torch\nprint(torch.cuda.is_available())  # Should return True\n```\n\n3. The notebook will guide you through:\n   - Loading a pretrained model (default: ViT-B/16)\n   - Running warm-up iterations\n   - Profiling inference\n   - Visualizing results\n\n### Customizing the Model\n\nTo profile a different model, simply change the model loading line:\n\n```python\n# Vision Transformer (default)\nmodel = models.vit_b_16(weights=models.ViT_B_16_Weights.DEFAULT)\n\n# Or try other models:\n# model = models.resnet50(weights=models.ResNet50_Weights.DEFAULT)\n# model = models.efficientnet_b0(weights=models.EfficientNet_B0_Weights.DEFAULT)\n# model = models.mobilenet_v3_small(weights=models.MobileNet_V3_Small_Weights.DEFAULT)\n```\n\n## Output Examples\n\nThe profiler generates:\n\n1. **Model Summary**: Detailed architecture breakdown with parameter counts and memory estimates\n2. **Performance Table**: Top operations sorted by CUDA execution time\n3. **CUDA Time Visualization**: Bar chart showing the most time-consuming layers\n4. **Memory Usage Visualization**: Bar chart displaying memory allocation per layer\n\n## How It Works\n\n1. **Model Loading**: Loads a pretrained model and moves it to GPU\n2. **Warm-up**: Runs multiple inference passes to stabilize GPU performance\n3. **Profiling**: Uses PyTorch's built-in profiler to capture:\n   - CPU and CUDA activities\n   - Operation shapes\n   - Memory allocations\n4. **Analysis**: Extracts and visualizes performance metrics\n\n## Use Cases\n\n- **Model Optimization**: Identify bottlenecks before deployment\n- **Hardware Selection**: Understand resource requirements\n- **Comparative Analysis**: Compare different architectures\n- **Educational**: Learn about model internals and performance characteristics\n\n## Example Output\n\nFor ViT-B/16, you'll see:\n- Total parameters: ~86.5M\n- Top operations: Matrix multiplications (addmm, sgemm)\n- Memory-intensive layers: Attention mechanisms and linear layers\n\n## Contributing\n\nContributions are welcome! Feel free to:\n- Add support for more profiling metrics\n- Implement additional visualization options\n- Extend to other frameworks\n- Improve documentation\n\n## License\n\nThis project is open source and available under the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiamfaham%2Fmodel-inference-profiler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiamfaham%2Fmodel-inference-profiler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiamfaham%2Fmodel-inference-profiler/lists"}