{"id":21190209,"url":"https://github.com/roryclear/transformer","last_synced_at":"2025-03-14T20:44:27.352Z","repository":{"id":232450429,"uuid":"754080464","full_name":"roryclear/transformer","owner":"roryclear","description":null,"archived":false,"fork":false,"pushed_at":"2024-04-12T19:30:53.000Z","size":401963,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-04-14T04:23:43.111Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/roryclear.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-02-07T11:10:19.000Z","updated_at":"2024-04-15T06:49:59.107Z","dependencies_parsed_at":"2024-04-15T06:49:56.815Z","dependency_job_id":null,"html_url":"https://github.com/roryclear/transformer","commit_stats":null,"previous_names":["roryclear/transformer"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roryclear%2Ftransformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roryclear%2Ftransformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roryclear%2Ftransformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roryclear%2Ftransformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/roryclear","download_url":"https://codeload.github.com/roryclear/transformer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243646540,"owners_count":20324583,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-20T18:59:47.367Z","updated_at":"2025-03-14T20:44:27.328Z","avatar_url":"https://github.com/roryclear.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Transformer\n## Running:\n### Online Notebook (CUDA)\nhttps://www.kaggle.com/code/roryclear/cuda-gpt2-demo\n\n### Locally\ngit clone --depth 1 https://github.com/roryclear/transformer.git\n\npip install -r requirements.txt\n\npip install pycuda (CUDA only)\n\npip install pyobjc-framework-Metal (Apple Silicon only)\n\npython demo.py --p=\"your prompt\" \n\n## CUDA, OpenCL and Metal GPT-2 inference\n- [ ] Fastest Inference\n\n- [X] Support for all GPT-2 Models\n\n- [X] Multiple Compute Languages\n\n- [X] Remove all Numpy usage (For calculations)\n\n- [ ] Support any transformer\n\n## Performance:\n\n### CUDA\n|T4 x2 (Kaggle Cloud)         | GPT2          |GPT2-Medium    |GPT2-Large |\n| -----------                 | -----------   |------         |----       |\n| tinygrad                    |74 t/s         |39 t/s         |24 t/s     |\n| huggingface/transformers    |30 t/s         |12 t/s         |6.1 t/s    |  \n|**roryclear/transformer**    |**75 t/s**     |**31 t/s**     |**15 t/s** |\n\n|P100 (Kaggle Cloud)          | GPT2          |GPT2-Medium    |GPT2-Large |\n| -----------                 | -----------   |------         |----       |\n| tinygrad                    |57 t/s         |31 t/s         |20 t/s     |\n| huggingface/transformers    |31 t/s         |12 t/s         |6.0 t/s    |  \n|**roryclear/transformer**    |**59 t/s**     |**21 t/s**     |**10 t/s** |\n\n### Metal\n|Apple M2                   | GPT2          |GPT2-Medium    |GPT2-Large |\n| -----------               | -----------   |------         |----       |\n| tinygrad                  |30 t/s         |22 t/s         |15 t/s     |\n| huggingface/transformers  |53 t/s         |17 t/s         |8 t/s      |  \n| **roryclear/transformer** |**33 t/s**     |**16 t/s**     |**10 t/s**  |\n\n### OpenCL\n|Intel Integrated Graphics (2020 XPS13)         | GPT2          |GPT2-Medium    |GPT2-Large |\n| -----------                                   | -----------   |------         |----       |\n| tinygrad                                      |16 t/s         |5.8 t/s        |2.1 t/s    |\n| huggingface/transformers                      |34 t/s         |15 t/s         |7.7 t/s    |  \n|**roryclear/transformer**                      |**11 t/s**     |**5.1 t/s**    |**3.1 t/s**|\n\n*generating 100 tokens from a 13 token prompt, I don't own any Nvidia hardware to measure CUDA speeds properly.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froryclear%2Ftransformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froryclear%2Ftransformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froryclear%2Ftransformer/lists"}