{"id":25980027,"url":"https://github.com/pxl-th/nnop.jl","last_synced_at":"2025-03-05T07:25:55.393Z","repository":{"id":280528872,"uuid":"928021667","full_name":"pxl-th/NNop.jl","owner":"pxl-th","description":"Pure Julia NN kernels.","archived":false,"fork":false,"pushed_at":"2025-03-03T22:53:10.000Z","size":6,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-03T23:29:13.028Z","etag":null,"topics":["gpgpu","gpu","julia"],"latest_commit_sha":null,"homepage":"","language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pxl-th.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-05T23:34:10.000Z","updated_at":"2025-03-03T22:58:00.000Z","dependencies_parsed_at":"2025-03-03T23:39:26.104Z","dependency_job_id":null,"html_url":"https://github.com/pxl-th/NNop.jl","commit_stats":null,"previous_names":["pxl-th/nnop.jl"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pxl-th%2FNNop.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pxl-th%2FNNop.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pxl-th%2FNNop.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pxl-th%2FNNop.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pxl-th","download_url":"https://codeload.github.com/pxl-th/NNop.jl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241985068,"owners_count":20053021,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpgpu","gpu","julia"],"created_at":"2025-03-05T07:25:54.728Z","updated_at":"2025-03-05T07:25:55.387Z","avatar_url":"https://github.com/pxl-th.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NNop.jl\n\nPure Julia NN kernels.\n\n\u003e [!WARNING]\n\u003e The package is in the early stages and is not yet fully ready.\n\n## Ops\n\n### Flash Attention\n\nImplementation of [FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness](https://arxiv.org/abs/2205.14135).\n\n```julia\nE = 64\nL = 4096\nH, B = 4, 4\n\nq = ROCArray(rand(Float32, E, L, H, B))\nk = ROCArray(rand(Float32, E, L, H, B))\nv = ROCArray(rand(Float32, E, L, H, B))\n\no = flash_attention(q, k, v)\n```\n\n#### Benchmarks:\n\nFor the problem size `(E=64, L=4096, H=4, B=4)`.\n\n||Naїve attention|Flash Attention|\n|-|-|-|\n|Execution time|55.034 ms|18.490 ms|\n|Peak memory usage|4.044 GiB|16.500 MiB|\n\n#### Features:\n\n- Forward \u0026 backward passes.\n- Arbitrary sequence length.\n- Arbitrary head sizes.\n- FP32, FP16, BFP16 support.\n\nIn progress:\n\n- [ ] Causal masking.\n- [ ] Variable sequence length.\n\n### Fused (online) Softmax\n\nImplementation of [Online normalizer calculation for softmax](https://arxiv.org/abs/1805.02867).\n\n```julia\nx = ROCArray(ones(Float32, 8192, 1024))\ny = online_softmax(x)\n```\n\n||Naїve Softmax|Online Softmax|\n|-|-|-|\n|Execution time|745.123 μs|61.600 μs|\n|Peak memory usage|64.258 MiB|32.000 MiB|\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpxl-th%2Fnnop.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpxl-th%2Fnnop.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpxl-th%2Fnnop.jl/lists"}