{"id":19156228,"url":"https://github.com/kyegomez/shallowff","last_synced_at":"2025-06-28T20:04:01.911Z","repository":{"id":208163310,"uuid":"720957738","full_name":"kyegomez/ShallowFF","owner":"kyegomez","description":"Zeta implemantion of \"Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers\"","archived":false,"fork":false,"pushed_at":"2025-04-19T12:53:19.000Z","size":37970,"stargazers_count":10,"open_issues_count":2,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-19T20:16:56.561Z","etag":null,"topics":["artificial-intelligence","attention","attention-is-all-you-need","attention-mechanism","attention-mechanisms","feedforward","transformer","transformer-encoder","transformer-models","transformers-models"],"latest_commit_sha":null,"homepage":"https://discord.gg/Yx5y5VBahs","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kyegomez.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["kyegomez"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2023-11-20T03:49:13.000Z","updated_at":"2025-01-27T03:00:18.000Z","dependencies_parsed_at":"2023-11-20T04:34:45.595Z","dependency_job_id":"4ee3b319-acf0-477e-a655-fa77bff2c7c5","html_url":"https://github.com/kyegomez/ShallowFF","commit_stats":{"total_commits":12,"total_committers":2,"mean_commits":6.0,"dds":0.08333333333333337,"last_synced_commit":"b67c9232daf6a6ff8a578546c33047fd87c5b741"},"previous_names":["kyegomez/shallowff"],"tags_count":0,"template":false,"template_full_name":"kyegomez/Python-Package-Template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyegomez%2FShallowFF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyegomez%2FShallowFF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyegomez%2FShallowFF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyegomez%2FShallowFF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kyegomez","download_url":"https://codeload.github.com/kyegomez/ShallowFF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252834181,"owners_count":21811331,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention","attention-is-all-you-need","attention-mechanism","attention-mechanisms","feedforward","transformer","transformer-encoder","transformer-models","transformers-models"],"created_at":"2024-11-09T08:33:40.532Z","updated_at":"2025-05-07T07:34:59.719Z","avatar_url":"https://github.com/kyegomez.png","language":"Python","funding_links":["https://github.com/sponsors/kyegomez"],"categories":[],"sub_categories":[],"readme":"[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)\n\n# ALR Transformer\nALR Transformer that replaces the original transformer implementation of an joint encoder + decoder block with a feedforward/alr block with a decoder block\n\n\n## Install\n`pip install alr-transformer`\n\n\n## Usage\n```python\nimport torch\nfrom alr_transformer import ALRTransformer\n\nx = torch.randint(0, 100000, (1, 2048))\n\nmodel = ALRTransformer(\n    dim = 512,\n    depth = 6,\n    num_tokens = 100000,\n    dim_head = 64,\n    heads = 8,\n    ff_mult = 4\n)\n\nout = model(x)\nprint(out)\nprint(out.shape)\n\n```\n\n## Train\n- First git clone the repo then download and then run the following\n```\npython3 train.py\n```\n\n\n\n## Citation\n```bibtex\n@misc{bozic2023rethinking,\n    title={Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers}, \n    author={Vukasin Bozic and Danilo Dordervic and Daniele Coppola and Joseph Thommes},\n    year={2023},\n    eprint={2311.10642},\n    archivePrefix={arXiv},\n    primaryClass={cs.CL}\n}\n\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkyegomez%2Fshallowff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkyegomez%2Fshallowff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkyegomez%2Fshallowff/lists"}