{"id":19665350,"url":"https://github.com/savinrazvan/attention","last_synced_at":"2026-01-08T13:38:05.025Z","repository":{"id":250956942,"uuid":"835972372","full_name":"SavinRazvan/attention","owner":"SavinRazvan","description":"Visualize BERT's attention mechanism with a user-friendly script. Input text with a masked token, predict the masked word, and generate attention diagrams to understand BERT's focus. Ideal for AI enthusiasts and NLP researchers.","archived":false,"fork":false,"pushed_at":"2024-09-19T09:36:31.000Z","size":2943,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-30T22:05:07.769Z","etag":null,"topics":["ai","attention-mechanism","bert","cs50","data-visualization","harvard-cs50","masked-language-model","natural-language-processing","nlp","python","transformer"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SavinRazvan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-30T22:44:55.000Z","updated_at":"2024-12-30T22:30:08.000Z","dependencies_parsed_at":"2024-11-11T16:34:16.292Z","dependency_job_id":null,"html_url":"https://github.com/SavinRazvan/attention","commit_stats":null,"previous_names":["savinrazvan/attention"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/SavinRazvan/attention","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SavinRazvan%2Fattention","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SavinRazvan%2Fattention/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SavinRazvan%2Fattention/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SavinRazvan%2Fattention/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SavinRazvan","download_url":"https://codeload.github.com/SavinRazvan/attention/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SavinRazvan%2Fattention/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275464787,"owners_count":25469881,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-16T02:00:10.229Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","attention-mechanism","bert","cs50","data-visualization","harvard-cs50","masked-language-model","natural-language-processing","nlp","python","transformer"],"created_at":"2024-11-11T16:22:19.165Z","updated_at":"2025-09-16T18:31:48.170Z","avatar_url":"https://github.com/SavinRazvan.png","language":"Jupyter Notebook","readme":"# Attention\n\nThis project demonstrates how to visualize the attention mechanism of a pre-trained BERT model. The script allows users to input text with a mask token, retrieves top predictions for the mask token, and generates attention diagrams showing the attention weights across different layers and heads.\n\n## Description\n\nThis project uses the BERT (Bidirectional Encoder Representations from Transformers) model to predict masked words within a text sequence. BERT is a transformer-based language model developed by Google, trained to predict a masked word based on its surrounding context. The project involves two main tasks:\n\n1. **Masked Word Prediction**: Using the Hugging Face transformers library, the program predicts masked words in an input text. The user inputs text containing a `[MASK]` token, and the program outputs the top predictions for the masked word.\n\n2. **Attention Visualization**: The program generates diagrams visualizing attention scores for each of BERT's 144 attention heads. These diagrams help users understand which words BERT focuses on when making predictions.\n\n## Features\n\n- **Masked Word Prediction**: Predicts the masked word in a text sequence using BERT.\n- **Attention Diagrams**: Generates and saves visualizations of attention scores for each attention head across all layers.\n- **Configurable Parameters**: Allows customization of the number of predictions, font settings, grid size, and output options.\n- **Console Output**: Optionally prints predictions and attention values to the console.\n- **Output Directory**: Saves generated diagrams to a specified output folder.\n\n## Goal\n\nThe goal of this project is to provide insights into the inner workings of the BERT model by visualizing its attention mechanism. By analyzing the attention diagrams, users can gain a better understanding of what BERT pays attention to when processing language, which can be particularly useful for debugging and improving language models.\n\n## Getting Started\n\n1. **Installation**:\n   - Clone the repository and navigate to the project directory.\n   - Create a virtual environment and activate it.\n   - Install the required packages using `pip install -r requirements.txt`.\n\n2. **Running the Script**:\n   - Execute the script using `python script.py`.\n   - Input the text with the `[MASK]` token when prompted.\n   - View the predictions and attention diagrams as per the configured output options.\n\n## Example Usage\n\n```bash\n$ python script.py\nText: We turned down a narrow lane and passed through a small [MASK].\nTop Predictions:\n1. We turned down a narrow lane and passed through a small field.\n2. We turned down a narrow lane and passed through a small clearing.\n3. We turned down a narrow lane and passed through a small park.\n```\n\n## Configuration Parameters\n\n- **MODEL**: Identifier for the pre-trained masked language model (e.g., `\"bert-base-uncased\"`).\n- **K**: Number of top predictions to generate for the mask token.\n- **FONT**: Default font for drawing text on images; can be adjusted as needed.\n- **BASE_GRID_SIZE**: Base size of each grid cell in the attention diagram.\n- **PIXELS_PER_WORD**: Pixels allocated per word for image dimensions.\n- **PRINT_TO_CONSOLE**: If `True`, prints the predictions and attention values to the console.\n- **SAVE_TO_OUTPUT_FOLDER**: If `True`, saves the generated diagrams to the output folder.\n\n## License\n\nThis project is licensed under the MIT License.\n\n\u003e For further details and updates, you can refer to the [CS50 AI course page](https://cs50.harvard.edu/ai/2024/projects/6/attention/).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsavinrazvan%2Fattention","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsavinrazvan%2Fattention","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsavinrazvan%2Fattention/lists"}