{"id":25774392,"url":"https://github.com/codewithdark-git/latentrecurrentdepthlm","last_synced_at":"2025-02-27T05:29:48.913Z","repository":{"id":279333274,"uuid":"937671733","full_name":"codewithdark-git/LatentRecurrentDepthLM","owner":"codewithdark-git","description":"This project presents a novel approach that integrates latent representations with recurrent architectures for depth modeling. It is designed for efficient learning and inference in complex environments.","archived":false,"fork":false,"pushed_at":"2025-02-25T03:18:55.000Z","size":16,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-25T03:25:50.277Z","etag":null,"topics":["ann","hugging","huggingface-transformers","language-model","latent","llm","lm","model","nn","pythin","pytorch","recurrent-neural-networks","rlms","rnn","tlm","torch","transformer"],"latest_commit_sha":null,"homepage":"https://huggingface.co/codewithdark/latent-recurrent-depth-lm","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codewithdark-git.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-23T16:26:16.000Z","updated_at":"2025-02-25T03:24:36.000Z","dependencies_parsed_at":"2025-02-25T03:25:53.535Z","dependency_job_id":"8691b09d-bdb2-4edb-86d6-f6eddafb8f4c","html_url":"https://github.com/codewithdark-git/LatentRecurrentDepthLM","commit_stats":null,"previous_names":["codewithdark-git/latentrecurrentdepthlm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codewithdark-git%2FLatentRecurrentDepthLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codewithdark-git%2FLatentRecurrentDepthLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codewithdark-git%2FLatentRecurrentDepthLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codewithdark-git%2FLatentRecurrentDepthLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codewithdark-git","download_url":"https://codeload.github.com/codewithdark-git/LatentRecurrentDepthLM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240983632,"owners_count":19888731,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ann","hugging","huggingface-transformers","language-model","latent","llm","lm","model","nn","pythin","pytorch","recurrent-neural-networks","rlms","rnn","tlm","torch","transformer"],"created_at":"2025-02-27T05:29:47.571Z","updated_at":"2025-02-27T05:29:48.899Z","avatar_url":"https://github.com/codewithdark-git.png","language":"Python","readme":"# Latent Recurrent Depth Language Model\n\n[![Model](https://img.shields.io/badge/transformer-Model-orange?logo=pytorch)](https://huggingface.co/codewithdark/latent-recurrent-depth-lm)  \n[![Hugging Face](https://img.shields.io/badge/HuggingFace-Space-yellow?logo=huggingface)](https://huggingface.co/spaces/codewithdark/LatentRecurrentDepthLM)  \n[![arXiv](https://img.shields.io/badge/arXiv-2502.05171-b31b1b.svg)](https://arxiv.org/abs/2502.05171)  \n[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)  \n\nWelcome to the **Latent Recurrent Depth Language Model** repository! This project provides an implementation of a deep language model that combines latent recurrent architectures with modern attention mechanisms. The model is designed for efficient sequence modeling and language understanding tasks.\n\n---\n\n## Table of Contents\n\n- [Overview](#overview)\n- [Features](#features)\n- [Directory Structure](#directory-structure)\n- [Installation](#installation)\n- [Usage](#usage)\n  - [Data Preparation](#data-preparation)\n  - [Training](#training)\n  - [Inference](#inference)\n- [Model Architecture](#model-architecture)\n- [Push to Hub](#push-to-hub)\n- [Contributing](#contributing)\n- [License](#license)\n\n---\n\n## Overview\n\nThis repository implements a novel language modeling architecture that leverages:\n- **Latent Recurrent Blocks**: To capture long-term dependencies.\n- **Multi-Head Attention**: For modeling complex interactions between tokens.\n- **Deep Stacking of Model Blocks**: To achieve depth and expressivity in the network.\n\nThe project is modularized to separate concerns such as data handling, tokenization, model definition, training pipelines, and inference utilities. This makes it easy to experiment with different configurations and extend the model.\n\n---\n\n## Features\n\n- **Custom Dataset Processing**: Easily preprocess and load your text data using `dataset.py`.\n- **Flexible Training Pipeline**: Train the model with configurable options using `trainer.py` and `pipeline.py`.\n- **Inference Utilities**: Generate sequences or test model predictions with scripts in the `Inference/` directory.\n- **Model Hub Integration**: Push trained models to popular hubs using `push_to_hub.py`.\n- **Modular Model Design**: Separate model components in the `Model/` directory including:\n  - `latent_Recurrent.py`\n  - `recurrent_Block.py`\n  - `prelude_Block.py`\n  - `codaBlock.py`\n  - `multi_head_Attention.py`\n\n---\n\n## Directory Structure\n\n```plaintext\ncodewithdark-git-latentrecurrentdepthlm/\n├── README.md\n├── LICENSE\n├── dataset.py\n├── pipeline.py\n├── push_to_hub.py\n├── tokenizer.py\n├── trainer.py\n├── Inference/\n│   ├── One_word.py\n│   ├── Squence_Generator.py\n│   └── locally.py\n└── Model/\n    ├── codaBlock.py\n    ├── latent_Recurrent.py\n    ├── model.py\n    ├── multi_head_Attention.py\n    ├── prelude_Block.py\n    └── recurrent_Block.py\n```\n\n- **Root Files**: Core utilities for data processing, training, tokenization, and hub integration.\n- **Inference/**: Contains scripts for various inference scenarios:\n  - `One_word.py`: Likely for single-word prediction or testing.\n  - `Squence_Generator.py`: For generating sequences.\n  - `locally.py`: For running inference locally.\n- **Model/**: Contains model definitions and components that build the architecture.\n\n---\n\n## Installation\n\n1. **Clone the Repository:**\n\n   ```bash\n   git clone https://github.com/codewithdark/latent-recurrent-depth-lm.git\n   cd latent-recurrent-depth-lm\n   ```\n\n2. **Create a Virtual Environment (Optional but Recommended):**\n\n   ```bash\n   python -m venv venv\n   source venv/bin/activate   # On Windows use `venv\\Scripts\\activate`\n   ```\n\n3. **Install Dependencies:**\n\n   Install the required Python packages. For example, if using `pip`:\n\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n   \u003e **Note:** If a `requirements.txt` is not provided, ensure you have the following installed:\n   \u003e - Python 3.7+\n   \u003e - PyTorch \n   \u003e - NumPy\n   \u003e - (Any other library required by your specific implementation)\n\n---\n\n## Usage\n\n### Data Preparation\n\nUse `dataset.py` to preprocess your text data. \n\n### Training\n\nStart training the model by running the pipeline. You can adjust hyperparameters and training configurations within `pipeline.py` \n\n---\n\n## Model Architecture\n\nThe model architecture is composed of several custom blocks:\n\n- **latent_Recurrent.py \u0026 recurrent_Block.py**: Implements the recurrent components for sequence modeling.\n- **prelude_Block.py \u0026 codaBlock.py**: Serve as the input and output blocks, respectively, to preprocess input tokens and postprocess model outputs.\n- **multi_head_Attention.py**: Implements multi-head attention mechanisms that allow the model to focus on different parts of the input simultaneously.\n- **model.py**: Combines all these components into a cohesive model that can be trained and evaluated.\n\nThe modular design allows for easy experimentation with different configurations and architectures.\n\n---\n\n## Push to Hub\n\nTo share your trained model with the community or deploy it on a model hub, use the `push_to_hub.py` script.\n---\n\n## Contributing\n\nContributions are welcome! If you have suggestions, bug fixes, or improvements, please open an issue or submit a pull request.\n\n1. Fork the repository.\n2. Create a new branch (`git checkout -b feature/your-feature`).\n3. Commit your changes (`git commit -am 'Add new feature'`).\n4. Push to the branch (`git push origin feature/your-feature`).\n5. Create a new Pull Request.\n\n---\n\n## License\n\nThis project is licensed under the terms of the [MIT License](LICENSE).\n\n---\n\nHappy Modeling!\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodewithdark-git%2Flatentrecurrentdepthlm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodewithdark-git%2Flatentrecurrentdepthlm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodewithdark-git%2Flatentrecurrentdepthlm/lists"}