{"id":24818597,"url":"https://github.com/kempnerinstitute/tmrc","last_synced_at":"2025-10-13T19:30:35.491Z","repository":{"id":266147881,"uuid":"836897224","full_name":"KempnerInstitute/tmrc","owner":"KempnerInstitute","description":"Transformer Model Research Codebase (TMRC)","archived":false,"fork":false,"pushed_at":"2025-01-27T02:28:46.000Z","size":5563,"stargazers_count":2,"open_issues_count":14,"forks_count":0,"subscribers_count":3,"default_branch":"develop","last_synced_at":"2025-01-27T03:23:16.793Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://kempnerinstitute.github.io/tmrc/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KempnerInstitute.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-01T19:35:21.000Z","updated_at":"2025-01-27T02:28:50.000Z","dependencies_parsed_at":"2024-12-03T02:45:39.662Z","dependency_job_id":"fc7931a1-3dad-48c0-a3ba-4d31aa677379","html_url":"https://github.com/KempnerInstitute/tmrc","commit_stats":null,"previous_names":["kempnerinstitute/tmrc"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KempnerInstitute%2Ftmrc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KempnerInstitute%2Ftmrc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KempnerInstitute%2Ftmrc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KempnerInstitute%2Ftmrc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KempnerInstitute","download_url":"https://codeload.github.com/KempnerInstitute/tmrc/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":236385713,"owners_count":19140690,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-30T17:09:05.205Z","updated_at":"2025-10-13T19:30:35.486Z","avatar_url":"https://github.com/KempnerInstitute.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/KempnerInstitute/tmrc/actions/workflows/deploy-docs.yml\"\u003e\n    \u003cimg src=\"https://github.com/KempnerInstitute/tmrc/actions/workflows/deploy-docs.yml/badge.svg?branch=develop\" alt=\"docs\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/KempnerInstitute/tmrc/actions/workflows/python-package.yml\"\u003e\n    \u003cimg src=\"https://github.com/KempnerInstitute/tmrc/actions/workflows/python-package.yml/badge.svg\" alt=\"tests\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://codecov.io/gh/KempnerInstitute/tmrc\" \u003e \n    \u003cimg src=\"https://codecov.io/gh/KempnerInstitute/tmrc/graph/badge.svg?token=PONKB6HEEH\"/\u003e \n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n# TMRC\n\n_Transformer model research codebase_\n\nTMRC (Transformer Model Research Codebase) is a simple, explainable codebase to train transformer-based models. It was developed with simplicity and ease of modification in mind, particularly for researchers. The codebase will eventually be used to train foundation models and experiment with architectural and training modifications.\n\n## Documentation\n[TMRC Documentation](https://kempnerinstitute.github.io/tmrc/)\n\n\n## Installation\n\n- Step 1: Load required modules\n\n  If you are using the Kempner AI cluster, load required modules:\n\n  ```bash\n  module load python/3.12.5-fasrc01\n  module load cuda/12.4.1-fasrc01\n  module load cudnn/9.1.1.17_cuda12-fasrc01 \n  ```\n\n  If you are not using the Kempner cluster, install torch and cuda dependencies following instructions on the [PyTorch website](https://pytorch.org). TMRC has been tested with torch `2.5.0+cu124` and Python `3.12`.\n\n- Step 2: Create a Conda environment\n\n  ```bash\n  conda create -n tmrc_env python=3.12\n  conda activate tmrc_env\n  ```\n\n- Step 3: Clone the repository\n\n  ```bash\n  git clone git@github.com:KempnerInstitute/tmrc.git\n  ```\n\n- Step 4: Install the package\n\n  ```bash\n  cd tmrc\n  pip install poetry\n  poetry install\n  ```\n\n## Running Experiments\n\n- Step 1: Login to Weights \u0026 Biases to enable experiment tracking\n\n  ```bash\n  wandb login\n  ```\n\n### Single-GPU Training\n- Step 2: Request compute resources. For example, on the Kempner AI cluster, to request an H100 80GB GPU run\n\n  ```bash\n  salloc --partition=kempner_h100 --account=\u003cfairshare account\u003e --nodes=1 --ntasks=1 --cpus-per-task=24 --mem=375G --gres=gpu:1  --time=00-07:00:00\n  ```\n\n  If you are not using the Kempner AI cluster, you can run experiments on your local machine (if you have a GPU) or on cloud services like AWS, GCP, or Azure.  TMRC should automatically find the available GPU.  If there are no GPUs available, it will run on CPU (though this is not recommended, since training will be prohibitively slow for any reasonable model size).\n\n- Step 3: Activate the Conda environment\n\n  ```bash\n  conda activate tmrc_env\n  ```\n\n- Step 4: Launch training\n\n  ```bash\n  python src/tmrc/core/training/train.py\n  ```\n\n### Multi-node multiple-GPU Training\n- Step 2: Request compute resources. For example, on the Kempner AI cluster, to request eight H100 80GB GPUs on two nodes run\n\n  ```bash\n  salloc --partition=kempner_h100 --account=\u003cfairshare account\u003e --nodes=2 --ntasks-per-node=4 --ntasks=8 --cpus-per-task=24 --mem=375G --gres=gpu:4  --time=00-07:00:00\n  ```\n- Step 3: Activate the Conda environment\n\n  ```bash\n  conda activate tmrc_env\n  ```\n\n- Step 4: Launch training\n\n  ```bash\n  srun python src/tmrc/core/training/train.py\n  ```\n\u003e [!NOTE]\n\u003e For distributed training, TMRC uses `Distributed Data Parallelism (DDP)` by default. For larger models, to use `Fully Sharded Data Parallelism (FSDP)`, set `distributed_strategy` to `fsdp` in the `training` part of the config file or see the next section on how to have a custom config file.\n\n### Configuration\n\nBy default, the training script uses the configuration defined in `configs/training/default_train_config.yaml`. \n\nTo use a custom configuration file\n\n    python src/tmrc/core/training/train.py --config-name YOUR_CONFIG\n\n\u003e [!NOTE]\n\u003e The `--config-name` parameter should be specified without the `.yaml` extension.\n\n\u003e [!TIP]\n\u003e Configuration files should be placed in the `configs/training/` directory. For example, if your config is named `my_experiment.yaml`, use `--config-name my_experiment`\n\nMake sure to change the `path` under `datasets` block in the config file. \n\n## Build the documentation locally\n\n- Step 1: Install the required packages\n  ```bash\n  poetry install --with dev\n  ```\n\n- Step 2: Build the documentation\n  ```bash\n  cd docs\n  make html\n  ```\n\n- Step 3: Open the documentation in your browser\n  ```bash\n  open _build/html/index.html\n    ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkempnerinstitute%2Ftmrc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkempnerinstitute%2Ftmrc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkempnerinstitute%2Ftmrc/lists"}