{"id":27734354,"url":"https://github.com/rosinality/halite","last_synced_at":"2025-07-28T19:10:54.366Z","repository":{"id":259383542,"uuid":"748054378","full_name":"rosinality/halite","owner":"rosinality","description":"Acceleration framework for Human Alignment Learning","archived":false,"fork":false,"pushed_at":"2025-07-03T01:14:51.000Z","size":640,"stargazers_count":8,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-07-03T02:26:01.963Z","etag":null,"topics":["evaluation-framework","inference","large-language-models","proximal-policy-optimization","reinforcement-learning","reinforcement-learning-from-human-feedback","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rosinality.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-01-25T07:13:32.000Z","updated_at":"2025-07-03T02:06:56.000Z","dependencies_parsed_at":"2024-10-25T01:29:46.076Z","dependency_job_id":"5c98f000-fe7d-4ed8-9ff1-4491831328e1","html_url":"https://github.com/rosinality/halite","commit_stats":null,"previous_names":["rosinality/halite"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rosinality/halite","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosinality%2Fhalite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosinality%2Fhalite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosinality%2Fhalite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosinality%2Fhalite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rosinality","download_url":"https://codeload.github.com/rosinality/halite/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosinality%2Fhalite/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265577779,"owners_count":23791225,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["evaluation-framework","inference","large-language-models","proximal-policy-optimization","reinforcement-learning","reinforcement-learning-from-human-feedback","transformers"],"created_at":"2025-04-28T13:08:44.175Z","updated_at":"2025-07-17T07:34:46.683Z","avatar_url":"https://github.com/rosinality.png","language":"Python","readme":"# halite\n\nHalite is an acceleration framework for pre-training, post-training, inference and evaluation of large language models built from scratch with PyTorch.\n\nThis is my on-going project, but I'm desined this framework with below things in mind.\n\n- **Post-Training**: Halite starts from my earlier work for accelerating post-training of LLMs especially RLHF and PPO. Halite supports easier way to implement various and sophisticated alignment techniques.\n- **Transformers**: Halite supports design and modification of novel transformer architectures with composable components. All of components are not tied to specific architecture, and you can compose it just in your config, without any framework-level code changes, thanks to [slickconf](https://github.com/rosinality/slickconf). Of course, it supports convert checkpoints from another framework in declarative way.\n- **Parallelism**: Halite designed to support multi-dimensional parallelism, not only plain FSDP, in a performant and flexible way without hassles.\n- **Inference**: Most post-training method requires to sample from the model, a lot. It is crucial to sample efficiently for post-training frameworks to be practical. Halite internalizes inference engine inspired from [SGLang](https://github.com/sgl-project/sglang) that allows switch training or inference mode of the model without any additional cost or checkpointing.\n- **Evaluation**: There are great frameworks for evaluating LLMs, like [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). But if you have a framework that allows fast inference, then it could be conveinent to have a unified framework that also supports evaluation.\n- **Pre-Training**: It would be safe to use verified frameworks for experiments like pre-training which requires a lot of compute costs. But if you have a framework that allows flexible configurations, various architectures, efficient parallelization, and evaluation, then it would be useful to have a support for pre-training, especially for small-scale explorative experiments. Actually pre-training is just one kind of possible experiments that can be implemented with Halite, like many post-training methods.\n\n## Overall Structure\n\n```\nconfigs/            root directory for config files\nsrc/halite          root directory for halite library\n    data/           dataset loading and preprocessing related tools\n    projects/       root directory for experiment and method related codes, like PPO, evaluation, etc\n    transformers/   composable components for building transformer architectures\n        infer/      inference engine for models composed using components above\nscripts/            root directory for experiment and utility scripts\n```\n\n## Configuration\n\nThe aspect which Halite is most different from other frameworks is its configuration system. Many would find it is unfamiliar.\n\nSlickConf, which is configuration system used in Halite is inspired by another configuration system, [Hydra](https://hydra.cc/), [detectron2](https://detectron2.readthedocs.io/en/latest/tutorials/lazyconfigs.html), [Fiddle](https://github.com/google/fiddle). It allows you to use python code to define your configuration, and set python classes or functions in the config. But, importantly, it converts these classes or functions into a dictionary without python dependencies, and validates the config with pydantic.\n\nFor example, Llama 3 architecture is defined as follows in the [config file](https://github.com/rosinality/halite/blob/main/configs/models/llama/llama3_2_3b.py):\n\n```python\nfrom halite.transformers.position import Llama3RoPE, apply_rotary_emb\n\nfrom ..transformer import transformer\n\nconf = field()\n\ndim = 3072\nn_heads = 24\nhead_dim = dim // n_heads\ncontext_len = 8192\nuse_complex_rope = True\nqkv_split = True\n\ntransformer_config = field(\n    vocab_size=128256,\n    dim=dim,\n    n_heads=n_heads,\n    head_dim=head_dim,\n    n_layers=28,\n    n_key_value_heads=8,\n    intermediate_size=8192,\n    rms_norm_epsilon=1e-5,\n    context_len=context_len,\n    pos_embed=Llama3RoPE(\n        head_dim,\n        context_len,\n        use_scaled_rope=True,\n        use_complex=use_complex_rope,\n    ),\n    pos_embed_apply_fn=partial(apply_rotary_emb, use_complex=use_complex_rope),\n    qkv_split=qkv_split,\n    gated_ff_split=qkv_split,\n)\n\nconf.model = call[transformer](**transformer_config)\n```\n\nAs you can use python classes and functions, you can compose your model without any framework-level code changes, just in your config. (For example, in above example you can change position embedding in your config.) Actually transformer itself is configured in the [config](https://github.com/rosinality/halite/blob/main/configs/models/transformer.py), composed of components defined in [transformers directory](https://github.com/rosinality/halite/tree/main/src/halite/transformers).\n\nThis allows you to extend the framework easily. For example, if you want to use a new optimizer, you can just assign it to configuration, like [this](https://github.com/rosinality/halite/blob/main/configs/lm/scale_383m_shampoo.py):\n\n```python\nfrom distributed_shampoo.distributed_shampoo import DistributedShampoo\nfrom distributed_shampoo.shampoo_types import (\n    AdamGraftingConfig,\n    FullyShardShampooConfig,\n    PrecisionConfig,\n)\n\nconf.training = field(\n    train_batch_size=320,\n    eval_batch_size=320,\n    max_iter=50000,\n    gradient_checkpointing=False,\n    optimizer=partial(\n        DistributedShampoo,\n        lr=lr,\n        betas=(0.9, 0.95),\n        epsilon=1e-12,\n        max_preconditioner_dim=8192,\n        precondition_frequency=10,\n        use_decoupled_weight_decay=True,\n        inv_root_override=2,\n        distributed_config=FullyShardShampooConfig(),\n        grafting_config=AdamGraftingConfig(\n            beta2=0.95,\n            epsilon=1e-08,\n        ),\n    ),\n    scheduler=partial(\n        lr_scheduler.cycle_scheduler,\n        lr=lr,\n        initial_multiplier=1e-6,\n        warmup=5000,\n        decay=(\"linear\", \"cos\"),\n    ),\n    criterion=CrossEntropyLoss(z_loss=1e-4, fast=True),\n    weight_decay=weight_decay,\n    clip_grad_norm=1.0,\n    n_epochs=1,\n)\n```\n\nIn above example I used `DistributedShampoo` optimizer from [Optimizers](https://github.com/facebookresearch/optimizers) directly. You don't need any code changes to the Halite framework itself. You don't need to add configuration fields, `if` conditions, and so on. It is just a function assignment and composition.\n\nYou may feel it is too complex, unlike simple YAML-based configuration systems. But Halite is tightly coupled with this style of configuration, and it would be hard to use without it. (For example, transformers are consists with individual components, and it is hard to compose them to work without this style of configuration.)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frosinality%2Fhalite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frosinality%2Fhalite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frosinality%2Fhalite/lists"}