{"id":26742809,"url":"https://github.com/murapadev/phinetuning","last_synced_at":"2025-12-26T14:02:13.522Z","repository":{"id":222147211,"uuid":"756393166","full_name":"murapadev/Phinetuning","owner":"murapadev","description":"A repository dedicated to finetuning phi2 models using advanced machine learning techniques. This includes training scripts, model evaluation methods, and data processing tools.","archived":false,"fork":false,"pushed_at":"2025-03-03T18:53:48.000Z","size":17,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-12T07:35:14.002Z","etag":null,"topics":["deep-learning","finetuning","machine-learning","model-training","models","natural-language-processing","nlp","phi2","python","pytorch","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/murapadev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-12T15:25:40.000Z","updated_at":"2025-06-14T10:56:56.000Z","dependencies_parsed_at":"2025-03-20T14:52:46.455Z","dependency_job_id":null,"html_url":"https://github.com/murapadev/Phinetuning","commit_stats":null,"previous_names":["murapa96/phinetuning","murapadev/phinetuning"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/murapadev/Phinetuning","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murapadev%2FPhinetuning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murapadev%2FPhinetuning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murapadev%2FPhinetuning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murapadev%2FPhinetuning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/murapadev","download_url":"https://codeload.github.com/murapadev/Phinetuning/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murapadev%2FPhinetuning/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275312968,"owners_count":25442564,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-15T02:00:09.272Z","response_time":75,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","finetuning","machine-learning","model-training","models","natural-language-processing","nlp","phi2","python","pytorch","transformers"],"created_at":"2025-03-28T06:31:57.961Z","updated_at":"2025-09-15T20:05:53.798Z","avatar_url":"https://github.com/murapadev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Phinetuning\n\nAdvanced finetuning toolkit for Phi-2 and other language models, featuring distributed training, hyperparameter optimization, and mixed precision training.\n\n## Table of Contents\n\n- [Features](#features)\n- [Installation](#installation)\n- [Usage](#usage)\n  - [Training](#training)\n  - [Generation](#generation)\n- [Advanced Configuration](#advanced-configuration)\n- [Examples](#examples)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Features\n\n- Distributed training support\n- Mixed precision training (FP16/BF16)\n- 4-bit and 8-bit quantization\n- Hyperparameter optimization using Optuna\n- Data augmentation techniques\n- Advanced text generation with various decoding strategies\n- Streaming generation support\n- Comprehensive logging and error handling\n- Support for multiple model architectures\n\n## Installation\n\n```bash\ngit clone https://github.com/murapa96/Phinetuning.git\ncd Phinetuning\npip install -r requirements.txt\n```\n## Usage\n\n### Training\n\nBasic training command:\n\n```bash\npython train.py \\\n  --model_path microsoft/phi-2 \\\n  --dataset_path your_dataset.json \\\n  --output_dir ./results \\\n  --batch_size 1 \\\n  --grad_accum_steps 4\n```\n\nEnable advanced features:\n\n```bash\npython train.py \\\n  --model_path microsoft/phi-2 \\\n  --dataset_path your_dataset.json \\\n  --output_dir ./results \\\n  --mixed_precision bf16 \\\n  --load_in_4bit \\\n  --distributed \\\n  --tune_hyperparams \\\n  --n_trials 10 \\\n  --augment_data\n```\n\n### Generation\n\nBasic text generation:\n\n```bash\npython generate.py \\\n  --model_path ./results/final_model \\\n  --input_text \"Your prompt here\" \\\n  --max_length 200\n```\n\nAdvanced generation with sampling:\n\n```bash\npython generate.py \\\n  --model_path ./results/final_model \\\n  --input_file inputs.json \\\n  --output_file outputs.json \\\n  --temperature 0.7 \\\n  --top_p 0.95 \\\n  --top_k 50 \\\n  --do_sample \\\n  --streaming \\\n  --mixed_precision\n```\n\n## Advanced Configuration\n\n### Training Arguments\n\n- `--mixed_precision`: Choose between 'no', 'fp16', or 'bf16'\n- `--load_in_4bit`: Enable 4-bit quantization\n- `--load_in_8bit`: Enable 8-bit quantization\n- `--distributed`: Enable distributed training\n- `--tune_hyperparams`: Enable Optuna hyperparameter tuning\n- `--augment_data`: Enable data augmentation\n- `--max_seq_length`: Maximum sequence length (default: 2048)\n- `--learning_rate`: Learning rate (default: 2e-4)\n- `--batch_size`: Per device batch size\n- `--grad_accum_steps`: Gradient accumulation steps\n\n### Generation Arguments\n\n- `--temperature`: Controls randomness (0.0-1.0)\n- `--top_k`: Top-k sampling parameter\n- `--top_p`: Nucleus sampling parameter\n- `--num_beams`: Number of beams for beam search\n- `--repetition_penalty`: Penalize repeated tokens\n- `--streaming`: Enable token-by-token generation\n- `--mixed_precision`: Enable mixed precision inference\n- `--load_in_4bit`: Enable 4-bit quantization for memory efficiency\n\n## Examples\n\n1. **Distributed training with mixed precision:**\n\n```bash\npython train.py \\\n  --model_path microsoft/phi-2 \\\n  --dataset_path your_dataset.json \\\n  --distributed \\\n  --mixed_precision bf16 \\\n  --batch_size 2 \\\n  --grad_accum_steps 4 \\\n  --learning_rate 2e-4 \\\n  --max_steps 10000\n```\n\n2. **Hyperparameter optimization:**\n\n```bash\npython train.py \\\n  --model_path microsoft/phi-2 \\\n  --dataset_path your_dataset.json \\\n  --tune_hyperparams \\\n  --n_trials 20 \\\n  --load_in_4bit\n```\n\n3. **Batch generation with advanced sampling:**\n\n```bash\npython generate.py \\\n  --model_path ./results/final_model \\\n  --input_file inputs.json \\\n  --output_file outputs.json \\\n  --batch_size 4 \\\n  --temperature 0.7 \\\n  --top_p 0.95 \\\n  --do_sample \\\n  --mixed_precision\n```\n\n## Contributing\n\nWe welcome contributions! Please:\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature-name`\n3. Commit your changes: `git commit -am 'Add feature'`\n4. Push to the branch: `git push origin feature-name`\n5. Submit a pull request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmurapadev%2Fphinetuning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmurapadev%2Fphinetuning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmurapadev%2Fphinetuning/lists"}