{"id":26856611,"url":"https://github.com/ZhangYiqun018/GENOME","last_synced_at":"2025-03-31T00:03:05.969Z","repository":{"id":275377283,"uuid":"924991154","full_name":"ZhangYiqun018/GENOME","owner":"ZhangYiqun018","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-03T10:14:28.000Z","size":571,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-03T11:26:57.949Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ZhangYiqun018.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-31T02:21:51.000Z","updated_at":"2025-03-03T10:14:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"9c62e77d-e09b-44df-8a2b-9e30bbe85e78","html_url":"https://github.com/ZhangYiqun018/GENOME","commit_stats":null,"previous_names":["zhangyiqun018/genome"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZhangYiqun018%2FGENOME","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZhangYiqun018%2FGENOME/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZhangYiqun018%2FGENOME/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZhangYiqun018%2FGENOME/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ZhangYiqun018","download_url":"https://codeload.github.com/ZhangYiqun018/GENOME/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246395595,"owners_count":20770243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-31T00:02:40.477Z","updated_at":"2025-03-31T00:03:05.957Z","avatar_url":"https://github.com/ZhangYiqun018.png","language":"Python","funding_links":[],"categories":["基因"],"sub_categories":["资源传输下载"],"readme":"# GENOME(+)\n\n\u003e **GENOME(+)** is a framework for population-based evolution of Large Language Models (LLMs), inspired by natural evolution. Starting with a population of parent LLMs, the framework enables model evolution through five key operations:\n\n### Key Operations\n- **Crossover**: Merges weights of different parent LLMs to create offspring\n- **Mutation**: Introduces controlled random changes to foster diversity\n- **Selection**: Prioritizes high-performing models\n- **Succession**: Transfers learned experience from parents to offspring\n- **Ensemble**: Combines the strengths of multiple evolved models for robust predictions\n\n---\n\n🌟 **Key Features**:\n- **Rapid adaptation** with only 200 samples per new task\n- **No gradients** required for evolution\n- **Up to 54.8% accuracy gains** over initial population (on DROP dataset)\n- **Effective scaling** with populations up to 40 LLMs\n- **Zero-shot generalization** to unseen tasks\n- **Runs on a single 4090 GPU** (24GB memory)\n\n![GENOME+ Architecture](assets/genome.png)\n\n## 📦 Installation\n\n1. Clone the repository:\n\n```bash\ngit clone https://github.com/yourusername/GENOME.git\ncd GENOME\n```\n\n2. Install dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\n## 🚀 Usage\n\n### GENOME\n\n```bash\npython run_genome.py \\\n    --tasks mmlu gsm8k arc_c \\\n    --task_weights 0.4 0.3 0.3 \\\n    --model_path meta-llama/Meta-Llama-3-8B-Instruct \\\n    --lora_dir lora_adapters \\\n    --combine_method ties \\\n    --population_size 30 \\\n    --max_iter 50\n```\n\n## 📁 Project Structure\n\n```\nGENOME/\n├── src/                    # Source code\n│   ├── genome/            # Genome optimization algorithms\n│   ├── evaluate/          # Task evaluators (MMLU, GSM8K, etc.)\n│   ├── base/              # Base classes and configurations\n│   └── analysis/          # Analysis and visualization tools\n├── scripts/               # Utility scripts\n├── config/               # Configuration files\n├── datas/                # Datasets\n├── run_genome.py           # Genome algorithm entry\n```\n\n## 🔧 Extension Guide\n\n### Adding New Evaluators\n\n1. Create a new evaluator class in `src/evaluate`:\n\n```python\nfrom src.evaluate.eval import Evaluator, Method, Split\nfrom typing import Dict, List\n\nclass NewTaskEvaluator(Evaluator):\n    def __init__(self):\n        super().__init__()\n        self.data = {}\n    \n    def load_data(self, split: str):\n        \"\"\"Load dataset for specific split\n        Args:\n            split: One of 'train', 'valid', 'test', 'full'\n        \"\"\"\n        data_path = f\"datas/new_task/{split}.jsonl\"\n        self.data[split] = self.load_jsonl(data_path)\n    \n    def api_evaluate(self, client: 'OpenAI', **kwargs) -\u003e float:\n        \"\"\"Evaluate using OpenAI API interface\n        Args:\n            client: OpenAI client instance\n            **kwargs: Additional parameters\n        Returns:\n            float: Evaluation score\n        \"\"\"\n        # Implement API-based evaluation\n        return score\n    \n    def local_evaluate(self, model: 'LLM', **kwargs) -\u003e float:\n        \"\"\"Evaluate using local vLLM model\n        Args:\n            model: vLLM model instance\n            **kwargs: Additional parameters\n        Returns:\n            float: Evaluation score\n        \"\"\"\n        # Implement local model evaluation\n        return score\n```\n\n2. Register the new evaluator in `src/evaluate/factory.py`:\n\n```python\nfrom enum import Enum\nfrom .new_task_evaluator import NewTaskEvaluator\n\nclass Benchmark(Enum):\n    # ... existing benchmarks ...\n    NEW_TASK = \"new_task\"  # Add new benchmark\n\nclass EvaluatorFactory:\n    def get_evaluator(self, task: str):\n        if isinstance(task, str):\n            task = Benchmark(task.lower())\n            \n        if not isinstance(task, Benchmark):\n            raise TypeError(f\"Task must be a string or Benchmark enum, got {type(task)}\")\n            \n        # ... existing evaluators ...\n        elif task == Benchmark.NEW_TASK:\n            return NewTaskEvaluator()\n        else:\n            raise ValueError(f\"Evaluator for task {task} not found.\")\n```\n\n\n### Adding New Methods\n\n1. Create a new method configuration class in `src/base`:\n\n```python\nfrom src.base.base_config import BaseConfig\n\nclass NewMethodConfig(BaseConfig):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n        # Add method-specific configuration parameters\n        \n    def validate(self):\n        \"\"\"Validate configuration parameters\"\"\"\n        super().validate()\n        # Add method-specific validation\n```\n\n2. Create a new method class in `src`:\n\n```python\nfrom src.base.base_method import BaseMethod\n\nclass NewMethod(BaseMethod):\n    def __init__(self, config: NewMethodConfig):\n        self.config = config\n        self.config.validate()\n        # Initialize method-specific properties\n        \n    def search(self):\n        \"\"\"Implement search logic\"\"\"\n        # Implement core optimization method logic\n```\n\n3. Create a run script `run_new_method.py`:\n\n```python\nimport argparse\nfrom src.new_method import NewMethod, NewMethodConfig\n\ndef parse_args():\n    parser = argparse.ArgumentParser()\n    # Add command line arguments\n    return parser.parse_args()\n\ndef main():\n    args = parse_args()\n    config = NewMethodConfig(**vars(args))\n    method = NewMethod(config)\n    method.search()\n\nif __name__ == \"__main__\":\n    main()\n```\n\n### Using New Components\n\n1. Using the new evaluator:\n\n```bash\npython run_modelswarms.py \\\n    --tasks new_task \\\n    --task_weights 1.0 \\\n    # ... other parameters\n```\n\n2. Using the new optimization method:\n\n```bash\npython run_new_method.py \\\n    --model_path meta-llama/Meta-Llama-3-8B-Instruct \\\n    --lora_dir lora_adapters \\\n    --task mmlu \\\n    # ... method-specific parameters\n```\n\n## 📝 Notes\n\n- Ensure sufficient GPU resources for model deployment\n- Recommended to use vLLM for efficient inference\n- Performance can be optimized through parameter tuning\n\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FZhangYiqun018%2FGENOME","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FZhangYiqun018%2FGENOME","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FZhangYiqun018%2FGENOME/lists"}