{"id":19932188,"url":"https://github.com/amazon-science/adaptive-in-context-learning","last_synced_at":"2026-02-25T20:31:41.614Z","repository":{"id":204585372,"uuid":"712024351","full_name":"amazon-science/adaptive-in-context-learning","owner":"amazon-science","description":"AdaICL: Which Examples to Annotate of In-Context Learning? Towards Effective and Efficient Selection","archived":false,"fork":false,"pushed_at":"2023-10-30T20:14:11.000Z","size":354,"stargazers_count":17,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-09-19T07:48:36.992Z","etag":null,"topics":["active-learning-in-nlp","few-shot-learning","in-context-learning","large-language-models","nlp"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2310.20046","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amazon-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-10-30T16:34:21.000Z","updated_at":"2025-04-16T10:13:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"2aba5f37-b31b-45db-9b30-fd125cbd5344","html_url":"https://github.com/amazon-science/adaptive-in-context-learning","commit_stats":null,"previous_names":["amazon-science/adaptive-in-context-learning"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/amazon-science/adaptive-in-context-learning","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fadaptive-in-context-learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fadaptive-in-context-learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fadaptive-in-context-learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fadaptive-in-context-learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amazon-science","download_url":"https://codeload.github.com/amazon-science/adaptive-in-context-learning/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Fadaptive-in-context-learning/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29838054,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-25T19:08:47.527Z","status":"ssl_error","status_checked_at":"2026-02-25T18:59:04.705Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["active-learning-in-nlp","few-shot-learning","in-context-learning","large-language-models","nlp"],"created_at":"2024-11-12T23:09:20.292Z","updated_at":"2026-02-25T20:31:41.597Z","avatar_url":"https://github.com/amazon-science.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AdaICL: Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection\n\nIn this work, we investigate an active learning approach for ICL, where there is a limited budget for annotating examples. We propose a model-adaptive optimization-free algorithm, termed AdaICL, which identifies examples that the model is uncertain about, and performs semantic diversity-based example selection. Diversity-based sampling improves overall effectiveness, while uncertainty sampling improves budget efficiency and helps the LLM learn new information. Moreover, AdaICL poses its sampling strategy as a Maximum Coverage problem, that dynamically adapts based on the model’s feedback and can be approximately solved via greedy algorithms. \n![AdaICL algorithm.](assets/AdaICL_alg.pdf \"AdaICL algorithm.\")\n\n\n\n## Installation\nTo establish the environment, run this code in the shell:\n```\nconda env create -f selective_annotation.yml\nconda activate selective_annotation\n```\nWe follow the general setup of Votek (\u003chttps://github.com/xlang-ai/icl-selective-annotation\u003e).\n\n## Usage\n\n### Datasets\n\nAll datasets will be automatically downloaded from huggingface/datasets and stored here.\n\n### End-to-end pipeline: selection, inference, evaluation\nGPT-Neo as the in-context learning model, TREC and SST2 as the tasks, and AdaICL  as the selective annotation method, with additional budget of 20.\n```\nCUDA_VISIBLE_DEVICES=0 ./scripts/run_adaicl.sh\n```\n\nExample:\n```\nCUDA_VISIBLE_DEVICES=0 python main_adaptive_phases.py --evaluate_calibration --few_shot 5 --task_name ag_news --selective_annotation_method \"ada_icl_default\" --model_cache_dir \"models\" --data_cache_dir \"datasets\" --output_dir outputs --annotation_size 20 --model_name \"gpt-neo\" --seed 0 --init \"cluster\"  --sample_k \n```\n\n\n## Directory Layout\nBelow you can find the scripts to reproduce the key results.\n\n```bash\n./active-in-context-learning\n|---- MetaICL/                      # the model will be loaded similar to MetaICL for classification problems. That way we do not encouter invalid label generation.\n|---- logs/                         # Folder for storing logfiles.\n|---- outputs/                      # Folder for storing output results.\n|---- scripts/                      # Run these scripts to reproduce results.\n|\n|---- algorithms.py                 # k-means, fast-votek, model_uncertainty_estimation, votek utilies\n|---- annotation_methods.py         # Supported active learning algos.\n|---- get_task.py                   # Dataset-specific utilies.\n|---- main_adaptive_phases.py       # Execution of AL algos in an adaptive manner (inductive).\n|---- main_generative.py            # Generation tasks.\n|---- prompt_retrieval.py           # Retrieve prompts from annotated pool.\n|---- utils.py                      # BERT embeddings, plots, calibration error etc.\n```\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n## License\n\nThis project is licensed under the Apache-2.0 License.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fadaptive-in-context-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famazon-science%2Fadaptive-in-context-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Fadaptive-in-context-learning/lists"}