{"id":30288570,"url":"https://github.com/da03/epanadiplosis_benchmark","last_synced_at":"2025-08-16T22:37:55.377Z","repository":{"id":282936599,"uuid":"615491484","full_name":"da03/Epanadiplosis_Benchmark","owner":"da03","description":"Benchmarking the performance of various language models in generating epanadiplosis, i.e., generating sentences that start with and end with the same word.","archived":false,"fork":false,"pushed_at":"2023-03-18T00:20:13.000Z","size":202,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-17T19:11:40.622Z","etag":null,"topics":["nlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/da03.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-17T20:28:14.000Z","updated_at":"2023-06-15T03:46:48.000Z","dependencies_parsed_at":"2025-03-17T19:11:42.954Z","dependency_job_id":"25fbe137-1695-4b03-b735-42f059bc9b01","html_url":"https://github.com/da03/Epanadiplosis_Benchmark","commit_stats":null,"previous_names":["da03/epanadiplosis_benchmark"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/da03/Epanadiplosis_Benchmark","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/da03%2FEpanadiplosis_Benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/da03%2FEpanadiplosis_Benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/da03%2FEpanadiplosis_Benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/da03%2FEpanadiplosis_Benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/da03","download_url":"https://codeload.github.com/da03/Epanadiplosis_Benchmark/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/da03%2FEpanadiplosis_Benchmark/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270781214,"owners_count":24643808,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nlp"],"created_at":"2025-08-16T22:37:54.344Z","updated_at":"2025-08-16T22:37:55.356Z","avatar_url":"https://github.com/da03.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Epanadiplosis Benchmark\n\n![ChatGPT Demo](./epanadiplosis_chatgpt.png)\n![GPT-4 Demo](./epanadiplosis_gpt4.png)\n\nBenchmarking the performance of various language models in generating [epanadiplosis](https://en.wiktionary.org/wiki/epanadiplosis), i.e., generating sentences that start with and end with the same word. Below is an example from Philippians 4:4:\n\n```\nRejoice in the Lord always: and again I say, Rejoice.\n```\n\n## Dependencies\n\nInstall the required dependencies with:\n\n```\npip install -r requirements.txt\n```\n\n## Usage\n\nFirst, make sure that your OpenAI API key is set as an environment variable `API_KEY`:\n\n```\nexport AI_KEY=\"my_api_key\"\n```\n\nThen, run the script using the following command:\n\n```\npython main.py --api_key ${API_KEY} --num_generations 100\n```\n\n## Results\n\n| Model            | Success Rate↑  (%) | Repetitivenses↓  (%) |\n|------------------|--------------------|----------------------|\n| code-davinci-002 | 4                  | 63                   |\n| text-davinci-002 | 18                 | 89                   |\n| text-davinci-003 | 43                 | 67                   |\n| gpt-3.5-turbo    | 22                 | 63                   |\n| gpt-4            | 92                 | 98                   |\n\nEvaluation metrics:\n\n* Success rate: The proportion of generated sentences that demonstrate epanadiplosis (i.e., using the same word as both the first and last word in the sentence).\n* Repetitiveness: The proportion of first words (counting from the second generation) that have been used in previous generations. For example, if all generated sentences start with the same word \"Dreams\", then this number will be 100%.\n\nThe raw generations corresponding to the above results are included in [evaluation_results.json](./evaluation_results.json) as part of the repo. Note that each run will result in different generations since we cannot set the random seed using the OpenAI API.\n\n## Prompt\n\nThe prompt used to obtain the results in this benchmark is:\n\n```\nWrite a sentence with at least 10 words that begins with and ends with the same word:\n```\n\nThe requirement of \"at least 10 words\" helps ensure that the generated sentences are of a reasonable length. Without this constraint, some models tend to produce short and ungrammatical phrases.\n\n## Acknolegements\n\nThis repository has been inspired by Yao Fu and Litu Ou's [Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance](https://github.com/FranxYao/chain-of-thought-hub).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fda03%2Fepanadiplosis_benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fda03%2Fepanadiplosis_benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fda03%2Fepanadiplosis_benchmark/lists"}