{"id":26276789,"url":"https://github.com/codeupdaterbot/aivsstep","last_synced_at":"2025-05-06T20:22:47.684Z","repository":{"id":273958669,"uuid":"921447199","full_name":"CodeUpdaterBot/AIvsSTEP","owner":"CodeUpdaterBot","description":"A Python program that uses only the text from the the 2024 STEP-Practice Exams 1-3 and runs it through OpenAI, Claude, Groq, OpenRouter, and open-source Ollama models to see how each scores and which is the best overall.","archived":false,"fork":false,"pushed_at":"2025-02-27T10:58:29.000Z","size":245,"stargazers_count":6,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-31T02:34:50.116Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CodeUpdaterBot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-24T00:49:39.000Z","updated_at":"2025-03-18T20:52:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"4dd266fc-e9d4-4452-8824-d4af26743f9b","html_url":"https://github.com/CodeUpdaterBot/AIvsSTEP","commit_stats":null,"previous_names":["codeupdaterbot/aivsstep"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeUpdaterBot%2FAIvsSTEP","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeUpdaterBot%2FAIvsSTEP/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeUpdaterBot%2FAIvsSTEP/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeUpdaterBot%2FAIvsSTEP/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CodeUpdaterBot","download_url":"https://codeload.github.com/CodeUpdaterBot/AIvsSTEP/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252762016,"owners_count":21800237,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-14T11:19:31.174Z","updated_at":"2025-05-06T20:22:47.665Z","avatar_url":"https://github.com/CodeUpdaterBot.png","language":"Python","readme":"# AI vs STEP\nA Python program that uses ****only the text**** from the the 2024 STEP-Practice Exams 1-3 and runs it through OpenAI, Claude, Groq, OpenRouter, and open-source Ollama models to see how each scores and which is the best overall.\n\n![o1](https://github.com/user-attachments/assets/96a8ab8a-0b60-4496-a6ab-5e4e3f3ffa18)\n\n![Step1-3 All models](https://github.com/user-attachments/assets/b7e54165-d322-40c3-a1a2-248afb83df92)\n\nModels used (2-2-2025):\n\n    mistralai/mistral-large-2411, qwen/qwen-max, meta-llama/llama-3.1-405b-instruct, deepseek-r1-distill-llama-70b, mistralai/mistral-small-24b-instruct-2501, deepseek/deepseek-r1, gpt-4o, gpt-4o-mini, o1-mini, o1-preview, o3-mini-low, o3-mini-medium, o3-mini-high, claude-3-5-haiku-latest, claude-3-5-sonnet-20241022, claude-3-opus-latest]\n\nChoose which models to use at the bottom (uncomment to run):\n\n    ### Open Source APIs\n    {\"model_name\": \"mistralai/mistral-large-2411\", \"engine\": \"openrouter\"},\n    {\"model_name\": \"qwen/qwen-max\", \"engine\": \"openrouter\"},\n    {\"model_name\": \"meta-llama/llama-3.1-405b-instruct\", \"engine\": \"openrouter\"},\n    {\"model_name\": \"deepseek-r1-distill-llama-70b\", \"engine\": \"groq\"},\n    {\"model_name\": \"mistralai/mistral-small-24b-instruct-2501\", \"engine\": \"openrouter\"},\n    {\"model_name\": \"llama-3.3-70b-versatile\", \"engine\": \"groq\"},\n    {\"model_name\": \"deepseek/deepseek-r1\", \"engine\": \"openrouter\"},\n\n    ### OpenAI\n    {\"model_name\": \"gpt-4o\", \"engine\": \"openai\"},\n    {\"model_name\": \"gpt-4o-mini\", \"engine\": \"openai\"},\n    {\"model_name\": \"o1-mini\", \"engine\": \"openai\"},\n    {\"model_name\": \"o1-preview\", \"engine\": \"openai\"},\n    {\"model_name\": \"o3-mini-low\", \"engine\": \"openai\"},\n    {\"model_name\": \"o3-mini-medium\", \"engine\": \"openai\"},\n    {\"model_name\": \"o3-mini-high\", \"engine\": \"openai\"},\n\n    ### Claude / Anthropic\n    {\"model_name\": \"claude-3-5-haiku-latest\", \"engine\": \"claude\"},\n    {\"model_name\": \"claude-3-5-sonnet-20241022\", \"engine\": \"claude\"},\n    {\"model_name\": \"claude-3-opus-latest\", \"engine\": \"claude\"},\n\n    ### Ollama (local)\n    #{\"model_name\": \"mistral-small:24b-instruct-2501-q8_0\", \"engine\": \"ollama\"},\n    #{\"model_name\": \"deepseek-r1:14b-qwen-distill-q4_K_M\", \"engine\": \"ollama\"},\n    #{\"model_name\": \"mistral-small:24b-instruct-2501-q4_K_M\", \"engine\": \"ollama\"},\n    #{\"model_name\": \"openchat:7b-v3.5-1210-q8_0\", \"engine\": \"ollama\"},\n    #{\"model_name\": \"llama3.1:8b-instruct-q8_0\", \"engine\": \"ollama\"},\n\n## Run with:\nPython - https://www.python.org/downloads/\n\n    C:\\Users\\PC\\Downloads\\medbot\u003epython main.py\n\ncd to the folder where the main.py file is stored and run\n `python main.py` \n - See imports at the top of main.py, you might have to pip install a library like 'pip install anthropic'\n\n## **To use ChatGPT/OpenAI:**\n\n You must set the openai.api_key to your OpenAI API Key\n\n## **To use Claude/Anthropic:**\n\n You must set the anthropic_api_key to your Claude API Key\n\n## **To use Groq:**\n\n You must set the groq_api_key to your Groq API Key\n\n## **To use OpenRouter (any LLM/Model):**\n\n You must set the Bearer Authorization to your OpenRouter API Key\n\n## **To use Ollama models:**\n\n You must be running Ollama and have the proper model name entered. You can check by running the Command Prompt and typing 'ollama list' to see which ollama models you have installed. \n- You only need a few GB of vRAM (GPU/video card memory) to run your own AI models for free\n\n## **Use this on any test**\n\n By replacing the test data at the top with your own. Must be multiple-choice.\n- You can copy/select the text in a PDF and use ChatGPT to provide you the test in a format matching the one provided\n\nNote: From what I count in Step 1, there are 5 images that actually require interpretation to get it correct, meaning the best score possible is 95.8% (unless it guesses lucky on one of those). This runs on only the text of the questions.\n\nPractice STEP 1: https://www.usmle.org/sites/default/files/2021-10/Step_1_Sample_Items.pdf\nStep 2 \u0026 3 are from the same source, see them in the .py file\nYou can Ctrl+A to select the content of a PDF, and have ChatGPT convert it to a python dictionary for you (provide it a few of the existing ones so it can match it exactly)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodeupdaterbot%2Faivsstep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodeupdaterbot%2Faivsstep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodeupdaterbot%2Faivsstep/lists"}