{"id":23787794,"url":"https://github.com/schultzm/arbanker","last_synced_at":"2026-06-21T02:31:51.149Z","repository":{"id":40983588,"uuid":"178139874","full_name":"schultzm/ARBanker","owner":"schultzm","description":"Download metadata from the CDC and FDA AR Isolate Bank","archived":false,"fork":false,"pushed_at":"2022-07-01T06:54:06.000Z","size":449,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-21T11:39:53.180Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/schultzm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-28T06:25:16.000Z","updated_at":"2022-07-01T06:54:00.000Z","dependencies_parsed_at":"2022-09-05T08:50:32.703Z","dependency_job_id":null,"html_url":"https://github.com/schultzm/ARBanker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/schultzm/ARBanker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/schultzm%2FARBanker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/schultzm%2FARBanker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/schultzm%2FARBanker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/schultzm%2FARBanker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/schultzm","download_url":"https://codeload.github.com/schultzm/ARBanker/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/schultzm%2FARBanker/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34592050,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-21T02:00:05.568Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-01T15:17:42.262Z","updated_at":"2026-06-21T02:31:51.121Z","avatar_url":"https://github.com/schultzm.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ARBanker\n\n\n[![Build Status](https://travis-ci.com/schultzm/ARBanker.svg?branch=master)](https://travis-ci.com/schultzm/ARBanker)  \n[![codecov](https://codecov.io/gh/schultzm/ARBanker/branch/master/graph/badge.svg)](https://codecov.io/gh/schultzm/ARBanker)  \n[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)  \n[![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/)  \n[![Powered by](https://img.shields.io/badge/powered%20by-jekyl-blue.svg)](https://schultzm.github.io/ARBanker/)  \n\n\nDownload metadata for isolates stored in the \n[CDC \u0026 FDA's Antibiotic Resistance Isolate Bank](https://www.cdc.gov/drugresistance/resistance-bank/index.html)  \n\nThis program will scrape the CDC webpages and parse out the tables to file for each AR Bank ID (\"isolate\").  \n\n## Motivation \n\nUpon requesting (via email) metadata from the CDC for all isolates stored in the ARIsolateBank, we were informed that was no API so would need to manually click through to get the data tables – in excel format and pdf.  Since this did not suit our purposes and there was likely a need for other analysts to do a similar thing, ARBanker was born.\n\n## Authors\n\n[Mark Schultz](https://github.com/schultzm)  \n[Torsten Seemann](https://github.com/tseemann)  \n\n## Installation\n\nIf you don't already have it, install `python3` (with `brew install python3`).  Install [`pipenv`](https://docs.pipenv.org/en/latest/) (with `pip3 install pipenv`).  After this, do:  \n\n```\ngit clone https://github.com/schultzm/ARBanker.git\ncd ARBanker\npipenv --python 3.6 install\npipenv shell\narbanker test\n# cd to wherever, work wherever. arbanker will run from the venv until `exit` from venv.\n```\n\nIf at any time you need to exit the venv activated by `pipenv shell`, just do `exit`.  \nTo get back into it the venv at a later time, do:  \n\n```\ncd ARBanker\npipenv shell\n```\n\n\nOn installing (i.e., `pipenv --python 3.6 install`), you should see something like:\n\n```\nCreating a virtualenv for this project…\nPipfile: ...\nUsing /usr/bin/python3 (3.6.8) to create virtualenv…\n⠸ Creating virtual environment...Using base prefix '/usr'\n  No LICENSE.txt / LICENSE found in source\nNew python executable in ...\nAlso creating executable in ...\nInstalling setuptools, pip, wheel...\ndone.\nRunning virtualenv with interpreter /usr/bin/python3\n\n✔ Successfully created virtual environment! \nVirtualenv location: ...\nInstalling dependencies from Pipfile.lock (7f0f58)…\n  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 6/6 — 00:00:06\nTo activate this project's virtualenv, run pipenv shell.\nAlternatively, run a command inside the virtualenv with pipenv run.\n```\n\nIf you want to run this program in parallel, use `gnu parallel`, installed via:  \n\n```\nbrew install parallel\n```\n\n#### Upgrading the installation  \n\nTo get the latest version, as arbanker is installed in editable mode, simply do:\n```\ncd ARBanker\ngit pull origin\n```\n\n### Testing installation\n\nRun the test suite:  \n```\narbanker test\n```\n\nYou should see:  \n\n```\ntest_hit_url (arbanker.tests.test_isolate.IsolateTestCasePass) ... ok\ntest_hit_xml (arbanker.tests.test_isolate.IsolateTestCasePass) ... ok\ntest_render_metadatatable (arbanker.tests.test_isolate.IsolateTestCasePass) ... ok\ntest_render_datatables (arbanker.tests.test_isolate.IsolateTestCasePass) ... ok\ntest_hit_url (arbanker.tests.test_isolate.IsolateTestCaseFail) ... ok\ntest_hit_xml (arbanker.tests.test_isolate.IsolateTestCaseFail) ... ok\ntest_render_metadatatable (arbanker.tests.test_isolate.IsolateTestCaseFail) ... ok\ntest_render_datatables (arbanker.tests.test_isolate.IsolateTestCaseFail) ... ok\n\n----------------------------------------------------------------------\nRan 8 tests in 11.004s\n\nOK\n\n```\n\n\n## Quick start\n\nIn this example we will get data for isolate number 1.  \n\n```\narbanker grab 1\n```\n\nThe stdout should look like:\n\n```\nWritten ${HOME}/arbanker_results/Metadata/0001.tab.\nWritten ${HOME}/arbanker_results/MIC/0001.tab.\nWritten ${HOME}/arbanker_results/MMR/0001.tab.\n```\n\nAnd the contents of each file are outlined below.\n\n### Metadata/0001.tab  \n\n```\nAR Bank\tBiosample Accession\tPanel\tSpecies\n0001\tSAMN04014842\tEnterobacteriaceae Carbapenem Breakpoint\tEscherichia coli\n```\n\n### MMR/0001.tab\n\n```\nAR Bank\tCategory\tGene\n0001\tAminoglycoside\taac(6')Ib-cr,aadA5\n0001\tBeta-lactam\tKPC-3 ,OXA-1\n0001\tMacrolide-Lincosamide-Streptogramin\tmph(A)\n0001\tSulfonamides\tsul1\n0001\tTetracyclines\ttet(A)\n0001\tTrimethoprim\tdfrA17\n```\n\n### MIC/0001.tab\n\n```\nAR Bank\tDrug\tMIC (μg/ml)\tINT\n0001\tAmikacin\t16\tS\n0001\tAmpicillin\t\u003e32\tR\n0001\tAmpicillin/sulbactam\t\u003e32\tR\n0001\tAztreonam\t\u003e64\tR\n0001\tCefazolin\t\u003e8\tR\n0001\tCefepime\t\u003e32\tR\n0001\tCefotaxime\t\u003e64\tR\n0001\tCefotaxime/clavulanic acid\t8\t---\n0001\tCefoxitin\t\u003e16\tR\n0001\tCeftazidime\t128\tR\n0001\tCeftazidime/avibactam\t\u003c =0.5\tS\n0001\tCeftazidime/clavulanic acid\t\u003e64\t---\n0001\tCeftolozane/tazobactam\t\u003e16\tR\n0001\tCeftriaxone\t\u003e32\tR\n0001\tCiprofloxacin\t\u003e8\tR\n0001\tColistin\t0.5\t---\n0001\tDoripenem\t4\tR\n0001\tErtapenem\t8\tR\n0001\tGentamicin\t4\tS\n0001\tImipenem\t4\tR\n0001\tImipenem+chelators\t4\t---\n0001\tLevofloxacin\t\u003e8\tR\n0001\tMeropenem\t4\tR\n0001\tPiperacillin/tazobactam\t\u003e128\tR\n0001\tTetracycline\t\u003e32\tR\n0001\tTigecycline\t\u003c =0.5\tS\n0001\tTobramycin\t\u003e16\tR\n0001\tTrimethoprim/sulfamethoxazole\t\u003e8\tR\n```\n\n## Advanced usage\n\nOutput the data to a user-defined folder:  \n \n```\narbanker grab 1 -o ~/tmp/arbanker\n```\n\nRun `arbanker` for multiple queries in parallel, outputting to custom \ndestination:  \n\n```\nparallel --bar -j 8 arbanker grab {} -o ~/tmp/arbankerparallel ::: $(seq 1 500)\n```\n\n## Acknowledgements\n\n[Josua Schmid](https://github.com/schmijos) (wrote the [parser.py](https://github.com/schmijos/html-table-parser-python3/blob/master/html_table_parser/parser.py) script, which importantly contains the `HTMLTableParser` Class that I have utilised for this package)  \n[Microbiological Diagnostic Unit - Public Health Laboratory (my employer)](https://biomedicalsciences.unimelb.edu.au/departments/microbiology-Immunology/research/services/microbiological-diagnostic-unit-public-health-laboratory)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fschultzm%2Farbanker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fschultzm%2Farbanker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fschultzm%2Farbanker/lists"}