{"id":18560766,"url":"https://github.com/eepp/findtb","last_synced_at":"2025-11-01T12:30:33.198Z","repository":{"id":57429619,"uuid":"257342274","full_name":"eepp/findtb","owner":"eepp","description":"Tarball finder","archived":false,"fork":false,"pushed_at":"2021-03-12T18:20:14.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-27T04:06:03.934Z","etag":null,"topics":["crawling","tarball"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eepp.png","metadata":{"files":{"readme":"README.adoc","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-20T16:35:19.000Z","updated_at":"2020-04-20T16:41:54.000Z","dependencies_parsed_at":"2022-09-17T15:42:36.919Z","dependency_job_id":null,"html_url":"https://github.com/eepp/findtb","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eepp%2Ffindtb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eepp%2Ffindtb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eepp%2Ffindtb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eepp%2Ffindtb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eepp","download_url":"https://codeload.github.com/eepp/findtb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239287705,"owners_count":19613976,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawling","tarball"],"created_at":"2024-11-06T22:04:31.808Z","updated_at":"2025-11-01T12:30:33.088Z","avatar_url":"https://github.com/eepp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"// Render with Asciidoctor\n\n= findtb\nPhilippe Proulx \u003chttps://eepp.ca/\u003e\n:toc:\n\nimage:https://img.shields.io/pypi/v/findtb.svg?label=Latest%20version[link=\"https://pypi.python.org/pypi/findtb\"]\n\n_**findtb**_ is a Python{nbsp}3 package which finds\nhttps://en.wikipedia.org/wiki/Tar_(computing)[tarball] URLs within an\nHTML (HTTP) page or FTP directory.\n\nfindtb finds tarball URLs directly into the URL's resource as well as in\nminor/path version directories, if any.\n\n== Installation\n\nInstall findtb from https://pypi.org/project/findtb/[PyPI]:\n\n----\n$ sudo pip3 install findtb\n----\n\n== Usage\n\nUse `findtb.find_tarball_urls()` to get a set of all the tarball URLs\nfound from a given URL:\n\n[source,python]\n----\nimport findtb\nimport pprint\n\n# find tarball URLs\nurls = findtb.find_tarball_urls('ansible',\n                                'https://releases.ansible.com/ansible/')\n\n# print URLs\npprint.pprint(urls)\n----\n\n----\n{'https://releases.ansible.com/ansible/ansible-1.1.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-1.2.1.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-1.2.2.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-1.2.3.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-1.2.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-1.3.0.tar.gz',\n ...\n 'https://releases.ansible.com/ansible/ansible-2.9.0rc5.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.1.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.2.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.3.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.4.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.5.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.6.tar.gz',\n 'https://releases.ansible.com/ansible/ansible-2.9.7.tar.gz'}\n----\n\nThe first parameter is the project's name as expected in the tarball\nnames. The second parameter is the URL from which to start the search.\n\nYou can pass a `logging.Logger` object to `findtb.find_tarball_urls()`.\nThis can be useful as the search process can be long:\n\n[source,python]\n----\nimport findtb\nimport pprint\nimport logging\n\n# configure logging\nlogging.basicConfig(style='{',\n                    format='{asctime} [{name}] {{{levelname}}}: {message}')\n\n# create logger\nlogger = logging.getLogger('findtb')\nlogger.setLevel(logging.INFO)\n\n# find tarball URLs\nurls = findtb.find_tarball_urls('glib',\n                                'ftp://ftp.gnome.org/pub/gnome/sources/glib/',\n                                logger)\n\n# print URLs\npprint.pprint(urls)\n----\n\n----\n2020-04-20 12:08:22,200 [findtb] {INFO}: FTP: connecting to `ftp.gnome.org`.\n2020-04-20 12:08:27,908 [findtb] {INFO}: FTP: changing working directory to `/pub/gnome/sources/glib/`.\n2020-04-20 12:08:28,061 [findtb] {INFO}: FTP: listing files in `/pub/gnome/sources/glib/`.\n2020-04-20 12:08:28,824 [findtb] {INFO}: FTP: changing working directory to `/pub/gnome/sources/glib/2.39`.\n2020-04-20 12:08:28,983 [findtb] {INFO}: FTP: listing files in `/pub/gnome/sources/glib/2.39`.\n2020-04-20 12:08:29,770 [findtb] {INFO}: FTP: changing working directory to `..`.\n2020-04-20 12:08:29,922 [findtb] {INFO}: FTP: changing working directory to `/pub/gnome/sources/glib/2.19`.\n2020-04-20 12:08:30,079 [findtb] {INFO}: FTP: listing files in `/pub/gnome/sources/glib/2.19`.\n2020-04-20 12:08:30,871 [findtb] {INFO}: FTP: changing working directory to `..`.\n2020-04-20 12:08:31,027 [findtb] {INFO}: FTP: changing working directory to `/pub/gnome/sources/glib/2.60`.\n2020-04-20 12:08:31,177 [findtb] {INFO}: FTP: listing files in `/pub/gnome/sources/glib/2.60`.\n...\n{'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.1/glib-1.1.12.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.1/glib-1.1.15.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.2/glib-1.2.0.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.2/glib-1.2.10.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.2/glib-1.2.2.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.2/glib-1.2.5.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/1.2/glib-1.2.6.tar.gz',\n ...\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/2.9/glib-2.9.4.tar.bz2',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/2.9/glib-2.9.4.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/2.9/glib-2.9.5.tar.bz2',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/2.9/glib-2.9.5.tar.gz',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/2.9/glib-2.9.6.tar.bz2',\n 'ftp://ftp.gnome.org/pub/gnome/sources/glib/2.9/glib-2.9.6.tar.gz'}\n----\n\n`findtb.find_tarball_urls()` is quite low-level: it returns a set of\nstrings. Use `findtb.find_tarballs()` to get a set of `findtb.Tarball`\nobjects instead.\n\nA `findtb.Tarball` object contains useful properties such as `name`\n(tarball file name) and `version` (a `packaging.version.Version`\nobject):\n\n[source,python]\n----\nimport findtb\n\n# find tarballs\ntarballs = findtb.find_tarballs('python',\n                                'https://www.python.org/ftp/python/')\n\n# print names and versions\nfor tarball in tarballs:\n    print(tarball.name, tarball.version.release)\n----\n\n----\nPython-3.7.0b2.tgz (3, 7, 0)\nPython-3.4.0a1.tgz (3, 4, 0)\nPython-2.4.6.tgz (2, 4, 6)\nPython-3.3.4rc1.tar.xz (3, 3, 4)\nPython-3.1.3rc1.tar.bz2 (3, 1, 3)\npython-3.6.0b4-embed-win32.zip None\nPython-3.7.0a1.tgz (3, 7, 0)\npython-3.5.0rc3-embed-amd64.zip None\n...\nPython-2.2.1.tgz (2, 2, 1)\npython-3.9.0a4-embed-win32.zip None\nPython-2.7.1rc1.tgz (2, 7, 1)\nPython-3.4.0a3.tgz (3, 4, 0)\nPython-3.5.2.tar.xz (3, 5, 2)\nPython-3.0a2.tgz (3, 0)\nPython-3.6.6rc1.tgz (3, 6, 6)\n----\n\nAny networking/parsing error raises `findtb.Error`.\n\n== Limitations\n\nfindtb is not guaranteed to work for all projects. Its very\nsophisticated algorithms rely on nasty regular expressions and there are\ndozens of software versioning schemes in the wild.\n\nFeel free to contribute if findtb does not work for you.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feepp%2Ffindtb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feepp%2Ffindtb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feepp%2Ffindtb/lists"}