{"id":21035872,"url":"https://github.com/archiveteam/ftp-grab","last_synced_at":"2025-05-15T14:31:25.478Z","repository":{"id":140744202,"uuid":"45554309","full_name":"ArchiveTeam/ftp-grab","owner":"ArchiveTeam","description":"Save all FTP sites!","archived":false,"fork":false,"pushed_at":"2016-03-04T00:06:07.000Z","size":19,"stargazers_count":8,"open_issues_count":1,"forks_count":2,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-05-07T10:32:48.365Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ArchiveTeam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2015-11-04T17:08:35.000Z","updated_at":"2025-02-13T03:31:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"6e9736bb-9a9b-4ba6-bd76-a0b99b895560","html_url":"https://github.com/ArchiveTeam/ftp-grab","commit_stats":{"total_commits":20,"total_committers":1,"mean_commits":20.0,"dds":0.0,"last_synced_commit":"57d732e881b6a26fd874631f712a24d23497d71a"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fftp-grab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fftp-grab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fftp-grab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fftp-grab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ArchiveTeam","download_url":"https://codeload.github.com/ArchiveTeam/ftp-grab/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254358720,"owners_count":22057961,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T13:16:53.806Z","updated_at":"2025-05-15T14:31:25.158Z","avatar_url":"https://github.com/ArchiveTeam.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"ftp-grab\n=============\n\nMore information about the archiving project can be found on the ArchiveTeam wiki: [FTP](http://archiveteam.org/index.php?title=FTP)\n\nSetup instructions\n=========================\n\nBe sure to replace `YOURNICKHERE` with the nickname that you want to be shown as, on the tracker. You don't need to register it, just pick a nickname you like.\n\nIn most of the below cases, there will be a web interface running at http://localhost:8001/. If you don't know or care what this is, you can just ignore it—otherwise, it gives you a fancy view of what's going on.\n\n**If anything goes wrong while running the commands below, please scroll down to the bottom of this page. There's troubleshooting information there.**\n\nRunning with a warrior\n-------------------------\n\nFollow the [instructions on the ArchiveTeam wiki](http://archiveteam.org/index.php?title=Warrior) for installing the Warrior, and select the \"FTP\" project in the Warrior interface.\n\nRunning without a warrior\n-------------------------\nTo run this outside the warrior, clone this repository, cd into its directory and run:\n\n    pip install --upgrade seesaw\n\nGrab a copy of Wpull 1.2 from https://launchpad.net/wpull/+download:\n\n    wget https://launchpad.net/wpull/trunk/v1.2/+download/wpull-1.2-linux-x86_64-3.4.3-20150508185423.zip\n    python -c \"import zipfile; f=zipfile.ZipFile('wpull-1.2-linux-x86_64-3.4.3-20150508185423.zip'); f.extractall('./')\"\n    chmod +x ./wpull\n\nthen start downloading with:\n\n    run-pipeline pipeline.py --concurrent 2 YOURNICKHERE\n\nFor more options, run:\n\n    run-pipeline --help\n\nIf you don't have root access and/or your version of pip is very old, you can replace \"pip install --upgrade seesaw\" with:\n\n    wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py ; python get-pip.py --user ; ~/.local/bin/pip install --user seesaw\n\nso that pip and seesaw are installed in your home, then run\n\n    ~/.local/bin/run-pipeline pipeline.py --concurrent 2 YOURNICKHERE\n\nRunning multiple instances on different IPs\n-------------------------------------------\n\nThis feature requires seesaw version 0.0.16 or greater. Use `pip install --upgrade seesaw` to upgrade.\n\nUse the `--context-value` argument to pass in `bind_address=123.4.5.6` (replace the IP address with your own).\n\nExample of running 2 threads, no web interface, and Wget binding of IP address:\n\n    run-pipeline pipeline.py --concurrent 2 YOURNICKHERE --disable-web-server --context-value bind_address=123.4.5.6\n\nDistribution-specific setup\n-------------------------\n### For Debian/Ubuntu:\n\n    adduser --system --group --shell /bin/bash archiveteam\n    apt-get update \u0026\u0026 install -y git-core libgnutls-dev screen python-dev python-pip bzip2 zlib1g-dev unzip\n    pip install --upgrade seesaw\n    su -c \"cd /home/archiveteam; git clone https://github.com/ArchiveTeam/ftp-grab.git\" archiveteam\n    su -c \"cd /home/archiveteam/ftp-grab/; wget https://launchpad.net/wpull/trunk/v1.2/+download/wpull-1.2-linux-x86_64-3.4.3-20150508185423.zip; unzip wpull-1.2-linux-x86_64-3.4.3-20150508185423.zip; chmod +x ./wpull\" archiveteam\n    screen su -c \"cd /home/archiveteam/ftp-grab/; run-pipeline pipeline.py --concurrent 2 --address '127.0.0.1' YOURNICKHERE\" archiveteam\n    [... ctrl+A D to detach ...]\n\n\n### For CentOS:\n\nEnsure that you have the CentOS equivalent of bzip2 installed as well. You might need the EPEL repository to be enabled.\n\n    yum -y install gnutls-devel python-pip zlib-devel unzip\n    pip install --upgrade seesaw\n    [... pretty much the same as above ...]\n\n### For openSUSE:\n\n    zypper install screen python-pip libgnutls-devel bzip2 python-devel gcc make unzip\n    pip install --upgrade seesaw\n    [... pretty much the same as above ...]\n\n### For OS X:\n\nYou need Homebrew. Ensure that you have the OS X equivalent of bzip2 installed as well.\n\n    brew install python gnutls unzip\n    pip install --upgrade seesaw\n    [... pretty much the same as above ...]\n\n**There is a known issue with some packaged versions of rsync. If you get errors during the upload stage, ftp-grab will not work with your rsync version.**\n\nThis supposedly fixes it:\n\n    alias rsync=/usr/local/bin/rsync\n\n### For Arch Linux:\n\nEnsure that you have the Arch equivalent of bzip2 installed as well.\n\n1. Make sure you have `python2-pip` installed.\n2. Run `pip2 install seesaw`.\n3. Modify the run-pipeline script in seesaw to point at `#!/usr/bin/python2` instead of `#!/usr/bin/python`.\n4. `useradd --system --group users --shell /bin/bash --create-home archiveteam`\n5. `su -c \"cd /home/archiveteam; git clone https://github.com/ArchiveTeam/ftp-grab.git\" archiveteam`\n6. `su -c \"cd /home/archiveteam/ftp-grab/; wget https://launchpad.net/wpull/trunk/v1.2/+download/wpull-1.2-linux-x86_64-3.4.3-20150508185423.zip; unzip wpull-1.2-linux-x86_64-3.4.3-20150508185423.zip; chmod +x ./wpull\" archiveteam`\n7. `screen su -c \"cd /home/archiveteam/ftp-grab/; run-pipeline pipeline.py --concurrent 2 --address '127.0.0.1' YOURNICKHERE\" archiveteam`\n\n### For FreeBSD:\n\nNothing specific here. If not so, please do let us know on IRC (irc.efnet.org #archiveteam).\n\nTroubleshooting\n=========================\n\nBroken? These are some of the possible solutions:\n\n### Wpull not successfully running\n\nIf you have trouble getting Wpull running, please see http://wpull.readthedocs.org/en/master/install.html.\n\n### Problem with gnutls or openssl during building\n\nPlease ensure that gnutls-dev(el) and openssl-dev(el) are installed.\n\n### ImportError: No module named seesaw\n\nIf you're sure that you followed the steps to install `seesaw`, permissions on your module directory may be set incorrectly. Try the following:\n\n    chmod o+rX -R /usr/local/lib/python2.7/dist-packages\n\n### run-pipeline: command not found\n\nInstall `seesaw` using `pip2` instead of `pip`.\n\n    pip2 install seesaw\n\n### Issues in the code\n\nIf you notice a bug and want to file a bug report, please use the GitHub issues tracker.\n\nAre you a developer? Help write code for us! Look at our [developer documentation](http://archiveteam.org/index.php?title=Dev) for details.\n\n### Other problems\n\nHave an issue not listed here? Join us on IRC and ask! We can be found at irc.efnet.org #effteepee.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchiveteam%2Fftp-grab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farchiveteam%2Fftp-grab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchiveteam%2Fftp-grab/lists"}