An open API service indexing awesome lists of open source software.

https://github.com/spencermountain/remote-work

script to crawl and download files from open-directories
https://github.com/spencermountain/remote-work

Last synced: 12 days ago
JSON representation

script to crawl and download files from open-directories

Awesome Lists containing this project

README

        



remote-work


crawl and download files from an open-directory

npm install remote-work








**work in progress!**

Sometimes you'll open a webpage, and it will look like this:
![2023-06-20-2trmhOpU](https://github.com/spencermountain/remote-work/assets/399657/0849ff32-d9f6-4776-a7d3-dd02ba6bc1c5)

This is called an **open directory**, or sometimes an **autoindexer**.

It's a server that's configured to show you all its files, which is nice. It used to be more common.

This is a tool to download the all files from a page like this, from the command-line.

```bash
npx remote-work http://us.archive.ubuntu.com/ubuntu/pool/multiverse/y
```

(you'll need to have [NodeJS installed](https://nodejs.dev/en/download/))

### Features

- **async** - downloads files 3 at a time, by default
- **configurable** - download only the files you'd like, using _a [glob](https://www.digitalocean.com/community/tools/glob)_
- **stoppable** - gets files _[depth-first](https://www.codecademy.com/article/tree-traversal)_
- **resumable** - don't re-download files that you already have

### Node API

you can also use this library in a script
`npm install remote-work`

```js
import remoteWork from 'remote-work'

const url = 'http://us.archive.ubuntu.com/ubuntu/pool/multiverse/y'
const dir = './output'
let opts = {
n: 1, //only download one file at a time
match: '*.mp3' //only download mp3 files
}
await remoteWork(url, dir, opts)
```

Please be considerate when downloading files from a remote server.

---

### See also

- [wget-wizard](https://www.whatismybrowser.com/developers/tools/wget-wizard/) - do it all w/ a CLI script
- [reddit.com/r/opendirectories](http://reddit.com/r/opendirectories)
- [directory_downloader](https://github.com/SuperVegetoo/directory_downloader) - python directory parser/crawler
- [autoindex](https://github.com/weisjohn/autoindex) javascript open directory parser by John Weis

MIT