https://github.com/atlassubbed/atlas-concurrent-queue

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/atlassubbed/atlas-concurrent-queue
Owner: atlassubbed
License: other
Created: 2018-07-01T04:07:36.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2018-07-01T23:32:36.000Z (about 7 years ago)
Last Synced: 2025-03-16T21:05:18.362Z (4 months ago)
Language: JavaScript
Size: 15.6 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

        # atlas-concurrent-queue

Async job queue that limits the number of concurrent jobs.

[![Travis](https://img.shields.io/travis/atlassubbed/atlas-concurrent-queue.svg)](https://travis-ci.org/atlassubbed/atlas-concurrent-queue)

---

## install

```

npm install --save atlas-concurrent-queue

```

## why

I was writing a totally legal file downloader and I needed to run the downloads in parallel, but not *all* at once otherwise we'd run into performance and spamming problems.

## examples

#### single queue

Let's assume we have some file downloading API and we're trying to upload the downloaded files to our personal server. The queue's API is dead simple -- you instantiate a queue and then push jobs onto it:

```javascript

const ConcurrentQueue = require("atlas-concurrent-queue");

const downloadFile = require("./my-file-downloader");

const uploadFile = require("./my-file-uploader");

const urls = require("./url-list");

const destinationUrl = require("./dest-url")

const concurrency = 10

const queue = new ConcurrentQueue(concurrency);

// urls.length === 2000

for (let i = urls.length; i--;){

  queue.push(done => {

    downloadFile(urls[i], contents => {

      done();

      uploadFile(`${destinatonUrl}?index=${i}`, contents, () => {

        // no-op, don't care about result of write

      })

    })

  })

}

```

In the example above, we have 2000 download jobs, but no more than 10 are running at any given time. This helps keep us under the radar and prevents us from overloading our system. You might notice that we called `done()` *before* we started uploading the files to our server. This means that the uploading isn't actually limited in concurrency; we could easily have more than 10 uploads being attempted at once if our personal server is weak. This could be fixed by calling `done` in the `uploadFile` callback, but then we run into potential problems if the download server and upload server operate at different speeds.

#### multiple queues

The example above can run into problems because we aren't pacing the upload jobs, so let's fix it by adding a second queue:

```javascript

...

const downloadConcurrency = 10;

const uploadConcurrency = 5;

const downloadQueue = new ConcurrentQueue(downloadConcurrency);

const uploadQueue = new ConcurrentQueue(uploadConcurrency);

// urls.length === 2000

for (let i = urls.length; i--;){

  downloadQueue.push(downloadDone => {

    downloadFile(urls[i], contents => {

      downloadDone();

      uploadQueue.push(uploadDone => {

        uploadFile(`${destinatonUrl}?index=${i}`, contents, uploadDone)        

      })

    })

  })

}

```

Now, we won't be running more than 5 upload jobs at any given time, in addition to limiting the concurrency of the download jobs.

## todo

#### dynamic concurrency

It might be interesting to implement a dynamic concurrency that can react to changes in bandwidth. For example, we might want to run only `N` downloads at a given time based on network factors.

#### capturing errors and data

Should this be implemented? See caveats below.

## caveats

#### capturing errors and data

There's no way to capture errors or results through the `done` callback. I wanted this queue to do as little work as possible. If you need to capture errors or results, do it at the scope you're writing your jobs in.

#### `done` callback

Don't forget to wrap your async functions with a `done` callback acceptor, because that's how the queue knows when to spin up the next job in the line.

#### streams

You might have noticed we aren't using streams in the examples above. This is for simplicity. With tasks like this, it's better to use streams to limit your memory usage.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/atlassubbed/atlas-concurrent-queue

Awesome Lists containing this project

README