https://github.com/bentsherman/nf-boost
Experimental features for Nextflow
https://github.com/bentsherman/nf-boost
nextflow
Last synced: about 1 year ago
JSON representation
Experimental features for Nextflow
- Host: GitHub
- URL: https://github.com/bentsherman/nf-boost
- Owner: bentsherman
- License: apache-2.0
- Created: 2024-03-23T00:28:50.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-19T17:08:36.000Z (about 1 year ago)
- Last Synced: 2025-03-27T23:51:09.527Z (about 1 year ago)
- Topics: nextflow
- Language: Groovy
- Homepage:
- Size: 198 KB
- Stars: 40
- Watchers: 8
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
# nf-boost
while we wait for four five two
a very special plugin, just for you
Nextflow plugin for experimental features that want to become core features!
Currently includes the following features:
- automatic deletion of temporary files (`boost.cleanup`)
- `fromJson` and `toJson` functions to read and write JSON
- `fromYaml` and `toYaml` functions to read and write YAML
- `mergeCsv` function for saving records to a CSV file
- `mergeText` function for saving items to a text file (similar to `collectFile` operator)
- `request` function for making HTTP requests
- `template` function for rendering templates
- `scan` operator for, well, scan operations
- `then` operator for defining custom operators in your pipeline
## Getting Started
To use `nf-boost`, include it in your Nextflow config and add any desired settings:
```groovy
plugins {
id 'nf-boost'
}
boost {
cleanup = true
}
```
The plugin requires Nextflow version `23.10.0` or later.
*New in version `0.4.0`: requires Nextflow `24.04.0` or later.*
*New in version `0.5.0`: requires Nextflow `24.10.0` or later.*
If a release hasn't been published to the main registry yet, you can still use it by specifying the following environment variable so that Nextflow can find the plugin:
```bash
export NXF_PLUGINS_TEST_REPOSITORY="https://github.com/bentsherman/nf-boost/releases/download/0.3.2/nf-boost-0.3.2-meta.json"
```
## Examples
Check out the `examples` directory for example pipelines that demonstrate how to use the features in this plugin.
## Reference
### Configuration
**`boost.cleanup`**
Set to `true` to enable automatic cleanup (default: `false`). Temporary files will be automatically deleted as soon as they are no longer needed.
The default cleanup observer uses `publishDir` directives to determine whether a file should be published before it is deleted. Setting `boost.cleanup = 'v2'` will use an alternate cleanup observer which uses the new workflow publish definition instead of `publishDir` to track publishing.
Limitations:
- Resume is not supported with automatic cleanup at this time. Deleted tasks will be re-executed on a resumed run. Resume will be supported when this feature is finalized in Nextflow.
- Helper files and log files created by Nextflow (e.g. `.command.run`, `.command.log`) are not deleted. Consider using a cleanup policy on the underlying filesystem or object storage to delete these files automatically over time.
- Input files that are staged into the work directory (e.g. from an HTTP/FTP server or S3 bucket) are not deleted.
- Files created by operators (e.g. `collectFile`, `splitFastq`) cannot be tracked and so are not deleted. For optimal performance, consider refactoring such operators into processes:
- Splitter operators such as `splitFastq` can also be used as functions in a native process:
```groovy
process SPLIT_FASTQ {
input:
val(fastq)
output:
path(chunks)
exec:
chunks = splitFastq(fastq, file: true)
}
```
- The `collectFile` operator can be replaced with `mergeText` (in this plugin) in a native process. See the `examples` directory for example usage.
**`boost.cleanupInterval`**
Specify how often to scan for cleanup (default: `'60s'`).
### Functions
**`fromJson( source: Path | String )`**
Load a value from JSON.
**`toJson( value, pretty: boolean = false ) -> String`**
Convert a value to JSON.
**`fromYaml( source: Path | String )`**
Load a value from YAML.
**`toYaml( value ) -> String`**
Convert a value to YAML.
**`mergeCsv( records: List, path: Path, [opts] )`**
Save a list of records (i.e. tuples or maps) to a CSV file.
Available options:
- `header: boolean | List`
When `true`, the keys of the first record are used as the column names (default: `false`). Can also be a list of column names.
- `sep: String`
The character used to separate values (default: `','`).
**`mergeText( items: List | List, path: Path, [opts] )`**
Save a list of items (i.e. files or strings) to a text file.
Available options:
- `keepHeader: boolean`
Prepend the resulting file with the header of the first file (default: `false`). The number of header lines can be specified using the `skip` option, to determine how many lines to remove from each file.
- `newLine: boolean`
Append a newline character after each entry (default: `false`).
- `skip: int`
The number of lines to skip at the beginning of each entry (default: `1` when `keepHeader` is true, `0` otherwise).
**`request( url: String, [opts] )`**
Make an HTTP request. Returns the underlying [HttpURLConnection](https://docs.oracle.com/javase/8/docs/api/java/net/HttpURLConnection.html).
Available options:
- `body: String`
The request body.
- `headers: Map`
Map of request headers.
- `method: String`
The request method, can be `'get'`, `'post'`, `'head'`, `'options'`, `'put'`, `'delete'`, or `'trace'` (default: `'get'`).
**`template( source: Path | String, binding: Map ) -> String`**
Render a template with the given binding.
### Operators
**`exec( name, body )`**
The `exec` operator was removed in version 0.5.0. Use an `exec` process instead.
**`scan( [seed], accumulator )`**
The `scan` operator is similar to `reduce` -- it applies an accumulator function sequentially to each value in a channel -- however, whereas `reduce` only emits the final result, `scan` emits each partially accumulated value.
**`then( onNext, [opts] )`**
**`then( [others...], opts )`**
The `then` operator is a generic operator that can be used to implement nearly any operator you can imagine.
It accepts any of three event handlers: `onNext`, `onComplete`, and `onError` (similar to `subscribe`). Each event handler has access to the following methods:
- `emit( value )`: emit a value to the output channel
- `done()`: signal that no more values will be emitted
When there is only one source channel, the `done()` method will be called automatically when the source channel sends the `onComplete` event. You can still call it manually, e.g. to finalize the output earlier. When there are multiple source channels, you are responsible for calling `done()` at the appropriate time -- if you don't call it, your operator will wait forever.
When there are multiple source channels, `onNext` and `onComplete` events are *synchronized*. This way, you don't need to worry about making your event handlers thread-safe, because they will be invoked on one event at a time.
Available options:
- `onNext( value, [i] )`: Closure that is invoked when a value is emitted by a source channel. Equivalent to providing a closure as the first argument. When there are multiple source channels, the closure is invoked with a second argument corresponding to the index of the source channel.
- `onComplete( [i] )`: Closure that is invoked after the last value is emitted by a source channel. When there are multiple source channels, the closure is invoked with the index of the source channel.
- `onError( error )`: Closure that is invoked when an exception is raised while handling an `onNext` event. It is invoked the exception that caused the error. No further calls will be made to `onNext` or `onComplete` after this event. By default, the error is logged and the workflow is terminated.
- `singleton`: Whether the output channel should be a value (i.e. *singleton*) channel. By default, it is determined by the source channel, i.e. if the source is a value channel then the output will also be a value channel and vice versa.
## Development
Build and install the plugin to your local environment:
```bash
make install
```
Run with Nextflow as usual:
```bash
nextflow run hello -plugins nf-boost@
```