Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sonots/dummer

Generates dummy log data
https://github.com/sonots/dummer

Last synced: 7 days ago
JSON representation

Generates dummy log data

Awesome Lists containing this project

README

        

# Dummer

Dummer is a set of tools to generate dummy log data. I made this for Fluentd benchmark.

This gem includes three executable commands

1. dummer
2. dummer\_simple
3. dummer\_yes

## Installation

Add this line to your application's Gemfile:

gem 'dummer'

And then execute:

$ bundle

Or install it yourself as:

$ gem install dummer

Run as

$ dummer -c dummer.conf
$ dummer_simple [options]
$ dummer_yes [options]

## dummer

`dummer` allows you to

1. specify a rate of generating messages per second,
2. determine a log format, and
3. generate logs randomly

### Usage (1) - Write to a file

Create a configuration file. A sample configuration is as follows:

```ruby
# dummer.conf
configure 'sample' do
output "dummy.log"
rate 500
delimiter "\t"
labeled true
field :id, type: :integer, countup: true, format: "%04d"
field :time, type: :datetime, format: "[%Y-%m-%d %H:%M:%S]", random: false
field :level, type: :string, any: %w[DEBUG INFO WARN ERROR]
field :method, type: :string, any: %w[GET POST PUT]
field :uri, type: :string, any: %w[/api/v1/people /api/v1/textdata /api/v1/messages]
field :reqtime, type: :float, range: 0.1..5.0
field :foobar, type: :string, length: 8
end
```

Running

```
$ dummer -c dummer.conf
```

Outputs to the `dummy.log` (specified by `output` parameter) file like:

```
id:0422 time:[2013-11-19 02:34:58] level:INFO method:POST uri:/api/v1/textdata reqtime:3.9726677258569842 foobar:LFK6XV1N
id:0423 time:[2013-11-19 02:34:58] level:DEBUG method:GET uri:/api/v1/people reqtime:0.49912949125272277 foobar:DcOYrONH
id:0424 time:[2013-11-19 02:34:58] level:WARN method:POST uri:/api/v1/textdata reqtime:2.930590441869852 foobar:XEZ5bQsh
```

### Usage (2) - Post to Fluentd process

(experimental)

Create a configuration file. Assume that a fluentd process is running on localhost:24224.
A sample configuration is as follows:

```ruby
# dummer.conf
configure 'sample' do
host "localhost" # define `host` and `port` instead of `output`
port 24224
rate 500
tag type: :string, any: %w[raw.syslog raw.message raw.nginx] # configure tag
field :id, type: :integer, countup: true, format: "%04d"
field :level, type: :string, any: %w[DEBUG INFO WARN ERROR]
field :method, type: :string, any: %w[GET POST PUT]
field :uri, type: :string, any: %w[/api/v1/people /api/v1/textdata /api/v1/messages]
field :reqtime, type: :float, range: 0.1..5.0
field :foobar, type: :string, length: 8
end
```

Running

```
$ dummer -c dummer.conf
```

Data is posted to fluentd process like (below is the fluentd log generated by out_stdout)

```
2014-01-31 00:55:32 +0900 raw.message: {"id":"1377","level":"INFO","method":"POST","uri":"/api/v1/people","reqtime":1.678867810409548,"foobar":"paOIWxhQ"}
2014-01-31 00:55:32 +0900 raw.syslog: {"id":"1378","level":"INFO","method":"GET","uri":"/api/v1/people","reqtime":4.8412816521873445,"foobar":"kUvnC0MK"}
2014-01-31 00:55:32 +0900 raw.message: {"id":"1379","level":"WARN","method":"GET","uri":"/api/v1/people","reqtime":3.584494903998221,"foobar":"KD78mpjX"}
```

### CLI Options

You can specify some configuration parameters on CLI without writing them on a configuration file.

```
$ dummer help start
Usage:
dummer start

Options:
-c, [--config=CONFIG] # Config file
# Default: dummer.conf
-r, [--rate=N] # Number of generating messages per second
-o, [--output=OUTPUT] # Output file
-h, [--host=HOST] # Host of fluentd process
-p, [--port=N] # Port of fluentd process
-m, [--message=MESSAGE] # Output message
-d, [--daemonize] # Daemonize. Stop with `dummer stop`
-w, [--workers=N] # Number of parallels
[--worker-type=WORKER_TYPE]
# Default: process
-p, [--pid-path=PID_PATH]
# Default: dummer.pid
```

### Configuration Parameters

Following parameters in the configuration file are available:

* output

Specify a filename to output, or IO object (STDOUT, STDERR)

* host

Post a data to a fluentd process on the specified host. Either of `output` or `host` can be specified.

* port

Post a data to a fluentd process on the specified post. Default is 24224.

* rate

Specify how many messages to generate per second. Default: 500 msgs / sec

* workers

Specify number of processes for parallel processing.

* delimiter

Specify the delimiter between each field. Default: "\t" (Tab)

* labeled

Whether add field name as a label or not. Default: true

* label_delimiter

Specify the delimiter between the label and the value. Default: ":" (column)

* tag

Define tag field to generate. This is effective only for posting data to fluentd process with `host` and `port`.

* field

Random field generator mode. Define data fields to generate. `message` and `input` options are ignored. See also `Field Data Types` section below.

* message

Specific message generation mode. See [message.conf](./example/message.conf) as an example. This mode works pretty fast because it does not require to generate values randomly.

* input

Messages taken from an input file mode. Use this if you want to write messages by reading lines of an input file in rotation. `message` option is ignored. See [input.conf](./example/input.conf) as an example. This mode also works fast.

### Field Data Types

You can specify following data types to your `tag` and `field` parameters:

* :datetime

* :format

You can specify format of datetime as `%Y-%m-%d %H:%M:%S`. See [Time#strftime](http://www.ruby-doc.org/core-2.0.0/Time.html#method-i-strftime) for details.

* :random

Generate datetime randomly. Default: false (Time.now)

* :value

You can specify a fixed Time object.

* :string

* :any

You can specify an array of strings, then the generator picks one from them randomly

* :length

You can specify the length of string to generate randomly

* :value

You can specify a fixed string

* :integer

* :format

You can specify a format of string as `%03d`.

* :range

You can specify a range of integers, then the generator picks one in the range (uniform) randomly

* :countup

Generate countup data. Default: false

* :value

You can specify a fixed integer

* :float

* :format

You can specify a format of string as `%03.1f`.

* :range

You can specify a range of float numbers, then the generator picks one in the range (uniform) randomly

* :value

You can specify a fixed float number

## dummer\_simple

I created a simple version of `dummer` since `dummer` could not achieve the maximum system I/O throughputs because of its rich features.
This simple version, `dummer_simple` could achieve the system I/O limit in my environment.

Sorry, but this simple script cannot post data to fluentd process, supports only writing to a file.

### Usage

```
$ dummer_simple [options]
```

### Options

```
Usage:
dummer_simple

Options:
[--sync] # Set `IO#sync=true`
-s, [--second=N] # Duration of running in second
# Default: 1
-p, [--parallel=N] # Number of processes to run in parallel
# Default: 1
-o, [--output=OUTPUT] # Output file
# Default: dummy.log
-i, [--input=INPUT] # Input file (Output messages by reading lines of the file in rotation)
-m, [--message=MESSAGE] # Output message
# Default: time:2013-11-20 23:39:42 +0900 level:ERROR method:POST uri:/api/v1/people reqtime:3.1983877060667103
```

## dummer\_yes

I created a wrapped version of `yes` command, `dummer_yes`, to confrim that `dummer_simple` achieves the maximum system I/O throughputs.

I do not use `dummer_yes` command anymore because I verified that `dummer_simple` achieves the I/O limit, but I will keep this command so that users can do verification experiments with it.

### Usage

```
$ dummer_yes [options]
```

### Options

```
Usage:
dummer_yes

Options:
-s, [--second=N] # Duration of running in second
# Default: 1
-p, [--parallel=N] # Number of processes to run in parallel
# Default: 1
-o, [--output=OUTPUT] # Output file
# Default: dummy.log
-m, [--message=MESSAGE] # Output message
# Default: time:2013-11-20 23:39:42 +0900 level:ERROR method:POST uri:/api/v1/people reqtime:3.1983877060667103
```

## Relatives

There is a [fluent-plugin-dummydata-producer](https://github.com/tagomoris/fluent-plugin-dummydata-producer), but I wanted to output dummy data to a log file,
and I wanted a standalone separated tool to do benchmark.

## Related Articles

* [Fluentd のベンチマークテストに使える dummer (旧称 dummy_log_generator)](http://qiita.com/sonots/items/750da77a18e62852a02f)

## ToDO

1. write tests

## Contributing

1. Fork it
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request

## Licenses

See [LICENSE.txt](LICENSE.txt)