https://github.com/yokawasa/fluent-plugin-azuresearch
Azure Search Output Plugin for Fluentd
https://github.com/yokawasa/fluent-plugin-azuresearch
azure azure-search fluent-plugin fluentd ruby
Last synced: about 2 months ago
JSON representation
Azure Search Output Plugin for Fluentd
- Host: GitHub
- URL: https://github.com/yokawasa/fluent-plugin-azuresearch
- Owner: yokawasa
- Created: 2016-01-31T03:51:13.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2020-03-18T18:28:11.000Z (about 5 years ago)
- Last Synced: 2025-03-21T23:33:44.809Z (2 months ago)
- Topics: azure, azure-search, fluent-plugin, fluentd, ruby
- Language: Ruby
- Homepage: https://rubygems.org/gems/fluent-plugin-azuresearch
- Size: 57.6 KB
- Stars: 2
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: ChangeLog.md
Awesome Lists containing this project
README
# Azure Search output plugin for Fluentd
fluent-plugin-azuresearch is a fluent plugin to output to Azure Search
## Requirements
| fluent-plugin-azuresearch | fluentd | ruby |
|------------------------|---------|------|
| >= 0.2.0 | >= v0.14.15 | >= 2.1 |
| < 0.2.0 | >= v0.12.0 | >= 1.9 |## Installation
$ gem install fluent-plugin-azuresearch
## Configuration
### Azure Search
To use Microsoft Azure Search, you must create an Azure Search service in the Azure Portal. Also you must have an index, persisted storage of documents to which fluent-plugin-azuresearch writes event stream out. Here are instructions:
* [Create a service](https://azure.microsoft.com/en-us/documentation/articles/search-create-service-portal/)
* [Create an index](https://azure.microsoft.com/en-us/documentation/articles/search-what-is-an-index/)### Fluentd - fluent.conf
@type azuresearch
@log_level info
endpoint https://AZURE_SEARCH_ACCOUNT.search.windows.net
api_key AZURE_SEARCH_API_KEY
search_index messages
column_names id,user_name,message,tag,created_at
key_names postid,user,content,tag,posttime
* **endpoint (required)** - Azure Search service endpoint URI
* **api\_key (required)** - Azure Search API key
* **search\_index (required)** - Azure Search Index name to insert records
* **column\_names (required)** - Column names in a target Azure search index. Each column needs to be separated by a comma.
* **key\_names (optional)** - Default:nil. Key names in incomming record to insert. Each key needs to be separated by a comma. ${time} is placeholder for Time.at(time).strftime("%Y-%m-%dT%H:%M:%SZ"), and ${tag} is placeholder for tag. By default, **key\_names** is as same as **column\_names**[note] @log_level is a fluentd built-in parameter (optional) that controls verbosity of logging: fatal|error|warn|info|debug|trace (See also [Logging of Fluentd](http://docs.fluentd.org/articles/logging#log-level))
## Sample Configurations
### Case1 - column_names is as same as key_namesSuppose you have the following fluent.conf and azure search index schema:
fluent.conf
@type azuresearch
endpoint https://yoichidemo.search.windows.net
api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy)
search_index messages
column_names id,user_name,message,created_at
Azure Search Schema: messages
{
"name": "messages",
"fields": [
{ "name":"id", "type":"Edm.String", "key": true, "searchable": false },
{ "name":"user_name", "type":"Edm.String" },
{ "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" },
{ "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false}
]
}The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "id": "1", "user_name": "taylorswift13", "message":"post by taylorswift13", "created_at":"2016-01-29T00:00:00Z" },
{ "id": "2", "user_name": "katyperry", "message":"post by katyperry", "created_at":"2016-01-30T00:00:00Z" },
{ "id": "3", "user_name": "ladygaga", "message":"post by ladygaga", "created_at":"2016-01-31T00:00:00Z" }Search results
"value": [
{ "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "created_at": "2016-01-29T00:00:00Z" },
{ "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "created_at": "2016-01-30T00:00:00Z" },
{ "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "created_at": "2016-01-31T00:00:00Z" }
]### Case2 - column_names is NOT as same as key_names
Suppose you have the following fluent.conf and azure search index schema:
fluent.conf
@type azuresearch
endpoint https://yoichidemo.search.windows.net
api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy)
search_index messages
column_names id,user_name,message,created_at
key_names postid,user,content,posttime
Azure Search Schema: messages
{
"name": "messages",
"fields": [
{ "name":"id", "type":"Edm.String", "key": true, "searchable": false },
{ "name":"user_name", "type":"Edm.String" },
{ "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" },
{ "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false}
]
}The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "postid": "1", "user": "taylorswift13", "content":"post by taylorswift13", "posttime":"2016-01-29T00:00:00Z" },
{ "postid": "2", "user": "katyperry", "content":"post by katyperry", "posttime":"2016-01-30T00:00:00Z" },
{ "postid": "3", "user": "ladygaga", "content":"post by ladygaga", "posttime":"2016-01-31T00:00:00Z" }Search results
"value": [
{ "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "created_at": "2016-01-29T00:00:00Z" },
{ "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "created_at": "2016-01-30T00:00:00Z" },
{ "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "created_at": "2016-01-31T00:00:00Z" }
]### Case3 - column_names is NOT as same as key_names, Plus, key_names includes ${time} and ${tag}
fluent.conf
@type azuresearch
endpoint https://yoichidemo.search.windows.net
api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy)
search_index messages
column_names id,user_name,message,tag,created_at
key_names postid,user,content,${tag},${time}
Azure Search Schema: messages
{
"name": "messages",
"fields": [
{ "name":"id", "type":"Edm.String", "key": true, "searchable": false },
{ "name":"user_name", "type":"Edm.String" },
{ "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" },
{ "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false}
]
}The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "id": "1", "user_name": "taylorswift13", "message":"post by taylorswift13" },
{ "id": "2", "user_name": "katyperry", "message":"post by katyperry" },
{ "id": "3", "user_name": "ladygaga", "message":"post by ladygaga" }Search results
"value": [
{ "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" },
{ "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" },
{ "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" }
]
[note] the value of created_at above is the time when fluentd actually recieves its corresponding input event.## Tests
### Running test code
$ git clone https://github.com/yokawasa/fluent-plugin-azuresearch.git
$ cd fluent-plugin-azuresearch
# edit CONFIG params of test/plugin/test_azuresearch.rb
$ vi test/plugin/test_azuresearch.rb
# run test
$ rake test### Creating package, running and testing locally
$ rake build
$ rake install:local
# running fluentd with your fluent.conf
$ fluentd -c fluent.conf -vv &
# send test input event to test plugin using fluent-cat
$ echo ' { "postid": "100", "user": "ladygaga", "content":"post by ladygaga"}' | fluent-cat azuresearch.msgPlease don't forget that you need forward input configuration to receive the message from fluent-cat
@type forward
## TODOs
* Input validation for Azure Search - check total size of columns to add## Change log
* [Changelog](ChangeLog.md)## Links
* http://yokawasa.github.io/fluent-plugin-azuresearch
* https://rubygems.org/gems/fluent-plugin-azuresearch## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/yokawasa/fluent-plugin-azuresearch.
## Copyright
CopyrightCopyright (c) 2016- Yoichi Kawasaki
LicenseApache License, Version 2.0