https://github.com/topfreegames/eventsgateway
Receive events and send them encoded as avro to kafka
https://github.com/topfreegames/eventsgateway
analytics avro data events golang kafka
Last synced: 6 months ago
JSON representation
Receive events and send them encoded as avro to kafka
- Host: GitHub
- URL: https://github.com/topfreegames/eventsgateway
- Owner: topfreegames
- License: mit
- Created: 2018-02-01T12:52:21.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-11-28T17:57:23.000Z (11 months ago)
- Last Synced: 2025-04-12T17:23:04.406Z (6 months ago)
- Topics: analytics, avro, data, events, golang, kafka
- Language: Go
- Size: 438 KB
- Stars: 4
- Watchers: 12
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
TFGCo EventsGateway
===================## [Development](#development-readme)
## [Client](#client-readme)
# Client
## Important Notice
The Events Gateway service works following the At Least Once semantics, that means it will not prevent duplicates.
In case the Client loses its connection (because of networking instabilities) after sending the event to the Events Gateway server, the server will complete the
request with success and send the data to Kafka, but won't be able to acknowledge the client. If the client has some retry logic (built in on async mode) it will
retry to send the same event and will duplicate it.If you really need distinct values guarantee, consider handling it on the downstream pipelines.
## Configuration
About client.NewClient(...) arguments:
* configPrefix: whatever comes just before `client` portion of your config
* config: a viper config with at least a `client` key holding Events Gateway settings
* logger: a logger.Logger instance
* client: should be nil for most cases, except for unit testing
* opts: extra []grpc.DialOption objects`client` config format, with defaults:
```yaml
client:
async: false # if you want to use the async or sync dispatch
channelBuffer: 500 # (async-only) size of the channel that holds events
lingerInterval: 500ms # (async-only) how long to wait before sending messages, in the hopes of filling the batch
batchSize: 10 # (async-only) maximum number of messages to send in a batch
maxRetries: 3 # (async-only) how many times to retry a dispatch if it fails
retryInterval: 1s # (async-only) first wait time before a retry, formula => 2^retryNumber * retryInterval
numRoutines: 2 # (async-only) number of go routines that read from events channel and send batches
kafkatopic: default-topic # default topic to send messages
grpc:
serverAddress: localhost:5000
timeout: 500ms
```Code example:
```go
import (
"context""github.com/spf13/viper"
"github.com/topfreegames/eventsgateway/v4"
"github.com/topfreegames/eventsgateway/v4/logger"
)func ConfigureEventsGateway() (*eventsgateway.Client, error) {
config := viper.New() // empty Viper config
config.Set("eventsgateway.client.async", true)
config.Set("eventsgateway.client.kafkatopic", "my-client-default-topic")
logger := &logger.NullLogger{} // Initialize you logger.Logger implementation here
client, err := eventsgateway.NewClient("eventsgateway", config, logger, nil)
if err != nil {
return nil, err
}
}func main() {
client, err := ConfigureEventsGateway()
if err != nil {
panic(err)
}
// here you pass along the context.Context you received,
// DON'T pass just a context.Background() if you have a previous context.Context
// Sync clients should handle errors accordingly
err := client.Send(context.Background(), "event-name", map[string]string{"some": "value"})
// Async clients error handling are transparent to the user
client.Send(context.Background(), "event-name", map[string]string{"some": "value"})
}```
# Development
## Running locally
All dependencies required to produce and consume events locally are bundled in this project.
1. `make build-dev` will build a development docker image for the client.
2. `make deps-start` will start docker containers for `zookeeper` `kafka`, `eventsgateway-api`
2.These are the necessary dependencies for EventsGateway server.
2. `make run` starts EventsGateway server.
3. `make producer` executes a client that sends one dummy event.
4. `make spark-notebook` runs a jupyter-notebook container with a mounted notebook to consume from Kafka and write ORC files to S3.
Checkout the localhost address to access the Web UI over the container logs.
5. `make hive-start` starts hive stack containers necessary to create tables in hive-metastore and to query from a presto client.
### Bootstraping Localstack bucket and prefixes
After `make deps-start` and before `make gobblin` you'll need to bootstrap localstack's s3 to transfer data from kafka.
`aws --endpoint-url=http://localhost:4572 s3 mb s3://eventsgateway-local`
`aws --endpoint-url=http://localhost:4572 s3api put-bucket-acl --bucket eventsgateway-local --acl public-read-write`
`aws --endpoint-url=http://localhost:4572 s3api put-object --bucket eventsgateway-local --key output/sv-uploads-default-topic/daily/`
Note that `default-topic` should be replaced by the topic you're using in your client, that's the one used by `make testclient`.
### Creating topic table in Hive metastore
Run it inside `docker exec -it hive_hive-server_1 sh -c "/opt/hive/bin/beeline -u jdbc:hive2://localhost:10000"`.
To get the commands necessary to create database and table, run the respective cell at the end of `eventsgateway-streaming-orc` notebook.
After creation, you need to run `msck repair table table_name;` from hive server container to be able to query recent data.
### Querying with Presto
Install presto client, on mac `brew install presto`.
`presto --catalog hive --schema default`
presto:default >> `show tables;`
presto:default >> `select * from defaulttopic;`