{"id":21482815,"url":"https://github.com/ibmstreams/streamsx.protobuf","last_synced_at":"2025-12-31T14:14:23.156Z","repository":{"id":74839685,"uuid":"120911808","full_name":"IBMStreams/streamsx.protobuf","owner":"IBMStreams","description":"IBM Streams toolkit for parsing and creating Google Protocol Buffers","archived":false,"fork":false,"pushed_at":"2020-07-10T12:27:23.000Z","size":128,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":7,"default_branch":"develop","last_synced_at":"2025-03-17T09:22:36.378Z","etag":null,"topics":["ibmstreams","protobuf","protobuffer","streams"],"latest_commit_sha":null,"homepage":"https://ibmstreams.github.io/streamsx.protobuf/","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IBMStreams.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-02-09T13:55:39.000Z","updated_at":"2018-04-03T18:40:46.000Z","dependencies_parsed_at":null,"dependency_job_id":"d9b943b5-7c50-48a8-bf54-87b754347710","html_url":"https://github.com/IBMStreams/streamsx.protobuf","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/IBMStreams/streamsx.protobuf","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fstreamsx.protobuf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fstreamsx.protobuf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fstreamsx.protobuf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fstreamsx.protobuf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IBMStreams","download_url":"https://codeload.github.com/IBMStreams/streamsx.protobuf/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fstreamsx.protobuf/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265866256,"owners_count":23840937,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ibmstreams","protobuf","protobuffer","streams"],"created_at":"2024-11-23T12:37:31.110Z","updated_at":"2025-12-31T14:14:23.113Z","avatar_url":"https://github.com/IBMStreams.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"# streamsx.protobuf\n\nThe `streamsx.protobuf` toolkit contains operators for interacting with serialized protocol\nbuffer messages. It contains two conversion operators and two simple source operators.\n\nCurrently, this toolkit only supports `proto2` syntax.\n\nExamples can be found in the `streamsx.protobuf.samples` directory.\n\n## Conversion operators\n\n1. `ProtobufParse` takes a tuple with a `blob` field and emits a tuple matching the `protoMessage`\n   parameter type it is given.\n\n2. `ProtobufBuild` takes a tuple as generated by the `spl-schema-from-protobuf` script (see below)\n   and emits a serialized version in the Protobuf serialization format as a blob.\n\n*Important!!* To compile these operators, your Makefile should include the following flag passed to `sc`:\n\n    APPDIR = $(shell basename `pwd`)\n\n    sc -M \u003cmain composite\u003e -t \u003cpath to streamsx.protobuf\u003e -w -Wl,-rpath=\"'\\$$\\$$ORIGIN/../toolkits/$(APPDIR)/impl/lib'\"\n\nNote, that `-Wl` is W followed by lowercase L. If this is not included, the sab bundle does not properly set\nthe runtime library path to include the generated `libcustomproto.so` that is placed in the application directory.\n\n## Source operators\n\n1. `ProtobufTCPSource` creates a TCP server that will accept connections which can pass 1 or\n   more Protobuf messages, each prefixed with a 4-byte record length.\n\n2. `ProtobufFileSource` reads binary files that contain Protobuf messages, each prefixed with\n   a 4-byte record length.\n\n## Configuration\n\nThe `streamsx.protobuf` toolkit requires the Protobuf libraries are installed on the compiling\nmachine.\n\nThe easiest way to install them is from the CentOS base yum repository\n```\nyum install protobuf.x86_64\nyum install protobuf-devel.x86_64\n```\n\nTwo environment variables are required:\n:`$STREAMSX_PROTOBUF_LIBPATH`\n:`$STREAMSX_PROTOBUF_INCLUDEPATH`.\n\nThe following statements will set them for protobuf and protobuf-devel that are available with CentOS:\n\n```\nexport STREAMSX_PROTOBUF_LIBPATH=/usr/lib64\nexport STREAMSX_PROTOBUF_INCLUDEPATH=/usr/include/google/protobuf\n```\n\n## Generating SPL schemas from .proto files\n\nThis toolkit contains a script under `streamsx.protobuf/bin` called `spl-schema-from-protobuf`. This script\nwill generate tuples in SPL to match the Protobuf messages in .proto files.\n\nThis generated schema is required to use the\nconversion operators.\n\n`ProtobufParse` emits the tuple generated by the script corresponding to the\nProtobuf message it is receiving.\n\n`ProtobufBuild` receives the tuple generated by the script\ncorresponding to the Protobuf message it is producing.\n\n### Naming Conventions\nFor all message and enum names, `_pb` is appended to the identifier.\n\nFor all field names or enum values, `_` (underscore)\nis appended to the identifier. An example can be seen in `streamsx.protobuf.samples`.\n\n## Usage\n\nTo use this toolkit, create an empty application. Place your `.proto` file inside your `\u003capplication\u003e/impl` directory.\n\nRun the command:\n```\n\u003cpath to streamsx.protobuf toolkit\u003e/bin/spl-schema-from-protobuf impl \u003cyour protobuf file name\u003e\n```\n\nThis will generate the SPL schema to use with the conversion operators. The files will be placed in your `\u003capplication\u003e` directory within a nested directory structure based on the .proto message structure.  For example, if your .proto file contains a package directive: `package tutorial`, then the generated files will be in `\u003capplication\u003e/tutorial`.\n\nThe output will also include a console message providing the code snippet to add to your .spl application to use the generated SPL types.\n\nAs an example, if you use the protobuf tutorial file (address.proto), the directory structure created will include:\n```\ntutorial\n├── AddressBook_pb.spl\n├── Person_pb.spl\n├── Person_PhoneNumber_pb.spl\n└── Person_PhoneType_pb.spl\n```\n\nNext, you will use the operators within your application or composite operator.\n\n## Simple Example\n\nIf you have a Protobuf message named `my.package.MyMessage`, the files will look like this:\n\n### MyMessage.proto\n\n    syntax = \"proto2\";\n\n    package my.package;\n\n    message MyMessage {\n        required string field = 1;\n    }\n\n### my.package/MyMessage\\_pb.spl\n\n    namespace my.package;\n\n    use my.package::*;\n\n    type MyMessage_pb = tuple\u003c\n        rstring field_\n    \u003e;\n\n### ProtobufParse invocation\n\n    stream\u003cblob recordData\u003e serializedRecords = ProtobufFileSource() {\n        param\n            file: \"\u003cbinary file\u003e\";\n    }\n\n    stream\u003cmy.package::MyMessage_pb\u003e myMessages = ProtobufParse(serializedRecords) {\n        param\n            dataAttribute: recordData;\n            protoMessage: \"my.package.MyMessage\";\n            protoDirectory: \"impl\";\n            protoRootFile: \"MyMessage.proto\";\n    }\n\n### ProtobufBuild invocation\n\n    stream\u003cmy.package::MyMessage_pb\u003e myMessages = Beacon() {\n        param\n            period: 1.0;\n        output\n            myMessages: field_ = \"\u003cvalue\u003e\";\n    }\n\n    stream\u003cblob recordData\u003e serializedRecords = ProtobufParse(myMessages) {\n        param\n            protoMessage: \"my.package.MyMessage\";\n            protoDirectory: \"impl\";\n            protoRootFile: \"MyMessage.proto\";\n    }\n\n## Under the hood\n\nHow do the converters work?\n\nThey utilize a grammar file in `yapp`, which is a Perl port of `yacc`. The grammar defines the `proto2` syntax\naccording to the Google language specification sheet. The `yapp` grammar is compiled into a Perl module, which\ngenerates a parse tree containing all message and enum definitions within the file. For each import from the\nroot file, this process is repeated until all files have been processed.\n\nThe Build/Parse operators iterate through this parse tree to map Protobuf message values into and out of Streams\ntuples. At compile time, these operators run this parser to create the tree, and then they run the `protoc`\ncommand to generate the C++ that is the messages will use. The C++ is compiled into a shared object library\nnamed `libcustomproto.so`, which is stored in the application directory's `impl/lib`. This means that if more\nthan one Build and/or Parse operator exists in the same composite, they cannot be compiled in parallel and they\nmust used the same Protobuf definitions. Otherwise, race conditions will occur and one or both will be\nnon-functional at run time.\n\nVariable mapping is generated recursively, so infinitely complex messages can be handled. There are two limitations:\nThese operators cannot handle `group` fields or `oneof` fields. Oneof fields are planned for future implementation,\nbut group fields have been deprecated by Google in favor of nested messages.\n\nEvery available effort to ensure the readability of the generated code was made, as this makes debugging issues\nmuch easier. Feel free to take a look. However, all variable names are randomly generated to reduce the likelihood\nof a name collision. Name collisions are not checked beforehand, as the likelihood of not having a name collision in\na message with 100 fields is (1-1/52^20)^100, which is infintessimally small.\n\nSome older versions of the `protoc` compiler do not require the first line to state `syntax = \"proto2\";`, but this\nparser requires the statement to be present regardless of the version of `protoc` installed.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibmstreams%2Fstreamsx.protobuf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fibmstreams%2Fstreamsx.protobuf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibmstreams%2Fstreamsx.protobuf/lists"}