{"id":13399871,"url":"https://github.com/xmlking/nifi-examples","last_synced_at":"2025-04-09T21:19:12.834Z","repository":{"id":142313661,"uuid":"43654591","full_name":"xmlking/nifi-examples","owner":"xmlking","description":"Apache NiFi example flows","archived":false,"fork":false,"pushed_at":"2020-01-25T21:15:16.000Z","size":1186,"stargazers_count":202,"open_issues_count":2,"forks_count":182,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-04-09T21:18:48.652Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xmlking.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null},"funding":{"github":["xmlking"],"open_collective":"xmlking"}},"created_at":"2015-10-04T22:04:37.000Z","updated_at":"2025-04-06T01:50:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"62e10e43-c78c-4841-9eac-c56da55cdb6e","html_url":"https://github.com/xmlking/nifi-examples","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xmlking%2Fnifi-examples","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xmlking%2Fnifi-examples/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xmlking%2Fnifi-examples/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xmlking%2Fnifi-examples/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xmlking","download_url":"https://codeload.github.com/xmlking/nifi-examples/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248111973,"owners_count":21049578,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T19:00:44.451Z","updated_at":"2025-04-09T21:19:12.802Z","avatar_url":"https://github.com/xmlking.png","language":"JavaScript","readme":"NiFi Examples\n=================\n\nApache NiFi example flows.\n\n#### collect-stream-logs\n\nThis [flow](./collect-stream-logs/) shows workflow for log collection, aggregation, store and display. \n\n1. Ingest logs from folders.\n2. Listen for syslogs on UDP port.\n3. Merge syslogs and drop-in logs and persist merged logs to Solr for historical search. \n4. Dashboard: stream real-time log events to dashboard and enable cross-filter search on historical logs data.\n\n\n#### iot-activity-tracker\n\nThis [flow](./iot-activity-tracker/) shows how to bring IoT data into Enterprise.\n\n1. Ingest IoT data over WebSocket and HTTP\n3. Store all data to Hadoop(HDFS) and summary data to NoSQL(MarkLogic) for historical data search. \n4. Route data based on pre-set thresholds (vital signs like `pulse rate` and `blood pressure`) to alert users and physicians. \n5. Inactivity Reporting\n\n\n#### oltp-to-olap\n\nA low latency *Change Data Capture* [flow](./oltp-cdc-olap/) to continuously replicate data from OLTP(MySQL) to OLAP(NoSQL) systems with no impact to the source. \n\n1. Multi-tenant: can contain data from many different databases, support multiple consumers. \n2. Flexible CDC: Capture changes from many data sources and types. \n    1. Source consistency preservation. No impact to the source.\n    2. Both DML (INSERT/UPDATE/DELETE) and DDL (ALTER/CREATE/DROP) are captured non invasively.\n    3. Produce Logical Change Records (LCR) in JSON format. \n    4. Commits at the source are grouped by transaction.\n3. Flexible Consumer Dataflows: consumer dataflows can be implemented in Apache NiFi, Flink, Spark or Apex \n    1. Parallel processing data filtering, transformation and loading.\n4. Flexible Databus: store LCRs in **Kafka** streams for durability and pub-sub semantics. \n    1. Use *only* Kafka as input for all consumer dataflows.\n    1. Feed data to many client types (real-time, slow/catch-up, full bootstrap).\n    2. Consumption from an arbitrary time point in the change stream including full bootstrap capability of the entire data.\n    3. Guaranteed in-commit-order and at-least-once delivery.\n    4. Partitioned consumption (partitioned data to different Kafka topics based on database name, table or any field of LCR)\n5. Both batch and near real time delivery.\n\n\n#### csv-to-json\n\nThis [flow](./csv-to-json/) shows how to convert a CSV entry to a JSON document using ExtractText and ReplaceText.\n\n#### decompression\n\nThis [flow](./decompression/) demonstrates taking an archive that is created with several levels of compression and then continuously \ndecompressing it using a loop until the archived file is extracted out.\n\n#### http-get-route\n\nhis [flow](./http-get-route/) pulls from a web service (example is nifi itself), extracts text from a specific section, makes a routing decision \non that extracted value, prepares to write to disk using PutFile.\n\n#### invoke-http-route\n\nThis [flow](./invoke-http-route/) demonstrates how to call an HTTP service based on an incoming FlowFile, and route the original FlowFile \nbased on the status code returned from the invocation. In this example, every 30 seconds a FlowFile is produced, \nan attribute is added to the FlowFile that sets q=nifi, the google.com is invoked for that FlowFile, and any response \nwith a 200 is routed to a relationship called 200.\n\n#### retry-count-loop\n\nThis [process group](./retry/) can be used to maintain a count of how many times a flowfile goes through it. If it reaches some \nconfigured threshold it will route to a 'Limit Exceeded' relationship otherwise it will route to 'retry'. \nGreat for processes which you only want to run X number of times before you give up.\n\n#### split-route\n\nThis [flow](./split-route/) demonstrates splitting a file on line boundaries, routing the splits based on a regex in the content, \nmerging the less important files together for storage somewhere, and sending the higher priority files down \nanother path to take immediate action.\n\n#### twitter-garden-hose\n\nThis [flow](./twitter-garden-hose/) pulls from Twitter using the garden hose setting; it pulls out some basic attributes from the Json and \nthen routes only those items that are actually tweets.\n\n#### twitter-solr\n\nThis [flow](./twitter-solr/) shows how to index tweets with Solr using NiFi. Pre-requisites for this flow are NiFi 0.3.0 or later, \nthe creation of a Twitter application, and a running instance of Solr 5.1 or later with a tweets collection:\n\n\n### Install NiFi\n1. Manual: Download [Apache NiFi](https://nifi.apache.org/download.html) binaries and unpack to a folder. \n2. On Mac: `brew install nifi`\n\n### Run NiFi\n```bash\ncd /Developer/Applications/nifi\n./bin/nifi.sh  start\n./bin/nifi.sh  stop\n```\nOn Mac \n```bash\n# nifi start|stop|run|restart|status|dump|install\nnifi start \nnifi status  \nnifi stop \n# Working Directory: /usr/local/Cellar/nifi/0.3.0/libexec\n```","funding_links":["https://github.com/sponsors/xmlking","https://opencollective.com/xmlking"],"categories":["Learning NiFi"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxmlking%2Fnifi-examples","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxmlking%2Fnifi-examples","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxmlking%2Fnifi-examples/lists"}