Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ibm-cloud-architecture/refarch-asset-analytics
Present a complete solution for manufacturing assets analytics with real time event processing.
https://github.com/ibm-cloud-architecture/refarch-asset-analytics
Last synced: about 11 hours ago
JSON representation
Present a complete solution for manufacturing assets analytics with real time event processing.
- Host: GitHub
- URL: https://github.com/ibm-cloud-architecture/refarch-asset-analytics
- Owner: ibm-cloud-architecture
- License: apache-2.0
- Created: 2018-06-08T17:21:45.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-01-07T02:24:08.000Z (almost 2 years ago)
- Last Synced: 2023-08-08T22:14:25.348Z (over 1 year ago)
- Language: JavaScript
- Size: 45.1 MB
- Stars: 6
- Watchers: 10
- Forks: 9
- Open Issues: 27
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Manufacturing Asset Predictive Maintenance
This project is part of the 'IBM Hybrid Analytics and Big Data Architecture' reference architecture implementation, available at https://github.com/ibm-cloud-architecture/refarch-analytics. This set of projects presents an end to end solution to enable predictive maintenance capabilities on manufacturing assets.
The problem space is related to continuous operation and service on manufacturing asset like [Electrical Submersible Pump](https://en.wikipedia.org/wiki/Submersible_pump), but any assets with sensors can be considered with the same of solution.The use case is also adaptable and the architecture, solution components can be used for a security threat analysis on a park of devices or assets connected to a company intranet: real time events come from device that need to be aggregated and correlated and analytics run can be performed on historical data to assess security risk. Alert can be published to dashboard user interface.
We are presenting best practices around data management, real-time streaming, cassandra high availability, microservices and serverless implementation.
## Table of contents
- Use case
-
System Context to present the solution components - Deployment to kubernetes cluster like IBM Cloud Private
- Related sub projects - repositories
- Event consumers
-
Event producers simulator to simulate pump events for demonstration purpose. -
Angular 6 user interface to present dashboard to present a mix of static and real time data. -
Dashboard BFF to present a mix of static and real time data. -
Asset management microservice used to expose REST api in front of Cassandra persistence - Demonstration script
- Analytics model
- Future readings
## Use Case
### The challenge
The adoption of IoT, smart devices, in manufacturing industry brings opportunity to predict future maintenance on high-cost equipment before failure. The cost control for maintenance operation combined with optimizing device utilization are continuous challenges engineers are facing. For IT architects, the challenge is to address how to prepare to support adopting artificial intelligence capacity to support new business opportunities? How to support real-time data analytics at scale combined with big data and microservice architecture?
### The solution
A set of geographically distributed electrical submersible pumps (can apply to any manufacturing IoT equipment) are sending a data stream about important physical measurements that need to be processed in real-time and persisted in big data storage. By adding traditional analytics combined with unstructured data as field engineer's reports, it is possible to build a solution that delivers a risk of failure ratio, in real time, from measurements received.
The solution combines key performance indicators aggregation, real-time reporting to a web-based dashboard user interface component, risk scoring microservice, and big data sink used by data scientists to develop and tune analytics and machine learning models.
Data are continuously persisted in a document oriented database, we selected [Cassandra](http://cassandra.apache.org/) as a data sink. The real time event processing is supported by [Kafka](http://kafka.apache.org/) and Kafka streaming. The microservices are done in Java, one in microprofile and one in Java. The data science work is done using [ICP for Data](https://www.ibm.com/analytics/cloud-private-for-data) and data science experience.
### The Benefits
The solution gives visibility to analysts and executive about the real time status of the devices in the grids, with an aggregate view of the ones at risk. The device operation was increased by 15% and the unpredictable failure rate decreased by 85%. The model added diagnostic capabilities to help field engineers to deliver better maintenance.
## System Context
The processing starts by the continuous event flow emitted by a set of monitored devices. The event platform offers pub-subs capabilities and flow processing to aggregate and correlate events so dash board monitoring can be implement on the stateful operators. Data sink are used to keep Asset information, physical measurements over time and maintenance reports. Risk scoring service is deployed as a REST operation, build from a model developed by analytics and machine learning capabilities. Assets are exposed via microservice.
![System Context](docs/system-ctx.png)
1. The application logic is split between the backend for frontend components, and the different microservices or streaming operators. The BFF is a web app exposing a user interface and the business logic to serve end users. For example when a new device or pump is added to the grid, a record is pushed to the user interface. All the real time metrics per device are also visible on the user interface. The supporting project is [the Dashboard BFF](asset-dashboard-bff/README.md) and ...
1. The user interface is done using Angular 6 and aim to present a dashboard of the pump allocation and real time metrics. The wireframe looks like:
![Dashboard Wireframe](docs/dashboard-wireframe.png)
and the project is under the [asset-dashboard-ui](./asset-dashboard-ui) folder.
1. Manage CRUD operation on the assets. See [the Asset manager microservice code.](https://github.com/ibm-cloud-architecture/refarch-asset-manager-microservice)
1. **Pump Simulator** is a java program running on developer laptop or on a server but external to ICP. The approach is to address communication to brokers deployed on ICP. The code and guidance are in the [asset-event-producer project](https://github.com/ibm-cloud-architecture/refarch-asset-analytics/tree/master/asset-event-producer#pump-simulator).
The following diagram illustrates the IBM Cloud Private deployment, we are using in this solution. You will find the same components as in the system context above, with added elements for data management and data scientists using [ICP for Data](https://www.ibm.com/analytics/cloud-private-for-data).
![ICP deployment](docs/icp-deployment.png)
The pump simulator is a standalone Java application you can run on your computer to simulate n pumps. See [the producer project.](asset-event-producer/README.md)
## Deployment
We propose two possible deployments: one for quick validation in a development laptop (tested on Mac) and one on IBM Cloud Private cluster (tested on 2.1.0.3 and 3.1).
### Pre-requisites
* Clone this project to get all the kubernetes deployment files and source code of the different components.
* Clone the Event Driven Architecture reference project: [EDA](https://github.com/ibm-cloud-architecture/refarch-eda) to get Zookeeper and Kafka deployment manifests. As an alternate solution you can use the IBM Event Streams Helm chart from the ICP Catalog.
```shell
git clone https://github.com/ibm-cloud-architecture/refarch-eda.git
```
* Clone the [asset management microservice using the microprofile branch](https://github.com/ibm-cloud-architecture/refarch-asset-manager-microservice) implementation.
```shell
git clone https://github.com/ibm-cloud-architecture/refarch-asset-manager-microservice.git
git checkout microprofile
```
* Access to a kubernetes deployment for development, for example on Mac, we use [Docker Edge](https://docs.docker.com/docker-for-mac/install/#download-docker-for-mac) distribution. You can use [this article](https://rominirani.com/tutorial-getting-started-with-kubernetes-with-docker-on-mac-7f58467203fd) to install Docker Edge and enable kubernetes.
For test and 'production' deployments you need to have access to a kubernetes cluster like IBM Cloud Private.
* Login to your cluster
We are providing multiple scripts to support connection and configuration validation:
* To connect to your cluster: `./scripts/connectToCluster.sh`
* To validate your dependencies: `./scripts/validateConfig.sh`. It may create the 'greencompute' namespace if it does not exist.
### Installing the Event Backbone
For your local environment install Zookeeper and Kafka using our development manifests:
#### Install Zookeeper for development
[Read instructions in this article](https://github.com/ibm-cloud-architecture/refarch-eda/tree/master/deployments/zookeeper/README.md)
#### Install Kafka for development
[Read instructions in this article](https://github.com/ibm-cloud-architecture/refarch-eda/tree/master/deployments/kafka/README.md)
For production we recommend to install IBM Events Streams
#### Install IBM Event Streams
[Read instructions in this article](https://github.com/ibm-cloud-architecture/refarch-eda/tree/master/deployments/eventstreams/README.md)
### Asset and Event Datasource
There is no Cassandra helm chart currently delivered with ICP Helm catalog. The asset manager microservice has a helm chart to install Cassandra on ICP, and an helm chart for the microservice itself. In fact there is an umbrella chart to deploy both in one install. The instructions can be read [here](https://github.com/ibm-cloud-architecture/refarch-asset-manager-microservice/blob/microprofile/docs/icp.md). But it basically do the following helm install under the folder of refarch-asset-manager-microservice
```
helm install --name assetmanager asset/assetmanager --namespace greencompute --tls
```
We are also summarizing Cassandra concepts and some installation considerations in [this article](./docs/cassandra/readme.md). In the readme, we also describe the potential architecture decisions around deploying Cassandra for high availability.
When the pods are up and running use the [following commands](https://github.com/ibm-cloud-architecture/refarch-asset-analytics/blob/master/docs/cassandra/readme.md#define-assets-table-structure-with-cql) to create the needed keyspace and tables for the solution to run.
### Deploy the solution
The following steps are manual. We need to deploy the asset manager microservice first, then the different event consumers.
#### Deploy the asset manager microservice
The docker image is pushed to docker hub, therefore the manifests under `refarch-asset-manager-microservice/manifests` is using this image.
See instructions [in the project repository](https://github.com/ibm-cloud-architecture/refarch-asset-manager-microservice/blob/microprofile/docs/icp.md#run-the-app) to deploy it using helm.
#### Deploy UI with BFF
To deploy the BFF: the scripts and manifests are under the asset-dashboard-bff folder of this project. We document the deployment and build process in [these instructions.](https://github.com/ibm-cloud-architecture/refarch-asset-analytics/tree/master/asset-dashboard-bff#build)
#### Populate the Cassandra with some assets
The project https://github.com/ibm-cloud-architecture/refarch-asset-manager-microservice has one script to do a `curl post` with json file representing pumps. You can change the URL to match your asset manager microservice endpoint, and then deploy the pump once the asset manager is deployed.
```shell
$ cd scripts
$ ./addAsset.sh pumpDAL01.json
$ ./addAsset.sh pumpHOU1.json
$ ./addAsset.sh pumpLA1.json
$ ./addAsset.sh pumpLA2.json
$ ./addAsset.sh pumpLA3.json
$ ./addAsset.sh pumpND1.json
$ ./getAssets.sh
```
The dashboard reports the imported pumps:
![Pump added](./docs/somepumps.png)
#### Deploy Asset Injector
We have developed two types of asset injector. We propose to use a function as service as the base one. So see the instruction to install kubeless and the consume asset function in [this note](./asset-consumer-function/README.md)
The second asset injector is part of the [asset-consumer](./asset-consumer) project and is the Springboot app which can be built and executed with the commands:
```shell
mvn compile
mvn exec:java@AssetInjector
```
Be sure to modify the config/config.properties file to match the server endpoint for Kafka and Asset Manager Microservice.
#### Start Pump Simulator to add one asset
Finally the [pump simulator](asset-event-producer/readme.md) is a standalone java application used to produce different types of event. It does not need to be deployed to kubernetes. It can run in two modes:
* add new asset: this is when a pump is added to a location.
```shell
mvn compile
mvn exec:java@Simulator
```
* run pump metrics
```shell
mvn compile
mvn exec:java@Simulator -Dexec.args="asset-topic gc-kafka-0.gc-kafka-hl-svc.greencompute.svc.cluster.local event PDrop 2000 4 pump01"
```
#### Start Pump Simulator to generate metrics event on existing pumps
### ICP Deployment
The diagram below presents the deployment of the solution components as well as the Zookeeper, Kafka and Cassandra products deployed inside kubernetes:
![icp deployment](docs/asset-sol-k8s-depl.png)
* For high availability we need three masters, three proxies, 3 managers and at least 6 workers.
* Cassandra is deployed with 3 replicas and uses NFS based persistence volume so leverage shareable filesystems.
* Kafka is deployed with 3 replicas with anti affinity to avoid to have two pods on same node and also on the same node as Zookeeper's ones.
* Zookeeper is deployed with 3 replicas with anti affinity to avoid to have two pods on same node and on the same node as Kafka. This constraint explains the 6 workers.
* The following components of the solution are deployed with at least 3 replicas: Asset manager microservice, dashboard BFF.
### Troubleshooting
As we are deploying different solutions into kubernetes we group [Troubleshooting notes here](https://github.com/ibm-cloud-architecture/refarch-integration/blob/master/docs/icp/troubleshooting.md) and [a technology summary here](https://jbcodeforce.github.io/#/studies)
## Contributors
* Lead development [Jerome Boyer](https://www.linkedin.com/in/jeromeboyer/)
* [Amaresh Rajasekharan for the data science part](https://www.linkedin.com/in/amaresh-rajasekharan/)
* [Hemankita Perabathini for the asset management microservice and ICP](https://www.linkedin.com/in/hemankita-perabathini/)
* [Zach Silverstein](https://www.linkedin.com/in/zsilverstein/)
Please [contact me](mailto:[email protected]) for any questions.