{"id":21482840,"url":"https://github.com/ibmstreams/sample.edge-mnist-flows","last_synced_at":"2025-03-17T09:22:43.125Z","repository":{"id":74839417,"uuid":"281596597","full_name":"IBMStreams/sample.edge-mnist-flows","owner":"IBMStreams","description":"MNIST digit recognition sample for Streams Flows","archived":false,"fork":false,"pushed_at":"2020-11-04T18:06:19.000Z","size":9770,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-01-23T18:50:34.329Z","etag":null,"topics":["edge-computing","samples","stream-processing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IBMStreams.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-22T06:39:26.000Z","updated_at":"2020-11-04T18:06:21.000Z","dependencies_parsed_at":"2023-05-09T17:16:42.569Z","dependency_job_id":null,"html_url":"https://github.com/IBMStreams/sample.edge-mnist-flows","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.edge-mnist-flows","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.edge-mnist-flows/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.edge-mnist-flows/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.edge-mnist-flows/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IBMStreams","download_url":"https://codeload.github.com/IBMStreams/sample.edge-mnist-flows/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244006303,"owners_count":20382443,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["edge-computing","samples","stream-processing"],"created_at":"2024-11-23T12:38:13.167Z","updated_at":"2025-03-17T09:22:43.117Z","avatar_url":"https://github.com/IBMStreams.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sample.edge-mnist-flows\n\nThe sample.edge-mnist is a sample that demonstrates IBM’s Cloud Pak for Data (CP4D) Edge Analytics on the popular MNIST dataset. \n\nThe sample consists of 2 main parts:\n\n\u003cimg src=\"https://github.com/IBMStreams/sample.edge-mnist-flows/blob/main/screenshots/arch.png\" width=\"720\"\u003e \n\n1. The Streams Flows - An IBM Streams app (developed using Streams Flows) running on remote edge systems, where real-time data from IOT devices (or in our case the static MNIST dataset) is processed and scored, with metrics being sent back to the CP4D hub for further analysis. For an overview of Streams Flows, click [here](https://www.youtube.com/watch?v=rVTOnt0nbDA) \n\n2. The Notebook -  A python Jupyter notebook that creates an IBM Streams app on Cloud Pak for Data (CP4D) to receive data from the remote edge systems (through Apache Kafka) for further processing of the data.\n\t\n## Requirements \n- IBM Cloud Pak for Data (CP4D) Cluster v3.0.1 with [Streams / Streams Flows](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/svc-welcome/streams.html) and [Watson Machine Learning](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/svc-welcome/wml.html) installed \n- [CP4D Edge Analytics beta](https://www.ibm.com/support/knowledgecenter/SSQNUZ_3.0.1/svc-welcome/edge.html)\n- [CP4D Streams instance](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/cpd/svc/streams/provision.html)\n- Kafka / [IBM Event Streams](https://www.ibm.com/cloud/event-streams) implementation with at least 1 topic\n- Familiarity with Python, IBM Streams, WML and Kafka/IBM Event Streams\n\n## Instructions \nTo get started with building the edge-mnist sample application, clone this repo and follow the steps below\n\n#### 1. Set up CP4D Project \n- Create a CP4D project with git integration and Jupyterlab IDE\n  - On the CP4D new project page, create a new empty project, choose a name and check the git integration box underneath\n  - On Github, create a [personal access token](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) with repo access (first checkbox) and copy the token\n  - On the new project page, click 'New Token', paste the access token, give it a name, and finally click \"Continue\"\n  - Back on Github, create an empty Git repo (e.g. mnist) and copy the HTTPS url for the repo\n  - Paste the URL in the Repository URL field, select the branch, check the box labeled \"Edit notebooks only with the JupyterLab IDE\", and finally click \"Create\"\n  - Once the project is created, open the project\n  - Launch the JupyterLab IDE by clicking the \"Launch IDE\" dropdown and select 'JupyterLab'\n  - For more information, please click [here](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/getting-started/projects.html)\n \n- Upload files to Jupyterlab IDE\n  - After launching Jupyterlab, there should be 2 folders in the left panel - `project_data_assets` and `project_git_repo`\n  - Select `project_git_repo`, then the folder with your github repo name, then `assets` and finally `jupyterlab`\n  - At this point, you should be in the `project_git_repo/\u003cgithub-repo-name\u003e/assets/jupyterlab/` directory\n  - Upload the following files to the current directory:\n    - `Build-metro-app.ipynb`\n    - `digit-support.py`\n    - `metrorender.py`\n    - `render-metro-view.ipynb`\n    - `model_upload.ipynb`\n    - `sklearn-023-SVC-model`\n \n - Upload model \u0026 create deployment\n   - While still in Jupyterlab, open and follow the instructions in the script `model_upload.ipynb` to populate it with the required details. Finally run it to upload the model\n   - Go back to your project and under the Settings tab, click \"Associate a deployment space\"\n   - Create a new deployment space by entering a name and clicking \"Associate\"\n   - Click on the \"Assets\" tab of the project and under \"Models\" section, click \"Promote\" of the sklearn-SVC-model options and finally click \"Promote to space\"\n   - Navigate to your deployment space, and click the \"Deploy\" button of the sklearn-SVC-model, select \"Online\" for the deployment type, enter a name and click 'Create'\n   - For more information, please click [here](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/analyze-data/ml-spaces_local.html )\n\n\n#### 2. Streams Flows app\n- Upload flow app to CP4D (the `sample.edge-mnist.stp` file)\n  - Exit out of Jupyterlab IDE and go back to your project\n  - Click the assets tab, select the blue \"Add to project\" button, select Streams Flows, from file. \n  - From there, choose a name, select an existing Streams instance, then click create\n  - For more information, please click [here](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/streaming-pipelines/creating-pipeline-import.html)\n  \n- Complete Streams Flows app \n  - To complete the Streams Flow app, we need to connect the app to the model deployment created earlier. \n  - The WML node connects to the WML service, retrieves the model we uploaded earlier, and uses it to score the MNIST dataset images\n  - With the Streams Flow open, click \"Edit the streams flow\"\n  - To connect the app to the WML deployment, select the WML MNIST model node, and in the settings panel on the right, change the model to the model deployment created earlier under the WML MODELS dropdown.\n  - Make sure the \"Inline Scoring\" checkbox under Runtime is ticked. This tells WML to download the model to do scoring on-device (in this case, the remote edge systems)\n  - Configure the output\n    - In the same settings panel, under \"Schema\", click the \"Edit\" button, click \"Reset schema\"\n    - Click the \"Add attributes from incoming schema\". These attributes will be greyed out. \n    -  Click \"Add Attributes\" twice. For the first attribute name enter \"result_class\" with \"Number\" as the type, and \"prediction\" as model field. For the second attribute name enter \"predictions\" with \"Text as the type, and \"probability\" as the model field. The attributes list should now appear similar to the image below.\n    - For more information, please click [here](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/streaming-pipelines/wml_operator.html)\n    - \u003cimg src=\"https://github.com/IBMStreams/sample.edge-mnist-flows/blob/main/screenshots/output_schema.png\" width=\"500\"\u003e\n  - Set up Kafka\n    - As said earlier, the edge nodes send metrics back for further analysis. For this sample, we use 1 kafka topic to send both metrics on low scoring images, and aggregate metrics from all the data being scored.\n    - If you already have your own kafka implementation, feel free to use that, if not, you can sign up for a free IBM cloud account by clicking [here](https://www.ibm.com/cloud/free)\n      - After you sign up, you can create an Event Streams instance by clicking the blue \"Create Resource\" button and searching for \"Event Streams\", and following the given instructions.\n      - Be sure to create a topic and service credentials to be used in the Streams flow\n    - Back in the Streams Flows, the kafka node titled \"Low confidence metrics - Kafka\" is the one used to send low scoring images, while the node titled \"Aggregate metrics - Kafka\" sends aggregate metrics from all the data being scored\n    - To add a connection click on either kafka node, and in the right panel select \"Add connection\" and populate it with the appropriate details. (e.g. if you're using Event Streams, populate the brokers, username, and password). When done, click \"Create\"\n    - In the Settings panel, select the topic you have created. Now select the other node and fill in the same connection and topic settings.\n    - For more information, please click [here](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/streaming-pipelines/Kafka.html)\n  - Ensure necessary packages are installed and parameters are populated\n    - While still in the Streams Flows edit page, select the settings button in the top left of the toolbar\n    - Make sure that both panels have parameters and python packages set as in the image below\n    - \u003cimg src=\"https://github.com/IBMStreams/sample.edge-mnist-flows/blob/main/screenshots/settings.png\" width=\"250\"\u003e | \u003cimg src=\"https://github.com/IBMStreams/sample.edge-mnist-flows/blob/main/screenshots/packages.png\" width=\"250\"\u003e \n  - As a final check to make sure everything is working, you can run the Streams Flows app\n    - To do this first ensure that there are no red dots on any of the nodes. This indicates an error that requires fixing.\n    - If there are no red dots, select the \"save and run\" button in the top left of the toolbar\n    - Let the application run for a few minutes. Be sure to check the bell icon in the top right corner to see if there are any errors with the application.\n    - If there are any errors, view the error log by selecting the bell icon, and then edit the app to fix the error\n    - For more information, please click [here](https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/wsj/streaming-pipelines/running-monitoring-streaming-pipeline.html)\n   \n    \n#### 3. Build and deploy to the edge \n  - Build the edge app\n    - If there are no errors, click the stop icon to quit the app\n    - Now click the \"Edit the Streams Flows\" button, and in the top left corner click the \"Build as Edge Analytics Application\" button\n    - For further instruction, see [here](https://www.ibm.com/support/knowledgecenter/SSQNUZ_3.0.1/svc-edge/developing-build.html)\n  - Package the edge app\n    - Once the edge app is built, we need to package it and prepare it for edge deployment. The link [here](https://www.ibm.com/support/knowledgecenter/SSQNUZ_3.0.1/svc-edge/usage-register-app.html) has more information on this\n  - Deploying to the edge\n    - Finally we can deploy the app to the edge, for more information, please click [here](https://www.ibm.com/support/knowledgecenter/SSQNUZ_3.0.1/svc-edge/usage-deploy.html)\n\n\n#### 4. Notebook app\n- After deploying the Streams Flow app to the edge, we can now create the IBM Streams app in Cloud Pak for Data that receives and process the scored images from the edge\n- To do this, go back into your CP4D project, and launch Jupyterlab IDE\n- In the same directory as before `project_git_repo/\u003cgithub-repo-name\u003e/assets/jupyterlab/`, open `build-metro-app.ipynb`\n  - Fill in the Streams instance name, Kafka topic name, and follow the instructions in the cells to input Kafka credentials. Run the rest of the cells to build and start the application.\n- Once the application is up and running, open `render-metro-view.ipynb`, fill in the Streams instance name and run the cells to  preview the data.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibmstreams%2Fsample.edge-mnist-flows","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fibmstreams%2Fsample.edge-mnist-flows","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibmstreams%2Fsample.edge-mnist-flows/lists"}