{"id":21482850,"url":"https://github.com/ibmstreams/sample.forecast_with_r","last_synced_at":"2025-03-17T09:22:54.194Z","repository":{"id":74839418,"uuid":"281407955","full_name":"IBMStreams/sample.forecast_with_r","owner":"IBMStreams","description":"Sample application using Streams Flow demonstrating the usage of R","archived":false,"fork":false,"pushed_at":"2020-09-01T16:35:29.000Z","size":39476,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-01-23T18:50:34.757Z","etag":null,"topics":["r","samples","stream-processing","watson-studio"],"latest_commit_sha":null,"homepage":"https://medium.com/ibm-watson/real-time-forecasting-using-r-and-watson-studio-513c45abd1a9","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IBMStreams.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-21T13:38:46.000Z","updated_at":"2022-03-19T11:41:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"cd1928aa-3e8c-4fa7-8a66-0bb318e9d5cd","html_url":"https://github.com/IBMStreams/sample.forecast_with_r","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.forecast_with_r","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.forecast_with_r/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.forecast_with_r/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IBMStreams%2Fsample.forecast_with_r/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IBMStreams","download_url":"https://codeload.github.com/IBMStreams/sample.forecast_with_r/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244006303,"owners_count":20382443,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["r","samples","stream-processing","watson-studio"],"created_at":"2024-11-23T12:38:17.239Z","updated_at":"2025-03-17T09:22:54.165Z","avatar_url":"https://github.com/IBMStreams.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n## Score streaming data with R and Streams flows\n\nThis is the sample code for the video [Score streaming data in R with Watson Studio Streams Flows](https://youtu.be/gZAoDOus0vc).\nThe code in the video is discussed in [this blog post on Medium](https://medium.com/ibm-watson/real-time-forecasting-using-r-and-watson-studio-513c45abd1a9).\nThis repo includes the full application from the video and a modifiable template so you can use it with your own data.\n\n#### Sample from the video\n- Precompiled versions of the microservice are in the `bin` folder, so you can try the application without having to compile any code.\n- [Sample flow from the demo](PredictHotspotUsage.stp) (uses generated data).\n\n* Running the sample\n  * [Import and run the sample flow](#import-the-sample-flow-into-watson-studio)\n  * [Launch the microservice](#launch-the-forecasting-microservice)\n* [Modify the sample to use your own model](#modifying-the-sample)\n\n\n#### Example streams flow and SPL code.\n- The `example` folder contains a template flow and template SPL code so that you can run the sample with your own data.\n\n## Running the sample\nTo use the sample, you need to\n- Import and run the sample flow in Watson Studio Streams flows.\n- Run the forecasting microservice in the Streaming analytics service.\n\n\n### Import the sample flow into Watson Studio\nFirst, create an account in Watson Studio and an instance of the Streaming analytics service.\n\n- [Sign up for Watson Studio](https://dataplatform.cloud.ibm.com)  if you haven't already.\n- Upload the sample flow:\n  - Download the [PredictHotspotUsage.stp](LINK) file\n  - From Watson Studio, [create a project](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/projects.html) if you do not have one.\n  - Follow [these steps to upload the sample flow you just downloaded](https://dataplatform.cloud.ibm.com/docs/content/wsj/streaming-pipelines/creating-pipeline-import.html?audience=wdp\u0026linkInPage=true).\n  - Start the flow:\n  ![](img/startflow.png)\n\n\n### Launch the forecasting microservice\nOnce the flow is running, submit the forecasting microservice. There are 2 versions of the application:\n - `bin/score.Forecast_With_R.sab`, just scores the data and publishes the results\n - `bin/score.Forecast_With_ModelUpdates.sab`  is configured to connect to Cloud Object Storage (COS) for updates to the model.\n\nTo use the application that has model updates enabled, follow the steps below to configure Cloud object storage.]LINK]\n\n#### Launching the microservice\n- From the metrics page of the flow, open the Streaming analytics dashboard by clicking  **Show notifications** \u003e **Manage [your instance name]  in the cloud**.\n  ![](img/openconsole.png)\n- Once it opens, click the submit button\n   ![](img/submitjob.png)\n- Upload the `bin/score.Forecast_With_R.sab` file.\n- Once the application is running, the streams graph should look like this:\n  ![](img/app-graph.png)\n\nYou can return to the flow and see that the forecast results are being ingested correctly.\n  ![](img/runningflow.png)\n\n\n#### Configure Streaming analytics service to connect to Cloud object storage\nIf you want to run the version of the application that has model updates, you must:\n1. Create an instance of Cloud Object Storage in the IBM cloud.\n2. Configure your [Streaming analytics service to connect to Cloud Object Storage instance as described here](https://ibmstreams.github.io/streamsx.objectstorage/doc/spldoc/html/tk$com.ibm.streamsx.objectstorage/tk$com.ibm.streamsx.objectstorage$9.html).\n3. Create a bucket called `models-demo` in your COS instance.\n\n\n## Modifying the sample\nThis repo also includes a sample flow and template SPL code so you can try out forecasting with your own data and model.\n\nTo modify the sample to use your own model:\n1. Modify the template flow to publish your own data\n2. Modify the template SPL to ingest the right data\n3. Change the R scripts to load and score your models\n\n### Modify the template flow\nUnlike the video, [the template flow](example/StreamsFlowForecastTemplate.stp) converts the stream of data to JSON before publishing it to the Streams instance. This makes it easy for you to set the data schema in the forecasting microservice.\n\n- Download the template flow and  upload it to Streams flows. Connect the data stream you wish to publish  to the `ConvertToJson` node.\n\n  ![](img/openconsole.png)\n  \n   \n- Run the flow. Once it is running you can see the list of attributes that will be expected by clicking the stream being published:\n  ![](img/publish.png) \n\nIn this case the data has the following attributes:\n`{\"id\": 443.0, \"address\": \"A717_M\", \"time\": 1568742876.0}` \n\nThese attributes will be used in the SPL code later.\n\n## Modify the SPL template to ingest your data.\n\nIf you are completely new to SPL please follow [the SPL development guide](http://ibmstreams.github.io/streamsx.documentation/docs/spl/atom/atom-apps/) first so you can learn some basics.\n\n1. First, configure your development environment (Atom or VS Code) to develop with SPL. [Install the Streams plugins for VS Code or Atom](https://developer.ibm.com/streamsdev/docs/develop-run-streams-applications-using-atom-visual-studio-code/) and then import the source code into the editor.\n2. Edit [example/ForecastingTemplate.spl]` to subscribe to the data you published from Streams flows:\n   1. At line 10, edit the schema to match the list of attributes\n       ``` \n       //Change this type to match the type of the attributes you expect\n        type InputDataSchema =float64 id, float64 time, int32 unique_users, int32 total_users;\n        ```\n   In our example, our schema was `{\"id\": 443.0, \"address\": \"A717_M\", \"time\": 1568742876.0}`.\n\n   Change the `InputDataSchema` to match those types, e.g.\n       ` type InputDataSchema =float64 id, rstring address, int64 time; `\n\n\n   See the [doc for a full list of SPL types](https://www.ibm.com/support/knowledgecenter/SSCRJU_4.3.0/com.ibm.streams.ref.doc/doc/primitivetypes.html). \n\n\n   1. Make sure the topic you are subscribing to matches  the topic published from the flow.  The default is `inputData`:\n    ```\n        stream\u003cJsonData\u003e JsonDataToScore = Subscribe()\n\t\t{\n\t\t\tparam\n\t\t\t\ttopic : \"inputData\" ;\n                                \n\t\t} \n    ```\n\n3. Edit the `initialize.r` script to load your own model, at line 6.  Make sure the model you are using is in the `etc/R` folder of your project.\n   \n4. Edit the scoring function in `predict.r` to use your own variables.\n   \n5. Change the `RScript` operator to send the data to the R script so that the attributes in your input match the variable names expected by the R script `predict.r`. Modify these lines as needed. E.g. if you wanted to map the `id` attribute to your input to a variable in the R script called `sensorId`, you would have: \n    ```\n   stream\u003cForecastResult\u003e RScriptResult =\tRScript(DataToScore)\n    {\n      param\n        initializationScriptFileName :  $appDir + \"/etc/R/initialize.r\" ; //edit this file to load your model\n        rScriptFileName :  $appDir + \"/etc/R/predict.r\" ; //\n     streamAttributes :  id;\n     rObjects : \"sensorId\" ; \n     ```\n\n6. Change the `RScript` operator to retreive the results from your RScript:\n  \n    ```\n    stream\u003cForecastResult\u003e RScriptResult =\tRScript(DataToScore)\n    {\n      param\n        initializationScriptFileName :  $appDir + \"/etc/R/initialize.r\" ; //edit this file to load your model\n        rScriptFileName :  $appDir + \"/etc/R/predict.r\" ; //\n     streamAttributes :  id;\n     rObjects : \"sensorId\" ; \n      output\n        RScriptResult : forecastedValue1 = fromR(\"value1\"),\n                        forecastedValue2 = fromR(\"value2\") ;\n    }\n    ```\n    In the example above, `predict.r` has a value called `value1`  and `value2` which we want to retreive. \n\n\n7. Add the names of the output add the attribute names to the `ForecastedResults` type definition on line 12 of ForecastingTemplate.spl:\n    ```\n    type ForecastResult = tuple \u003cfloat32 forecastedValue, int32 forecastedValue2\u003e, InputDataSchema;\n    ```\n8. Compile and run the sample - right click the SPL file in the tree view and select \"Build and submit job\":\n   \n   ![](http://ibmstreams.github.io/streamsx.documentation/images/atom/jpg/build-submit.png)\n\n9.  Go back to the flow to see the results once the application is built successfully.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibmstreams%2Fsample.forecast_with_r","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fibmstreams%2Fsample.forecast_with_r","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibmstreams%2Fsample.forecast_with_r/lists"}