{"id":21028112,"url":"https://github.com/leonjessen/keras_tensorflow_on_iris","last_synced_at":"2025-06-13T21:05:20.194Z","repository":{"id":145190200,"uuid":"130107908","full_name":"leonjessen/keras_tensorflow_on_iris","owner":"leonjessen","description":"A minimal tutorial on how to build a neural network classifier based on the iris data set using Keras/TensorFlow in R/RStudio","archived":false,"fork":false,"pushed_at":"2018-05-04T06:06:10.000Z","size":751,"stargazers_count":18,"open_issues_count":0,"forks_count":14,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-05-15T10:40:50.670Z","etag":null,"topics":["classification","datascience","deep-learning","ggplot","iris-dataset","keras","machine-learning","neural-network","r","rstudio","tensorflow","tensorflow-tutorials","tutorial"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/leonjessen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-04-18T18:51:00.000Z","updated_at":"2022-11-13T00:59:22.000Z","dependencies_parsed_at":null,"dependency_job_id":"1e9249dd-3e76-475c-b5bd-df509566a9b4","html_url":"https://github.com/leonjessen/keras_tensorflow_on_iris","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/leonjessen/keras_tensorflow_on_iris","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leonjessen%2Fkeras_tensorflow_on_iris","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leonjessen%2Fkeras_tensorflow_on_iris/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leonjessen%2Fkeras_tensorflow_on_iris/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leonjessen%2Fkeras_tensorflow_on_iris/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/leonjessen","download_url":"https://codeload.github.com/leonjessen/keras_tensorflow_on_iris/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leonjessen%2Fkeras_tensorflow_on_iris/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259719716,"owners_count":22901238,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","datascience","deep-learning","ggplot","iris-dataset","keras","machine-learning","neural-network","r","rstudio","tensorflow","tensorflow-tutorials","tutorial"],"created_at":"2024-11-19T11:53:56.596Z","updated_at":"2025-06-13T21:05:20.167Z","avatar_url":"https://github.com/leonjessen.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"Building a simple neural network using Keras and Tensorflow\n================\n\nA minimal example for building your first simple artificial neural network using [Keras and TensorFlow for R](https://tensorflow.rstudio.com/keras/) - Right, let's get to it!\n\n### Data\n\n[The famous Iris flower data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) contains data to quantify the morphologic variation of Iris flowers of three related species. In other words - A total of 150 observations of 4 input features `Sepal.Length`, `Sepal.Width`, `Petal.Length` and `Petal.Width` and 3 output classes `setosa` `versicolor` and `virginica`, with 50 observations in each class. The distributions of the feature values looks like so:\n\n``` r\niris %\u003e% as_tibble %\u003e% gather(feature, value, -Species) %\u003e%\n  ggplot(aes(x = feature, y = value, fill = Species)) +\n  geom_violin(alpha = 0.5, scale = \"width\", position = position_dodge(width = 0.9)) +\n  geom_boxplot(alpha = 0.5, width = 0.2, position = position_dodge(width = 0.9)) +\n  theme_bw()\n```\n\n\u003cimg src=\"README_files/figure-markdown_github/see_iris-1.png\" style=\"display: block; margin: auto;\" /\u003e\n\n### Aim\n\nOur aim is to connect the 4 input features (`Sepal.Length`, `Sepal.Width`, `Petal.Length` and `Petal.Width`) to the correct output class (`setosa` `versicolor` and `virginica`) using an artificial neural network. For this task, we have chosen the following simple architecture with one input layer with 4 neurons (one for each feature), one hidden layer with 4 neurons and one output layer with 3 neurons (one for each class), all fully connected:\n\n\u003cimg src=\"img/architecture_visualisation.png\" width=\"500px\" style=\"display: block; margin: auto;\" /\u003e\n\nOur artificial neural network will have a total of 35 parameters: 4 for each input neuron connected to the hidden layer, plus an additional 4 for the associated first bias neuron and 3 for each of the hidden neurons connected to the output layer, plus an additional 3 for the associated second bias neuron. I.e. 4 ⋅ 4 + 4 + 4 ⋅ 3 + 3 = 35\n\n### Install Keras and TensorFlow for R\n\nBefore we begin, we need to install [Keras and TensorFlow for R](https://tensorflow.rstudio.com/keras/) as follows:\n\n``` r\ninstall.packages(\"keras\")\n```\n\nTensorFlow is the default backend engine. TensorFlow and Keras can be installed as follows:\n\n``` r\nlibrary(keras)\ninstall_keras()\n```\n\nWe also need to install [`TidyVerse`](https://www.tidyverse.org/):\n\n``` r\ninstall.packages(\"tidyverse\")\n```\n\n### Load libraries\n\n``` r\nlibrary(\"keras\")\nlibrary(\"tidyverse\")\n```\n\n### Prepare data\n\nWe start with slightly wrangling the iris data set by renaming and scaling the features and converting character labels to numeric:\n\n``` r\nnn_dat = iris %\u003e% as_tibble %\u003e%\n  mutate(sepal_l_feat = scale(Sepal.Length),\n         sepal_w_feat = scale(Sepal.Width),\n         petal_l_feat = scale(Petal.Length),\n         petal_w_feat = scale(Petal.Width),          \n         class_num    = as.numeric(Species) - 1, # factor, so = 0, 1, 2\n         class_label  = Species) %\u003e%\n  select(contains(\"feat\"), class_num, class_label)\nnn_dat %\u003e% head(3)\n```\n\n    ## # A tibble: 3 x 6\n    ##   sepal_l_feat sepal_w_feat petal_l_feat petal_w_feat class_num\n    ##          \u003cdbl\u003e        \u003cdbl\u003e        \u003cdbl\u003e        \u003cdbl\u003e     \u003cdbl\u003e\n    ## 1       -0.898        1.02         -1.34        -1.31        0.\n    ## 2       -1.14        -0.132        -1.34        -1.31        0.\n    ## 3       -1.38         0.327        -1.39        -1.31        0.\n    ## # ... with 1 more variable: class_label \u003cfct\u003e\n\nThen, we split the iris data into a training and a test data set, setting aside 20% of the data for left out data partition, to be used for final performance evaluation:\n\n``` r\ntest_f = 0.20\nnn_dat = nn_dat %\u003e%\n  mutate(partition = sample(c('train','test'), nrow(.), replace = TRUE, prob = c(1 - test_f, test_f)))\n```\n\nBased on the partition, we can now create training and test data\n\n``` r\nx_train = nn_dat %\u003e% filter(partition == 'train') %\u003e% select(contains(\"feat\")) %\u003e% as.matrix\ny_train = nn_dat %\u003e% filter(partition == 'train') %\u003e% pull(class_num) %\u003e% to_categorical(3)\nx_test  = nn_dat %\u003e% filter(partition == 'test')  %\u003e% select(contains(\"feat\")) %\u003e% as.matrix\ny_test  = nn_dat %\u003e% filter(partition == 'test')  %\u003e% pull(class_num) %\u003e% to_categorical(3)\n```\n\n### Set Architecture\n\nWith the data in place, we now set the architecture of our artificical neural network:\n\n``` r\nmodel = keras_model_sequential()\nmodel %\u003e% \n  layer_dense(units = 4, activation = 'relu', input_shape = 4) %\u003e% \n  layer_dense(units = 3, activation = 'softmax')\nmodel %\u003e% summary\n```\n\n    ## ___________________________________________________________________________\n    ## Layer (type)                     Output Shape                  Param #     \n    ## ===========================================================================\n    ## dense_1 (Dense)                  (None, 4)                     20          \n    ## ___________________________________________________________________________\n    ## dense_2 (Dense)                  (None, 3)                     15          \n    ## ===========================================================================\n    ## Total params: 35\n    ## Trainable params: 35\n    ## Non-trainable params: 0\n    ## ___________________________________________________________________________\n\nAs expected we see 35 trainable parameters. Next, the architecture set in the model needs to be compiled:\n\n``` r\nmodel %\u003e% compile(\n  loss      = 'categorical_crossentropy',\n  optimizer = optimizer_rmsprop(),\n  metrics   = c('accuracy')\n)\n```\n\n### Train the Artificial Neural Network\n\nLastly we fit the model and save the training progres in the `history` object:\n\n``` r\nhistory = model %\u003e% fit(\n  x = x_train, y = y_train,\n  epochs           = 200,\n  batch_size       = 20,\n  validation_split = 0\n)\nplot(history)\n```\n\n\u003cimg src=\"README_files/figure-markdown_github/fit_model-1.png\" style=\"display: block; margin: auto;\" /\u003e\n\n### Evaluate Network Performance\n\nThe final performance can be obtained like so:\n\n``` r\nperf = model %\u003e% evaluate(x_test, y_test)\nprint(perf)\n```\n\n    ## $loss\n    ## [1] 0.3843208\n    ## \n    ## $acc\n    ## [1] 1\n\nThen we can augment the `nn_dat` for plotting:\n\n``` r\nplot_dat = nn_dat %\u003e% filter(partition == 'test') %\u003e%\n  mutate(class_num = factor(class_num),\n         y_pred    = factor(predict_classes(model, x_test)),\n         Correct   = factor(ifelse(class_num == y_pred, \"Yes\", \"No\")))\nplot_dat %\u003e% select(-contains(\"feat\")) %\u003e% head(3)\n```\n\n    ## # A tibble: 3 x 5\n    ##   class_num class_label partition y_pred Correct\n    ##   \u003cfct\u003e     \u003cfct\u003e       \u003cchr\u003e     \u003cfct\u003e  \u003cfct\u003e  \n    ## 1 0         setosa      test      0      Yes    \n    ## 2 0         setosa      test      0      Yes    \n    ## 3 0         setosa      test      0      Yes\n\nand lastly, we can visualise the confusion matrix like so:\n\n``` r\ntitle     = \"Classification Performance of Artificial Neural Network\"\nsub_title = str_c(\"Accuracy = \", round(perf$acc, 3) * 100, \"%\")\nx_lab     = \"True iris class\"\ny_lab     = \"Predicted iris class\"\nplot_dat %\u003e% ggplot(aes(x = class_num, y = y_pred, colour = Correct)) +\n  geom_jitter() +\n  scale_x_discrete(labels = levels(nn_dat$class_label)) +\n  scale_y_discrete(labels = levels(nn_dat$class_label)) +\n  theme_bw() +\n  labs(title = title, subtitle = sub_title, x = x_lab, y = y_lab)\n```\n\n\u003cimg src=\"README_files/figure-markdown_github/conf_mat_vis-1.png\" style=\"display: block; margin: auto;\" /\u003e\n\n### Conclusion\n\nHere, we created a 3-class predictor with an accuracy of 100% on a left out data partition. I hope this little post illustrated how you can get started building artificial neural network using [Keras and TensorFlow in R](https://keras.rstudio.com/). This was a basic minimal example. It should be noted that the network can be expanded to create full deep Learning networks and furhtermore, the entire TensorFlow API is available. It also goes to show how important it is for a data scientist, that the tools needed to go effeciently from idea to implementation is available - Available and accessible technology is the cornerstone of modern data science.\n\nEnjoy and Happy Learning!\n\nLeon\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleonjessen%2Fkeras_tensorflow_on_iris","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fleonjessen%2Fkeras_tensorflow_on_iris","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleonjessen%2Fkeras_tensorflow_on_iris/lists"}