{"id":22636712,"url":"https://github.com/daroczig/ceu-r-mastering","last_synced_at":"2025-07-19T15:04:24.571Z","repository":{"id":139896781,"uuid":"183962078","full_name":"daroczig/CEU-R-mastering","owner":"daroczig","description":"Materials for the \"Mastering R\" class at CEU","archived":false,"fork":false,"pushed_at":"2023-06-05T17:03:47.000Z","size":594,"stargazers_count":7,"open_issues_count":0,"forks_count":10,"subscribers_count":1,"default_branch":"2022-2023","last_synced_at":"2025-07-14T05:23:41.827Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/daroczig.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-04-28T21:44:54.000Z","updated_at":"2025-02-13T23:06:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"87da566d-71ac-42dc-a1c5-da79f25a2b62","html_url":"https://github.com/daroczig/CEU-R-mastering","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/daroczig/CEU-R-mastering","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FCEU-R-mastering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FCEU-R-mastering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FCEU-R-mastering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FCEU-R-mastering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/daroczig","download_url":"https://codeload.github.com/daroczig/CEU-R-mastering/tar.gz/refs/heads/2022-2023","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FCEU-R-mastering/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265950420,"owners_count":23853755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-09T03:29:55.374Z","updated_at":"2025-07-19T15:04:24.560Z","avatar_url":"https://github.com/daroczig.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"This is the R script/materials repository of the \"[Mastering R Skills](https://courses.ceu.edu/courses/2022-2023/mastering-r-skills)\" course in the 2022/2023 Spring term, part of the [MSc in Business Analytics](https://courses.ceu.edu/programs/ms/master-science-business-analytics) at CEU. For the previous edition, see [2018/2019 Spring](https://github.com/daroczig/CEU-R-mastering/tree/2018-2019), [2019/2020 Spring](https://github.com/daroczig/CEU-R-mastering/tree/2019-2020), and [2020/2021 Spring](https://github.com/daroczig/CEU-R-mastering/tree/2020-2021).\n\n## Table of Contents\n\n* [Schedule](#schedule)\n* [Location](#location)\n* [Syllabus](#syllabus)\n* [Technical Prerequisites](#technical-prerequisites)\n* [Class materials](#class-materials)\n  * [Report on the current price of 0.42 BTC](#report-on-the-current-price-of-042-btc)\n  * [Report on the current price of 0.42 BTC in EUR](#report-on-the-current-price-of-042-btc-in-eur)\n  * [Move helpers to a new R package](#move-helpers-to-a-new-r-package)\n  * [Recap of week 1](#recap-of-week-1)\n  * [Homework for week 1 gotchas](#homework-for-week-1-gotchas)\n  * [Replace the home-brew retry with something better maintained](#replace-the-home-brew-retry-with-something-better-maintained)\n  * [Speed up flaky API calls with caching](#speed-up-flaky-api-calls-with-caching)\n  * [Report on the price of 0.42 BTC in the past 30 days](#report-on-the-price-of-042-btc-in-the-past-30-days)\n  * [Make sure our helper functions work!](#make-sure-our-helper-functions-work)\n  * [Recap of week 2](#recap-of-week-2)\n  * [Homework for week 2 gotchas](#homework-for-week-2-gotchas)\n  * [Report on the price of 0.42 BTC and 1.2 ETH in the past 30 days](#report-on-the-price-of-042-btc-and-12-eth-in-the-past-30-days)\n  * [Report on the price of cryptocurrency assets read from a database](#report-on-the-price-of-cryptocurrency-assets-read-from-a-database)\n  * [Report on the price of cryptocurrency assets based on the transaction history read from a database](#report-on-the-price-of-cryptocurrency-assets-based-on-the-transaction-history-read-from-a-database)\n* [Home assignments](#homeworks)\n  * [Week 1](#week-1)\n  * [Week 2](#week-2)\n  * [Week 3](#week-3)\n* [References](#references)\n\n## Schedule\n\n2 x 150 mins on May 22, 31:\n\n* 13:30 - 15:00 session 1\n* 15:00 - 15:15 break\n* 15:15 - 16:15 session 2\n\n1 x 300 mins on June 5:\n\n* 13:30 - 15:10 session 1\n* 15:10 - 15:40 break\n* 15:40 - 17:20 session 2\n* 17:20 - 17:40 break\n* 17:40 - 19:20 session 3\n\n## Location\n\nIn-person at the Vienna campus (QS B-421).\n\n## Syllabus\n\nPlease find in the `syllabus` folder of this repository.\n\n## Technical Prerequisites\n\n0. Bookmark, watch or star this repository so that you can easily find it later.\n1. Please bring your own laptop and make sure to install R and RStudio **before** attending the first class!\n\n    💪 R packages to be installed from CRAN via `install.packages`:\n\n    * `data.table`\n    * `httr`\n    * `jsonlite`\n    * `lubridate`\n    * `ggplot2`\n    * `scales`\n    * `zoo`\n    * `RMySQL`\n    * `RSQLite`\n    * `openxlsx`\n    * `googlesheets4`\n    * `devtools`\n    * `roxygen2`\n    * `pander`\n    * `logger`\n    * `botor` (requires Python and the `boto3` Python module)\n    * `purrr`\n    * `memoise`\n\n    💪 R packages to be installed from GitHub via `remotes::install_github`:\n\n    * `daroczig/binancer`\n    * `daroczig/logger`\n    * `daroczig/dbr`\n\n    If you get stuck, feel free to use the preconfigured, shared RStudio Server at http://mr.ceudata.net/rstudio (I will share the usernames and passwords at the start of the class). In such case, you can skip all the steps prefixed with \"💪\" as the server already have that configured.\n\n2. Join the #ba-mr-2022 Slack channel in the `ceu-bizanalytics` Slack group.\n3. If you do not already have a GitHub account, create one\n4. Optionally create a new GitHub repository called `mastering-r` (or similar), but can be done later as well for th e R package (see below).\n5. 💪 Install `git` from https://git-scm.com/\n6. 💪 Verify that in RStudio, you can see the path of the `git` executable binary in the Tools/Global Options menu's \"Git/Svn\" tab -- if not, then you might have to restart RStudio (if you installed git after starting RStudio) or installed git by not adding that to the PATH on Windows. Either way, browse the \"git executable\" manually (in some `bin` folder look for thee `git` executable file).\n8. Create an RSA key via Tools/Global options/Git/Create RSA Key button (optionally with a passphrase for increased security -- that you have to enter every time you push and pull to and from GitHub), then copy the public key (from `~/.ssh/id_rsa.pub`) and add that to you SSH keys on your [GitHub profile](https://github.com/settings/ssh/new).\n9. Create a new project in RStudio choosing \"version control\", then \"git\" and paste the SSH version of the repo URL copied from GitHub (from point 4) in the pop-up -- now RStudio should be able to download the repo. If it asks you to accept GitHub's fingerprint, say \"Yes\".\n9. If RStudio/git is complaining that you have to set your identity, click on the \"Git\" tab in the top-right panel, then click on the Gear icon and then \"Shell\" -- here you can set your username and e-mail address in the command line, so that RStudio/git integration can work. Use the following commands:\n\n    ```sh\n    $ git config --global user.name \"Your Name\"\n    $ git config --global user.email \"Your e-mail address\"\n    ```\n\n    Close this window, commit, push changes, all set.\n\nFind more resources in Jenny Bryan's \"[Happy Git and GitHub for the useR](http://happygitwithr.com/)\" tutorial if in doubt or [contact me](#contact).\n\n## Class materials\n\n### Report on the current price of 0.42 BTC\n\nWe have 0.42 Bitcoin. Let's write an R script reporting on the current value of this asset in USD.\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a hint ...\u003c/summary\u003e\n\n  We installed the `binancer` package for a reason! Look up the related functions via `help(package = binancer)`.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ...\u003c/summary\u003e\n\n```r\n## library(devtools)\n## install_github('daroczig/binancer')\n\nlibrary(binancer)\ncoin_prices \u003c- binance_coins_prices()\ncoin_prices[symbol == 'BTC', usd]\n\n## don't forget that we need to report on the price of 0.42 BTC instead of 1 BTC\ncoin_prices[symbol == 'BTC', usd * 0.42]\n```\n\n\u003c/details\u003e\n\n### Report on the current price of 0.42 BTC in EUR\n\nLet's do the same report as above, but instead of USD, now let's report in Euros.\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ...\u003c/summary\u003e\n\n```r\n## How to get EUR/HUF rate?\n## See eg https://exchangerate.host for free API access\n\n## Loading data without any dependencies\n## https://api.exchangerate.host/latest\n## https://api.exchangerate.host/latest?base=USD\n\nreadLines('https://api.exchangerate.host/latest?base=USD')\n\n## Parse JSON\nlibrary(jsonlite)\nfromJSON(readLines('https://api.exchangerate.host/latest?base=USD'))\nfromJSON('https://api.exchangerate.host/latest?base=USD')\n\n## Extract the USD/HUF exchange rate from the list\nusdeur \u003c- fromJSON('https://api.exchangerate.host/latest?base=USD\u0026symbols=EUR')$rates$EUR\ncoin_prices[symbol == 'BTC', 0.42 * usd * usdeur]\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... after cleaning up\u003c/summary\u003e\n\n```r\n## loading requires packages on the top of the script\nlibrary(binancer)\nlibrary(httr)\n\n## constants\nBITCOINS \u003c- 0.42\n\n## get Bitcoin price in USD\ncoin_prices \u003c- binance_coins_prices()\nbtcusdt \u003c- coin_prices[symbol == 'BTC', usd]\n\n## get USD/HUF exchange rate\nusdeur \u003c- fromJSON('https://api.exchangerate.host/lat?base=USD\u0026symbols=EUR')$rates$EUR\n\n## report\nBITCOINS * btcusdt * usdeur\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with logging\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(httr)\nlibrary(data.table)\nlibrary(logger)\n\nBITCOINS \u003c- 0.42\n\ncoin_prices \u003c- binance_coins_prices()\nlog_info('Found {coin_prices[, .N]} coins on Binance')\nbtcusdt \u003c- coin_prices[symbol == 'BTC', usd]\nlog_info('The current Bitcoin price is ${btcusdt}')\n\nusdeur \u003c- fromJSON('https://api.exchangerate.host/latest?base=USD\u0026symbols=EUR')$rates$EUR\nlog_info('1 USD currently costs {usdeur} EUR')\n\nlog_eval(BITCOINS * btcusdt * usdeur, level = INFO)\nlog_info('{BITCOINS} Bitcoins now worth {round(btcusdt * usdeur * BITCOINS)} EUR')\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with validating values received from the API\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(httr)\nlibrary(data.table)\nlibrary(logger)\nlibrary(checkmate)\n\nBITCOINS \u003c- 0.42\n\ncoin_prices \u003c- binance_coins_prices()\nlog_info('Found {coin_prices[, .N]} coins on Binance')\nbtcusdt \u003c- coin_prices[symbol == 'BTC', usd]\nlog_info('The current Bitcoin price is ${btcusdt}')\nassert_number(btcusdt, lower = 1000)\n\nusdeur \u003c- fromJSON('https://api.exchangerate.host/latest?base=USD\u0026symbols=EUR')$rates$EUR\nlog_info('1 USD currently costs {usdeur} EUR')\nassert_number(usdeur, lower = 0.9, upper = 1.1)\n\nlog_info('{BITCOINS} Bitcoins now worth {round(btcusdt * usdeur * BITCOINS)} EUR')\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with auto-retries for API errors\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(httr)\nlibrary(data.table)\nlibrary(logger)\nlibrary(checkmate)\n\nBITCOINS \u003c- 0.42\n\nget_usdeur \u003c- function() {\n  tryCatch({\n    usdeur \u003c- fromJSON('https://api.exchangerate.host/latest?base=USD\u0026symbols=EUR')$rates$EUR\n    assert_number(usdeur, lower = 0.9, upper = 1.1)\n  }, error = function(e) {\n    ## str(e)\n    log_error(e$message)\n    Sys.sleep(1)\n    get_usdeur()\n  })\n  log_info('1 USD={usdeur} EUR')\n  usdeur\n}\n\nget_bitcoin_price \u003c- function() {\n  tryCatch({\n      btcusdt \u003c- binance_coins_prices()[symbol == 'BTC', usd]\n      assert_number(btcusdt, lower = 1000)\n      log_info('The current Bitcoin price is ${btcusdt}')\n      btcusdt\n  },\n  error = function(e) {\n    log_error(e$message)\n    Sys.sleep(1)\n    get_bitcoin_price()\n  })\n}\n\nlog_info('{BITCOINS} Bitcoins now worth {round(get_bitcoin_price() * get_usdeur() * BITCOINS)} EUR')\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with auto-retries for API errors with exponential backoff\u003c/summary\u003e\n\n```r\nget_usdeur \u003c- function(retried = 0) {\n  tryCatch({\n    ## httr\n    usdeur \u003c- fromJSON('https://api.exchangerate.host/latest?base=USD\u0026symbols=EUR')$rates$EUR\n    assert_number(usdeur, lower = 0.9, upper = 1.1)\n  }, error = function(e) {\n    ## str(e)\n    log_error(e$message)\n    if (retried \u003e 3) {\n      stop('Gave up')\n    }\n    Sys.sleep(1 + retried ^ 2)\n    get_usdeur(retried = retried + 1)\n  })\n  log_info('1 USD={usdeur} EUR')\n  usdeur\n}\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with better currency formatter\u003c/summary\u003e\n\n```r\nround(btcusdt * usdeur * BITCOINS)\nformat(btcusdt * usdeur * BITCOINS, big.mark = ',', digits = 10)\nformat(btcusdt * usdeur * BITCOINS, big.mark = ',', digits = 6)\n\nlibrary(scales)\ndollar(btcusdt * usdeur * BITCOINS)\ndollar(btcusdt * usdeur * BITCOINS, prefix = '€')\ndollar(btcusdt * usdeur * BITCOINS, prefix = '', suffix = ' EUR')\n\neuro \u003c- function(x) {\n  dollar(x, prefix = '€')\n}\neuro(get_bitcoin_price() * get_usdeur() * BITCOINS)\n```\n\n\u003c/details\u003e\n\n### Move helpers to a new R package\n\n1. Click File / New Project / New folder and create a new R package (maybe call it `mr`, also create a git repo for it) -- that will fill in your newly created folder with a package skeleton delivering the `hello` function in the `hello.R` file.\n\n2. Get familiar with:\n\n    * the `DESCRIPTION` file\n\n        * semantic versioning: https://semver.org\n        * open-source license, see eg http://r-pkgs.had.co.nz/description.html#license or https://rstats-pkgs.readthedocs.io/en/latest/licensing.html\n\n    * the `R` subfolder\n    * the `man` subfolder\n    * the `NAMESPACE` file\n\n3. Install the package (in the Build menu), load it and try `hello()`, then `?hello`\n4. Create a git repo (if not done that already) and add/commit this package skeleton\n5. Add a new function called `euro` in the `R` subfolder:\n\n    \u003cdetails\u003e\n      \u003csummary\u003e\u003ccode\u003eeuro.R\u003c/code\u003e\u003c/summary\u003e\n\n    ```r\n    euro \u003c- function(x) {\n      dollar(x, prefix = '€')\n    }\n    ```\n\n    \u003c/details\u003e\n\n6. Install the package, re-load it, and try running `euro` eg calling on `42` -- realize it's failing\n7. After loading the `scales` package (that delivers the `dollar` function), it works ... we need to prepare our package to load `scales::dollar` without user interventation\n8. Also, look at the docs of `euro` -- realize it's missing, so let's learn about `roxygen2` and update the `euro.R` file to explicitely list the function to be exported and note that `dollar` is to be imported from the `scales` package:\n\n    \u003cdetails\u003e\n      \u003csummary\u003e\u003ccode\u003eeuro.R\u003c/code\u003e\u003c/summary\u003e\n\n    ```r\n    #' Formats number in EUR currency\n    #' @param x number\n    #' @return string\n    #' @export\n    #' @importFrom scales dollar\n    #' @examples\n    #' euro(1000)\n    #' euro(10.3241245125125)\n    euro \u003c- function(x) {\n      dollar(x, prefix = '€')\n    }\n    ```\n\n    \u003c/details\u003e\n\n9. Run `roxygen2` on the package by enabling it in the \"Build\" menu's \"Configure Build Tools\", then \"Document\" it (if there's no such option, probably you need to install the `roxygen2` package first), and make sure to check what changes happened in the `man`, `NAMESPACE` (note that you might need to delete the original one) and `DESCRIPTION` files. It's also a good idea to automatically run `roxygen2` before each install, so I'd suggests marking that option as well. The resulting files should look something like:\n\n    \u003cdetails\u003e\n      \u003csummary\u003e\u003ccode\u003eDESCRIPTION\u003c/code\u003e\u003c/summary\u003e\n\n    ```\n    Package: mr\n    Type: Package\n    Title: Demo R package for the Mastering R class\n    Version: 0.1.0\n    Author: Gergely \u003c***@***.***\u003e\n    Maintainer: Gergely \u003c***@***.***\u003e\n    Description: Demo R package for the Mastering R class\n    License: AGPL\n    Encoding: UTF-8\n    LazyData: true\n    RoxygenNote: 7.1.0\n    Imports: scales\n    ```\n\n    \u003c/details\u003e\n\n    \u003cdetails\u003e\n      \u003csummary\u003e\u003ccode\u003eNAMESPACE\u003c/code\u003e\u003c/summary\u003e\n\n    ```\n    # Generated by roxygen2: do not edit by hand\n\n    export(euro)\n    importFrom(scales,dollar)\n    ```\n\n    \u003c/details\u003e\n\n10. Keep committing to the git repo\n11. Delete `hello.R` and rerun `roxygen2` / reinstall the package\n12. Add a new function that gets the most exchange rate for USD/EUR:\n\n    \u003cdetails\u003e\n      \u003csummary\u003e\u003ccode\u003econverter.R\u003c/code\u003e\u003c/summary\u003e\n\n    ```r\n    #' Look up the value of a US Dollar in EURs\n    #' @param retried number of times the function already failed\n    #' @return number\n    #' @export\n    #' @importFrom jsonlite fromJSON\n    #' @importFrom logger log_error log_info\n    #' @importFrom checkmate assert_number\n    get_usdeur \u003c- function(retried = 0) {\n      tryCatch({\n        ## httr\n        usdeur \u003c- fromJSON('https://api.exchangerate.host/latest?base=USD\u0026symbols=EUR')$rates$EUR\n        assert_number(usdeur, lower = 0.9, upper = 1.1)\n      }, error = function(e) {\n        ## str(e)\n        log_error(e$message)\n        if (retried \u003e 3) {\n          stop('Gave up')\n        }\n        Sys.sleep(1 + retried ^ 2)\n        get_usdeur(retried = retried + 1)\n      })\n      log_info('1 USD={usdeur} EUR')\n      usdeur\n    }\n    ```\n\n    \u003c/details\u003e\n\n13. Now you can run the original R script hitting the Binance and exchangerate.host APIs by using these helper functions:\n\n```r\nlibrary(binancer)\nlibrary(logger)\nlog_threshold(TRACE)\nlibrary(scales)\nlibrary(mr)\n\nBITCOINS \u003c- 0.42\nlog_info('Number of Bitcoins: {BITCOINS}')\n\nusdeur \u003c- get_usdeur()\n\nbtcusd \u003c- binance_coins_prices()[symbol == 'BTC', usd]\nlog_info('1 BTC={dollar(btcusd)}')\n\nlog_info('My crypto fortune is {euro(BITCOINS * btcusd * usdeur)}')\n```\n\n14. Make sure that the R package works as intended, and then push to Github.\n\n### Recap of week 1\n\n* writing helper functions\n* API integrations\n* documenting helper functions\n* creating an R package from helper functions\n\n### Homework for week 1 gotchas\n\n* easy to mess up copy/paste\n* make sure to test your function in a clean environment\n* [import `data.table`](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-importing.html) if a package needs i!\n\nThe homework has been published at https://github.com/daroczig/CEU-R-mastering-demo-pkg/tree/76b283914380f05e0ddfdb44b98fe6560d86dc02\n\nLet's fork the above repository and continue working on that from now on,\nso that later we can also prepare a pull request for the main repo!\n\nYou can also install the above version of `mr` via:\n\n```r\ndevtools::install_github('daroczig/CEU-R-mastering-demo-pkg')\n```\n\n### Replace the home-brew retry with something better maintained\n\nCheck out how `purrr::insistently` works!\n\n1. Import the `insistently` function `purrr` with a roxygen tag\n2. Add `purrr` to the `Imports` of your `DESCRIPTION` file\n3. Drop the `tryCatch` handler and let the function fail on error\n4. Wrap your function with `insistently`\n5. Optionally enable reporting on errors via setting the `quiet` flag to `FALSE`\n\n```r\n#' Look up the current price of a Bitcoin in USD\n#' @param retried number of times the function already failed\n#' @return number\n#' @export\n#' @importFrom binancer binance_coins_prices\n#' @importFrom logger log_error log_info\n#' @importFrom checkmate assert_number\n#' @import data.table\n#' @importFrom purrr insistently\nget_bitcoin_price \u003c- insistently(function() {\n    if (runif(1) \u003e 0.5) stop('oh nooo') # TODO drop\n    btcusdt \u003c- binance_coins_prices()[symbol == 'BTC', usd]\n    assert_number(btcusdt, lower = 1000)\n    log_info('The current Bitcoin price is ${btcusdt}')\n    btcusdt\n}, quiet = FALSE)\n```\n\n### Speed up flaky API calls with caching\n\nCheck out how `memoise::memoise` works! Make sure to set a TTL (time to live) for the cached value .. crypto markets are changing rapidly :)\n\n1. Import the `memoise` function `memoise` with a roxygen tag\n2. Add `memoise` to the `Imports` of your `DESCRIPTION` file\n3. Wrap your function with `memoise`\n4. Look up the `cache_mem` function of the `cachem` package mentioned in the `memoise` docs\n5. Set up a custom cache with a 5 seconds TTL by calling `cache_mem(max_age = 5)` as the `cache` argument of `memoise`, and make sure to do the related imports properly: add a roxygen tag to import `cache_mem` from `cachem` and add `cachem` in the `DESCRIPTION` file\n6. Indent your code so that it is clear which argument belongs to which function\n\n```r\n#' Look up the current price of a Bitcoin in USD\n#' @param retried number of times the function already failed\n#' @return number\n#' @export\n#' @importFrom binancer binance_coins_prices\n#' @importFrom logger log_error log_info\n#' @importFrom checkmate assert_number\n#' @import data.table\n#' @importFrom purrr insistently\n#' @importFrom memoise memoise\n#' @importFrom cachem cache_mem\nget_bitcoin_price \u003c- memoise(\n    insistently(\n        function() {\n            btcusdt \u003c- binance_coins_prices()[symbol == 'BTC', usd]\n            assert_number(btcusdt, lower = 1000)\n            log_info('The current Bitcoin price is ${btcusdt}')\n            btcusdt\n        },\n        quiet = FALSE),\n    cache = cache_mem(max_age = 5)\n)\n```\n\n### Report on the price of 0.42 BTC in the past 30 days\n\nLet's do the same report as above, but instead of reporting the most recent value of the asset, let's report on the daily values from the past 30 days, e.g. on a line plot.\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with fixed USD/HUF exchange rate\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(httr)\nlibrary(data.table)\nlibrary(logger)\nlibrary(ggplot2)\nlibrary(mr)\n\n## ########################################################\n## CONSTANTS\n\nBITCOINS \u003c- 0.42\n\n## ########################################################\n## Loading data\n\nusdeur \u003c- get_usdeur()\n\nbtcusdt \u003c- binance_klines('BTCUSDT', interval = '1d', limit = 30)\nstr(btcusdt)\n\nbalance \u003c- btcusdt[, .(date = as.Date(close_time), btcusd = close)]\nstr(balance)\n\nbalance[, btceur := btcusd * usdeur]\nbalance[, btc := BITCOINS]\nbalance[, value := btc * btceur]\nstr(balance)\n\n## ########################################################\n## Report\n\nggplot(balance, aes(date, value)) +\n  geom_line() +\n  xlab('') +\n  ylab('') +\n  scale_y_continuous(labels = euro) +\n  theme_bw() +\n  ggtitle('My crypto fortune',\n          subtitle = paste(BITCOINS, 'BTC'))\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ... with daily corrected USD/HUF exchange rate\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(httr)\nlibrary(data.table)\nlibrary(logger)\nlibrary(scales)\nlibrary(ggplot2)\nlibrary(mr)\n\n## ########################################################\n## CONSTANTS\n\nBITCOINS \u003c- 0.42\n\n## ########################################################\n## Loading data\n\nusdeur \u003c- get_usdeur()\n\n## try with a single date?\nfromJSON('https://api.exchangerate.host/2023-05-01?base=USD\u0026symbols=HUF')\n## no, it's just a single day\n# fromJSON('https://api.exchangerate.host/timeseries?start_date=2023-05-01\u0026base=USD\u0026symbols=HUF')\n## need end\nfromJSON('https://api.exchangerate.host/timeseries?start_date=2023-05-01\u0026end_date=2023-05-05\u0026base=USD\u0026symbols=HUF')\n## we can do a much better job!\n\nlibrary(httr)\nresponse \u003c- GET(\n  'https://api.exchangerate.host/timeseries',\n  query = list(\n    start_date = Sys.Date() - 30,\n    end_date   = Sys.Date(),\n    base       = 'USD',\n    symbols    = 'EUR'\n  ))\nexchange_rates \u003c- content(response)\nstr(exchange_rates)\nexchange_rates \u003c- exchange_rates$rates\n\nlibrary(data.table)\nusdeur \u003c- data.table(\n  date = as.Date(names(exchange_rates)),\n  usdeur = as.numeric(unlist(exchange_rates)))\nstr(usdeur)\n## NOTE last element might be an empty list if early in the day ...\n##      query yesterday or drop last row when this occurs\n\n## Bitcoin price in USD\nbtcusdt \u003c- binance_klines('BTCUSDT', interval = '1d', limit = 30)\nstr(btcusdt)\n\nbalance \u003c- btcusdt[, .(date = as.Date(close_time), btcusd = close)]\nstr(balance)\nstr(usdeur)\n\nbalance \u003c- merge(balance, usdeur, by = 'date')\nbalance[, btceur := btcusd * usdeur]\nbalance[, btc := 0.42]\nbalance[, value := btc * btceur]\n\n## ########################################################\n## Report\n\nggplot(balance, aes(date, value)) +\n  geom_line() +\n  xlab('') +\n  ylab('') +\n  scale_y_continuous(labels = euro) +\n  theme_bw() +\n  ggtitle('My crypto fortune',\n          subtitle = paste(BITCOINS, 'BTC'))\n```\n\n\u003c/details\u003e\n\nNow let's create the `get_usdeurs` function (similar to `get_usdeur`) to take start and end dates! Although we can set the start and end date default to today, so would return the same value as `get_usdeur` and could be the latter deprecated, not that this new function will return a `data.frame` or `data.table` object, so thus there's value in keeping the previous one as well.\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003ccode\u003eexchange_rates.R\u003c/code\u003e\u003c/summary\u003e\n\n```r\n#' Look up the value of a US Dollar in Euro\n#' @param start_date date\n#' @param end_date date\n#' @return \\code{data.table} object with dates and values\n#' @export\n#' @importFrom httr GET content\n#' @importFrom logger log_error log_info\n#' @importFrom checkmate assert_numeric\n#' @importFrom data.table data.table\n#' @importFrom purrr insistently\n#' @importFrom memoise memoise\nget_usdeurs \u003c- memoise(\n    insistently(\n        function(start_date = Sys.Date(), end_date = Sys.Date()) {\n            response \u003c- GET(\n                'https://api.exchangerate.host/timeseries',\n                query = list(\n                    start_date = start_date,\n                    end_date   = end_date,\n                    base       = 'USD',\n                    symbols    = 'EUR'\n                )\n            )\n            exchange_rates \u003c- content(response)$rates\n            usdeur \u003c- data.table(\n                date = as.Date(names(exchange_rates)),\n                usdeur = as.numeric(unlist(exchange_rates)))\n            assert_numeric(usdeur$usdeur, lower = 0.9, upper = 1.1)\n            usdeur\n        }\n    )\n)\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003ccode\u003eCleaned up R script using the above helper function\u003c/code\u003e\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(data.table)\nlibrary(ggplot2)\nlibrary(mr)\n\n## ########################################################\n## CONSTANTS\n\nBITCOINS \u003c- 0.42\n\n## ########################################################\n## Loading data\n\nusdeurs \u003c- get_usdeurs(Sys.Date() - 30, Sys.Date())\nbtcusdt \u003c- binance_klines('BTCUSDT', interval = '1d', limit = 30)\nbalance \u003c- btcusdt[, .(date = as.Date(close_time), btcusd = close)]\n\nbalance \u003c- merge(balance, usdeurs, by = 'date')\nbalance[, btceur := btcusd * usdeur]\nbalance[, btc := BITCOINS]\nbalance[, value := btc * btceur]\n\n## ########################################################\n## Report\n\nggplot(balance, aes(date, value)) +\n  geom_line() +\n  xlab('') +\n  ylab('') +\n  scale_y_continuous(labels = euro) +\n  theme_bw() +\n  ggtitle('My crypto fortune',\n          subtitle = paste(BITCOINS, 'BTC'))\n```\n\n\u003c/details\u003e\n\n### Make sure our helper functions work!\n\nMake sure to consult the related chapter of Hadley Wickham's \"R packages\" book at http://r-pkgs.had.co.nz, but in short:\n\n0. Load the `usethis` package to scaffold the boring parts of setting up unit tests.\n1. Run `use_testthat` to configure the package for unit testing with `testthat`. This will update the `DESCRIPTION` file, create the `tests/testthat` folder and the `tests/testthat.R` file.\n2. Run `use_test('euro')` to generate `tests/testthat/test-euro.R`.\n3. Edit the `test-euro.R` file to write an actual test:\n\n    ```r\n    test_that(\"euro sign added\", {\n      expect_equal(euro(2), '€2')\n    })\n    ```\n\n4. Run the test via `devtools::test()`\n5. Check test coverage via `devtools::test_coverage()`\n\nCheck out some of the relevant advanced topics, e.g.\n\n* implementing automatically running the tests via GitHub Actions for future pushes in the repo\n* mock API calls, see e.g. https://r-pkgs.org/testing-advanced.html#mocking\n\n### Recap of week 2\n\n* revisit retries and caching\n* further API integrations\n* unit testing\n\nIt is recommended to install the current version of `mr`:\n\n```r\ndevtools::install_github('daroczig/CEU-R-mastering-demo-pkg@week2')\n```\n\n### Homework for week 2 gotchas\n\n[![A QA engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 99999999999 beers. Orders a lizard. Orders -1 beers. Orders a ueicbksjdhd. First real customer walks in and asks where the bathroom is. The bar bursts into flames, killing everyone.](https://github.com/daroczig/CEU-R-mastering/assets/495736/18d88b52-fd09-4ee0-91a1-8b96e062f89c)](https://twitter.com/brenankeller/status/1068615953989087232)\n\n### Report on the price of 0.42 BTC and 1.2 ETH in the past 30 days\n\nLet's do the same report as above, but now we not only have 0.42 Bitcoin, but 1.2 Ethereum as well.\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution ...\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(data.table)\nlibrary(ggplot2)\nlibrary(mr)\n\n## ########################################################\n## CONSTANTS\n\nBITCOINS  \u003c- 0.42\nETHEREUMS \u003c- 1.2\n\n## ########################################################\n## Loading data\n\nusdeurs \u003c- get_usdeurs(start_date = Sys.Date() - 30, end_date = Sys.Date())\n\n## Cryptocurrency prices in USD\nbtcusdt \u003c- binance_klines('BTCUSDT', interval = '1d', limit = 30)\nethusdt \u003c- binance_klines('ETHUSDT', interval = '1d', limit = 30)\ncoinusdt \u003c- rbind(btcusdt, ethusdt)\nstr(coinusdt)\n## oh no, how to keep the symbol??\ncoinusdt[, .(date = as.Date(close_time), btcusd = close, symbol = ???)]\n\n## DRY (don't repeat yourself)\nbalance \u003c- rbindlist(lapply(c('BTC', 'ETH'), function(s) {\n  binance_klines(paste0(s, 'USDT'), interval = '1d', limit = 30)[, .(\n    date = as.Date(close_time),\n    usdt = close,\n    symbol = s\n  )]\n}))\n\nbalance \u003c- balance[, amount := switch(\n  symbol,\n  'BTC' = BITCOINS,\n  'ETH' = ETHEREUMS,\n  stop('Unsupported coin')),\n  by = symbol]\nstr(balance)\n\nbalance \u003c- merge(balance, usdeurs, by = 'date')\nbalance[, value := amount * usdt * usdeur]\nstr(balance)\n\n## ########################################################\n## Report\n\nggplot(balance, aes(date, value, fill = symbol)) +\n  geom_col() +\n  xlab('') +\n  ylab('') +\n  scale_y_continuous(labels = euro) +\n  theme_bw() +\n  ggtitle(\n    'My crypto fortune',\n    subtitle = balance[date == max(date), paste(paste(amount, symbol), collapse = ' + ')])\n```\n\n\u003c/details\u003e\n\n### Report on the price of cryptocurrency assets read from a database\n\n1. 💪 Create a new MySQL database at Amazon AWS and don't forget to set an \"inital database name\" and make it publicly accessible.\n\n2. Log in and give a try with MySQL client:\n\n    ```shell\n    mysql -h mr.cf27iwlo5bzr.eu-west-1.rds.amazonaws.com -u admin -p\n    ```\n\n    Look around:\n\n    ```shell\n    show databases;\n    use crypto;\n    show tables;\n    desc coins;\n    select * FROM coins;\n    ```\n\n3. 💪 Install `dbr` from GitHub:\n\n    ```r\n    library(devtools)\n    install_github('daroczig/logger')\n    install_github('daroczig/dbr')\n    ```\n\n4. 💪 Install `botor` as well to be able to use encrypted credentials (note that this requires you to install Python first and then `pip install boto3` as well):\n\n    ```r\n    install_github('daroczig/botor')\n    ```\n\n5. Set up a YAML file (menu: new file/text file, save as `databases.yml`) for the database connection, something like:\n\n   ```shell\n   remotemysql:\n     host: ...\n     port: 3306\n     dbname: ...\n     user: ...\n     drv: !expr RMySQL::MySQL()\n     password: ...\n   ```\n\n6. Set up `dbr` to use that YAML file:\n\n    ```r\n    options('dbr.db_config_path' = '/path/to/databases.yml')\n    ```\n\n7. 💪 Create a table for the balances and insert some records:\n\n    ```r\n    library(dbr)\n    db_config('remotemysql')\n    db_query('CREATE TABLE coins (symbol VARCHAR(3) NOT NULL, amount DOUBLE NOT NULL DEFAULT 0)', 'remotemysql')\n    db_query('TRUNCATE TABLE coins', 'remotemysql')\n    db_query('INSERT INTO coins VALUES (\"BTC\", 0.42)', 'remotemysql')\n    db_query('INSERT INTO coins VALUES (\"ETH\", 1.2)', 'remotemysql')\n    ```\n\n8. Write the reporting script, something like:\n\n    \u003cdetails\u003e\n      \u003csummary\u003eClick here for a potential solution ...\u003c/summary\u003e\n\n    ```r\n    library(binancer)\n    library(data.table)\n    library(logger)\n    library(ggplot2)\n    library(mr)\n\n    library(dbr)\n    options('dbr.db_config_path' = '/path/to/databases.yml')\n    options('dbr.output_format' = 'data.table')\n\n    ## ########################################################\n    ## Loading data\n\n    ## Read actual balances from the DB\n    balance \u003c- db_query('SELECT * FROM coins', 'remotemysql')\n\n    ## Look up cryptocurrency prices in USD and merge balances\n    balance \u003c- rbindlist(lapply(balance$symbol, function(s) {\n      binance_klines(paste0(s, 'USDT'), interval = '1d', limit = 30)[, .(\n        date = as.Date(close_time),\n        usdt = close,\n        symbol = s,\n        amount = balance[symbol == s, amount]\n      )]\n    }))\n\n    ## USD in EUR\n    usdeurs \u003c- get_usdeurs(start_date = Sys.Date() - 30, end_date = Sys.Date())\n\n    ## join USD/HUF exchange rate to balances\n    balance \u003c- merge(balance, usdeurs, by = 'date')\n    balance[, value := amount * usdt * usdeur]\n\n    ## ########################################################\n    ## Report\n\n    ggplot(balance, aes(date, value, fill = symbol)) +\n      geom_col() +\n      xlab('') +\n      ylab('') +\n      #scale_y_continuous(labels = forint) +\n      theme_bw() +\n      ggtitle(\n        'My crypto fortune',\n        subtitle = balance[date == max(date), paste(paste(amount, symbol), collapse = ' + ')])\n    ```\n\n    \u003c/details\u003e\n\n9. Rerun the above report after inserting two new records to the table:\n\n    ```r\n    db_query(\"INSERT INTO coins VALUES ('NEO', 100)\", 'remotemysql')\n    db_query(\"INSERT INTO coins VALUES ('LTC', 25)\", 'remotemysql')\n    ```\n\n### Report on the price of cryptocurrency assets based on the transaction history read from a database\n\n💪 Let's prepare the transactions table:\n\n```r\nlibrary(dbr)\noptions('dbr.db_config_path' = '/path/to/database.yml')\noptions('dbr.output_format' = 'data.table')\n\ndb_query('\n  CREATE TABLE transactions (\n    date TIMESTAMP NOT NULL,\n    symbol VARCHAR(3) NOT NULL,\n    amount DOUBLE NOT NULL DEFAULT 0)',\n  db = 'remotemysql')\n\ndb_query('TRUNCATE TABLE transactions', 'remotemysql')\ndb_query('INSERT INTO transactions VALUES (\"2023-05-11 10:42:02\", \"BTC\", 1.42)', 'remotemysql')\ndb_query('INSERT INTO transactions VALUES (\"2023-05-11 10:45:20\", \"ETH\", 1.2)', 'remotemysql')\ndb_query('INSERT INTO transactions VALUES (\"2023-05-18\", \"BTC\", -1)', 'remotemysql')\ndb_query('INSERT INTO transactions VALUES (\"2023-05-23\", \"NEO\", 100)', 'remotemysql')\ndb_query('INSERT INTO transactions VALUES (\"2023-05-30 12:12:21\", \"LTC\", 25)', 'remotemysql')\n```\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here for a potential solution for the report ...\u003c/summary\u003e\n\n```r\nlibrary(binancer)\nlibrary(data.table)\nlibrary(logger)\nlibrary(ggplot2)\nlibrary(zoo)\nlibrary(mr)\n\n## ########################################################\n## Loading data\n\n## Read transactions from the DB\ntransactions \u003c- db_query('SELECT * FROM transactions', 'remotemysql')\n\n## Prepare daily balance sheets\nbalance \u003c- transactions[, .(date = as.Date(date), amount = cumsum(amount)), by = symbol]\nbalance\n\n## Transform long table into wide\nbalance \u003c- dcast(balance, date ~ symbol)\nbalance\n\n## Add missing dates\ndates \u003c- data.table(date = seq(from = Sys.Date() - 30, to = Sys.Date(), by = '1 day'))\nbalance \u003c- merge(balance, dates, by = 'date', all.x = TRUE, all.y = TRUE)\nbalance\n\n## Fill in missing values between actual balances\nbalance \u003c- na.locf(balance, na.rm = FALSE)\n\n## Fill in remaining missing values with zero\nbalance[is.na(balance)] \u003c- 0\n\n## Transform wide table back to long format\nbalance \u003c- melt(balance, id.vars = 'date', variable.name = 'symbol', value.name = 'amount')\nbalance\n\n## Get crypto prices\nprices \u003c- rbindlist(lapply(as.character(unique(balance$symbol)), function(s) {\n    binance_klines(paste0(s, 'USDT'), interval = '1d', limit = 30)[\n      , .(date = as.Date(close_time), symbol = s, usdt = close)]\n}))\nbalance \u003c- merge(balance, prices, by = c('date', 'symbol'), all.x = TRUE, all.y = FALSE)\n\n## USD in EUR\nusdeurs \u003c- get_usdeurs(start_date = Sys.Date() - 30, end_date = Sys.Date())\n\n## join USD/HUF exchange rate to balances\nbalance \u003c- merge(balance, usdeurs, by = 'date')\nbalance[, value := amount * usdt * usdeur]\n\n## compute daily values in HUF\nbalance[, value := amount * usdt * usdhuf]\n\n## ########################################################\n## Report\n\nggplot(balance, aes(date, value, fill = symbol)) +\n    geom_col() +\n    ylab('') + scale_y_continuous(labels = euro) +\n    xlab('') +\n    theme_bw() +\n    ggtitle(\n        'My crypto fortune',\n        subtitle = balance[date == max(date), paste(paste(amount, symbol), collapse = ' + ')])\n```\n\n\u003c/details\u003e\n\n## Profiling, benchmarks\n\nBreaking down the a single run of the `get_usdhuf` function to see which component is slow and taking up resources:\n\n```r\nlibrary(mr)\n\nlibrary(profvis)\nprofvis({\n  get_usdeur()\n})\n\nprofvis({\n  get_usdhuf()\n}, interval = 0.005)\n```\n\nA more realistic example: is `ggplot2` indeed slow when generating scatter plots on a dataset with larger number of observations?\n\nNote first run for the `library` call! Then run again.\n\n```r\nprofvis({\n  library(ggplot2)\n  x \u003c- ggplot(diamonds, aes(price, carat)) + geom_point()\n  print(x)\n})\n\nsystem.time(x \u003c- ggplot(diamonds, aes(price, carat)) + geom_point())\n```\n\nPipe VS Bracket:\n\n```r\nlibrary(data.table)\nlibrary(dplyr)\ndt \u003c- data.table(diamonds)\nprofvis({\n  dt[, sum(carat), by = color][order(color)]\n  group_by(dt, color) %\u003e% summarise(price = sum(carat))\n})\n## run too quickly for profiling ...\n\nlibrary(microbenchmark)\nresults \u003c- microbenchmark(\n  aggregate(dt$carat, by = list(dt$color), FUN = sum),\n  dt[, sum(carat), by = color][order(color)],\n  group_by(dt, color) %\u003e% summarise(price = sum(carat)),\n  times = 100)\n\nresults\nplot(results)\nautoplot(results)\n\nlibrary(bench)\n## needs to make sure that resulting objects are the same\nresults \u003c- bench::mark(\n  as.data.frame(dt[, .(price = sum(carat)), by = color][order(color)]),\n  as.data.frame(group_by(dt, color) %\u003e% summarize(price = sum(carat)))\n)\n\nresults\nautoplot(results)\n\n## revisit benchmarking creating and printing ggplot\nresults \u003c- microbenchmark(\n  x \u003c- ggplot(diamonds, aes(price, carat)) + geom_point(),\n  print(x),\n  times = 10)\n```\n\nAlso check out `dtplyr`!\n\nMore examples at https://rstudio.github.io/profvis/examples.html\n\n## Reporting exercises\n\n### Connecting to and exploring the SQLite database\n\nDownload and extract the database file:\n\n```r\n## download database file\ndownload.file('http://bit.ly/CEU-R-ecommerce', 'ecommerce.zip', mode = 'wb')\nunzip('ecommerce.zip')\n```\n\nInstall the SQLite client on your operating system and then use the `sqlite3 ecommerce.sqlite3` command to enter the command-line SQLite client to browse the database:\n\n```sql\n-- list tables in the database\n.tables\n-- show the structure of the sales table\n.schema sales\n-- show the first 5 rows of the table\nselect * from sales limit 5\n-- tweak how the rows are shown\n.headers on\n.mode column\nselect * from sales limit 5\n\n-- count number of rows in the table\nSELECT COUNT(*) FROM sales;\n\n-- count number of rows in January 2011 (lack of proper date/time handling in SQLite)\nSELECT COUNT(*)\nFROM sales\nWHERE SUBSTR(InvoiceDate, 7, 4) || SUBSTR(InvoiceDate, 1, 2) || SUBSTR(InvoiceDate, 4, 2)\n      BETWEEN '20110101' AND '20110131'\n\n-- check on the date format\nSELECT InvoiceDate FROM sales ORDER BY random() LIMIT 25;\n\n-- count the number of rows per month\nSELECT\n  SUBSTR(InvoiceDate, 7, 4) || SUBSTR(InvoiceDate, 1, 2) AS month,\n  COUNT(*)\nFROM sales\nGROUP BY month\nORDER BY month;\n```\n\nLet's switch to R!\n\n### Connect to SQLite from R\n\nCreate a database config file for the `dbr` package:\n\n```yaml\necommerce:\n  drv: !expr RSQLite::SQLite()\n  dbname: /path/to/ecommerce.sqlite3\n```\n\nUpdate your `dbr` settings to use the config file:\n\n```r\nlibrary(dbr)\noptions('dbr.db_config_path' = '/path/to/database.yml')\noptions('dbr.output_format' = 'data.table')\n\nsales \u003c- db_query('SELECT * FROM sales', 'ecommerce')\nstr(sales)\n\n## explore and fix the invoice date column\nsales[, sample(InvoiceDate, 25)]\nsales[, InvoiceDate := as.POSIXct(InvoiceDate, format = '%m/%d/%Y %H:%M')]\n## see fasttime::fastPOSIXct\n\n## number of sales per month like in SQL\nlibrary(lubridate)\nsales[, .N, by = month(InvoiceDate)]\nsales[, .N, by = year(InvoiceDate)]\nsales[, .N, by = paste(year(InvoiceDate), month(InvoiceDate))]\n# slow\nsales[, .N, by = as.character(InvoiceDate, format = '%Y %m')]\n# smart\nsales[, .N, by = floor_date(InvoiceDate, 'month')]\n\nsystem.time(sales[, .N, by = as.character(InvoiceDate, format = '%Y %m')])\nsystem.time(sales[, .N, by = floor_date(InvoiceDate, 'month')])\n\nlibrary(microbenchmark)\nmicrobenchmark(\n  sales[, .N, by = as.character(InvoiceDate, format = '%Y %m')],\n  sales[, .N, by = floor_date(InvoiceDate, 'month')],\n  times = 10)\n\n## number of items per country\nsales[, .N, by = Country]\nsales[, .N, by = Country][order(-N)]\n```\n\n### Aggregate transaction items into invoice summary\n\n```r\ninvoices \u003c- sales[, .(date = min(as.Date(InvoiceDate)),\n                      value  = sum(Quantity * UnitPrice)),\n                  by = .(invoice = InvoiceNo, customer = CustomerID, country = Country)]\n\ndb_insert(invoices, 'invoices', 'ecommerce')\n```\n\nCheck the structure of the newly (and automatically) created table using the command-line SQLite client:\n\n```sql\n.schema invoices\n```\n\nCheck the date column after reading back from the database:\n\n```r\ninvoices \u003c- db_query('SELECT * FROM invoices', 'ecommerce')\nstr(invoices)\n\ninvoices[, date := as.Date(date, origin = '1970-01-01')]\n```\n\n### Report the daily revenue in Excel\n\n```r\nrevenue \u003c- invoices[, .(revenue = sum(value)), by = date]\n\nlibrary(openxlsx)\nwb \u003c- createWorkbook()\nsheet \u003c- 'Revenue'\naddWorksheet(wb, sheet)\nwriteData(wb, sheet, revenue)\n\n## open for quick check\nopenXL(wb)\n\n## write to a file to be sent in an e-mail, uploaded to Slack or as a Google Spreasheet etc\nfilename \u003c- tempfile(fileext = '.xlsx')\nsaveWorkbook(wb, filename)\nunlink(filename)\n\n## static file name\nfilename \u003c- 'report.xlsx'\nsaveWorkbook(wb, filename)\n```\n\nTweak that spreadsheet:\n\n```r\nfreezePane(wb, sheet, firstRow = TRUE)\n\nsetColWidths(wb, sheet, 1:ncol(revenue), 'auto')\n\npoundStyle \u003c- createStyle(numFmt = '£0,000.00')\naddStyle(wb, sheet = sheet, poundStyle,\n         gridExpand = TRUE, cols = 2, rows = (1:nrow(revenue)) + 1, stack = TRUE)\n\ngreenStyle \u003c- createStyle(fontColour = \"#00FF00\") # previously? fgFill = \"#00FF00\"\nconditionalFormatting(wb, sheet, cols = 2,\n                      rows = 2:(nrow(revenue) + 1),\n                      rule = '$B2\u003e66788.35', style = greenStyle)\n\nstandardStyle \u003c- createStyle()\nconditionalFormatting(wb, sheet, cols = 2,\n                      rows = 2:(nrow(revenue) + 1),\n                      rule = '$B2\u003c=66788.35', style = standardStyle)\n```\n\nAdd a plot:\n\n```r\naddWorksheet(wb, 'Plot')\n\nlibrary(ggplot2)\nlibrary(ggthemes)\nggplot(revenue, aes(date, revenue)) + geom_line() + theme_excel()\n\ninsertPlot(wb, 'Plot')\n\nsaveWorkbook(wb, filename)\nsaveWorkbook(wb, filename,  overwrite = TRUE)\n```\n\n### Report the monthly revenue and daily breakdowns in Excel\n\n```r\nlibrary(lubridate)\nmonthly \u003c- invoices[, .(value = sum(value)), by = .(month = floor_date(date, 'month'))]\n\nlibrary(openxlsx)\nwb \u003c- createWorkbook()\nsheet \u003c- 'Summary'\naddWorksheet(wb, sheet)\nwriteData(wb, sheet, monthly)\n\nfor (month in as.character(monthly$month)) {\n  revenue \u003c- invoices[floor_date(date, 'month') == month,\n                      .(revenue = sum(value)), by = date]\n  addWorksheet(wb, as.character(month))\n  writeData(wb, month, revenue)\n}\n\nsaveWorkbook(wb, 'monthly-report.xlsx')\n```\n\n### Report on the top 10 customers in a Google Spreadsheet\n\n```r\ntop10 \u003c- sales[!is.na(CustomerID),\n               .(revenue = sum(UnitPrice * Quantity)), by = CustomerID][order(-revenue)][1:10]\n\nlibrary(openxlsx)\nwb \u003c- createWorkbook()\nsheet \u003c- 'Top Customers'\naddWorksheet(wb, sheet)\nwriteData(wb, sheet, top10)\nt \u003c- tempfile(fileext = '.xlsx')\nsaveWorkbook(wb, t)\n\n## upload file\nlibrary(googledrive)\ndrive_auth()\n## NOTE you can clean up credentials in ~/.R/gargle/gargle-oauth\ndrive_upload(media = t, name = 'top customers', path = 'ceu')\ndrive_update(media = t, file = 'top customers')\n\n## instead of top10, let's do top25 ... so appending a few rows to an already existing spreadsheet\ntop25 \u003c- sales[\n  !is.na(CustomerID),\n  .(revenue = sum(UnitPrice * Quantity)), by = CustomerID][order(-revenue)][1:25]\nlibrary(googlesheets4)\ngs4_auth()\nfor (i in 11:25) {\n  sheet_append('your.spreadsheet.id', data = top25[i])\n}\n```\n\n## Homeworks\n\n### Week 1\n\nAdd the `get_usdeur` and `get_bitcoin_price` functions to your `mr` R package (including documentation and all required imports), and push to your GitHub repo, so that you can install the package on any computer via `remotes::install_github`. Submit the URL to your GitHub repo in Moodle.\n\n### Week 2\n\nWrite unit tests for the `get_usdeurs` function, e.g. what happens when end date is lower than the start date, when he dates are not valid dates. Create a pull request for the main repo, and share the URL on Moodle.\n\n### Week 3\n\nUse GitHub Actions to either run the unit tests of the package after each push to GitHub, or to build documentation and publish using GitHub Pages.\n\nReferences:\n\n* https://r-pkgs.org/software-development-practices.html#sec-sw-dev-practices-ci\n* https://r-pkgs.org/website.html\n\nShare the URL of a successful GitHub Actions run on Moodle.\n\n## Contact\n\nFile a [GitHub ticket](https://github.com/daroczig/CEU-R-mastering/issues).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaroczig%2Fceu-r-mastering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaroczig%2Fceu-r-mastering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaroczig%2Fceu-r-mastering/lists"}