{"id":23526086,"url":"https://github.com/mattcowgill/readabs","last_synced_at":"2025-05-16T06:04:20.194Z","repository":{"id":39028899,"uuid":"127828221","full_name":"MattCowgill/readabs","owner":"MattCowgill","description":"Download and tidy time series data from the Australian Bureau of Statistics in R","archived":false,"fork":false,"pushed_at":"2025-04-07T20:37:33.000Z","size":12965,"stargazers_count":104,"open_issues_count":13,"forks_count":23,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-12T03:45:37.876Z","etag":null,"topics":["abs","australia","australian-bureau-of-statistics","australian-data","statistics","tidy-data","time-series"],"latest_commit_sha":null,"homepage":"https://mattcowgill.github.io/readabs/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MattCowgill.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-04-03T00:28:46.000Z","updated_at":"2025-04-10T14:07:22.000Z","dependencies_parsed_at":"2023-02-16T12:31:50.904Z","dependency_job_id":"77d31bc6-deb1-40fd-ac84-9700e2e189ed","html_url":"https://github.com/MattCowgill/readabs","commit_stats":{"total_commits":351,"total_committers":11,"mean_commits":31.90909090909091,"dds":"0.18518518518518523","last_synced_commit":"9b2d4d7b48c9e135ea712fc2706bac9ee252ca86"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MattCowgill%2Freadabs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MattCowgill%2Freadabs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MattCowgill%2Freadabs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MattCowgill%2Freadabs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MattCowgill","download_url":"https://codeload.github.com/MattCowgill/readabs/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248514214,"owners_count":21116899,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["abs","australia","australian-bureau-of-statistics","australian-data","statistics","tidy-data","time-series"],"created_at":"2024-12-25T19:13:34.444Z","updated_at":"2025-04-12T03:45:46.370Z","avatar_url":"https://github.com/MattCowgill.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\neditor_options: \n  chunk_output_type: console\n---\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.retina = 2,\n  fig.path = \"man/figures/README-\"\n)\n\nversion \u003c- as.vector(read.dcf(\"DESCRIPTION\")[, \"Version\"])\nversion \u003c- gsub(\"-\", \".\", version)\n```\n\n# readabs \u003cimg src=\"man/figures/logo.png\" align=\"right\" height=\"139\" /\u003e\n\u003c!-- badges: start --\u003e\n[![R build status](https://github.com/mattcowgill/readabs/workflows/R-CMD-check/badge.svg)](https://github.com/mattcowgill/readabs/actions)\n[![codecov status](https://img.shields.io/codecov/c/github/mattcowgill/readabs.svg)](https://app.codecov.io/gh/MattCowgill/readabs)\n[![CRAN status](https://www.r-pkg.org/badges/version/readabs)](https://cran.r-project.org/package=readabs)\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html)\n\u003c!-- badges: end --\u003e\n\n## Overview\n\n{readabs} helps you easily download, import, and tidy data from the Australian Bureau of Statistics within R. \nThis saves you time manually downloading and tediously tidying data and allows you to spend more time on your analysis.\n\n## Installing {readabs}\n\nInstall the latest CRAN version of {readabs} with:\n\n```{r cran-installation, eval = FALSE}\ninstall.packages(\"readabs\")\n```\n\nYou can install the development version of {readabs} from GitHub with:\n```{r gh-installation, eval = FALSE}\n# if you don't have devtools installed, first run:\n# install.packages(\"devtools\")\ndevtools::install_github(\"mattcowgill/readabs\")\n```\n\n## Using {readabs}\n\nThe ABS releases data in many different formats, through many different dissemination channels.\n\nThe {readabs} contains functions for working with three different types of ABS data:\n\n - `read_abs()` and related functions downloads, imports, and tidies ABS time series data.\n - `download_abs_data_cube()` and related functions find and download ABS data cubes, which\n   are spreadsheets on the ABS website that are not in the standard time series format.\n - `read_api()` and related functions find, filter, and import data from the [ABS.Stat](https://dataexplorer.abs.gov.au) API.\n\n\n### ABS time series data\n\nA key function in {readabs} is `read_abs()`, which downloads, imports, and tidies time series data from the ABS website. **Note that `read_abs()` only works with spreadsheets in the standard ABS time series format.**\n\nFirst we'll load {readabs} and the {tidyverse}:\n```{r load-package-dev, results = FALSE, warning = FALSE, eval = TRUE, include = FALSE}\ndevtools::load_all()\nlibrary(tidyverse)\n```\n```{r load-packages, results=FALSE, warning=FALSE, eval = FALSE}\nlibrary(readabs)\nlibrary(tidyverse)\n```\n\nNow we'll create one data frame that contains all the time series data from the Wage Price Index, catalogue number 6345.0:\n\n```{r all-wpi}\nall_wpi \u003c- read_abs(\"6345.0\")\n```\n\nThis is what it looks like:\n\n```{r str-wpi}\nstr(all_wpi)\n```\n\nIt only takes you a few lines of code to make a graph from your data:\n\n```{r all-in-one-example}\nall_wpi %\u003e%\n  filter(\n    series == \"Percentage Change From Corresponding Quarter of Previous Year ;  Australia ;  Total hourly rates of pay excluding bonuses ;  Private and Public ;  All industries ;\",\n    !is.na(value)\n  ) %\u003e%\n  ggplot(aes(x = date, y = value, col = series_type)) +\n  geom_line() +\n  theme_minimal() +\n  labs(y = \"Annual wage growth (per cent)\")\n```\n\nIn the example above we downloaded all the time series from a catalogue number. This will often be overkill. If you know the data you need is in a particular table, you can just get that table like this:\n\n```{r wpi1}\nwpi_t1 \u003c- read_abs(\"6345.0\", tables = 1)\n```\n\nIf you want multiple tables, but not the whole catalogue, that's easy too:\n\n```{r wpi1_5}\nwpi_t1_t5 \u003c- read_abs(\"6345.0\", tables = c(\"1\", \"5a\"))\n```\n\nFor more examples, please see the vignette on working with time series data (run `browseVignettes(\"readabs\")`).\n\nSome other functions that may come in handy when working with ABS time series data:\n\n* `read_abs_local()` imports and tidies time series data from ABS spreadsheets stored on a local drive. Thanks to Hugh Parsonage for contributing to this functionality.\n* `separate_series()` splits the `series` column of a tidied ABS time series spreadsheet into multiple columns, reducing the manual wrangling that's needed to work with the data. Thanks to David Diviny for writing this function.\n\n#### Convenience functions for loading time series data\nThere are several functions that load specific ABS time series data:\n\n* `read_cpi()` imports the Consumer Price Index numbers as a two-column tibble: `date` and `cpi`. This is useful for joining to other series to adjust data for changes in consumer prices.\n* `read_awe()` returns a long time series of Average Weekly Earnings data. \n* `read_job_mobility()` downloads, imports and tidies tables from the ABS Job Mobility dataset.\n\n### ABS data cubes\n\nThe ABS (generally) releases time series data in a standard format, which allows `read_abs()` to download, import and tidy it (see above). But not all ABS data is time series data - the ABS also releases data as 'data cubes'. These are all formatted in their own, unique way. \n\nUnfortunately, because data cubes are all formatted in their own way, there is no one function that can import tidy data cubes for you in the same way that `read_abs()` works with all time series. But `{readabs}` still has functions that can help. Thanks to David Diviny for writing these functions.\n\nThe `download_abs_data_cube()` function can download an ABS data cube for you. It works with any data cube on the ABS website. To use this function, we need two things: a `catalogue_string` (the short name of the release) and `cube`, a (unique fragment of) the filename within the catalogue you wish to download.\n\nFor example, let's say you wanted to download table 4 from _Weekly Payroll Jobs and Wages in Australia_. We can find the catalogue name like this:\n\n```{r cat-name}\nsearch_catalogues(\"payroll\")\n```\n\nNow we know that the string `\"weekly-payroll-jobs\"` is the `catalogue_string` for this release. We can now see what files are available to download from this catalogue:\n\n```{r files}\nshow_available_files(\"weekly-payroll-jobs\")\n```\n\nWe want Table 4, which has the filename `6160055001_DO004.xlsx`. \n\nWe can download the file as follows:\n\n```{r download-data-cube}\npayrolls_t4_path \u003c- download_abs_data_cube(\"weekly-payroll-jobs\", \"004\")\n\npayrolls_t4_path\n```\n\nThe `download_abs_data_cube()` function downloads the file and returns the full file path to the saved file. You can then pipe that in to another function:\n\n```{r read-payrolls-manual, eval = FALSE}\npayrolls_t4_path %\u003e%\n  readxl::read_excel(\n    sheet = \"Payroll jobs index\",\n    skip = 5\n  )\n```\n\n\n#### Convenience functions for data cubes\n\nAs it happens, if you want the ABS Weekly Payrolls data, you don't need to use `download_abs_data_cube()` directly. Instead, there is a convenience function available that downloads, imports, and tidies the data for you:\n\n```{r read-payrolls-fn, eval = FALSE}\nread_payrolls()\n```\n\nThere is also a convenience function available for data cube GM1 from the monthly Labour Force data, which contains labour force gross flows:\n\n```{r read-lfs-grossflows, eval = FALSE}\nread_lfs_grossflows()\n```\n\n\n### Finding and loading data from the ABS.Stat API\n\nThe ABS has created a new site to access its data, called the ABS Data Explorer, also known as ABS.Stat. As at early 2023, this site is in Beta mode. The site provides an API.\n\nThe {readabs} package includes functions to query the ABS.Stat API. Thank you to Kinto Behr for writing these functions. The functions are:\n\n* `read_api_dataflows()` lists available dataflows (roughly equivalent to 'tables')\n* `read_api_datastructure()` lists variables within a particular dataflow and the levels of those variables, which you can use to filter the data server-side in an API query\n* `read_api()` downloads data from the ABS.Stat API.\n\nLet's list available dataflows:\n```{r api-flows}\nflows \u003c- read_api_dataflows()\n```\n\nSay from this I am interested in the first dataflow, the projected population of \nAboriginal and Torres Strait Islander Australians. The id for this dataflow is\n`\"ABORIGINAL_POP_PROJ\"`, which I can use to download the data. \n\nIn this case, I could download the entire dataflow with:\n```{r all-aboriginal-pop}\nread_api(\"ABORIGINAL_POP_PROJ\")\n```\n\nLet's say I'm only interested in the population projections for males, not females or all persons. In that case, I can filter the data on the ABS server before downloading my query. I can use `read_api_datastructure()` to help with this.\n\n\n```{r datastructure}\nread_api_datastructure(\"ABORIGINAL_POP_PROJ\")\n```\n\nFrom this, I can see that there's a variable (`var`) called `sex_abs`, which can take the value `1`, `2`, or `3`, corresponding to `Males`, `Females` and `Persons`. If I only want to data for Males, I can obtain this by supplying a datakey:\n\n```{r}\nread_api(\"ABORIGINAL_POP_PROJ\", datakey = list(sex_abs = 1))\n```\n\nNote that in some cases, querying the API without filtering the data will return an error, as the table will be too big. In this case, you will need to supply a datakey that reduces the size of the data.\n\n## Resolving network issues by manually setting the download method\n\nCertain corporate networks restrict your ability to download files in an R session. On some of these networks, the `\"wininet\"` method must be used when downloading files. Users can now specify the method that will be used to download files by setting the `\"R_READABS_DL_METHOD\"` environment variable. \n\nFor example, the following code sets the environment variable for your current session: \n\n```{r, eval = FALSE}\nSys.setenv(\"R_READABS_DL_METHOD\" = \"wininet\")\n```\n\nYou can add `R_READABS_DL_METHOD = \"wininet\"` to your .Renviron to have this persist across sessions.\n\nIf you have other issues using `{readabs}` in your corporate environment, I would appreciate you opening an issue on GitHub.\n\n## Bug reports and feedback\nGitHub issues containing error reports or feature requests are welcome. Please try to make a [reprex](https://reprex.tidyverse.org) (a minimal, reproducible example) if possible.\n\nAlternatively you can email the package maintainer at mattcowgill at gmail dot com.\n\n## Disclaimer\nThe `{readabs}` package is not associated with the Australian Bureau of Statistics.\nAll data is provided subject to any restrictions and licensing arrangements\nnoted on the ABS website.\n\n## Awesome Official Statistics Software\n\n[![Mentioned in Awesome Official Statistics ](https://awesome.re/mentioned-badge.svg)](https://github.com/SNStatComp/awesome-official-statistics-software)\n\nWe're pleased to be included in a [list of software](https://github.com/SNStatComp/awesome-official-statistics-software) that can be used to work with official statistics.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmattcowgill%2Freadabs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmattcowgill%2Freadabs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmattcowgill%2Freadabs/lists"}