Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gvegayon/googlepublicdata

An R package to build Google's Public Data Explorer DSPL Metadata files
https://github.com/gvegayon/googlepublicdata

dspl gapminder google google-public-data r-package xml

Last synced: 22 days ago
JSON representation

An R package to build Google's Public Data Explorer DSPL Metadata files

Awesome Lists containing this project

README

        

googlePublicData
================

An *R* package for building *Google's* *Data Sets Publication Language* (DSPL) metadata files used in *Public Data Explorer*.

[![Downloads](http://cranlogs.r-pkg.org/badges/googlePublicData?color=brightgreen)](http://cran.rstudio.com/package=googlePublicData) [![Travis-CI Build Status](https://travis-ci.org/gvegayon/googlePublicData.svg?branch=master)](https://travis-ci.org/gvegayon/googlePublicData) [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/gvegayon/googlePublicData?branch=master&svg=true)](https://ci.appveyor.com/project/gvegayon/googlePublicData) [![codecov](https://codecov.io/gh/gvegayon/googlePublicData/branch/master/graph/badge.svg)](https://codecov.io/gh/gvegayon/googlePublicData)
[![Grand total](http://cranlogs.r-pkg.org/badges/grand-total/googlePublicData)](https://cran.r-project.org/package=googlePublicData)

Features:

- Reads tab, csv, xls and xlsx from a folder.

- Identifies data types and distinguishes between dimensional and metric concepts.

- Identifies dimensional data tabs.

- Auto generates conceps id.

- Auto data sorting on dimensional (no time) concepts.

- Prints XML and csv files to upload to Public Data Explorer.

- Some bug trackers before final printing XML.

- Builds ZIP file containing CSV and XML files.

So you don't need to mess with the XML coding at all!

``` r
library(googlePublicData)

# This path has some csv files that we will use
data.path <-try(paste(.libPaths()[1],'/googlePublicData/data',sep=''), silent=T)
data.path
```

## [1] "/home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data"

``` r
# The dspl function looks for csv files in that paths, and analyzes them
mydspl <- dspl(path=data.path, sep=";")
```

## 6 files found...

## /home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data/countries.csv analyzed correctly

## /home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data/country_slice.csv analyzed correctly

## /home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data/gender_country_slice.csv analyzed correctly

## /home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data/genders.csv analyzed correctly

## /home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data/states.csv analyzed correctly

## /home/george/R/x86_64-pc-linux-gnu-library/3.4/googlePublicData/data/state_slice.csv analyzed correctly

``` r
# If we wanted to write the zip file... ready to be uploaded to
# http://publicdata.google.com
# dspl(path=data.path, sep=";", output= "mydspl.zip")

# Printing the data
mydspl
```

##
##
##
##
##
##
##
##
##
##
##
## No name
##
##
## No description
##
##
##
##
##
## No provider
##
##
##
##
##
##
##
## Country
##
##
##
##
##
##
##
##
## Population
##
##
##
##
##
##
##
## Gender
##
##
##
##
##
##
##
##
## State
##
##
##
##
##
##
##
##
## Unemployment Rate
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
## countries.csv
##
##
##
##
##
##
##
## country_slice.csv
##
##
##
##
##
##
##
##
## gender_country_slice.csv
##
##
##
##
##
##
## genders.csv
##
##
##
##
##
##
##
##
## states.csv
##
##
##
##
##
##
##
##
## state_slice.csv
##
##
##
##

``` r
# Summary of the dspl class object
summary(mydspl)
```

## Attributes
## $names
## [1] "dspl" "concepts.by.table" "dimtabs"
## [4] "slices" "concepts" "dimentions"
## [7] "statistics"
##
## $class
## [1] "dspl"
##
## Dataset contents

## $dimtabs
## [1] "countries" "genders" "states"
##
## $slices
## [1] "countries" "country_slice" "gender_country_slice"
## [4] "genders" "states" "state_slice"
##
## $concepts
## [1] "Country" "name" "latitude"
## [4] "longitude" "Year" "Population"
## [7] "Gender" "State" "Unemployment Rate"
##
## $dimentions
## label
## 1 Country
## 12 Gender
## 14 State
##
## $statistics
## slices concepts dimentions
## [1,] 6 9 3