Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/chrismuir/baidugeo

R interface to Baidu Maps API geocoding services
https://github.com/chrismuir/baidugeo

api baidu baidumap geocoding r rstats

Last synced: about 1 month ago
JSON representation

R interface to Baidu Maps API geocoding services

Awesome Lists containing this project

README

        

---
output:
github_document: default
html_document: default
---

```{r, echo = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```

# baidugeo

[![Travis-CI Build Status](https://travis-ci.org/ChrisMuir/baidugeo.svg?branch=master)](https://travis-ci.org/ChrisMuir/baidugeo)

R interface for the Baidu Maps API geocoding services. API docs can be found [here](http://lbsyun.baidu.com/index.php?title=jspopular), [here](http://developer.baidu.com/map/skins/MySkin/resources/iframs/webapi-geocoding.html), and [here](http://lbsyun.baidu.com/index.php?title=car/api/geocoding). Functions are provided for both forward and reverse geocoding. The Baidu Maps API is useful for geolocation of Chinese strings (addresses, business names, locations, etc).

This project initially started as a fork of the [baidumap](https://github.com/badbye/baidumap) package by [Yalei Du](https://github.com/badbye). This package is meant to be a stripped down, simplified version of that package, useful for when all you need is geocoding from Baidu, and with a couple additional features built in. Major differences include:

- All API return data is cached as internal package data. The cache is persistent across R sessions.
- API rate limiting is handled automatically.
- Removes multicore processing, so no parallel API calls.
- Fewer dependencies.
- Uses the [rapidjson](https://github.com/Tencent/rapidjson) C++ header files (via the R packge [rapidjsonr](https://github.com/SymbolixAU/rapidjsonr)) for fast parsing of JSON API return data.

For geolocation of Chinese terms using basic string parsing (no API required), check out my package [geolocChina](https://github.com/ChrisMuir/geolocChina).

## Install from this repo

```{r, eval=FALSE}
#install.packages("devtools")
devtools::install_github("ChrisMuir/baidugeo")
```

## Usage and API Credentialing

A valid key is required to make queries with this package. Apply for an application key from [lbsyun.baidu.com](http://lbsyun.baidu.com/apiconsole/key). Then register your key using function `bmap_set_key()`.
```{r}
library(baidugeo)
bmap_set_key("some_valid_key_str")
```
The Baidu Maps API features daily rate limits. By default, any key registered using `bmap_set_key()` will be assigned a daily rate limit of 5950. This can be adjusted using function `bmap_set_daily_rate_limit()`
```{r}
bmap_set_daily_rate_limit(20000)
```

The function `bmap_rate_limit_info()` can be used to get info related to the currently registered key.
```{r}
bmap_rate_limit_info()
```

## Geocoding
Use function `bmap_get_coords()` to get lat and lon coordinates given a vector of strings.

```{r, echo=FALSE}
res_df <- data.frame(
"location" = c("中百超市有限公司长堤街二分店", "浙江省杭州市余杭区径山镇小古城村", "成都高梁红餐饮管理有限公司"),
"lon" = c(114.2729, 119.8783, 104.0679),
"lat" = c(30.61617, 30.39625, 30.67994),
"status" = c(0, 0, 0),
"precise" = c(1, 0, 0),
"confidence" = c(80, 30, 12),
"comprehension" = c(95, 100, 29),
"level" = c("UNKNOWN", "乡镇", "城市")
)

res_json <- c(
"{\"status\":0,\"result\":{\"location\":{\"lng\":114.27287244473057,\"lat\":30.616167082550779},\"precise\":1,\"confidence\":80,\"comprehension\":95,\"level\":\"UNKNOWN\"}}",
"{\"status\":0,\"result\":{\"location\":{\"lng\":119.87833669326516,\"lat\":30.39624844375698},\"precise\":0,\"confidence\":30,\"comprehension\":100,\"level\":\"乡镇\"}}",
"{\"status\":0,\"result\":{\"location\":{\"lng\":104.06792346330406,\"lat\":30.679942845419565},\"precise\":0,\"confidence\":12,\"comprehension\":29,\"level\":\"城市\"}}"
)
```

```{r}
locs <- c(
"中百超市有限公司长堤街二分店",
"浙江省杭州市余杭区径山镇小古城村",
"成都高梁红餐饮管理有限公司"
)
```
`bmap_get_coords()` returns a data frame by default.
```{r, eval=FALSE}
coords_df <- bmap_get_coords(locs)
knitr::kable(coords_df)
```

```{r, echo=FALSE}
knitr::kable(res_df)
```

Use arg `type` to return a vector of json strings.
```{r, eval=FALSE}
bmap_get_coords(locs, type = "json")
```

```{r, echo=FALSE}
print(res_json)
```

## Reverse Geocoding
Use function `bmap_get_location()` to get full address details given vectors of lat/lon coordinate pairs.

```{r, echo=FALSE}
res_df <- data.frame(
"input_lon" = c(114.2729, 119.8783, 104.0679),
"input_lat" = c(30.61617, 30.39625, 30.67994),
"return_lon" = c(114.2729, 119.8783, 104.0679),
"return_lat" = c(30.61617, 30.39625, 30.67994),
"status" = c(0, 0, 0),
"formatted_address" = c("湖北省武汉市江汉区新华路630号", "浙江省杭州市余杭区潘金线", "四川省成都市青羊区王家塘街84号"),
"business" = c("汉口火车站,常青路,北湖", NA, "骡马市,新华西路,八宝街"),
"country" = c("中国", "中国", "中国"),
"country_code" = c(0, 0, 0),
"country_code_iso" = c("CHN", "CHN", "CHN"),
"country_code_iso2" = c("CN", "CN", "CN"),
"province" = c("湖北省", "浙江省", "四川省"),
"city" = c("武汉市", "杭州市", "成都市"),
"city_level" = c(2, 2, 2),
"district" = c("江汉区", "余杭区", "青羊区"),
"town" = c(NA_character_, NA_character_, NA_character_),
"ad_code" = c(420103, 330110, 510105),
"street" = c("新华路", "潘金线", "王家塘街"),
"street_number" = c("630号", NA, "84号"),
"direction" = c("附近", NA, "附近"),
"distance" = c(9, NA, 6),
"city_code" = c(218, 179, 75)
)

res_json <- c(
"{\"status\":0,\"result\":{\"location\":{\"lng\":114.27287244473092,\"lat\":30.61616696729939},\"formatted_address\":\"湖北省武汉市江汉区新华路630号\",\"business\":\"汉口火车站,常青路,北湖\",\"addressComponent\":{\"country\":\"中国\",\"country_code\":0,\"country_code_iso\":\"CHN\",\"country_code_iso2\":\"CN\",\"province\":\"湖北省\",\"city\":\"武汉市\",\"city_level\":2,\"district\":\"江汉区\",\"town\":\"\",\"adcode\":\"420103\",\"street\":\"新华路\",\"street_number\":\"630号\",\"direction\":\"附近\",\"distance\":\"9\"},\"pois\":[],\"roads\":[],\"poiRegions\":[],\"sematic_description\":\"锦江之星酒店(武汉菱角湖万达店)南53米\",\"cityCode\":218}}",
"{\"status\":0,\"result\":{\"location\":{\"lng\":119.87833669326493,\"lat\":30.39624841472855},\"formatted_address\":\"浙江省杭州市余杭区潘金线\",\"business\":\"\",\"addressComponent\":{\"country\":\"中国\",\"country_code\":0,\"country_code_iso\":\"CHN\",\"country_code_iso2\":\"CN\",\"province\":\"浙江省\",\"city\":\"杭州市\",\"city_level\":2,\"district\":\"余杭区\",\"town\":\"\",\"adcode\":\"330110\",\"street\":\"潘金线\",\"street_number\":\"\",\"direction\":\"\",\"distance\":\"\"},\"pois\":[],\"roads\":[],\"poiRegions\":[],\"sematic_description\":\"阳坞山西北444米\",\"cityCode\":179}}",
"{\"status\":0,\"result\":{\"location\":{\"lng\":104.06792346330394,\"lat\":30.67994271533221},\"formatted_address\":\"四川省成都市青羊区王家塘街84号\",\"business\":\"骡马市,新华西路,八宝街\",\"addressComponent\":{\"country\":\"中国\",\"country_code\":0,\"country_code_iso\":\"CHN\",\"country_code_iso2\":\"CN\",\"province\":\"四川省\",\"city\":\"成都市\",\"city_level\":2,\"district\":\"青羊区\",\"town\":\"\",\"adcode\":\"510105\",\"street\":\"王家塘街\",\"street_number\":\"84号\",\"direction\":\"附近\",\"distance\":\"6\"},\"pois\":[],\"roads\":[],\"poiRegions\":[{\"direction_desc\":\"内\",\"name\":\"青羊区政府\",\"tag\":\"政府机构;各级政府\",\"uid\":\"96b672aa58335874cf04ef80\"}],\"sematic_description\":\"青羊区政府内,成都华氏陶瓷艺术博物馆附近1米\",\"cityCode\":75}}"
)
```

`bmap_get_location()` returns a data frame by default.
```{r, eval=FALSE}
addrs_df <- bmap_get_location(coords_df$lat, coords_df$lon)
knitr::kable(addrs_df)
```

```{r, echo=FALSE}
knitr::kable(res_df)
```

Use arg `type` to return a vector of json strings.
```{r, eval=FALSE}
bmap_get_location(coords_df$lat, coords_df$lon, type = "json")
```

```{r, echo=FALSE}
print(res_json)
```

## Package Data
Functions `bmap_get_coords()` and `bmap_get_location()` both cache API return data. The functions will first look for the return data in the cached package datasets, if it's not found there they will execute an API request.

Use function `bmap_clear_cache()` to clear either of the package cache data sets (or both at once).
```{r, eval=FALSE}
bmap_clear_cache(coordinate_cache = TRUE, address_cache = TRUE)
```

Use function `bmap_get_cached_coord_data()` to load all of the cached lat/lon return data as a tidy data frame.
```{r, eval=FALSE}
df <- bmap_get_cached_coord_data()
```

Use function `bmap_get_cached_address_data()` to load the cached address return data as a tidy data frame.
```{r, eval=FALSE}
df <- bmap_get_cached_address_data()
```