https://github.com/devinit/fts_precodinglists
Parsing and coding FTS data from API
https://github.com/devinit/fts_precodinglists
Last synced: about 1 year ago
JSON representation
Parsing and coding FTS data from API
- Host: GitHub
- URL: https://github.com/devinit/fts_precodinglists
- Owner: devinit
- Created: 2020-03-18T12:31:39.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-06-22T11:31:54.000Z (over 5 years ago)
- Last Synced: 2025-01-07T21:46:51.363Z (about 1 year ago)
- Language: R
- Homepage:
- Size: 669 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# fts_precodinglists
The fts_finalcode.R is used to join the FTS database extracted from API with the CSV files that contain new variables to be added to the new modified FTS database. The fts_precodinglists.R was only used to select new flows in the Donors, Recipient.Organisations and Destination.Country that were not incorporated in the CVS coding lists.
# Prepare and clean the the database before joining the csv files
We selected only the appropriate flows from status column ('paid' and 'commitment' contributions, excluding the pledges).
```R
data <- subset(data, data$status %in% c('paid','commitment'))
```
We also selected the columns from the original database that are required in the final visualisation of the database
```R
data <- subset(data, select = c("id","amountUSD","budgetYear","description",
"flowType","newMoney","originalAmount","originalCurrency",
"method","status","boundary","onBoundary","source_Organization_name",
"source_Location_name","source_UsageYear_name","destination_Organization_name",
"destination_GlobalCluster_name","destination_Location_name","destination_UsageYear_name",
"destination_Plan_name","destination_Project_name","parentFlowId","grandBargainEarmarkingType",
"source_Plan_id","source_Plan_name","destination_Cluster_name","destination_Emergency_name",
"exchangeRate","source_Emergency_name","source_GlobalCluster_name"))
```
We renamed the source_Organization_name, destination_Organization_name and destination_Location_name to Donor, Recipient.Organization and Destination.Country, respectively. The cvs files will be join to the FTS database by Donor, Recipient.Organization and Destination.Country columns.
```R
if("source_Organization_name" %in% names(data)){
names(data)[which(names(data)=="source_Organization_name")] = "Donor"
}
```
We Removed ", Government of" of Donor and Recipient.Organization columns to harmonise the name of the organisations with the cvs files.
```R
data$Donor <- gsub(", Government of","",data$Donor)
unique(data$Donor)
```
We substituted the special characters to the foreign characters from Donor and Recipient.Organization columns
```R
character_replacements = list(
c("á","á"), c("â","â"), c("ä","ä"),
c("é","é"), c("ë","ë"), c("ó","ó"),
c("ô","ô"), c("ö","ö"), c("ú","ú"),
c("ü","ü"), c("Äf","ã"), c("ÄT","ê"),
c('Å"','ñ'), c("Å^","ò"), c("Å'","õ"),
c("Å.","à"), c("Å>","o"), c("ů","ù"),
c("ű","û"), c("è","è"), c("ç","ç"),
c("ø","ø"), c('â???"','-'), c("â???T","'"),
c("î","î"), c("ñ","ñ"), c("ê","ê"),
c("Ã???","Ç"), c("î","î"), c("ñ","ñ"),
c('â???"','-'), c('Ã"','Ä'),
c("Ã","í"), c("ï","ï"), c("í£","ã"),
c("í¯","ï"), c("í¦","æ"), c("í¥","å"),
c("â???T","'"),c('â???"',"-"), c("í¦","à"),
c('Å"',"o"), c("Å,","à"), c('Å"',"à"),
c('â???"','-'),c("â???T","'"),c('â???"',"-"),
c('â???T',"'"), c('Å"',"à"), c('â???"',"-"),
c('â???T',"'")
)
for(character_replacement in character_replacements){
from_character = character_replacement[1]
to_character = character_replacement[2]
data$Donor <- gsub(
from_character,
to_character,
data$Donor,
ignore.case = FALSE, perl = FALSE,
fixed = TRUE, useBytes = FALSE
)
data$Recipient.Organization <- gsub(
from_character,
to_character,
data$Recipient.Organization,
ignore.case = FALSE, perl = FALSE,
fixed = TRUE, useBytes = FALSE
)
}
```
# Join the CVS to the FTS database
The following CVS files are linked to the FTS database by the Donor column: codenames2020, privatemoney2020, dacregion2020 and donorscountryid2020.
The following CVS files are linked to the FTS database by the Recipient.Organization column: recipientcodename2020, ngotype2020, deliverychannels2020 and recipientcountryid2020.
The following CVS files are linked to the FTS database by the Destination.Country column: odaeligible2020, destinationcountryid2020 and incomegroups2020.
```R
ngotype <- read.csv("Final lists coded/ngotype2020.csv",na.strings="",as.is=TRUE)
ngotype$lower.Recipient.Organization <- lowerConv(ngotype$Recipient.Organization)
ngotype <- ngotype[!duplicated(ngotype$Recipient.Organization),]
ngotype$Recipient.Organization <- NULL
data <- join(data, ngotype, by='lower.Recipient.Organization', type='left', match='first')
withoutngos <- subset(data,is.na(ngotype))
unique(withoutngos$Recipient.Organization)
```
# Create extra columns based on the existing final columns
Domestic response is TRUE when donor country id is equal to the destination country id. When both columns are balnk, I consider it as non-domestic response (FALSE)
```R
data$domesticresponse <- ifelse(data$donorcountryid==data$destinationcountryid,TRUE,FALSE)
data$domesticresponse[is.na(data$domesticresponse)] <- FALSE
```
We created the column Deflator type merging the data from the columns Source_UsageYear_name and donorcountryid
```R
data$deflatortype <- paste(data$donorcountryid,data$source_UsageYear_name)
data$deflatortype[is.na(data$deflatortype)] <- FALSE
```
# Merge deflator file and deflate amount USD values
The deflators2020 (csv file) is linked to the deflatortype column and the calculation are conduct according to the following code line:
```R
deflators <- read.csv("Final lists coded/deflators2020.csv",na.strings="",as.is=TRUE)
data <- join(data, deflators, by='deflatortype', type='left', match='all')
data <- transform(data,amountDeflated=as.numeric(amountUSD)/as.numeric(Deflators))
data <- transform(data,amountDeflatedMillions=amountDeflated/1000000)
withoutdeflators <- subset(data,is.na(deflators))
unique(withoutdeflators$Donor)
```
# Exclude the columns that are not necessary in the final file
We excluded the following columns: deflatortype, lower.destinationcountrytype, destinationcountrytype, lower.deflatortype, lower.Donor and lower.Recipient.Organization.