Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/datacite/bolognese

Ruby gem and command-line utility for conversion of DOI metadata
https://github.com/datacite/bolognese

Last synced: 3 months ago
JSON representation

Ruby gem and command-line utility for conversion of DOI metadata

Awesome Lists containing this project

README

        

[![Identifier](https://img.shields.io/badge/doi-10.5438%2Fn138--z3mk-fca709.svg)](https://doi.org/10.5438/n138-z3mk)
[![Gem Version](https://badge.fury.io/rb/bolognese.svg)](https://badge.fury.io/rb/bolognese)
![Build Ruby Gem](https://github.com/datacite/bolognese/workflows/Build%20Ruby%20Gem/badge.svg)
[![Code Climate](https://codeclimate.com/github/datacite/bolognese/badges/gpa.svg)](https://codeclimate.com/github/datacite/bolognese)
[![Test Coverage](https://codeclimate.com/github/datacite/bolognese/badges/coverage.svg)](https://codeclimate.com/github/datacite/bolognese/coverage)

# Bolognese: a Ruby library for conversion of DOI Metadata

Ruby gem and command-line utility for conversion of DOI metadata from and to different metadata formats, including [schema.org](https://schema.org).

## Features

Bolognese reads and/or writes these metadata formats:



Format
Name
Content Type
Read
Write




CrossRef Unixref XML
crossref
application/vnd.crossref.unixref+xml
Yes
No


DataCite XML
datacite
application/vnd.datacite.datacite+xml
Yes
Yes


DataCite JSON
datacite_json
application/vnd.datacite.datacite+json
Yes
Yes


Schema.org in JSON-LD
schema_org
application/vnd.schemaorg.ld+json
Yes
Yes


RDF XML
rdf_xml
application/rdf+xml
No
Yes


RDF Turtle
turtle
text/turtle
No
Yes


Citeproc JSON
citeproc
application/vnd.citationstyles.csl+json
Yes
Yes


Formatted text citation
citation
text/x-bibliography
No
Yes


Codemeta
codemeta
application/vnd.codemeta.ld+json
Yes
Yes


JATS
jats
application/vnd.jats+xml
No
Yes


CSV
csv
text/csv
No
Yes


BibTeX
bibtex
application/x-bibtex
Yes
Yes


RIS
ris
application/x-research-info-systems
Yes
Yes


Crosscite
crosscite
application/vnd.crosscite.crosscite+json
Yes
Yes

**Crosscite** is the format used internally by bolognese.

## Installation

Requires Ruby 2.2 or later. Then add the following to your `Gemfile` to install the
latest version:

```ruby
gem 'bolognese'
```

Then run `bundle install` to install into your environment.

You can also install the gem system-wide in the usual way:

```bash
gem install bolognese
```

## Commands

Run the `bolognese` command with either an identifier (DOI or URL) or filename:

```
bolognese https://doi.org/10.7554/elife.01567
```

```
bolognese example.xml
```

Bolognese can read BibTeX files (file extension `.bib`), RIS files (file extension `.ris`), Crossref or DataCite XML files (file extension `.xml`), DataCite JSON files (file extension `Citeproc JSON files ().

The input format (e.g. Crossref XML or BibteX) is automatically detected, but
you can also provide the format with the `--from` or `-f` flag. The supported
input formats are listed in the table above.

The output format is determined by the `--to` or `-t` flag, and defaults to `schema_org`.

Show all commands with `bolognese help`:

```
Commands:
bolognese # convert metadata
bolognese --version, -v # print the version
bolognese help [COMMAND] # Describe available commands or one specific command
```
## Errors

Errors are returned to STDOUT.

All DataCite XML input is validated against the corresponding schema version (kernel 2.1, 2.2, 3, or 4).

## Examples

Read Crossref XML:

```
bolognese https://doi.org/10.7554/elife.01567 -t crossref





eLife
2050-084X



02
11
2014


3




Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth



Martial
Sankar


Kaisa
Nieminen


Laura
Ragni


Ioannis
Xenarios


Christian S
Hardtke



02
11
2014


10.7554/eLife.01567


1
eLifesciences


www.elifesciences.org


false

2013-09-20
2013-12-24
2014-02-11


SystemsX



EMBO
http://dx.doi.org/10.13039/501100003043




Swiss National Science Foundation
http://dx.doi.org/10.13039/501100001711




University of Lausanne
http://dx.doi.org/10.13039/501100006390




http://creativecommons.org/licenses/by/3.0/
http://creativecommons.org/licenses/by/3.0/
http://creativecommons.org/licenses/by/3.0/




10.7554/eLife.01567
http://elifesciences.org/lookup/doi/10.7554/eLife.01567


...

Sankar
2014
10.5061/dryad.b835k

...


...




```

Convert Crossref XML to schema.org/JSON-LD:
```
bolognese https://doi.org/10.7554/elife.01567

{
"@context": "http://schema.org",
"@type": "ScholarlyArticle",
"@id": "https://doi.org/10.7554/elife.01567",
"url": "http://elifesciences.org/lookup/doi/10.7554/eLife.01567",
"additionalType": "JournalArticle",
"name": "Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth",
"author": [{
"@type": "Person",
"givenName": "Martial",
"familyName": "Sankar"
}, {
"@type": "Person",
"givenName": "Kaisa",
"familyName": "Nieminen"
}, {
"@type": "Person",
"givenName": "Laura",
"familyName": "Ragni"
}, {
"@type": "Person",
"givenName": "Ioannis",
"familyName": "Xenarios"
}, {
"@type": "Person",
"givenName": "Christian S",
"familyName": "Hardtke"
}],
"license": "http://creativecommons.org/licenses/by/3.0/",
"datePublished": "2014-02-11",
"dateModified": "2015-08-11T05:35:02Z",
"isPartOf": {
"@type": "Periodical",
"name": "eLife",
"issn": "2050-084X"
},
"citation": [{
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/nature02100",
"position": "1",
"datePublished": "2003"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1534/genetics.109.104976",
"position": "2",
"datePublished": "2009"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1034/j.1399-3054.2002.1140413.x",
"position": "3",
"datePublished": "2002"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1162/089976601750399335",
"position": "4",
"datePublished": "2001"
}, {
"@type": "CreativeWork",
"position": "5",
"datePublished": "1995"
}, {
"@type": "CreativeWork",
"position": "6",
"datePublished": "1993"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.semcdb.2009.09.009",
"position": "7",
"datePublished": "2009"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1242/dev.091314",
"position": "8",
"datePublished": "2013"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1371/journal.pgen.1002997",
"position": "9",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/msb.2010.25",
"position": "10",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.biosystems.2012.07.004",
"position": "11",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.pbi.2005.11.013",
"position": "12",
"datePublished": "2006"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1105/tpc.110.076083",
"position": "13",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1073/pnas.0808444105",
"position": "14",
"datePublished": "2008"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/0092-8674(89)90900-8",
"position": "15",
"datePublished": "1989"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1126/science.1066609",
"position": "16",
"datePublished": "2002"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1104/pp.104.040212",
"position": "17",
"datePublished": "2004"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/nbt1206-1565",
"position": "18",
"datePublished": "2006"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1073/pnas.77.3.1516",
"position": "19",
"datePublished": "1980"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1093/bioinformatics/btq046",
"position": "20",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1105/tpc.111.084020",
"position": "21",
"datePublished": "2011"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.5061/dryad.b835k",
"position": "22",
"datePublished": "2014"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.cub.2008.02.070",
"position": "23",
"datePublished": "2008"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1111/j.1469-8137.2010.03236.x",
"position": "24",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1007/s00138-011-0345-9",
"position": "25",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.cell.2012.02.048",
"position": "26",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/ncb2764",
"position": "27",
"datePublished": "2013"
}],
"funder": [{
"@type": "Organization",
"name": "SystemsX"
}, {
"@type": "Organization",
"@id": "https://doi.org/10.13039/501100003043",
"name": "EMBO"
}, {
"@type": "Organization",
"@id": "https://doi.org/10.13039/501100001711",
"name": "Swiss National Science Foundation"
}, {
"@type": "Organization",
"@id": "https://doi.org/10.13039/501100006390",
"name": "University of Lausanne"
}],
"provider": {
"@type": "Organization",
"name": "Crossref"
}
}
```

Convert Crossref XML to DataCite XML:
```
bolognese https://doi.org/10.7554/elife.01567 -t datacite

10.7554/eLife.01567


Sankar, Martial
Martial
Sankar


Nieminen, Kaisa
Kaisa
Nieminen


Ragni, Laura
Laura
Ragni


Xenarios, Ioannis
Ioannis
Xenarios


Hardtke, Christian S
Christian S
Hardtke



Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth

eLife
2014
JournalArticle


SystemsX


EMBO
https://doi.org/10.13039/501100003043


Swiss National Science Foundation
https://doi.org/10.13039/501100001711


University of Lausanne
https://doi.org/10.13039/501100006390



2014-02-11
2015-08-11T05:35:02Z


https://doi.org/10.1038/nature02100
https://doi.org/10.1534/genetics.109.104976
https://doi.org/10.1034/j.1399-3054.2002.1140413.x
https://doi.org/10.1162/089976601750399335
https://doi.org/10.1016/j.semcdb.2009.09.009
https://doi.org/10.1242/dev.091314
https://doi.org/10.1371/journal.pgen.1002997
https://doi.org/10.1038/msb.2010.25
https://doi.org/10.1016/j.biosystems.2012.07.004
https://doi.org/10.1016/j.pbi.2005.11.013
https://doi.org/10.1105/tpc.110.076083
https://doi.org/10.1073/pnas.0808444105
https://doi.org/10.1016/0092-8674(89)90900-8
https://doi.org/10.1126/science.1066609
https://doi.org/10.1104/pp.104.040212
https://doi.org/10.1038/nbt1206-1565
https://doi.org/10.1073/pnas.77.3.1516
https://doi.org/10.1093/bioinformatics/btq046
https://doi.org/10.1105/tpc.111.084020
https://doi.org/10.5061/dryad.b835k
https://doi.org/10.1016/j.cub.2008.02.070
https://doi.org/10.1111/j.1469-8137.2010.03236.x
https://doi.org/10.1007/s00138-011-0345-9
https://doi.org/10.1016/j.cell.2012.02.048
https://doi.org/10.1038/ncb2764


Creative Commons Attribution 3.0 (CC-BY 3.0)

```
Convert Crossref XML to BibTeX:

```
bolognese https://doi.org/10.7554/elife.01567 -t bibtex

@article{https://doi.org/10.7554/elife.01567,
doi = {10.7554/eLife.01567},
url = {http://elifesciences.org/lookup/doi/10.7554/eLife.01567},
author = {Sankar, Martial and Nieminen, Kaisa and Ragni, Laura and Xenarios, Ioannis and Hardtke, Christian S},
title = {Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth},
journal = {eLife},
year = {2014}
}
```

Read DataCite XML:
```
bolognese 10.5061/DRYAD.8515 -t datacite

10.5061/DRYAD.8515
1


Ollomo, Benjamin


Durand, Patrick


Prugnolle, Franck


Douzery, Emmanuel J. P.


Arnathau, Céline


Nkoghe, Dieudonné


Leroy, Eric


Renaud, François



Data from: A new malaria agent in African hominids.

Dryad Digital Repository
2011

Phylogeny
Malaria
Parasites
Taxonomy
Mitochondrial genome
Africa
Plasmodium

DataPackage

Ollomo B, Durand P, Prugnolle F, Douzery EJP, Arnathau C, Nkoghe D, Leroy E, Renaud F (2009) A new malaria agent in African hominids. PLoS Pathogens 5(5): e1000446.


10.5061/DRYAD.8515/1
10.5061/DRYAD.8515/2
10.1371/JOURNAL.PPAT.1000446
19478877



```

Convert DataCite XML to schema.org/JSON-LD:
```sh
bolognese 10.5061/DRYAD.8515

{
"@context": "http://schema.org",
"@type": "Dataset",
"@id": "https://doi.org/10.5061/dryad.8515",
"additionalType": "DataPackage",
"name": "Data from: A new malaria agent in African hominids.",
"alternateName": "Ollomo B, Durand P, Prugnolle F, Douzery EJP, Arnathau C, Nkoghe D, Leroy E, Renaud F (2009) A new malaria agent in African hominids. PLoS Pathogens 5(5): e1000446.",
"author": [{
"@type": "Person",
"givenName": "Benjamin",
"familyName": "Ollomo"
}, {
"@type": "Person",
"givenName": "Patrick",
"familyName": "Durand"
}, {
"@type": "Person",
"givenName": "Franck",
"familyName": "Prugnolle"
}, {
"@type": "Person",
"givenName": "Emmanuel J. P.",
"familyName": "Douzery"
}, {
"@type": "Person",
"givenName": "Céline",
"familyName": "Arnathau"
}, {
"@type": "Person",
"givenName": "Dieudonné",
"familyName": "Nkoghe"
}, {
"@type": "Person",
"givenName": "Eric",
"familyName": "Leroy"
}, {
"@type": "Person",
"givenName": "François",
"familyName": "Renaud"
}],
"license": "http://creativecommons.org/publicdomain/zero/1.0/",
"version": "1",
"keywords": "Phylogeny, Malaria, Parasites, Taxonomy, Mitochondrial genome, Africa, Plasmodium",
"datePublished": "2011",
"hasPart": [{
"@type": "CreativeWork",
"@id": "https://doi.org/10.5061/dryad.8515/1"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.5061/dryad.8515/2"
}],
"citation": [{
"@type": "CreativeWork",
"@id": "https://doi.org/10.1371/journal.ppat.1000446"
}],
"schemaVersion": "http://datacite.org/schema/kernel-3",
"publisher": {
"@type": "Organization",
"name": "Dryad Digital Repository"
},
"provider": {
"@type": "Organization",
"name": "DataCite"
}
}
```

Convert DataCite XML to schema version 4.0:
```
bolognese 10.5061/DRYAD.8515 -t datacite --schema_version http://datacite.org/schema/kernel-4

10.5061/DRYAD.8515


Ollomo, Benjamin
Benjamin
Ollomo


Durand, Patrick
Patrick
Durand


Prugnolle, Franck
Franck
Prugnolle


Douzery, Emmanuel J. P.
Emmanuel J. P.
Douzery


Arnathau, Céline
Céline
Arnathau


Nkoghe, Dieudonné
Dieudonné
Nkoghe


Leroy, Eric
Eric
Leroy


Renaud, François
François
Renaud



Data from: A new malaria agent in African hominids.

Dryad Digital Repository
2011
DataPackage

Ollomo B, Durand P, Prugnolle F, Douzery EJP, Arnathau C, Nkoghe D, Leroy E, Renaud F (2009) A new malaria agent in African hominids. PLoS Pathogens 5(5): e1000446.


Phylogeny
Malaria
Parasites
Taxonomy
Mitochondrial genome
Africa
Plasmodium


2011


https://doi.org/10.5061/dryad.8515/1
https://doi.org/10.5061/dryad.8515/2
https://doi.org/10.1371/journal.ppat.1000446

1

Public Domain (CC0 1.0)

```

Convert DataCite XML to Codemeta:

```
bolognese https://doi.org/10.5063/f1m61h5x -t codemeta

{
"@context":"https://raw.githubusercontent.com/codemeta/codemeta/master/codemeta.jsonld",
"@type":"SoftwareSourceCode",
"@id":"https://doi.org/10.5063/f1m61h5x",
"identifier":"https://doi.org/10.5063/f1m61h5x",
"title":"dataone: R interface to the DataONE network of data repositories",
"agents":{
"@type":"Person",
"givenName":"Matthew B.",
"familyName":"Jones"
},
"datePublished":"2016",
"publisher":{
"@type":"Organization",
"name":"KNB Data Repository"
}
}
```

Convert DataCite XML to BibTeX:

```
bolognese 10.5061/DRYAD.8515 -t bibtex

@misc{https://doi.org/10.5061/dryad.8515,
doi = {10.5061/DRYAD.8515},
author = {Ollomo, Benjamin and Durand, Patrick and Prugnolle, Franck and Douzery, Emmanuel J. P. and Arnathau, Céline and Nkoghe, Dieudonné and Leroy, Eric and Renaud, François},
keywords = {Phylogeny, Malaria, Parasites, Taxonomy, Mitochondrial genome, Africa, Plasmodium},
title = {Data from: A new malaria agent in African hominids.},
publisher = {Dryad Digital Repository},
year = {2011}
}
```

Convert schema.org/JSON-LD to DataCite XML:

```
bolognese https://blog.datacite.org/eating-your-own-dog-food -t datacite

10.5438/4k3m-nyvg


Fenner, Martin
Martin
Fenner
http://orcid.org/0000-0003-1419-2405



Eating your own Dog Food

DataCite
2016
BlogPosting

MS-49-3632-5083


datacite
doi
metadata
featured


2016-12-20
2016-12-20
2016-12-20


https://doi.org/10.5438/0000-00ss
https://doi.org/10.5438/0012
https://doi.org/10.5438/55e5-t5c0

1.0




Eating your own dog food is a slang term to describe that an organization should itself use the products and services it provides. For DataCite this means that we should use DOIs with appropriate metadata and strategies for long-term preservation for...

```

Convert schema.org/JSON-LD to BibTeX:

```
bolognese https://blog.datacite.org/eating-your-own-dog-food -t bibtex

@article{https://doi.org/10.5438/4k3m-nyvg,
doi = {10.5438/4k3m-nyvg},
url = {https://blog.datacite.org/eating-your-own-dog-food},
author = {Fenner, Martin},
keywords = {datacite, doi, metadata, featured},
title = {Eating your own Dog Food},
publisher = {DataCite},
year = {2016}
}
```

Convert Codemeta to schema.org/JSON-LD:

```
bolognese https://github.com/datacite/maremma

{
"@context":"http://schema.org",
"@type":"SoftwareSourceCode",
"@id":"https://doi.org/10.5438/qeg0-3gm3",
"url":"https://github.com/datacite/maremma",
"name":"Maremma: a Ruby library for simplified network calls",
"author":{
"@type":"person",
"@id":"http://orcid.org/0000-0003-0077-4738",
"name":"Martin Fenner"
},
"description":"Simplifies network calls, including json/xml parsing and error handling. Based on Faraday.",
"keywords":"faraday, excon, net/http",
"dateCreated":"2015-11-28",
"datePublished":"2017-02-24",
"dateModified":"2017-02-24",
"publisher":{
"@type":"Organization",
"name":"DataCite"
}
}
```

Convert Codemeta to DataCite XML:

```
bolognese https://github.com/datacite/maremma -t datacite

10.5438/qeg0-3gm3


Martin Fenner
http://orcid.org/0000-0003-0077-4738



Maremma: a Ruby library for simplified network calls

DataCite
2017
SoftwareSourceCode

faraday
excon
net/http


2015-11-28
2017-02-24
2017-02-24


Simplifies network calls, including json/xml parsing and error handling. Based on Faraday.

```

## Development

We use rspec for unit testing:

```
bundle exec rspec
```

Follow along via [Github Issues](https://github.com/datacite/bolognese/issues).
Please open an issue if conversion fails or metadata are not properly supported.

### Note on Patches/Pull Requests

* Fork the project
* Write tests for your new feature or a test that reproduces a bug
* Implement your feature or make a bug fix
* Do not mess with Rakefile, version or history
* Commit, push and make a pull request. Bonus points for topical branches.

## License
**bolognese** is released under the [MIT License](https://github.com/datacite/bolognese/blob/master/LICENSE.md).