An open API service indexing awesome lists of open source software.

https://github.com/caltechlibrary/dataset-instruction

Instructional content for the dataset package
https://github.com/caltechlibrary/dataset-instruction

Last synced: 5 months ago
JSON representation

Instructional content for the dataset package

Awesome Lists containing this project

README

          

Training on Dataset tools
=======

*Content Contributors: Robert Doiel, Tom Morrell*

*Lesson Maintainers: Robert Doiel, Tom Morrell*

**Lesson status: In Development**

## What you will learn:

* Identify the structure of a JSON file
* Gather data from an API
* Use the basic functions of dataset
* Combine data using dataset to collect citations for a publications list
* Export and Import from a Google sheet
* Index and search over a large collection of data

## Topics:

1. [Intro](00-intro-json-apis.html)
2. [Basic Dataset](01-basic-dataset.html)
3. [Working with Larger Amounts of Data](02-large-data.html)

## Requirements

This lesson requires basic familiarity with the bash shell, similar to the
experience gained through the
[Software Carpentry shell lesson](http://swcarpentry.github.io/shell-novice/).
You'll need to have a bash shell installed, you can follow
[these instructions](https://swcarpentry.github.io/workshop-template/#setup).

Two tool collections developed at Caltech Library will also be used, [datatools](https://caltechlibrary.github.io/datatools/)
and [dataset](https://caltechlibrary.github.io/dataset/). From _datatools_ we will be using
a program called _jsonmunge_ for extracting and re-formatting JSON content. _datatools_, a collection
of tools for working with CSV, XLSX and JSON content, is available [here](https://github.com/caltechlibrary/datatools/latest/releases).
_dataset_, a data management tool, is available
[here](https://github.com/caltechlibrary/dataset/latest/releases).

## References

+ data formats
+ json documentation, https://www.json.org
+ simple index maps, https://caltechlibrary.github.io/dataset/docs/dsindexer/defining-index.html
+ data sources
+ Dimension API presentation (see slide 3), https://figshare.com/s/3c8f0284e8e51718c1b2
+ CrossRef REST API, https://github.com/CrossRef/rest-api-doc
+ CrossRef Content Negotiation, https://citation.crosscite.org/docs.html
+ programs
+ curl documentation, https://curl.haxx.se/
+ dataset, https://caltechlibrary.github.io/dataset/
+ datatools, https://caltechlibrary.github.io/datatools/