https://github.com/robjg/dido
Data In/Data Out in many formats
https://github.com/robjg/dido
csv-parser data etl java json-parser
Last synced: 5 months ago
JSON representation
Data In/Data Out in many formats
- Host: GitHub
- URL: https://github.com/robjg/dido
- Owner: robjg
- License: other
- Created: 2011-01-18T20:47:57.000Z (over 15 years ago)
- Default Branch: master
- Last Pushed: 2025-12-02T19:00:55.000Z (7 months ago)
- Last Synced: 2025-12-05T18:56:50.507Z (7 months ago)
- Topics: csv-parser, data, etl, java, json-parser
- Language: Java
- Homepage:
- Size: 13 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
Dido
====
- [Overview](#overview)
- [Some Java Examples](#some-java-examples)
- [No Code Dido](#no-code-dido)
- [More Info](#more-info)
- [Building](#building)
- [Background](#background)
### Overview
Dido stands for Data-In/Data-Out. It is a framework for making data from different sources
look the same so that it can be copied, processed and compared.
Dido is available in Maven. To get started simply include [dido-all](https://mvnrepository.com/artifact/uk.co.rgordon/dido-all)
which will provide all the stable modules in one dependency.
### Some Java Examples
Given this CSV:
```
Apple,5,19.50
Orange,2,35.24
Pear,3,26.84
```
We can read it in:
```java
List didoData;
try (DataIn in = DataInCsv.fromPath(Path.of("Fruit.csv"))) {
didoData = in.stream().collect(Collectors.toList());
}
assertThat(didoData, contains(
DidoData.of("Apple", "5", "19.50"),
DidoData.of("Orange", "2", "35.24"),
DidoData.of("Pear", "3", "26.84")));
```
And we can write it out as Json
```java
try (DataOut out = DataOutJson.with()
.outFormat(JsonDidoFormat.LINES)
.toOutputStream(System.out)) {
didoData.forEach(out);
}
```
Giving us:
```
{"f_1":"Apple","f_2":"5","f_3":"19.50"}
{"f_1":"Orange","f_2":"2","f_3":"35.24"}
{"f_1":"Pear","f_2":"3","f_3":"26.84"}
```
We can give our data a schema:
```java
DataSchema schema = DataSchema.builder()
.addNamed("Fruit", String.class)
.addNamed("Qty", int.class)
.addNamed("Price", double.class)
.build();
```
And now when we copy from CSV to JSON
```java
try (DataIn in = DataInCsv.with()
.schema(schema)
.fromPath(Path.of("Fruit.csv"));
DataOut out = DataOutJson.with()
.outFormat(JsonDidoFormat.LINES)
.toOutputStream(System.out)) {
in.forEach(out);
}
```
We get:
```
{"Fruit":"Apple","Qty":5,"Price":19.5}
{"Fruit":"Orange","Qty":2,"Price":35.24}
{"Fruit":"Pear","Qty":3,"Price":26.84}
```
### No Code Dido
Dido comes with Jobs and Types for creating Data Processing Pipelines in [Oddjob](https://github.com/robjg/oddjob/)
without code using Oddjob's UI - *Oddjob Explorer*
Here is Oddjob Explorer running the first example above.

See [Dido in Oddjob](docs/DIDO-ODDJOB.md) for getting started with Dido in Oddjob.
See [The Reference](docs/reference/README.md) for details of all the Oddjob configurations in Dido.
### More Info
[dido-data](docs/DIDO-DATA.md) provides the definition of Data on which the rest of Dido is based.
[dido-operators](docs/DIDO-OPERATORS.md) provide functions for processing data.
For Reading Data in and Out in different formats:
- [dido-csv](docs/DIDO-CSV.md) - For reading and writing CSV data.
- [dido-json](docs/DIDO-JSON.md) - For reading and writing JSON.
- [dido-sql](docs/DIDO-SQL.md) - For reading and writing to Databases.
- [dido-poi](docs/DIDO-POI.md) - For reading and writing to Excel sheets.
- [dido-text](docs/DIDO-TEXT.md) - For writing to Ascii Formatted Text Tables.
[dido-objects](docs/DIDO-OBJECTS.md) for converting to and from Java Objects.
### Building
See [Building](BUILDING.md)
### Background
For a more information on why Dido was created please see
[Background](BACKGROUND.md)