Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pachevalier/datatools

awesome open source tools to work with data
https://github.com/pachevalier/datatools

List: datatools

awesome awesome-list data-science data-visualization

Last synced: 16 days ago
JSON representation

awesome open source tools to work with data

Awesome Lists containing this project

README

        

# Awesome Data Tools

A simple list of awesome open source tools to deal with data

## Graphics and charts

* [DataWrapper](https://www.datawrapper.de/)
* [RAWgraphs](https://rawgraphs.io/)
* Created by DensityDesign
* [Chartbuilder](http://quartz.github.io/Chartbuilder/)
* Created by [Quartz](https://github.com/Quartz)
* [Fastcharts](https://fastcharts.io/)
* Created by [FTGraphics](https://github.com/ft-interactive)
* [Palladio](http://hdlab.stanford.edu/palladio/)
* Map, graph, table and gallery
* Can be connected to a SPARQL endpoint (eg Wikidata.org)
* [ObservableHQ](observablehq.com/)
* [Voyager2](https://vega.github.io/voyager2/) : data explorer based on Vega

## Maps

* [uMap](https://umap.openstreetmap.fr/fr/)
* [Kepler.gl](https://kepler.gl/)
* [Magrit](http://magrit.cnrs.fr/)
* static maps
* [MapShaper](https://mapshaper.org/)
* [mviewer](https://mviewer.netlify.app/fr/)
* [cocarto](https://cocarto.com/) nice app developed by Codeurs en liberté. It allows to collaboratively edit points, lines and shapes on a map.
* [khartis](https://www.sciencespo.fr/cartographie/khartis/) developed by Medialab Sciences Po.

## Dashboards

* [Metabase](https://github.com/metabase/metabase)
* [SuperSet](https://superset.apache.org/)

## Database/SQL client

* [Falcon](https://github.com/plotly/falcon)
* [Metabase](https://github.com/metabase/metabase)

## Analyse documents

* [Datashare](https://github.com/ICIJ/datashare)
* [Aleph](https://github.com/alephdata/aleph)
* [Ambar](https://ambar.cloud/)

## Extract tables from PDF

* [Tabula](https://tabula.technology/)
* [Excalibur](https://www.tryexcalibur.com/)

## Information extraction

* [grobid](https://github.com/kermitt2/grobid) : information extraction for scholarly documents

## Browse CSV

* [dplyr-cli](https://github.com/coolbutuseless/dplyr-cli)
* Type : command line
* [xsv](https://github.com/BurntSushi/xsv)
* Type : command line
* [tabview](https://github.com/TabViewer/tabview)
* Type : command line
* [Miller](https://github.com/johnkerl/miller)
* Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON http://johnkerl.org/miller/doc
* [Visidata](https://www.visidata.org/)
* [modern csv](https://www.moderncsv.com/)

## Browse JSON

* [fx](https://github.com/antonmedv/fx)
* Type : command line
* [jq](https://stedolan.github.io/jq/)
* Type : command line

## Conversion

* [Knead](https://github.com/hay/dataknead)
* [Flatterer](https://flatterer.opendata.coop/) converts json to tabular data

## Data wrangling

* [OpenRefine](https://openrefine.org/)
* [TadViewer](https://www.tadviewer.com/)
* [OpenDataEditor](https://opendataeditor.okfn.org/)

## Fuzzy matching

* [Crosswalker](https://crosswalker.washingtonpost.com/)

## Workflows

* [Airflow](https://airflow.apache.org/)
* [Prefect](https://www.prefect.io/)

## API

* [csvapi](https://github.com/etalab/csvapi)
* [datasette](https://github.com/simonw/datasette)
* [fastapi-csv](https://github.com/jrieke/fastapi-csv)

## Annotation

* [Doccano](https://github.com/chakki-works/doccano/wiki)
* [WebAnno](https://webanno.github.io/webanno/)
* [BRAT](http://brat.nlplab.org/)
* [YEDDA](https://github.com/jiesutd/YEDDA)
* [PIAF](https://github.com/etalab/piaf)
* [Label Studio](https://labelstud.io/)
* [Tessele](https://medialab.github.io/tesselle/#/)

## Open data portals

* [csvbase](https://github.com/calpaterson/csvbase) : a very simple framework to share and preview csv files.
* [udata](https://github.com/opendatateam/udata)
* [metaclic](https://github.com/datakode/metaclic), front end interface for udata
* [jkan](https://github.com/timwis/jkan), static open data portal, made with Jekyll
* [ckan](https://ckan.org/)
* [datafair](https://github.com/data-fair/data-fair)

## HCR

* [kraken](https://github.com/mittagessen/kraken)

## References

* [Data science at the command line](https://www.datascienceatthecommandline.com/)
* [Awesome Production Machine Learning](https://github.com/EthicalML/awesome-production-machine-learning)
* [Awesome Data Labeling](https://github.com/heartexlabs/awesome-data-labeling)