An open API service indexing awesome lists of open source software.

https://github.com/frictionlessdata/frictionless-py

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
https://github.com/frictionlessdata/frictionless-py

Last synced: 16 days ago
JSON representation

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data

Awesome Lists containing this project

README

          

# frictionless-py

[![Build](https://img.shields.io/github/actions/workflow/status/frictionlessdata/frictionless-py/general.yaml?branch=main)](https://github.com/frictionlessdata/frictionless-py/actions)
[![Coverage](https://img.shields.io/codecov/c/github/frictionlessdata/frictionless-py/main)](https://codecov.io/gh/frictionlessdata/frictionless-py)
[![Release](https://img.shields.io/pypi/v/frictionless.svg)](https://pypi.python.org/pypi/frictionless)
[![Citation](https://zenodo.org/badge/28409905.svg)](https://zenodo.org/badge/latestdoi/28409905)
[![Codebase](https://img.shields.io/badge/codebase-github-brightgreen)](https://github.com/frictionlessdata/frictionless-py)
[![Support](https://img.shields.io/badge/support-slack-brightgreen)](https://join.slack.com/t/frictionlessdata/shared_invite/zt-17kpbffnm-tRfDW_wJgOw8tJVLvZTrBg)

```markdown remark type=primary
Migrating from an older version? Please read **[v5](blog/2022/08-22-frictionless-framework-v5.html)** announcement and migration guide.
```

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data (DEVT Framework). It supports a great deal of data sources and formats, as well as provides popular platforms integrations. The framework is powered by the lightweight yet comprehensive [Frictionless Standards](https://specs.frictionlessdata.io/).

## Purpose

- **Describe your data**: You can infer, edit and save metadata of your data tables. It's a first step for ensuring data quality and usability. Frictionless metadata includes general information about your data like textual description, as well as, field types and other tabular data details.
- **Extract your data**: You can read your data using a unified tabular interface. Data quality and consistency are guaranteed by a schema. Frictionless supports various file schemes like HTTP, FTP, and S3 and data formats like CSV, XLS, JSON, SQL, and others.
- **Validate your data**: You can validate data tables, resources, and datasets. Frictionless generates a unified validation report, as well as supports a lot of options to customize the validation process.
- **Transform your data**: You can clean, reshape, and transfer your data tables and datasets. Frictionless provides a pipeline capability and a lower-level interface to work with the data.

## Features

- Open Source (MIT)
- Powerful Python framework
- Convenient command-line interface
- Low memory consumption for data of any size
- Reasonable performance on big data
- Support for compressed files
- Custom checks and formats
- Fully pluggable architecture
- More than 1000+ tests

## Installation

```bash
$ pip install frictionless
```

## Example

```bash
$ frictionless validate data/invalid.csv
[invalid] data/invalid.csv

row field code message
----- ------- ---------------- --------------------------------------------
3 blank-header Header in field at position "3" is blank
4 duplicate-header Header "name" in field "4" is duplicated
2 3 missing-cell Row "2" has a missing cell in field "field3"
2 4 missing-cell Row "2" has a missing cell in field "name2"
3 3 missing-cell Row "3" has a missing cell in field "field3"
3 4 missing-cell Row "3" has a missing cell in field "name2"
4 blank-row Row "4" is completely blank
5 5 extra-cell Row "5" has an extra value in field "5"
```

## Documentation

Please visit our documentation portal:
- https://framework.frictionlessdata.io