An open API service indexing awesome lists of open source software.

https://github.com/koma-private/csvs

csvs (CSV Sql) is a command-line tool that simplifies working with CSV or TSV files by enabling SQL queries through an embedded SQLite engine. Binaries for macOS, Windows & Linux.
https://github.com/koma-private/csvs

csv csv-converter csv-export csv-import csv-parser cui linux macos sql sqlite tsv tsv-parser windows

Last synced: about 1 month ago
JSON representation

csvs (CSV Sql) is a command-line tool that simplifies working with CSV or TSV files by enabling SQL queries through an embedded SQLite engine. Binaries for macOS, Windows & Linux.

Awesome Lists containing this project

README

        

# csvs

**csvs** (**CSV** **S**ql) is a command-line tool that simplifies working with CSV or TSV files by enabling SQL queries
through an embedded [SQLite](https://www.sqlite.org/) engine. It is ideal for data analysts and developers who need
SQL's flexibility to manage text-based data efficiently.

[![GitHub Release](https://img.shields.io/github/actions/workflow/status/koma-private/csvs/release.yml)](https://github.com/koma-private/csvs)
[![GitHub Tag](https://img.shields.io/github/v/tag/koma-private/csvs)](https://github.com/koma-private/csvs)
[![Crates.io Version](https://img.shields.io/crates/v/csvs)](https://crates.io/crates/csvs)

![Banner of executing csvs](assets/usage.banner.png)
![Banner of interactive mode](assets/interactive.banner1.png)
![Banner of interactive mode](assets/interactive.banner2.png)

- [Features](#features)
- [SQL Power for CSV Files](#sql-power-for-csv-files)
- [Automatic Encoding Detection](#automatic-encoding-detection)
- [Decide Data Type for Each Column](#decide-data-type-for-each-column)
- [Multi-File Handling](#multi-file-handling)
- [Customizable Output](#customizable-output)
- [Interactive Mode](#interactive-mode)
- [Multi-Statement Query Support](#multi-statement-query-support)
- [Usage](#usage)
- [Interactive Mode](#interactive-mode-1)
- [Command Options](#command-options)
- [Example](#example)
- [SQL Query Notes](#sql-query-notes)
- [Mapping CSVs to Table Names](#mapping-csvs-to-table-names)
- [Quoting Columns with Special Characters](#quoting-columns-with-special-characters)
- [`--in-no-header` Option](#--in-no-header-option)
- [Execute Multiple Statements in a Single Query](#execute-multiple-statements-in-a-single-query)
- [Error Handling](#error-handling)
- [Build](#building-csvs)
- [Limitations](#limitations)
- [Acknowledgments](#acknowledgments)
- [License](#license)

---

## Features

### SQL Power for CSV Files

Run advanced SQL queries, including `JOIN`, `GROUP BY`, `SUM()`, or `COUNT()` on CSV data.
Gain unparalleled flexibility to query, filter, sort, group, and combine data compared to traditional spreadsheet tools.
**csvs** also supports regular expressions in SQL queries. Refer to [Regular Expressions Document](regexp.md).

### Automatic Encoding Detection

Eliminate encoding issues with automatic detection of character encodings. Avoid garbled text and broken queries
effortlessly.

### Decide Data Type for Each Column

**csvs** scans CSV rows to determine the most appropriate data types for SQLite tables. This dynamic analysis ensures
compatibility and precision, even when handling nullable fields.

See [Decide Data Type for Each Column](decide_data_type.md) and [Validating Number Document](validating_number.md) for
details.

### Multi-File Handling

Combine data from multiple CSV or TSV files by creating a temporary SQLite database using `--in-file`. Easily perform
SQL joins across files in seconds.

Common use cases:

- Merging datasets from separate files.
- Cross-referencing data using SQL joins.

### Customizable Output

Export query results as:

- **CSV or TSV**: Ideal for data sharing or further processing.
- **SQLite Database**: Retain results as `.db` files for future queries.

Control delimiters, headers, and quoting styles to suit your needs.

### Interactive Mode

Explore datasets interactively without specifying queries upfront. Features include:

- Browsing imported tables.
- Ad-hoc query execution.
- Previewing and saving query results.

### Multi-Statement Query Support

Execute multiple SQL statements in a single command. Transform and query data across multiple steps, with only the final
result displayed.

---

## Usage

- Display help:

```shell
csvs --help
```

### Interactive Mode

Start **csvs** in *interactive mode* when neither `--query` nor `--source` is specified. This mode allows you to:

- View imported tables.
- Preview table content.
- Save query results to files interactively.

See [Interactive Mode Guide](interactive_mode.md)

![Animation of interactive mode](assets/interactive.gif)

### Command Options

See [Command Options Guide](command_options.md)

![Animation of executing csvs](assets/usage.gif)

---

## Example

- Display version:

```shell
csvs --version
```

- Select specific fields from `./data/address.csv` and save the results to `picked.csv`:

```shell
csvs -i ./data/address.csv -q 'SELECT "city","town","phone" FROM "address.csv"' -o picked.csv
```

- Process CSV data from `STDIN`:

```shell
csvs -q 'SELECT "city","town","phone" FROM "stdin"' < ./data/address.csv > picked.csv
```

or

```shell
cat ./data/address.csv | csvs -q 'SELECT "city","town","phone" FROM "stdin"' > picked.csv
```

- Perform SQL joins across multiple files:

```shell
csvs -i ./left.csv -i ./right.tsv -q 'SELECT * FROM "left.csv" AS l JOIN "right.tsv" AS r ON l."name"=r."name"'
```

- Leverage SQLite functions like `UPPER()`, `COUNT()`, etc., and export results to a SQLite database:

```shell
csvs -i people.csv -q 'SELECT "city",COUNT(*) FROM "people.csv" GROUP BY "city" ORDER BY COUNT(*) DESC' --out-database out.db
```

- Start in *interactive mode*:

```shell
csvs -i MOCK_DATA.csv
```

---

## SQL Query Notes

### Mapping CSVs to Table Names

File names provided with `--in-file` map directly to SQLite tables (e.g. `./sample/address.csv` becomes `"address.csv"`,
and `data.2024.csv` becomes `"data.2024.csv"`).

### Quoting Columns with Special Characters

Columns or table names with spaces, punctuation, or reserved words must be quoted. Example:

```sql
SELECT "first name", "last name"
FROM "contacts.csv"
```

### `--in-no-header` Option

If specified, column names default to "c1", "c2", "c3", etc., for header-less CSV files.

### Execute Multiple Statements in a Single Query

Separate SQL statements with semicolons to execute multiple queries in sequence. Only the result of the final query is
displayed.

Example:

```sql
SELECT "first name"
FROM "contacts.csv";
SELECT "age"
FROM "contacts.csv";
```

---

## Error Handling

See [Error Handling Guide](error_handling.md)

---

## Building csvs

See [Build Guide](build.md)

---

## Limitations

- *Interactive mode* cannot be invoked when CSV data is provided via `STDIN`. Use `--in-file` to specify CSV files
instead.
- Large files may require significant RAM since **csvs** loads entire files into memory when `--out-database` is not
specified.
- CSV files with names starting with `sqlite_` cannot be used with `--in-file` due to SQLite's reserved naming
convention.

## Acknowledgments

**csvs** relies on the following open-source projects:

- [SQLite](https://www.sqlite.org/)
- [anyhow](https://github.com/dtolnay/anyhow)
- [chardetng](https://github.com/hsivonen/chardetng)
- [clap](https://github.com/clap-rs/clap)
- [clap-help](https://github.com/Canop/clap-help)
- [csv](https://github.com/BurntSushi/rust-csv)
- [encoding_rs](https://github.com/hsivonen/encoding_rs)
- [encoding_rs_rw](https://github.com/LiosK/encoding_rs_rw)
- [indicatif](https://github.com/console-rs/indicatif)
- [lazy-regex](https://github.com/Canop/lazy-regex)
- [ratatui](https://github.com/ratatui/ratatui)
- [r2d2](https://github.com/sfackler/r2d2)
- [r2d2_sqlite](https://github.com/ivanceras/r2d2-sqlite)
- [rusqlite](https://github.com/rusqlite/rusqlite)
- [smashquote](https://github.com/ionizedgirl/smashquote)
- [sqlparser](https://github.com/apache/datafusion-sqlparser-rs)
- [tracing](https://github.com/tokio-rs/tracing)
- [tracing-logfmt](https://github.com/EmbarkStudios/tracing-logfmt)
- [tuirealm](https://github.com/veeso/tui-realm)
- And many more! See the full list in the source repository.

## License

**csvs** is licensed under the [MIT license](LICENSE).