https://github.com/zeddo123/csvcat
Lighting-fast csv file concatination and filtering written in Go.
https://github.com/zeddo123/csvcat
clitool csv csvfile go golang
Last synced: 2 days ago
JSON representation
Lighting-fast csv file concatination and filtering written in Go.
- Host: GitHub
- URL: https://github.com/zeddo123/csvcat
- Owner: zeddo123
- License: gpl-3.0
- Created: 2023-05-06T13:56:58.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2023-08-31T20:14:01.000Z (almost 3 years ago)
- Last Synced: 2025-10-25T21:58:54.450Z (8 months ago)
- Topics: clitool, csv, csvfile, go, golang
- Language: Go
- Homepage:
- Size: 32.2 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Csvcat
`csvcat` is a very fast csv files compiler (with filtering) written in Go. Using concurrency, `csvcat` can concat and filter
a huge number of files without loosing to much in terms of memeroy and processing time.
Using a dummy dataset generated from `generate_set.py` with `100` files that have each `100000` lines (around 1.4G) in concurrent and non-concurrent
modes to filter 5 columns out of 10, cvscat take around (on an i7-7820HQ (8)):
```sh
$ ./csvcat --columns "B,A,E,C,F" --delimiter "," --directory "csvset" --concurrency=true
Number of files found: 100
============ Total 3.933102857s ===================
$ ./csvcat --columns "B,A,E,C,F" --delimiter "," --directory "csvset" --concurrency=false
Number of files found: 100
============ Total 10.588019261s ===================
```
## Usage of `csvcat`
```
Usage of ./csvcat:
-batch int
Batch size (default 30)
-c Set to false to ignore checking extension (default true)
-columns string
Columns to be selected
-concurrency
Set flag to disable concurrency (default true)
-delimiter string
Csv delimiter of files (default ",")
-directory string
Directory containing the files to be compilled (default ".")
-output string
Output filename (default "output.csv")
-v Set to true to have verbose output
```
Here's an example of how you might run `csvcat` with its flags:
```
./csvcat --batch 20 --columns "A,B,C" --delimiter "," --directory files
```
`csvcat` expects every csv file to have a header in its first line where all the columns are labled so that
it can filter the correct columns. If the csv file is not correctly formated (some lines have more/less columns),
it will try to add an empty column in the correct location.
## Building `csvcat`
To build `csvcat` you need to run:
```sh
go build .
// or
go install .
```