{"id":46349765,"url":"https://github.com/ribbondz/gsv","last_synced_at":"2026-03-04T22:32:57.157Z","repository":{"id":62973572,"uuid":"174927746","full_name":"ribbondz/gsv","owner":"ribbondz","description":"CSV toolkit written in golang","archived":false,"fork":false,"pushed_at":"2021-09-24T15:38:51.000Z","size":259,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-06-20T22:34:09.685Z","etag":null,"topics":["cli","command-line","csv","golang","large-files"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ribbondz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-11T04:50:57.000Z","updated_at":"2021-09-24T15:38:54.000Z","dependencies_parsed_at":"2022-11-10T05:02:31.624Z","dependency_job_id":null,"html_url":"https://github.com/ribbondz/gsv","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/ribbondz/gsv","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribbondz%2Fgsv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribbondz%2Fgsv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribbondz%2Fgsv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribbondz%2Fgsv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ribbondz","download_url":"https://codeload.github.com/ribbondz/gsv/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribbondz%2Fgsv/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30096808,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T21:59:23.547Z","status":"ssl_error","status_checked_at":"2026-03-04T21:57:50.415Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","command-line","csv","golang","large-files"],"created_at":"2026-03-04T22:32:56.983Z","updated_at":"2026-03-04T22:32:57.145Z","avatar_url":"https://github.com/ribbondz.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# csv toolkit written in golang\r\n\r\ngsv is a command line program to deal with CSV files. Gsv has following features:\r\n\r\n- fast and parallel processing\r\n- real-time progress bar\r\n- simple usage\r\n\r\n## Usage\r\ndownload gsv.exe from release tab; and choose the either one:\r\n- put gsv.exe in system path\r\n- put gsv.exe and the data in same folder\r\n\r\n## Available commands\r\n- **head** - Show head n lines of CSV file.\r\n- **header** - Show header of CSV file.\r\n- **count** - Count the lines in CSV file.\r\n- **cat** - Concatenate CSV files by row **(with progress bar)**.\r\n- **frequency** - Show frequency table on columns.\r\n- **partition** - Split CSV file based on a column value **(with progress bar)**.\r\n- **select** - Select rows and columns from CSV file.\r\n- **stats** - Show statistics (e.g., min, max, average, unique count, null) on every column.\r\n\r\nTips: you can always check usage of each command by **gsv command --help**, \r\nfor example, gsv frequency --help.\r\n\r\n## Examples\r\n\r\n- gsv head\r\n```shell\r\ngsv head a.txt        // default to first 20 rows\r\ngsv head -l 30 a.txt  // first 30 rows\r\ngsv head --help       // help info on all flags\r\n```\r\n\r\n- gsv header \r\n```\r\ngsv header a.txt         // separator \",\" (default)\r\ngsv header -s \\t a.txt   // separator tab\r\n```\r\n\r\n- gsv count\r\n```shell\r\ngsv count a.txt      // default to have a header\r\ngsv count -n a.txt   // no header\r\ngsv count --help     // help info on all flags\r\n```\r\nTips: **gsv count dirname** can also count the number of files in direcroty.\r\n\r\n- gsv cat\r\n```shell\r\ngsv cat data_dir            // concatenate all files in data_dir directory, \r\n                            // assume a header for all files,\r\n                            // output file is named to data_dir-current-time.txt\r\ngsv cat -n data_dir         // no header\r\ngsv cat -p * data_dir       // file pattern, default to all files\r\ngsv cat -p *.csv data_dir   // all csv files in the directory\r\ngsv cat --help              // help info on all flags\r\n```\r\n\r\n- gsv frequency\r\n```shell\r\ngsv frequency a.txt           // first column, has header, separator \",\" (default)\r\ngsv frequency -n a.txt        // no header\r\ngsv frequency -s \\t a.txt     // tab separator\r\ngsv frequency -c 0 a.txt      // frequency table on first column (default)\r\ngsv frequency -c 1 a.txt      // frequency table on second column\r\ngsv frequency -c 0,1 a.txt    // frequency table on first and second columns\r\ngsv frequency -l 10 a.txt     // keep top 10 records\r\ngsv frequency -a a.txt        // frequency table in ascending order, default to descending\r\ngsv frequency -o a.txt        // Print the frequency table to output file named \"a-current-time.txt\"\r\ngsv frequency --help          // help info on all flags\r\n\r\ncolumn selection syntax:\r\n-c \"1,2\"   --\u003e    cols [1,2]\r\n-c \"1-3,6\" --\u003e    cols [1,2,3,6]\r\n-c \"!1\"    --\u003e    cols [all except col 1]\r\n-c \"-1\"    --\u003e    cols [all]\r\n\r\nfrequency table:\r\n+-------+-------+-------+\r\n|  COL  | VALUE | COUNT |\r\n+-------+-------+-------+\r\n| col_1 |     a |     2 |\r\n| col_1 |     b |     2 |\r\n| col_2 |     3 |     2 |\r\n| col_2 |     2 |     1 |\r\n| col_2 |     4 |     1 |\r\n+-------+-------+-------+\r\n```\r\n\r\n- gsv partition\r\n```shell\r\ngsv partition a.txt            // default to split by first column, separator \",\", with file header\r\ngsv partition -n a.txt         // no header\r\ngsv partition -c 0 a.txt       // split by first column (default)\r\ngsv partition -c 1 a.txt       // split by second column\r\ngsv partition -s , a.txt       // row separator is \",\" (default) \r\ngsv partition -s \\t a.txt      // row separator is tab\r\ngsv partition -summary a.txt   // generate a summary file tabling the number of lines for unique column values\r\ngsv partition --help           // help info on all flags\r\n```\r\n\r\n- gsv select\r\n```shell\r\ngsv select -f 0=abc a.txt                       // has header, separator \",\", first column is \"abc\"\r\n                                                // set FILTER criterion using -f flag\r\ngsv select -f \"0=abc|0=de\" a.txt                // first column is \"abc\" or \"de\"\r\ngsv select -f \"0=abc\u00261=de\" a.txt                // first column is \"abc\" and second column is \"de\"\r\ngsv select -f 0=abc -c 0,1,2 a.txt              // output keeps only columns 0, 1, and 2\r\ngsv select -f 0=abc -o a.txt                    // save result to a-filter-current-time.txt\r\ngsv select -n -s \\t -f 0=abc -c 0,1,2 -o a.txt  // all options\r\ngsv select -c 0,1 -o a.txt                      // NO filter, only to select columns\r\ngsv select --help                               // help info on other options\r\n\t\r\ncolumn filter syntax:\r\n-f \"0=abc\"        --\u003e  first column equal to string \"abc\"\r\n-f \"1=5.0\"        --\u003e  second column equal to number 5.0\r\n-f \"1=5\"          --\u003e  same as pre command, second column equal to number 5.0\r\n-f \"0=abc\u00261=5.0\"  --\u003e  first column is \"abc\" AND second column is 5.0\r\n-f \"0=abc|1=5.0\"  --\u003e  first column is \"abc\" OR second column is 5.0\r\n\r\nNOTE: 1. more complex syntax with brackets \r\n\t     such as \"(0=abc|1=5.0)\u0026c=1\" is not supported.\r\n      2. one filter can only have \u0026 or |, but never both. \r\n\t     This feature maybe be added in the future.\r\n      3. The filter option can be omitted to select all rows.\r\n\t     \r\ncolumn selection syntax:\r\n-c \"1,2\"   --\u003e    cols [1,2]\r\n-c \"1-3,6\" --\u003e    cols [1,2,3,6]\r\n-c \"!1\"    --\u003e    cols [all except col 1]\r\n-c \"-1\"    --\u003e    cols [all]\r\n```\r\n\r\n- gsv stats\r\n```shell\r\ngsv stats a.txt           // has header, separator \",\" (default)\r\ngsv stats -n a.txt        // no header\r\ngsv stats -s \\t a.txt     // tab separator\r\ngsv stats --help          // help info on all flags\r\n\r\nstatistics table.\r\n+------+--------+------+--------+---------------------+---------------------+----------+------------+------------+\r\n| COL  |  TYPE  | NULL | UNIQUE |         MIN         |          MAX        |   MEAN   | MIN LENGTH | MAX LENGTH |\r\n+------+--------+------+--------+---------------------+---------------------+----------+------------+------------+\r\n| col1 | string |    0 | 965304 | 00000208bb80146803f | ffffebf8245861dd564 |        - |         32 |         32 |\r\n| col2 |  float |    0 |      - |             30.1054 |             31.3370 |  30.6524 |          2 |          9 |\r\n| col3 |  float |    0 |      - |            103.0818 |            104.8750 | 104.0399 |          3 |         10 |\r\n| col4 |  float |    0 |      - |             30.1041 |             31.3370 |  30.6522 |          2 |          9 |\r\n| col5 |  float |    0 |      - |            103.0839 |            104.8742 | 104.0392 |          3 |         10 |\r\n| col6 | string |    0 | 566252 | 2016-11-07 00:00:00 | 2016-11-14 00:00:00 |        - |         23 |         23 |\r\n| col7 | string |    0 | 586711 | 1900-01-01 00:00:00 | 2021-09-24 13:52:23 |        - |         23 |         23 |\r\n| col8 |  float |    0 |      - |              0.0000 |             84.9298 |   2.0013 |          1 |         22 |\r\n+------+--------+------+--------+---------------------+--------------- -----+----------+------------+------------+\r\nTotal records: 9703035\r\nTime consumed: 6s\r\n```\r\n\r\n# Next\r\nnew features will be added in the future.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fribbondz%2Fgsv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fribbondz%2Fgsv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fribbondz%2Fgsv/lists"}