{"id":17490868,"url":"https://github.com/stastnypremysl/lsql-csv","last_synced_at":"2026-02-25T07:40:40.386Z","repository":{"id":55735306,"uuid":"266841594","full_name":"stastnypremysl/lsql-csv","owner":"stastnypremysl","description":"lsql-csv is a tool for small CSV file data querying from a shell with short queries. It makes it possible to work with small CSV files like with a read-only relational databases. The tool implements a new language LSQL similar to SQL, specifically designed for working with CSV files in a shell. LSQL aims to be a more lapidary language than SQL.","archived":false,"fork":false,"pushed_at":"2025-04-27T13:29:30.000Z","size":2924,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-08T07:10:56.150Z","etag":null,"topics":["csv","data-analysis","data-processing","haskell","language","linux-shell","lsql","lsql-csv","new-language","query-language","relational-database","sql","unix-command","unix-philosophy","unix-shell"],"latest_commit_sha":null,"homepage":"","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stastnypremysl.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-05-25T17:36:14.000Z","updated_at":"2025-04-27T13:29:34.000Z","dependencies_parsed_at":"2024-04-15T15:18:55.391Z","dependency_job_id":"895915af-5943-46c2-ad63-03a86d03dddd","html_url":"https://github.com/stastnypremysl/lsql-csv","commit_stats":null,"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/stastnypremysl/lsql-csv","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stastnypremysl%2Flsql-csv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stastnypremysl%2Flsql-csv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stastnypremysl%2Flsql-csv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stastnypremysl%2Flsql-csv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stastnypremysl","download_url":"https://codeload.github.com/stastnypremysl/lsql-csv/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stastnypremysl%2Flsql-csv/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29814179,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-25T05:36:42.804Z","status":"ssl_error","status_checked_at":"2026-02-25T05:36:31.934Z","response_time":61,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data-analysis","data-processing","haskell","language","linux-shell","lsql","lsql-csv","new-language","query-language","relational-database","sql","unix-command","unix-philosophy","unix-shell"],"created_at":"2024-10-19T07:04:42.846Z","updated_at":"2026-02-25T07:40:40.350Z","avatar_url":"https://github.com/stastnypremysl.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# `lsql-csv`\n`lsql-csv` is a tool for CSV file data querying from a shell with short queries. It makes it possible to work with small CSV files like with a read-only relational database.\n\nThe tool implements a new language LSQL similar to SQL, specifically designed for working with CSV files in a shell. LSQL aims to be a more lapidary language than SQL. Its design purpose is to enable its user to quickly write simple queries directly\n  to the terminal - its design purpose is therefore different from SQL, where the readability of queries is more taken into account than in LSQL.\n\n## Installation\nIt is necessary, you had GHC (`\u003e=8 \u003c9.29`) and Haskell packages Parsec (`\u003e=3.1 \u003c3.2`), Glob (`\u003e=0.10 \u003c0.11`), base (`\u003e=4.9 \u003c4.20`), text (`\u003e=1.2 \u003c2.2`) and containers (`\u003e=0.5 \u003c0.8`)\n installed. (The package boundaries given are identical to the boundaries in Cabal package file.) For a build and an installation run:\n\n    make\n    sudo make install\n    \nNow the `lsql-csv` is installed in `/usr/local/bin`. If you want, you can specify `INSTALL_DIR` like:\n\n    sudo make INSTALL_DIR=/custom/install-folder install\n\nThis will install the package into `INSTALL_DIR`.\n\nIf you have installed `cabal`, you can alternatively run:\n\n    cabal install\n   \nIt will also install the Haskell package dependencies for you.    \n\nThe package is also published at https://hackage.haskell.org/package/lsql-csv in the Hackage public repository. You can therefore also install it directly without the repository cloned with:\n\n    cabal install lsql-csv\n\n### Running the integration tests\nIf you want to verify, that the package has been compiled correctly, it is possible to test it by running:\n\n    make test\n\nThis will run all tests for you.\n\n\n## `lsql-csv` - quick introduction \n\n\n### Examples\nOne way to learn a new programming language is by understanding concrete examples of its usage. The following examples are written explicitly for the purpose of teaching a reader, how to use the tool `lsql-csv` by showing him many examples of its usage. \n\nThe following examples might not be enough for readers, who don't know enough Unix/Linux scripting. If this is the case, please consider learning Unix/Linux scripting first before LSQL.\n\nIt is also advantageous to know SQL.\n\nThe following examples will be mainly about parsing of `/etc/passwd` and parsing of `/etc/group`. To make example reading more comfortable, we have added `/etc/passwd` and `/etc/group` column descriptions from man pages to the text.\n\nFile `/etc/passwd` has the following columns:\n* login name;\n* optional encrypted password;\n* numerical user ID;\n* numerical group ID;\n* user name or comment field;\n* user home directory;\n* optional user command interpreter.\n\nFile `/etc/group` has the following columns:\n* group name;\n* password;\n* numerical group ID;\n* user list separated by commas.\n\n#### Hello World\n\n    lsql-csv '-, \u00261.2 \u00261.1'\n\nThis will print the second (`\u00261.2`) and the first column (`\u00261.1`) of a CSV file on the standard input. If you know SQL, you can read it like `SELECT S.second, S.first FROM stdio S;`. \n\nCommands are split by commas into blocks. The first block is (*and always is*) the from block. There are file names or `-` (the standard input) separated by whitespaces. \nThe second block in the example is the select block, also separated by whitespaces.\n\nFor example:\n\n    lsql-csv '-, \u00261.2 \u00261.1' \u003c\u003c- EOF\n    World,Hello\n    EOF\n    \nIt returns:\n    \n    Hello,World\n\n#### Simple filtering \n    lsql-csv -d: '-, \u00261.*, if \u00261.3 \u003e= 1000' \u003c/etc/passwd\n\nThis will print lines of users whose `UID \u003e= 1000`. It can also be written as:\n  \n    lsql-csv -d: 'p=/etc/passwd, p.*, if p.3 \u003e= 1000'\n    \n    lsql-csv -d: 'p=/etc/passwd, \u00261.*, if \u00261.3 \u003e= 1000'\n\n    lsql-csv -d: '/etc/passwd, \u00261.*, if \u00261.3 \u003e= 1000'\n    \nThe `-d:` optional argument means the primary delimiter is `:`. In the few examples we used overnaming, which allows us to give a data source file `/etc/passwd` a name `p`.\n\nIf you know SQL, you can read it as `SELECT * FROM /etc/passwd P WHERE P.UID \u003e= 1000;`. As you can see, the LSQL style is more compressed than standard SQL.\n    \nThe output might be:\n\n    nobody:x:65534:65534:nobody:/var/empty:/bin/false\n    me:x:1000:1000::/home/me:/bin/bash\n\nIf you specify delimiter specifically for `/etc/passwd`, the output will be a comma delimited.\n    \n    lsql-csv '/etc/passwd -d:, \u00261.*, if \u00261.3 \u003e= 1000'\n\nIt might return:\n\n    nobody,x,65534,65534,nobody,/var/empty,/bin/false\n    me,x,1000,1000,,/home/me,/bin/bash\n\nThis happens because the default global delimiter, which is used for the output generation, is a comma.\nThe global delimiter changes by usage of the command-line optional argument, but remains unchanged by the usage of the attribute inside the from block.\n\n\n#### Named columns\nLet's suppose we have a file `people.csv`:\n   \n    name,age\n    Adam,21\n    Petra,23\n    Karel,25\n\nNow, let's get all the names of people in `people.csv` using the `-n` named optional argument:\n\n    lsql-csv -n 'people.csv, \u00261.name'\n\nThe output will be:\n\n    Adam\n    Petra\n    Karel\n\nAs you can see, we can reference named columns by the name. The named optional argument `-n` enables first-line headers.\nIf first-line headers are enabled by the argument, each column has two names under `\u0026X` - the number name `\u0026X.Y` and the actual name `\u0026X.NAME`.\n\nNow, we can select all columns with a wildcard `\u00261.*`:\n\n    lsql-csv -n 'people.csv, \u00261.*'\n\nAs the output, we get:\n\n    Adam,21,21,Adam\n    Petra,23,23,Petra\n    Karel,25,25,Karel\n\nThe output contains each column twice because the wildcard `\u00261.*` was evaluated to `\u00261.1, \u00261.2, \u00261.age, \u00261.name`.\nHow to fix it?\n\n    lsql-csv -n 'people.csv, \u00261.[1-9]*'\n\nThe output is now:\n\n    Adam,21\n    Petra,23\n    Karel,25\n\nThe command can also be written as\n\n    lsql-csv -n 'people.csv, \u00261.{1,2}'\n    lsql-csv -n 'people.csv, \u00261.{1..2}'\n    lsql-csv 'people.csv -n, \u00261.{1..2}'\n\nThe output will be in all cases still the same.\n\n\n#### Simple join\n\nLet's say, I am interested in the default group names of users. We need to join tables `/etc/passwd` and `/etc/group`. Let's do it.\n\n    lsql-csv -d: '/etc/{passwd,group}, \u00261.1 \u00262.1, if \u00261.4 == \u00262.3'\n    \nWhat does `/etc/{passwd,group}` mean? Basically, there are three types of expressions. The select, the from, and the arithmetic expression. \nIn all select and from expressions, you can use the curly expansion and wildcards just like in `bash`.\n    \nFinally, the output can be something like this:\n\n    root:root\n    bin:bin\n    daemon:daemon\n    me:me\n\nThe first column is the name of a user and the second column is the name of its default group.\n    \n#### Basic grouping\nLet's say, I want to count users using the same shell. \n\n    lsql-csv -d: 'p=/etc/passwd, p.7 count(p.3), by p.7'\n    \nAnd the output?\n\n    /bin/bash:7\n    /bin/false:7\n    /bin/sh:1\n    /bin/sync:1\n    /sbin/halt:1\n    /sbin/nologin:46\n    /sbin/shutdown:1\n    \nYou can see here the first usage of the by block, which is equivalent to `GROUP BY` in SQL. \n\n#### Basic sorting\nLet's say, you want to sort your users by `UID` with `UID` greater than or equal to 1000 ascendingly.\n\n    lsql-csv -d: '/etc/passwd, \u00261.*, if \u00261.3 \u003e= 1000, sort \u00261.3'\n\nThe output might look like:\n  \n    me1:x:1000:1000::/home/me1:/bin/bash\n    me2:x:1001:1001::/home/me2:/bin/bash\n    me3:x:1002:1002::/home/me3:/bin/bash\n    nobody:x:65534:65534:nobody:/var/empty:/bin/false\n\nThe sort block is the equivalent of `ORDER BY` in SQL.\n\nIf we wanted descendingly sorted output, we might create a pipe to the `tac` command - the `tac` command prints the lines in reverse order:\n    \n    lsql-csv -d: '/etc/passwd, \u00261.*, if \u00261.3 \u003e= 1000, sort \u00261.3' | tac\n\n    \n#### About nice outputs\nThere is a trick, how to concatenate two values in the select expression: Write them without space.\n\nBut how will the interpreter know the ends of the values in a command? If the interpreter sees a char, that can't be part of the currently parsed value, it tries to parse it as a new value concatenated to the current one.\nYou can use quotes for it - quotes themselves can't be part of most value types like the column name or numerical constant.\n\nAs an example, let's try to format our basic grouping example.\n\n    lsql-csv -d: 'p=/etc/passwd, \"The number of users of \"p.7\" is \"count(p.3)\".\", by p.7\n    \nThe output might be:\n \n    The number of users of /bin/bash is 7.\n    The number of users of /bin/false is 7.\n    The number of users of /bin/sh is 1.\n    The number of users of /bin/sync is 1.\n    The number of users of /sbin/halt is 1.\n    The number of users of /sbin/nologin is 46.\n    The number of users of /sbin/shutdown is 1.\n   \nAs you can see, string formatting is sometimes very simple with LSQL.\n\n\n#### Arithmetic expression\n\nSo far, we just met all kinds of blocks, and only the if block accepts the arithmetic expression, and the other accepts the select expression. \nWhat if we needed to run the arithmetic expression inside the select expression? There is a special syntax `$(...)` for it.\n\nFor example:\n\n    lsql-csv -d: '/etc/passwd, $(sin(\u00261.3)^2 + cos(\u00261.3)^2)'\n    \nIt returns something like:\n    \n    1.0\n    1.0\n    1.0\n    0.9999999999999999\n    ...\n    1.0\n\nIf we run:\n    \n    lsql-csv -d: '/etc/passwd, $(\u00261.3 \u003e= 1000), sort $(\u00261.3 \u003e= 1000)'\n\nWe get something like:\n\n    false\n    false\n    ...\n    false\n    true\n    true\n    ...\n    true\n\n#### More complicated join\n\nLet's see more complicated examples.\n\n    lsql-csv -d: 'p=/etc/passwd g=/etc/group, p.1 g.1, if p.1 in g.4'\n    \nThis will print all pairs of users and its group excluding the default group. \nIf you know SQL, you can read it as `SELECT P.1, G.1 FROM /etc/passwd P, /etc/group G WHERE G.4 LIKE '%' + P.1 + '%';` with operator `LIKE` case-sensitive and columns named by their column number.\n\nHow does `in` work? It's one of the basic string level \"consist\". If some string `A` is a substring of `B`, then `A in B` is `true`. Otherwise, it is `false`.\n\nAnd the output?\n\n    root:root\n    root:wheel\n    root:floppy\n    root:tape\n    lp:lp\n    halt:root\n    halt:wheel\n\nThe example will work under the condition, that there isn’t any username, which is an infix of any other username.\n\n#### More complicated...\n\nThe previous example doesn't give a very readable output. We can use `group by` to improve it (shortened as `by`).\n\n    lsql-csv -d: 'p=/etc/passwd g=/etc/group, p.1 cat(g.1\",\"), if p.1 in g.4, by p.1'\n\nThe output will be something like:\n    \n    adm:adm,disk,sys,\n    bin:bin,daemon,sys,\n    daemon:adm,bin,daemon,\n    lp:lp,\n    mythtv:audio,cdrom,tty,video,\n    news:news,\n    \nIt groups all non-default groups of a user to a one line and concatenates it delimited by `,`. \n\nHow can we add default groups too?\n\n    lsql-csv -d: 'p=/etc/passwd g=/etc/group, p.1 cat(g.1\",\"), if p.1 in g.4, by p.1' |\n    lsql-csv -d: '- /etc/passwd /etc/group, \u00261.1 \u00261.2\"\"\u00263.1, if \u00261.1 == \u00262.1 \u0026\u0026 \u00262.4 == \u00263.3'\n    \nThis will output something like:\n\n    adm:adm,disk,sys,adm\n    bin:bin,daemon,sys,bin\n    daemon:adm,bin,daemon,daemon\n    lp:lp,lp\n    mythtv:audio,cdrom,tty,video,mythtv\n    news:news,news\n\nThe first part of the command is the same as in the previous example. The second part inner joins the output\nof the first part with `/etc/passwd` on the username and `/etc/group` on the default GID number and prints\nthe output of the first part with an added default group name.\n\nThe examples will also work under the condition, that there isn’t any username,\nwhich is an infix of any other username.\n\n## Usage\nNow, if you understand the examples, it is time to move forward to a more abstract description of the language and tool usage.\n\n### Options\n\n    -h\n    --help\n\nShows a short command line help and exits before doing anything else.\n\n    -n\n    --named\n\nEnables the first-line naming convention in CSV files. \nWith this option, the first lines of CSV files will be interpreted as a list of column names.\n\nThis works only on input files. Output is always without first-line column\nnames.\n    \n    -dCHAR\n    --delimiter=CHAR\n\nChanges the default primary delimiter. The default value is `,`.\n\n    -sCHAR\n    --secondary-delimiter=CHAR\n    \nChanges the default quote char (secondary delimiter). The default value is `\"`.\n\n### Datatypes\nThere are 4 datatypes considered: `Bool`, `Int`, `Double`, and `String`. \n`Bool` is either `true`/`false`, `Int` is at least a 30-bit integer, `Double` is a double-precision floating point number, and `String` is an ordinary char string.\n\nDuring CSV data parsing, the following logic of datatype selection is used: \n* `Bool`, if `true` or `false`;\n* `Int`, if the POSIX ERE `[0-9]+` fully matches;\n* `Double`, if the POSIX ERE `[0-9]+\\.[0-9]+(e[0-9]+)?` fully matches;\n* `String`, if none of the above matches.\n\n### Joins\nJoin means, that you put multiple input files into the from block.\n\nJoins always have the time complexity O(nm). \nThere is no optimization made based on if conditions when you put multiple files into the from block.\n\n### Documentation of language\n\n    lsql-csv [OPTIONS] COMMAND\n    \n    Description of the grammar:\n    \n      COMMAND -\u003e FROM_BLOCK, REST\n\n    \n      REST -\u003e SELECT_BLOCK, REST\n      REST -\u003e BY_BLOCK, REST\n      REST -\u003e SORT_BLOCK, REST\n      REST -\u003e IF_BLOCK, REST\n      REST -\u003e LAST_BLOCK\n        \n      LAST_BLOCK -\u003e SELECT_BLOCK   \n      LAST_BLOCK -\u003e BY_BLOCK\n      LAST_BLOCK -\u003e SORT_BLOCK\n      LAST_BLOCK -\u003e IF_BLOCK\n\n    \n      FROM_BLOCK -\u003e FROM_EXPR\n\n      FROM_EXPR -\u003e FROM_SELECTOR FROM_EXPR\n      FROM_EXPR -\u003e FROM_SELECTOR\n\n      // Wildcard and brace expansion\n      FROM_SELECTOR ~~\u003e FROM ... FROM \n\n\n      // Standard input\n      FROM -\u003e ASSIGN_NAME=- OPTIONS\n      FROM -\u003e - OPTIONS\n  \n      FROM -\u003e ASSIGN_NAME=FILE_PATH OPTIONS\n      FROM -\u003e FILE_PATH OPTIONS\n\n\n      OPTIONS -\u003e -dCHAR OPTIONS\n      OPTIONS -\u003e --delimiter=CHAR OPTIONS\n\n      OPTIONS -\u003e -sCHAR OPTIONS\n      OPTIONS -\u003e --secondary-delimiter=CHAR OPTIONS\n\n      OPTIONS -\u003e -n OPTIONS\n      OPTIONS -\u003e --named OPTIONS\n\n      OPTIONS -\u003e -N OPTIONS\n      OPTIONS -\u003e --not-named OPTIONS\n\n      OPTIONS -\u003e\n\n      \n      SELECT_BLOCK -\u003e SELECT_EXPR\n      BY_BLOCK -\u003e by SELECT_EXPR\n      SORT_BLOCK -\u003e sort SELECT_EXPR\n      IF_BLOCK -\u003e if ARITHMETIC_EXPR\n    \n \n      ARITHMETIC_EXPR -\u003e ATOM\n\n      ARITHMETIC_EXPR -\u003e ARITHMETIC_EXPR OPERATOR ARITHMETIC_EXPR\n      ARITHMETIC_EXPR -\u003e (ARITHMETIC_EXPR)\n\n      // Logical negation\n      ARITHMETIC_EXPR -\u003e ! ARITHMETIC_EXPR\n      // Number negation\n      ARITHMETIC_EXPR -\u003e - ARITHMETIC_EXPR\n      \n\n      SELECT_EXPR -\u003e ATOM_SELECTOR SELECT_EXPR\n      SELECT_EXPR -\u003e ATOM_SELECTOR\n      \n      // Wildcard and brace expansion\n      ATOM_SELECTOR ~~\u003e ATOM ... ATOM \n      \n\n      ATOM -\u003e pi\n      ATOM -\u003e e\n      ATOM -\u003e true\n      ATOM -\u003e false\n\n      // e.g. 1.0, \"text\", 'text', 1\n      ATOM -\u003e CONSTANT\n      // e.g. \u00261.1\n      ATOM -\u003e SYMBOL_NAME\n\n      ATOM -\u003e $(ARITHMETIC_EXPR)\n      ATOM -\u003e AGGREGATE_FUNCTION(SELECT_EXPR)\n      ATOM -\u003e ONEARG_FUNCTION(ARITHMETIC_EXPR)\n\n\n      // # is not a char:       \n      // Two atoms can be written without whitespace   \n      // and their values will be String appended    \n      // if the right atom begins with a char, \n      // which can't be a part of the left atom.     \n      //\n      // E.g. if the left atom is a number constant, \n      // and the right atom is a String constant \n      // beginning with a quote char,\n      // the left atom value will be converted to the String \n      // and prepended to the right atom value.\n      //\n      // This rule doesn't apply inside ARITHMETIC_EXPR \n      ATOM ~~\u003e ATOM#ATOM     \n\n      \n      // Converts all values to the String type and appends them.\n      AGGREGATE_FUNCTION -\u003e cat\n\n      // Returns the number of values.\n      AGGREGATE_FUNCTION -\u003e count\n\n      AGGREGATE_FUNCTION -\u003e min\n      AGGREGATE_FUNCTION -\u003e max\n      AGGREGATE_FUNCTION -\u003e sum\n      AGGREGATE_FUNCTION -\u003e avg\n      \n\n      // All trigonometric functions are in radians.\n      ONEARG_FUNCTION -\u003e sin\n      ONEARG_FUNCTION -\u003e cos\n      ONEARG_FUNCTION -\u003e tan\n      \n      ONEARG_FUNCTION -\u003e asin\n      ONEARG_FUNCTION -\u003e acos\n      ONEARG_FUNCTION -\u003e atan\n      \n      ONEARG_FUNCTION -\u003e sinh\n      ONEARG_FUNCTION -\u003e cosh\n      ONEARG_FUNCTION -\u003e tanh\n      \n      ONEARG_FUNCTION -\u003e asinh\n      ONEARG_FUNCTION -\u003e acosh\n      ONEARG_FUNCTION -\u003e atanh\n      \n      ONEARG_FUNCTION -\u003e exp\n      ONEARG_FUNCTION -\u003e sqrt\n      \n      // Converts a value to the String type and returns its length.\n      ONEARG_FUNCTION -\u003e size\n\n      ONEARG_FUNCTION -\u003e to_string\n      \n      ONEARG_FUNCTION -\u003e negate\n      ONEARG_FUNCTION -\u003e abs\n      ONEARG_FUNCTION -\u003e signum\n      \n      ONEARG_FUNCTION -\u003e truncate\n      ONEARG_FUNCTION -\u003e ceiling\n      ONEARG_FUNCTION -\u003e floor\n      \n      ONEARG_FUNCTION -\u003e even\n      ONEARG_FUNCTION -\u003e odd\n\n      \n      // A in B means A is a substring of B.\n      OPERATOR -\u003e in\n      \n      OPERATOR -\u003e *\n      OPERATOR -\u003e /\n\n      // General power\n      OPERATOR -\u003e **    \n      // Natural power\n      OPERATOR -\u003e ^     \n      \n      // Integer division truncated towards minus infinity\n      // (x div y)*y + (x mod y) == x\n      OPERATOR -\u003e div\n      OPERATOR -\u003e mod\n      \n      // Integer division truncated towards 0\n      // (x quot y)*y + (x rem y) == x  \n      OPERATOR -\u003e quot\n      OPERATOR -\u003e rem\n\n      // Greatest common divisor\n      OPERATOR -\u003e gcd\n      // Least common multiple\n      OPERATOR -\u003e lcm\n      \n      // String append\n      OPERATOR -\u003e ++    \n      \n      OPERATOR -\u003e +\n      OPERATOR -\u003e -\n      \n      OPERATOR -\u003e \u003c=\n      OPERATOR -\u003e \u003e=\n      OPERATOR -\u003e \u003c\n      OPERATOR -\u003e \u003e\n      OPERATOR -\u003e !=\n      OPERATOR -\u003e ==\n      \n      OPERATOR -\u003e ||\n      OPERATOR -\u003e \u0026\u0026\n\nEach command is made from blocks separated by a comma. There are these types of blocks.\n* From block\n* Select block\n* If block\n* By block\n* Sort block\n\nThe first block is always the from block. If the block after the first block is without a specifier (`if`, `by`, or `sort`), then it is the select block. Otherwise, it is a block specified by the specifier.\n\nThe from block accepts a specific grammar (as specified in the grammar description), the select, the by, and the sort block accept the select expression (`SELECT_EXPR` in the grammar), \nand the if block accepts the arithmetic expression (`ARITHMETIC_EXPR` in the grammar).\n\nEvery source data file has a reference number based on its position in the from block and may have multiple names - the assign name, the name given to the source data file by `ASSIGN_NAME=FILE_PATH` syntax in the from block, and \nthe default name, which is given by the path to the file or `-` in the case of the standard input in the from block.\n\nEach column of a source data file has a reference number based on its position in it and may have a name (if the named option is enabled for the given source file). \n\nIf a source data file with the reference number `M` (numbering input files from 1) has a name `XXX`, its columns can be addressed by `\u0026M.N` or `XXX.N`, where `N` is the reference number of a column (numbering columns from 1). \nIf the named option is enabled for the input file and a column has the name `NAME`, it can also be addressed by `\u0026M.NAME` or `XXX.NAME`.\n\nWe call the address a symbol name - `SYMBOL_NAME` in the grammar description.\n\nIf there is a collision in naming (some symbol name addresses more than one column), then the behavior is undefined.\n\n\n#### Exotic chars\nSome chars cannot be in unquoted symbol names - exotic chars. For simplicity, we can suppose, they are all non-alphanumerical chars excluding `-`, `.`, `\u0026`, and `_`. \nAlso the first char of a symbol name must be non-numerical and must not be `-` or `.` to not be considered as an exotic char.\n\nIt is possible to use a symbol name with exotic chars using \\` quote - like \\`EXOTIC SYMBOL NAME\\`. \n\n#### Quote chars\nThere are 3 quote chars (\\`, \" and ') used in LSQL. \" and ' are always quoting a `String`. The \\` quote char is used for quoting symbol names.\n\nThese chars can be used for `String` appending. If two atoms inside SELECT_EXPR are written consecutively without whitespace and the left atom ends by a quote char or the right begins by a quote char, \nthey will be converted to the `String` and will be `String` appended. \nFor example, `\u00261.1\"abc\"` means: convert the value of `\u00261.1` to the `String` and append it to the `String` constant `abc`.\n\n#### Constants\nConstants are in the grammar description as `CONSTANT`. In the following section, we speak only about these constants and not about built-in constant values like `pi` or `true`.\n\nThere are 3 datatypes of constants. `String`, `Double`, and `Int`. \nEvery string quoted in \" chars or ' chars in an LSQL command is always tokenized as a `String` constant. \nNumbers fully matching the POSIX ERE `[0-9]+` are considered `Int` constants and numbers fully matching the POSIX ERE `[0-9]+\\.[0-9]+` `Double` constants.\n\n#### Operator associativity and precedence\nAll operators are right-to-left associative.\n\nThe following list outlines the precedence of the `lsql-csv` infix operators. The lower the precedence number, the higher the priority.\n* 1: `in`, `**`, `^`\n* 2: `*`, `/`, `div`, `quot`, `rem`, `mod`, `gcd`, `lcm`\n* 3: `++`, `+`, `-`\n* 4: `\u003c=`, `\u003e=`, `\u003c`, `\u003e`, `!=`, `==`\n* 5: `||`, `\u0026\u0026`\n\n\n\n#### Select expression\nSelect expressions are in the grammar description as `SELECT_EXPR`.\nThey are similar to the `bash` expressions. They are made by atom selector expressions (`ATOM_SELECTOR`) separated by whitespaces. \nThese expressions are wildcard and brace expanded to atoms (`ATOM`) and are further processed as they were separated by whitespace.\n(In `bash` brace expansion is a mechanism by which arbitrary strings are generated. For example, `a{b,c,d}e` is expanded to `abe ace ade`, see [the `bash` reference manual](https://www.gnu.org/software/bash/manual/bash.html) for details.)\n\nWildcards and brace expansion expressions are only evaluated and expanded in unquoted parts of the atom selector expression,\nwhich aren’t part of an inner arithmetic expression.\n\nFor example, if we have an LSQL command with symbol names `\u00261.1` and `\u00261.2`, then\n* the atom selector expression `\u00261.{1,2}` will be expanded to `\u00261.1` and `\u00261.2`;\n* the atom selector expression `\u00261.*` will be expanded to `\u00261.1` and `\u00261.2`;\n* the atom selector expression `` `\u00261.*` `` will be expanded to `` `\u00261.*` ``;\n* the atom selector expression `\"\u00261.*\"` will be expanded to `\"\u00261.*\"`;\n* the atom selector expression `$(\u0026?.1)` will be expanded to `$(\u0026?.1)`;\n* the atom selector expression `\u0026*$(\u00261.1)` will be expanded to `\u00261.1$(\u00261.1)` and `\u00261.2$(\u00261.1)`.\n\n\nEvery atom selector expression can consist:\n* A wildcard (Each wildcard is expanded against the symbol name list. If no symbol name matching the wildcard is found, the wildcard is expanded to itself.);\n* A `bash` brace expansion expression (e.g. `{22..25}` -\u003e `22 23 24 25`);\n* An arithmetic expression in `$(expr)` format;\n* A call of an aggregate function `AGGREGATE_FUNCTION(SELECT_EXPR)` - there cannot be any space after `FUNCTION`;\n* A call of a one-argument function `ONEARG_FUNCTION(ARITHMETIC_EXPR)` — there cannot be any space after `FUNCTION`;\n* A constant;\n* A symbol name;\n* A built-in constant value;\n* A reference to a column name.\n\nIf you want to concatenate strings without `++` operator, you can write: `a.1\",\"a.2`.\n\nPlease, keep in mind, that operators must be put inside arithmetic expressions, or they will be matched to a column name or aggregate function.\n\n#### Arithmetic expression\nArithmetic expressions are in the grammar description as `ARITHMETIC_EXPR`.\n\nThe expressions use mainly the classical `awk` style of expressions.\nYou can use here operators `OPERATOR` keywords `\u003e`, `\u003c`, `\u003c=`, `\u003e=`, `==`, `||`, `\u0026\u0026`, `+`, `-`, `*`, `/`... \n\nWildcards and brace expansion expressions are not evaluated inside the arithmetic expression.\n\n#### Select blocks\nSelect blocks are referred in the grammar description as `SELECT_BLOCK`.\nThese blocks determine the output. They accept the select expression.\n\nThere must be at least one select block in an LSQL command, which refers to at least one symbol name, or the behavior is undefined.\n\nExamples of select blocks:\n\n    \u00261.[3-6]\n\nThis will print the 3rd, 4th, 5th, and 6th columns from the first file if the first file has at least 6 columns. \n\n    ax*.{6..4}\n\nThis will print the 6th, the 5th, and the 4th columns from all files whose name begins with ax if the files have at least 6 columns.\n\n\n\n#### From blocks\nThese blocks are in the grammar description as `FROM_BLOCK`.\nThere must be exactly one from block at the beginning of an LSQL command. \n\nThe from block contains input file paths (or `-` in the case of the standard input), and optionally their assign name `ASSIGN_NAME`. \n\nYou can use the wildcards and the curly bracket expansion as you were in the `bash` to refer input files. \nIf there is a wildcard with an assign name `NAME` matching more than one input file, the input files will be given assign names `NAME`, `NAME1`, `NAME2`...\nIf there is a wildcard, that matches to no file, it is expanded to itself. \n\nIf `FILE_PATH` is put inside \\` quotes, no wildcard or expansion logic applies to it.\n\nYou can also add custom attributes to input files in the format `FILE_PATH -aX --attribute=X -b`. The attributes will be applied to all files which will be matched against `FILE_PATH`.\nThe custom attributes are referred to as `OPTIONS` in the grammar description.\n\nExamples:\n\n    /etc/{passwd,group}\n\n\nThis will select `/etc/passwd` and `/etc/group` files. They can be addressed either as `\u00261` or `/etc/passwd`, and `\u00262` or `/etc/group`.\n\n    passwd=/etc/passwd\n\nThis will select `/etc/passwd` and set its assign name to `passwd`. It can be addressed as `\u00261`, `passwd`, or `/etc/passwd`.\n\n\n\n\n##### Possible attributes\n    \n    -n\n    --named\n\nEnables the first-line naming convention for an input CSV file.\nWith this option, the first line of a CSV file will be interpreted as a list of column names.\n\n\n    -N\n    --not-named\n\nYou can also set the exact opposite to an input file. This can be useful if you change the default behavior.\n\n    -dCHAR\n    --delimiter=CHAR\n    \nThis changes the primary delimiter of an input file.\n\n    -sCHAR\n    --secondary-delimiter=CHAR\n    \nThis changes the secondary delimiter of an input file.\n\nExample:\n\n    /etc/passwd -d:\n\nThis will select `/etc/passwd` and set its delimiter to `:`.\n\nCurrently, commas and `CHAR`s, which are also quotes in LSQL, are not supported as a delimiter or a secondary delimiter in `FILE_PATH` custom attributes.\n\n\n#### If blocks\nThese blocks are in the grammar description as `IF_BLOCK`.\nThey always begin with the `if` keyword.        \nThey accept arithmetic expressions, which should be convertible to `Bool`: \neither `String` `false`/`true`, `Int` `0` `false`, anything else `true`), or `Bool`. \n\nRows with the arithmetic expression converted to `Bool` `true` are printed or aggregated, and\nrows with the arithmetic expression converted to `Bool` `false` are skipped.\n\nFiltering is done before the aggregation.\n\nYou can imagine the if block as the `WHERE` clause in SQL.\n\n\n#### By blocks\nBy blocks are referred in the grammar description as `BY_BLOCK`.\nThese blocks always begin with the `by` keyword. They accept the select expression.\n\nThere can be only one by block in a whole LSQL command.\n\nThe by block is used to group the resulting set by the given atoms for the evaluation by an aggregate function.\nThe by block is similar to the `GROUP BY` clause in SQL.\n\nThere must be at least one aggregate function in the select block if the by block is present. Otherwise, the behavior is undefined.\n\nIf there is an aggregate function present without the by block present in an LSQL command, the aggregate function runs over all rows at once.\n\n\n#### Sort blocks\nThese blocks are in the grammar description as `SORT_BLOCK`.\nIt begins with the `sort` keyword. They accept the select expression.\n\nThe sort block determines the order of the final output - given atoms are sorted in ascending order.\nIf there is more than one atom in the sort block (`A`, `B`, `C`...), the data is first sorted by `A` \nand in the case of ties, the atoms (`B`, `C`...) are used to further refine the order of the final output.\n\nYou can imagine the sort block as the `ORDER BY` clause in SQL.\n\nThere can be only one sort block in the whole command.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstastnypremysl%2Flsql-csv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstastnypremysl%2Flsql-csv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstastnypremysl%2Flsql-csv/lists"}