https://github.com/tst2005/lua-csv-parser
CSV parser with LPeg.re
https://github.com/tst2005/lua-csv-parser
csv-parser experiments lpeg lua re
Last synced: 3 months ago
JSON representation
CSV parser with LPeg.re
- Host: GitHub
- URL: https://github.com/tst2005/lua-csv-parser
- Owner: tst2005
- Created: 2017-04-25T11:45:43.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-07-08T13:01:28.000Z (over 8 years ago)
- Last Synced: 2025-04-06T02:47:03.876Z (6 months ago)
- Topics: csv-parser, experiments, lpeg, lua, re
- Language: Lua
- Homepage:
- Size: 6.84 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# csv-parser
# For me
This is an experimental project to improve my LPeg/LPeg.re/parser/lexer/... skills.
I started from something simple : parsing a CSV file!# For you
I publish my own sample of code, make step by step...
I hope it will be usefull for someone else.# my tries
# 1 with LPeg
## 1.a. LPeg sample from doc
See ["the Comma-Separated Values (CSV)" sample](http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#CSV).
You get the first sample ([try1a.lua](try1/try1a.lua)) :
```lua
local field = '"' * lpeg.Cs(((lpeg.P(1) - '"') + lpeg.P'""' / '"')^0) * '"' +
lpeg.C((1 - lpeg.S',\n"')^0)local record = field * (',' * field)^0 * (lpeg.P'\n' + -1)
function csv (s)
return lpeg.match(record, s)
end
```
Run: lua try1/test.try1a.luaAnd get:
```lua
foo bar baz
```# 1.b. LPeg sample from doc (bis)
The doc say we can capture values into a table. Just change the record definition :
```diff
-local record = field * (',' * field)^0 * (lpeg.P'\n' + -1)
+local record = lpeg.Ct(field * (',' * field)^0) * (lpeg.P'\n' + -1)
```We get ([try1b.lua](try1/try1b.lua)) :
```lua
local field = '"' * lpeg.Cs(((lpeg.P(1) - '"') + lpeg.P'""' / '"')^0) * '"' +
lpeg.C((1 - lpeg.S',\n"')^0)local record = lpeg.Ct(field * (',' * field)^0) * (lpeg.P'\n' + -1)
function csv (s)
return lpeg.match(record, s)
end
```Run: lua try1/test.try1b.lua
And get:
```lua
{
[1] = "foo",
[2] = "bar",
[3] = "baz",
}
```# 2. with LPeg.re
for now we make tries with this [CSV sample](sample.csv) :
```csv
foo,bar,baz
1,2,"trois"
11,22,"trois trois"
```We need the [LPeg.re documentation](http://www.inf.puc-rio.br/~roberto/lpeg/re.html#basic).
# 2.a. LPeg.re parse only one line at a time
Lua code is used to get lines from the input and add the parsed line result into a table.
```lua
local re = require"re"
local input = io.stdinlocal record = re.compile[[
record <- {| field (',' field)* |} (%nl / !.)
field <- escaped / nonescaped
nonescaped <- { [^,"%nl]* }
escaped <- '"' {~ ([^"] / '""' -> '"')* ~} '"'
]]local parsed = {}
while true do
local line = input:read("*l")
if not line then break end
parsed[#parsed+1]= record:match(line)
end-- show the result
print("return "..require"tprint"(parsed)) -- in lua
--print(require"json".encode(parsed)) -- in json
```Run: `lua csv-parser-1.lua < sample.csv`
See files :
* [csv-parser-1.lua](csv-parser-1.lua)
* [sample.csv](sample.csv)Get the result:
```lua
return {
[1] = {
[1] = "foo",
[2] = "bar",
[3] = "baz",
},
[2] = {
[1] = "1",
[2] = "2",
[3] = "trois",
},
[3] = {
[1] = "11",
[2] = "22",
[3] = "trois trois",
},
}
```# 2.b.
We add a `records` = some `record` to parse the entire file without extra lua code.
```lua
local re = require"re"
local input = io.stdinlocal csvfile = re.compile[[
records <- {| (record)* |} !.
record <- {| field (',' field)* |} %nl
field <- escaped / nonescaped
nonescaped <- { [^,"%nl]* }
escaped <- '"' {~ ([^"] / '""' -> '"')* ~} '"'
]]local parsed = csvfile:match(input:read("*a"))
-- show the result
print("return "..require"tprint"(parsed)) -- in lua
--print(require"json".encode(parsed)) -- in json
```Run: `lua csv-parser-2.lua < sample.csv`
See files :
* [csv-parser-2.lua](csv-parser-2.lua)
* [sample.csv](sample.csv)Get the result:
```lua
return {
[1] = {
[1] = "foo",
[2] = "bar",
[3] = "baz",
},
[2] = {
[1] = "1",
[2] = "2",
[3] = "trois",
},
[3] = {
[1] = "11",
[2] = "22",
[3] = "trois trois",
},
}
```# 2.c. result in AST
```lua
local re = require"re"local input = io.stdin
local csvfile = re.compile[[
csvfile <- {| {:tag: '' -> "csvfile":} hdr (row)+ |} !.
hdr <- row
row <- {| {:tag: '' -> "row" :} field (',' field)* |} %nl
eol <- %nl -- end of line (%nl is newline, "\n")field <- escaped / nonescaped
nonescaped <- { [^,"%nl]* }
escaped <- '"' {~ ([^"] / '""' -> '"')* ~} '"'
]]local parsed = csvfile:match(input:read("*a"))
-- show the result
print("return "..require"tprint"(parsed)) -- in lua
--print(require"json".encode(parsed)) -- in json
```Run: `lua csv-parser-3.lua < sample.csv`
See files :
* [csv-parser-3.lua](csv-parser-3.lua)
* [sample.csv](sample.csv)Get the result:
```lua
return {
[1] = {
[1] = "foo",
[2] = "bar",
[3] = "baz",
["tag"] = "row",
},
[2] = {
[1] = "1",
[2] = "2",
[3] = "trois",
["tag"] = "row",
},
[3] = {
[1] = "11",
[2] = "22",
[3] = "trois trois",
["tag"] = "row",
},
["tag"] = "csvfile",
}
```# 2.d.