{"id":20244308,"url":"https://github.com/protonmail/go-rfc5322","last_synced_at":"2025-04-10T20:43:56.903Z","repository":{"id":68953030,"uuid":"311374338","full_name":"ProtonMail/go-rfc5322","owner":"ProtonMail","description":"An RFC5322 address/date parser written in Go","archived":false,"fork":false,"pushed_at":"2022-08-16T13:44:17.000Z","size":288,"stargazers_count":14,"open_issues_count":1,"forks_count":5,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-24T18:13:14.894Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ProtonMail.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-09T15:02:02.000Z","updated_at":"2024-11-20T11:49:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"91214ebd-ded6-4547-935e-2844590776cb","html_url":"https://github.com/ProtonMail/go-rfc5322","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProtonMail%2Fgo-rfc5322","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProtonMail%2Fgo-rfc5322/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProtonMail%2Fgo-rfc5322/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProtonMail%2Fgo-rfc5322/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ProtonMail","download_url":"https://codeload.github.com/ProtonMail/go-rfc5322/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248294181,"owners_count":21079799,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T09:14:22.333Z","updated_at":"2025-04-10T20:43:56.883Z","avatar_url":"https://github.com/ProtonMail.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Outline\nThe `rfc5322` package implements a parser for `address-list` and `date-time` strings, as defined in RFC5322.\nIt also supports encoded words (RFC2047) and has international tokens (RFC6532).\n\n# Generated code\nThe lexer and parser are generated using ANTLR4.\nThe grammar is defined in the g4 files:\n- RFC5322Parser.g4 defines the parser grammar,\n- RFC5322Lexer.g4 defines the lexer grammar.\n\nThese grammars are derived from the ABNF grammar provided in the RFCs mentioned above, \nalbeit with some relaxations added to support \"nonstandard\" (and in some cases, bad) input.\n\nRunning `go generate` generates a parser which recognises strings conforming to the grammar:\n- parser/rfc5322_lexer.go\n- parser/rfc5322parser_base_listener.go\n- parser/rfc5322_parser.go\n- parser/rfc5322parser_listener.go\n\nThe generated parser can then be used to convert a valid address/date into an abstract syntax tree.\n\n# Parsing\nOnce we have an abstract syntax tree, we must turn it into something usable, namely a `mail.Address` or `time.Time`.\n\nThe generated code in the `parser` directory implements a walker.\nThis walker walks over the abstract syntax tree, \ncalling a callback when entering and another when when exiting each node.\nBy default, the callbacks are no-ops, unless they are overridden.\n\n## `walker.go`\nThe `walker` type extends the base walker, overriding the default no-op callbacks\nto do something specific when entering and exiting certain nodes. \n\nThe goal of the walker is to traverse the syntax tree, picking out relevant information from each node's text.\nFor example, when parsing a `mailbox` node, the relevant information to pick out from the parse tree is the\nname and address of the mailbox. This information can appear in a number of different ways, e.g. it might be\nRFC2047 word-encoded, it might be a string with escaped chars that need to be handled, it might have comments\nthat should be ignored, and so on.\n\nSo while walking the syntax tree, each node needs to ask its children what their \"value\" is.\nThe `mailbox` needs to ask its child nodes (either a `nameAddr` node or an `addrSpec` node)\nwhat the name and address are.\nIf the child node is a `nameAddr`, it needs to ask its `displayName` child what the name is\nand the `angleAddr` what the address is; these in turn ask `word` nodes, `addrSpec` nodes, etc.\n\nEach child node is responsible for telling its parent what its own value is.\nThe parent is responsible for assembling the children into something useful.\n\nIdeally, this would be done with the visitor pattern. But unfortunately, the generated parser only\nprovides a walker interface. So we need to make use of a stack, pushing on nodes when we enter them\nand popping off nodes when we exit them, to turn the walker into a kind of visitor.\n\n## `parser.go`\nThis file implements two methods, \n`ParseAddressList(string) ([]*mail.Address, error)` \nand\n`ParseDateTime(string) (time.Time, error)`.\n\nThese methods set up a parser from the raw input, start the walker, and convert the walker result\ninto an object of the correct type.\n\n\n# Example: Parsing `dateTime`\nParsing a date-time is rather simple. The implementation begins in `date_time.go`. The abridged code is below:\n\n```\ntype dateTime struct {\n\tyear   int\n\t...\n}\n\nfunc (dt *dateTime) withYear(year *year) {\n\tdt.year = year.value\n}\n\n...\n\nfunc (w *walker) EnterDateTime(ctx *parser.DateTimeContext) {\n\tw.enter(\u0026dateTime{\n\t\tloc: time.UTC,\n\t})\n}\n\nfunc (w *walker) ExitDateTime(ctx *parser.DateTimeContext) {\n\tdt := w.exit().(*dateTime)\n\tw.res = time.Date(dt.year, ...)\n}\n```\n\nAs you can see, when the walker reaches a `dateTime` node, it pushes a `dateTime` object onto the stack:\n```\nw.enter(\u0026dateTime{\n\tloc: time.UTC,\n})\n```\n\nand when it leaves a `dateTime` node, it pops it off the stack, \nconverting it from `interface{}` to the concrete type,\nand uses the parsed `dateTime` values like day, month, year etc \nto construct a go `time.Time` object to set the walker result:\n```\ndt := w.exit().(*dateTime)\nw.res = time.Date(dt.year, ...)\n```\n\nThese parsed values were discovered while the walker continued to walk across the date-time node.\n\nLet's see how the walker discovers the `year`.\nHere is the abridged code of what happens when the walker enters a `year` node:\n```\ntype year struct {\n\tvalue int\n}\n\nfunc (w *walker) EnterYear(ctx *parser.YearContext) {\n\tvar text string\n\n\tfor _, digit := range ctx.AllDigit() {\n\t\ttext += digit.GetText()\n\t}\n\n\tval, err := strconv.Atoi(text)\n\tif err != nil {\n\t\tw.err = err\n\t}\n\n\tw.enter(\u0026year{\n\t\tvalue: val,\n\t})\n}\n```\n\nWhen entering the `year` node, it collects all the raw digits, which are strings, then\nconverts them to an integer, and sets that as the year's integer value while pushing it onto the stack.\n\nWhen exiting, it pops the year off the stack and gives itself to the parent (now on the top of the stack).\nIt doesn't know what type of object the parent is, it just checks to see if anything above it on the stack\nis expecting a `year` node:\n```\nfunc (w *walker) ExitYear(ctx *parser.YearContext) {\n\ttype withYear interface {\n\t\twithYear(*year)\n\t}\n\n\tres := w.exit().(*year)\n\n\tif parent, ok := w.parent().(withYear); ok {\n\t\tparent.withYear(res)\n\t}\n}\n```\n\nIn our case, the `date` is expecting a `year` node because it implements `withYear`,\n```\nfunc (dt *dateTime) withYear(year *year) {\n\tdt.year = year.value\n}\n```\nand that is how the `dateTime` data members are collected.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprotonmail%2Fgo-rfc5322","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprotonmail%2Fgo-rfc5322","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprotonmail%2Fgo-rfc5322/lists"}