{"id":19652596,"url":"https://github.com/wangchunsen/parser","last_synced_at":"2025-10-31T12:02:37.588Z","repository":{"id":165623523,"uuid":"172004592","full_name":"wangchunsen/parser","owner":"wangchunsen","description":"A pure function parser library for general purpose","archived":false,"fork":false,"pushed_at":"2020-05-08T08:39:48.000Z","size":36,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-10T00:14:50.934Z","etag":null,"topics":["parser","scala"],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wangchunsen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-22T06:10:16.000Z","updated_at":"2020-05-08T08:39:50.000Z","dependencies_parsed_at":"2023-07-30T04:16:32.809Z","dependency_job_id":null,"html_url":"https://github.com/wangchunsen/parser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangchunsen%2Fparser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangchunsen%2Fparser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangchunsen%2Fparser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangchunsen%2Fparser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wangchunsen","download_url":"https://codeload.github.com/wangchunsen/parser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240961058,"owners_count":19885249,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["parser","scala"],"created_at":"2024-11-11T15:11:24.644Z","updated_at":"2025-10-31T12:02:37.547Z","avatar_url":"https://github.com/wangchunsen.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# parser\nA pure function parser library for general purpose\n\nHere is a example of parsing html using this library:\n\n```scala\n  type AttrValue = (String, Option[String])\n  \n  val voidElements = Array(\"area\", \"base\", \"br\", \"col\", \"embed\",\n    \"hr\", \"img\", \"input\", \"link\", \"meta\", \"param\", \"source\", \"track\", \"wbr\")\n\n  val textElement = Array(\"script\", \"style\", \"textarea\", \"title\")\n\n  val spaceChars = Array(' ', '\\t', '\\n', '\\r', '\\f')\n\n  def isWhiteSpace(char: Char) = spaceChars.contains(char)\n\n  def tagName: Parser[String] =\n    p(charsWhileIn(\"a-z0-9A-Z_\").cap)\n\n  def attributeName: Parser[String] = p {\n    val illegalChars = spaceChars ++ Array('=', '/', '\u003e', '\"', '\\'')\n    charsWhile(char =\u003e !illegalChars.contains(char)).min(1).cap\n  }\n\n  def maybeSpace: PUnit = charsWhileIn(spaceChars)\n\n  def mustSpace: PUnit = charsWhileIn(spaceChars) min 1\n\n  def attribute: Parser[AttrValue] =\n    mustSpace ~ attributeName ~ (maybeSpace ~\u003e \"=\" ~\u003e maybeSpace ~\u003e attrValue).opt\n\n  def attributes: Parser[Seq[AttrValue]] = attribute.rep\n\n  def attrValue: Parser[String] = p {\n    def quotedValue(quote: Char): Parser[String] = p(charsWhile(c =\u003e c != quote).cap \u003c~ quote)\n\n    def noQuote: Parser[String] = p {\n      val illegalChars = spaceChars ++ Array('\\'', '\"', '\u003e', '\u003c', '=', '`')\n      charsWhile(c =\u003e !illegalChars.contains(c)).cap\n    }\n\n    (\"\\\"\" | \"'\").cap.opt flatMap { quote =\u003e\n      quote\n        .map { q =\u003e quotedValue(q.charAt(0)) }\n        .getOrElse(noQuote)\n    }\n  }\n\n  def text: Parser[Text] = charsWhile(c =\u003e c != '\u003c').cap map Text\n\n  def comment: Parser[Comment] =\n    p(allBetween(\"\u003c!--\", \"--\u003e\") map Comment)\n\n  def node: Parser[Node] = comment | element | text\n\n  def closeType: Parser[Boolean] =\n  maybeSpace ~\u003e (\"/\u003e\" ~\u003e pass(true) | (\"\u003e\" ~\u003e pass(false))).!!\n\n  def allBetween(start: String, end: String): Parser[String] = {\n    val content: Parser[String] = matchAll.cap ~: charsUntil(end)\n    start ~\u003e content \u003c~ end\n  }\n\n\n\n  def scriptElement: Parser[Node] =\n    p(allBetween(\"\u003cscript\u003e\", \"\u003c/script\u003e\") map Text)\n\n  def closeTag: Parser[String] = p(\"\u003c/\" ~\u003e tagName.cap \u003c~ maybeSpace \u003c~ \"\u003e\")\n\n  def element: Parser[Element] =\n    \"\u003c\" ~\u003e tagName ~ attributes ~ closeType flatMap (t =\u003e {\n      val (tagName, attrs, closed) = t\n\n      def element(children: Seq[Node] = Seq.empty): Element = Element(tagName = tagName, attributes = ListMap(attrs: _*))\n\n      if (closed || voidElements(tagName)) pass(element())\n      else {\n        val childrenNodes = node.rep\n        childrenNodes \u003c~ s\"\u003c/$tagName\u003e\" map {nodes =\u003e\n          element(children = nodes)\n        }\n      }\n    })\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwangchunsen%2Fparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwangchunsen%2Fparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwangchunsen%2Fparser/lists"}