{"id":18351189,"url":"https://github.com/storyicon/graphquery","last_synced_at":"2025-04-06T10:32:25.508Z","repository":{"id":52676377,"uuid":"153610629","full_name":"storyicon/graphquery","owner":"storyicon","description":"GraphQuery is a query language and execution engine tied to any backend service. ","archived":false,"fork":false,"pushed_at":"2021-05-16T15:52:53.000Z","size":447,"stargazers_count":125,"open_issues_count":2,"forks_count":19,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-21T21:51:19.427Z","etag":null,"topics":["crawler","css","graph","html","jsonpath","query","regexp","sql","xml","xpath"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/storyicon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-10-18T11:08:47.000Z","updated_at":"2024-10-21T15:16:58.000Z","dependencies_parsed_at":"2022-08-20T14:30:52.355Z","dependency_job_id":null,"html_url":"https://github.com/storyicon/graphquery","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/storyicon%2Fgraphquery","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/storyicon%2Fgraphquery/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/storyicon%2Fgraphquery/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/storyicon%2Fgraphquery/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/storyicon","download_url":"https://codeload.github.com/storyicon/graphquery/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247470339,"owners_count":20944146,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","css","graph","html","jsonpath","query","regexp","sql","xml","xpath"],"created_at":"2024-11-05T21:29:49.578Z","updated_at":"2025-04-06T10:32:25.042Z","avatar_url":"https://github.com/storyicon.png","language":"Go","readme":"**We are looking for contributors**! Please check the [ROADMAP](https://github.com/storyicon/graphquery/blob/master/ROADMAP.md) to see how you can help ❤️          \n\n---\n\n❤️ This is a brand new project, let's give it some star and watch to see how it will develop, the open source community is driven by you !\n\n# GraphQuery [![CircleCI](https://circleci.com/gh/storyicon/graphquery/tree/master.svg?style=svg)](https://circleci.com/gh/storyicon/graphquery/tree/master) [![Go Report Card](https://goreportcard.com/badge/github.com/storyicon/graphquery)](https://goreportcard.com/report/github.com/storyicon/graphquery)  [![Build Status](https://travis-ci.org/storyicon/graphquery.svg?branch=master)](https://travis-ci.org/storyicon/graphquery)  [![Coverage Status](https://coveralls.io/repos/github/storyicon/graphquery/badge.svg)](https://coveralls.io/github/storyicon/graphquery) [![GoDoc](https://godoc.org/github.com/storyicon/graphquery?status.svg)](https://godoc.org/github.com/storyicon/graphquery) [![Gitter chat](https://badges.gitter.im/gitterHQ/gitter.png)](https://gitter.im/storyicon/Lobby)\n![GraphQuery](https://raw.githubusercontent.com/storyicon/graphquery/master/docs/screenshot/graphquery.png)\n\n \nGraphQuery is a query language and execution engine tied to any backend service. It is `back-end language independent`.    \n\nRelated Projects:        \n* [GraphQuery-PlayGround](https://github.com/storyicon/graphquery-playground) : Learn and test GraphQuery in an interactive walkthrough       \n* [Document](https://github.com/storyicon/graphquery/wiki) : Detailed documentation of GraphQuery\n* [GraphQuery-http](https://github.com/storyicon/graphquery-http) : Cross language solution for GraphQuery\n## Catalog\n- [Overview](#overview)\n    - [Language-independent](#language-independent)\n    - [Multiple selector syntax support](#multiple-selector-syntax-support)\n    - [Complete function](#complete-function)\n    - [Clear data structure \u0026 Concise grammar](#clear-data-structure--concise-grammar)\n    - [Mature error message](#mature-error-message)\n- [Getting Started](#getting-started)\n    - [1. First example](#1-first-example)\n    - [2. Pipeline](#2-pipeline)\n- [Install](#install)\n    - [1. Golang:](#1-golang)\n    - [2. Other language](#2-other-language)\n- [Update Log](#update-log)\n    - [2018-12-20](#2018-12-20)\n\n## Overview \nGraphQuery is an easy to use query language, it has built-in `Xpath/CSS/Regex/JSONpath` selectors and enough built-in `text processing functions`.    \nThe most amazing thing is that you can use the minimalist GraphQuery syntax to get `any data structure you want`.         \n\n### Language-independent\nUse GraphQuery to let you unify text parsing logic on any backend language.    \nYou won't need to find implementations of Xpath/CSS/Regex/JSONpath selectors between different languages ​​and get familiar with their syntax or explore their compatibility.\n \n### Multiple selector syntax support\nYou can use GraphQuery to parse any text and use your skilled selector. GraphQuery currently supports the following selectors:\n1. `Jsonpath` for parsing JSON strings\n2. `Xpath` and `CSS` for parsing XML/HTML\n3. `Regular expressions` for parsing any text.    \n\nYou can use these selectors in any combination in GraphQuery. The rich built-in selector provides great flexibility for your parsing.\n\n### Complete function\nGraphquery has some built-in text processing functions like `trim`, `template`, `replace`. If you think these functions don't meet your needs, you can register new custom functions in the pipeline.\n\n\n### Clear data structure \u0026 Concise grammar\nWith GraphQuery, you won't need to look for parsing libraries when parsing text, nor do you need to write complex nesting and traversal. Simple and clear GraphQuery syntax gives you a clear picture of the data structure.      \n\n![compare](https://raw.githubusercontent.com/storyicon/graphquery/master/docs/screenshot/compare.png) \n           \nAs you can see from the comparison chart above, the syntax of GraphQuery is so simple that even if you are in touch with it for the first time, you can still understand its meaning and get started quickly.            \n\n\n### Mature error message          \n\nWhether it is a syntax error or a function call error, there will be detailed error messages for you to debug.          \n\n![compare](https://raw.githubusercontent.com/storyicon/graphquery/master/docs/screenshot/error.gif)       \n\nThe above example is executed in [GraphQuery-PlayGround](https://github.com/storyicon/graphquery-playground). We can see that when an error occurs, GraphQuery can return detailed error information to help developers locate the wrong location.        \n\nAt the same time, GraphQuery is also very easy to integrate into your backend data system (any backend language), let's continue to look down.                  \n\n## Getting Started \nGraphQuery consists of query language and pipelines. To guide you through each of these components, we've written an example designed to illustrate the various pieces of GraphQuery. This example is not comprehensive, but it is designed to quickly introduce the core concepts of GraphQuery. The premise of the example is that we want to use GraphQuery to query for information about library books.          \n\n\n### 1. First example\n\n```html\n\u003clibrary\u003e\n\u003c!-- Great book. --\u003e\n\u003cbook id=\"b0836217462\" available=\"true\"\u003e\n    \u003cisbn\u003e0836217462\u003c/isbn\u003e\n    \u003ctitle lang=\"en\"\u003eBeing a Dog Is a Full-Time Job\u003c/title\u003e\n    \u003cquote\u003eI'd dog paddle the deepest ocean.\u003c/quote\u003e\n    \u003cauthor id=\"CMS\"\u003e\n        \u003c?echo \"go rocks\"?\u003e\n        \u003cname\u003eCharles M Schulz\u003c/name\u003e\n        \u003cborn\u003e1922-11-26\u003c/born\u003e\n        \u003cdead\u003e2000-02-12\u003c/dead\u003e\n    \u003c/author\u003e\n    \u003ccharacter id=\"PP\"\u003e\n        \u003cname\u003ePeppermint Patty\u003c/name\u003e\n        \u003cborn\u003e1966-08-22\u003c/born\u003e\n        \u003cqualification\u003ebold, brash and tomboyish\u003c/qualification\u003e\n    \u003c/character\u003e\n    \u003ccharacter id=\"Snoopy\"\u003e\n        \u003cname\u003eSnoopy\u003c/name\u003e\n        \u003cborn\u003e1950-10-04\u003c/born\u003e\n        \u003cqualification\u003eextroverted beagle\u003c/qualification\u003e\n    \u003c/character\u003e\n\u003c/book\u003e\n\u003c/library\u003e\n```\nFaced with such a text structure, we naturally think of extracting the following data structure from the text :\n```\n{\n    bookID\n    title\n    isbn\n    quote\n    language\n    author{\n        name\n        born\n        dead\n    }\n    character [{\n        name\n        born\n        qualification\n    }]\n}\n```\nThis is perfect, when you know the data structure you want to extract, you have actually succeeded 80%, the above is the data structure we want, we call it DDL (Data Definition Language) for the time being. let's see how GraphQuery does it:\n```\n{\n    bookID `css(\"book\");attr(\"id\")`\n    title `css(\"title\")`\n    isbn `xpath(\"//isbn\")`\n    quote `css(\"quote\")`\n    language `css(\"title\");attr(\"lang\")`\n    author `css(\"author\")` {\n        name `css(\"name\")`\n        born `css(\"born\")`\n        dead `css(\"dead\")`\n    }\n    character `xpath(\"//character\")` [{\n        name `css(\"name\")`\n        born `css(\"born\")`\n        qualification `xpath(\"qualification\")`\n    }]\n}\n```\nAs you can see, the syntax of GraphQuery adds some strings wrapped in \u003cb\u003e\\`\u003c/b\u003e to the DDL. These strings wrapped by \u003cb\u003e\\`\u003c/b\u003e are called `Pipeline`. We will introduce Pipeline later.\nLet's first take a look at what data GraphQuery engine returns to us.\n```json\n{\n    \"bookID\": \"b0836217462\",\n    \"title\": \"Being a Dog Is a Full-Time Job\",\n    \"isbn\": \"0836217462\",\n    \"quote\": \"I'd dog paddle the deepest ocean.\",\n    \"language\": \"en\",\n    \"author\": {\n        \"born\": \"1922-11-26\",\n        \"dead\": \"2000-02-12\",\n        \"name\": \"Charles M Schulz\"\n    },\n    \"character\": [\n        {\n            \"born\": \"1966-08-22\",\n            \"name\": \"Peppermint Patty\",\n            \"qualification\": \"bold, brash and tomboyish\"\n        },\n        {\n            \"born\": \"1950-10-04\",\n            \"name\": \"Snoopy\",\n            \"qualification\": \"extroverted beagle\"\n        }\n    ],\n}\n```\nWow, it's wonderful. Just like what we want.    \nWe call the above example Example1, now let's have a brief look at what pipeline is.\n\n### 2. Pipeline\nA pipeline is a collection of functions that use the parent element text as an entry parameter to execute the functions in the collection in sequence.\nFor example, the language field in our previous example is defined as follows:\n```graphquery\nlanguage `css(\"title\");attr(\"lang\")`\n```\nThe `language` is the field name, `css(\"title\"); attr(\"lang\")` is the pipeline. In this pipeline, GraphQuery first uses the CSS selector to find the `title` node from the document, and the title node will be obtained. Pass the obtained node into the attr() function and get its lang attribute. The whole process is as follows:\n\n![language: document-\u003ecss(\"title\")-\u003eattr(\"lang\")-\u003een](https://raw.githubusercontent.com/storyicon/graphquery/master/docs/screenshot/pipeline-getlang.png) \n\nIn Example1, we not only use the css and attr functions, but also xpath(). It is easy to associate, Xpath() is to select elements with the Xpath selector.\nThe following is a list of the pipeline functions built into the current version of graphquery:    \n\n| pipeline | prototype | example | introduce \n| ------ | ------ | ------ | ----- |\n| css | css(CSSSelector) | css(\"title\") | Use CSS selector to select elements | \n| json | json(JSONSelector) | json(\"title\") | Use json path to select elements | \n| xpath | xpath(XpathSelector) |  xpath(\"//title\") |Use Xpath selector to select elements |\n| regex | regex(RegexSelector) | regex(\"\u003ctitle\u003e(.*?)\u003c/title\u003e\") | Use Regex selector to select elements |\n| trim | trim() | trim() | Clear spaces and line breaks before and after the string|\n| template | template(TemplateStr) | template(\"[{$}]\") | Add characters before and after variables|\n| attr | attr(AttributeName) | attr(\"lang\") | Extract the property of the current node|\n| eq | eq(Index) | eq(\"0\") | Take the nth element in the current node collection|\n| string | string() | string() | Extract the current node native string|\n| text | text() | text() | Extract the text of the current node|\n| link | link(KeyName) | link(\"title\") | Returns the current text of the specified key|\n| replace | replace(A, B) | replace(\"a\", \"b\") | Replace all A in the current node to B|\n| absolute | absolute(A) | absolute(\"https://google.com\") | Absolute will take A as a reference and absoluteize the current text as a URL | \n\nMore detailed introduction to pipeline and function, please go to docs.\n\n## Install \nGraphQuery is currently only native to Golang, but for other languages, it can be invoked as a service.     \n\n### 1. Golang:\n```\ngo get -u github.com/storyicon/graphquery\n```\nCreate a new go file :\n```golang\npackage main\n\nimport (\n\t\"encoding/json\"\n\t\"log\"\n\n\t\"github.com/storyicon/graphquery\"\n)\n\nfunc main() {\n\tdocument := `\n        \u003chtml\u003e\n            \u003cbody\u003e\n                \u003ca href=\"01.html\"\u003ePage 1\u003c/a\u003e\n                \u003ca href=\"02.html\"\u003ePage 2\u003c/a\u003e\n                \u003ca href=\"03.html\"\u003ePage 3\u003c/a\u003e\n            \u003c/body\u003e\n        \u003c/html\u003e\n    `\n\texpr := \"{ anchor `css(\\\"a\\\")` [ content `text()` ] }\"\n\tresponse := graphquery.ParseFromString(document, expr)\n\tbytes, _ := json.Marshal(response.Data)\n\tlog.Println(string(bytes))\n}\n```\nRun the go file, the output is as follows : \n```\n{\"anchor\":[\"Page 1\",\"Page 2\",\"Page 3\"]}\n```\n\n### 2. Other language\nWe use the HTTP protocol to provide a cross-language solution for developers to query GraphQuery using any back-end language you want to use to access the specified port after starting the service.      \n\n\u003e [GraphQuery-http](https://github.com/storyicon/graphquery-http) : Cross language solution for GraphQuery        \n\nYou can also use RPC for communication, but currently you may need to do this yourself, because the RPC project on GraphQuery is still under development.           \nAt the same time, We welcome the contributors to write native support code for other languages ​​in GraphQuery.      \n\n\n## Update Log\n\n#####  2018-12-20\n\n1. Now `graphquery.Response` is the alias of `kernel.Response`           \n2. Change the type of `Response.Errors` from `[]string` to `kernel.Errors`, `kernel.Errors` implements some common interfaces for `error` and `json`      \n3. Added three methods for kernel.Response       \n3.1 `MarshalData() (string, error)`: Output `Response.Data` as json string\n3.2 `JSON() string`: Output `Response` as json string, It is now equivalent to `String()`       \n3.3 `Decode(obj interface{})`: Map parsing results to a given data structure at a lower cost. You can use it like this:        \n\n```go\n\ntype Anchor struct {\n\tTitle string `json:\"title\"`\n\tURL   string `json:\"url\"`\n}\n\nfunc main() {\n\tdocument := `\n        \u003ca href=\"1.html\"\u003eanchor 1\u003c/a\u003e\n        \u003ca href=\"2.html\"\u003eanchor 2\u003c/a\u003e\n        \u003ca href=\"3.html\"\u003eanchor 3\u003c/a\u003e\n    `\n\tquery := \"a `css(\\\"a\\\")` [{ title `text();trim()` url  `attr(\\\"href\\\")` }]\"\n\tresponse := graphquery.ParseFromString(document, query)       \n\t\n    anchors := []*Anchor{}\n\tresponse.Decode(\u0026anchors)\n    // Now you have converted the parsing results to []*Anchor\n}\n\n\n```\n\n4. Any questions in use, please feel free to issue :)","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstoryicon%2Fgraphquery","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstoryicon%2Fgraphquery","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstoryicon%2Fgraphquery/lists"}