{"id":13367141,"url":"https://github.com/andrewstuart/Goq","last_synced_at":"2025-03-12T18:32:00.398Z","repository":{"id":57480229,"uuid":"82510193","full_name":"andrewstuart/goq","owner":"andrewstuart","description":"A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library","archived":false,"fork":false,"pushed_at":"2021-09-02T04:20:26.000Z","size":101,"stargazers_count":259,"open_issues_count":1,"forks_count":20,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-10-25T05:25:09.870Z","etag":null,"topics":["decoder","golang","goquery","html","html-unmarshaling","scrape","selector","selectors","struct","unmarshaling","unmarshall","unmarshaller"],"latest_commit_sha":null,"homepage":"https://godoc.org/astuart.co/goq","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andrewstuart.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-20T02:54:40.000Z","updated_at":"2024-10-19T05:15:52.000Z","dependencies_parsed_at":"2022-09-18T05:49:02.826Z","dependency_job_id":null,"html_url":"https://github.com/andrewstuart/goq","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewstuart%2Fgoq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewstuart%2Fgoq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewstuart%2Fgoq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewstuart%2Fgoq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andrewstuart","download_url":"https://codeload.github.com/andrewstuart/goq/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243271476,"owners_count":20264463,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["decoder","golang","goquery","html","html-unmarshaling","scrape","selector","selectors","struct","unmarshaling","unmarshall","unmarshaller"],"created_at":"2024-07-30T00:01:39.564Z","updated_at":"2025-03-12T18:32:00.116Z","avatar_url":"https://github.com/andrewstuart.png","language":"Go","readme":"# goq\n[![Build Status](https://travis-ci.org/andrewstuart/goq.svg?branch=master)](https://travis-ci.org/andrewstuart/goq)\n[![GoDoc](https://godoc.org/astuart.co/goq?status.svg)](https://godoc.org/astuart.co/goq)\n[![Coverage Status](https://coveralls.io/repos/github/andrewstuart/goq/badge.svg?branch=master)](https://coveralls.io/github/andrewstuart/goq?branch=master)\n[![Go Report Card](https://goreportcard.com/badge/astuart.co/goq)](https://goreportcard.com/report/astuart.co/goq)\n\n## Example\n\n```go\nimport (\n\t\"log\"\n\t\"net/http\"\n\n\t\"astuart.co/goq\"\n)\n\n// Structured representation for github file name table\ntype example struct {\n\tTitle string `goquery:\"h1\"`\n\tFiles []string `goquery:\"table.files tbody tr.js-navigation-item td.content,text\"`\n}\n\nfunc main() {\n\tres, err := http.Get(\"https://github.com/andrewstuart/goq\")\n\tif err != nil {\n\t\tlog.Fatal(err)\n\t}\n\tdefer res.Body.Close()\n\n\tvar ex example\n\t\n\terr = goq.NewDecoder(res.Body).Decode(\u0026ex)\n\tif err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\tlog.Println(ex.Title, ex.Files)\n}\n```\n\n## Details\n\n# goq\n--\n    import \"astuart.co/goq\"\n\nPackage goq was built to allow users to declaratively unmarshal HTML into go\nstructs using struct tags composed of css selectors.\n\nI've made a best effort to behave very similarly to JSON and XML decoding as\nwell as exposing as much information as possible in the event of an error to\nhelp you debug your Unmarshaling issues.\n\nWhen creating struct types to be unmarshaled into, the following general rules\napply:\n\n- Any type that implements the Unmarshaler interface will be passed a slice of\n*html.Node so that manual unmarshaling may be done. This takes the highest\nprecedence.\n\n- Any struct fields may be annotated with goquery metadata, which takes the form\nof an element selector followed by arbitrary comma-separated \"value selectors.\"\n\n- A value selector may be one of `html`, `text`, or `[someAttrName]`. `html` and\n`text` will result in the methods of the same name being called on the\n`*goquery.Selection` to obtain the value. `[someAttrName]` will result in\n`*goquery.Selection.Attr(\"someAttrName\")` being called for the value.\n\n- A primitive value type will default to the text value of the resulting nodes\nif no value selector is given.\n\n- At least one value selector is required for maps, to determine the map key.\nThe key type must follow both the rules applicable to go map indexing, as well\nas these unmarshaling rules. The value of each key will be unmarshaled in the\nsame way the element value is unmarshaled.\n\n- For maps, keys will be retreived from the *same level* of the DOM. The key\nselector may be arbitrarily nested, though. The first level of children with any\nnumber of matching elements will be used, though.\n\n- For maps, any values *must* be nested *below* the level of the key selector.\nParents or siblings of the element matched by the key selector will not be\nconsidered.\n\n- Once used, a \"value selector\" will be shifted off of the comma-separated list.\nThis allows you to nest arbitrary levels of value selectors. For example, the\ntype `[]map[string][]string` would require one selector for the map key, and\ntake an optional second selector for the values of the string slice.\n\n- Any struct type encountered in nested types (e.g. map[string]SomeStruct) will\noverride any remaining \"value selectors\" that had not been used. For example,\ngiven:\n\n    struct S {\n    \tF string `goquery:\",[bang]\"`\n    }\n\n    struct {\n    \tT map[string]S `goquery:\"#someId,[foo],[bar],[baz]\"`\n    }\n\n`[foo]` will be used to determine the string map key,but `[bar]` and `[baz]`\nwill be ignored, with the `[bang]` tag present S struct type taking precedence.\n\n## Usage\n\n#### func  NodeSelector\n\n```go\nfunc NodeSelector(nodes []*html.Node) *goquery.Selection\n```\nNodeSelector is a quick utility function to get a goquery.Selection from a slice\nof *html.Node. Useful for performing unmarshaling, since the decision was made\nto use []*html.Node for maximum flexibility.\n\n#### func  Unmarshal\n\n```go\nfunc Unmarshal(bs []byte, v interface{}) error\n```\nUnmarshal takes a byte slice and a destination pointer to any interface{}, and\nunmarshals the document into the destination based on the rules above. Any error\nreturned here will likely be of type CannotUnmarshalError, though an initial\ngoquery error will pass through directly.\n\n#### func  UnmarshalSelection\n\n```go\nfunc UnmarshalSelection(s *goquery.Selection, iface interface{}) error\n```\nUnmarshalSelection will unmarshal a goquery.goquery.Selection into an interface\nappropriately annoated with goquery tags.\n\n#### type CannotUnmarshalError\n\n```go\ntype CannotUnmarshalError struct {\n\tErr      error\n\tVal      string\n\tFldOrIdx interface{}\n}\n```\n\nCannotUnmarshalError represents an error returned by the goquery Unmarshaler and\nhelps consumers in programmatically diagnosing the cause of their error.\n\n#### func (*CannotUnmarshalError) Error\n\n```go\nfunc (e *CannotUnmarshalError) Error() string\n```\n\n#### type Decoder\n\n```go\ntype Decoder struct {\n}\n```\n\nDecoder implements the same API you will see in encoding/xml and encoding/json\nexcept that we do not currently support proper streaming decoding as it is not\nsupported by goquery upstream.\n\n#### func  NewDecoder\n\n```go\nfunc NewDecoder(r io.Reader) *Decoder\n```\nNewDecoder returns a new decoder given an io.Reader\n\n#### func (*Decoder) Decode\n\n```go\nfunc (d *Decoder) Decode(dest interface{}) error\n```\nDecode will unmarshal the contents of the decoder when given an instance of an\nannotated type as its argument. It will return any errors encountered during\neither parsing the document or unmarshaling into the given object.\n\n#### type Unmarshaler\n\n```go\ntype Unmarshaler interface {\n\tUnmarshalHTML([]*html.Node) error\n}\n```\n\nUnmarshaler allows for custom implementations of unmarshaling logic\n\n## TODO\n\n- Callable goquery methods with args, via reflection\n","funding_links":[],"categories":["文本处理","文本處理"],"sub_categories":["高级控制台界面","高級控制台界面"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewstuart%2FGoq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrewstuart%2FGoq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewstuart%2FGoq/lists"}