{"id":38814356,"url":"https://github.com/aafeher/go-microdata-extract","last_synced_at":"2026-01-17T12:58:12.470Z","repository":{"id":260653312,"uuid":"873191292","full_name":"aafeher/go-microdata-extract","owner":"aafeher","description":"Go language library for extracting structured microdata from websites","archived":false,"fork":false,"pushed_at":"2025-02-02T11:14:42.000Z","size":61,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-02T12:20:26.501Z","etag":null,"topics":["go","golang","json-ld","jsonld","jsonld-extractor","microdata","microdata-extractor","opengraph","opengraphprotocol","structured-data","structured-data-extractor","xcard"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aafeher.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-15T18:43:59.000Z","updated_at":"2025-02-02T11:10:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"c0f0482a-164d-4eef-8fe9-916292ff83d2","html_url":"https://github.com/aafeher/go-microdata-extract","commit_stats":null,"previous_names":["aafeher/go-microdata-extract"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/aafeher/go-microdata-extract","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aafeher%2Fgo-microdata-extract","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aafeher%2Fgo-microdata-extract/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aafeher%2Fgo-microdata-extract/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aafeher%2Fgo-microdata-extract/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aafeher","download_url":"https://codeload.github.com/aafeher/go-microdata-extract/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aafeher%2Fgo-microdata-extract/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28508888,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T11:50:55.898Z","status":"ssl_error","status_checked_at":"2026-01-17T11:50:55.569Z","response_time":85,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","golang","json-ld","jsonld","jsonld-extractor","microdata","microdata-extractor","opengraph","opengraphprotocol","structured-data","structured-data-extractor","xcard"],"created_at":"2026-01-17T12:58:12.372Z","updated_at":"2026-01-17T12:58:12.452Z","avatar_url":"https://github.com/aafeher.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# go-microdata-extract\n\n[![codecov](https://codecov.io/gh/aafeher/go-microdata-extract/graph/badge.svg?token=BD1QYCZESR)](https://codecov.io/gh/aafeher/go-microdata-extract)\n[![Go](https://github.com/aafeher/go-microdata-extract/actions/workflows/go.yml/badge.svg)](https://github.com/aafeher/go-microdata-extract/actions/workflows/go.yml)\n[![Go Reference](https://pkg.go.dev/badge/github.com/aafeher/go-microdata-extract.svg)](https://pkg.go.dev/github.com/aafeher/go-microdata-extract)\n[![Go Report Card](https://goreportcard.com/badge/github.com/aafeher/go-microdata-extract)](https://goreportcard.com/report/github.com/aafeher/go-microdata-extract)\n\nA Go package for extracting structured data from HTML.\n\n## Formats supported\n\nFor currently supported formats, see [Statistics](#statistics)\n\n## Statistics\n\nUsage statistics of structured data formats for websites\n\n(from https://w3techs.com/technologies/overview/structured_data, 2025-07-01)\n\n\n| Format                                                                                 | Usage | Supported |\n| -------------------------------------------------------------------------------------- |-------| :-------: |\n| None                                                                                   | 22.5% |          |\n| [OpenGraph](https://ogp.me/)                                                           | 68.7% |    ✔    |\n| [X Cards](https://developer.x.com/en/docs/x-for-websites/cards/guides/getting-started) | 53.8% |    ✔    |\n| [JSON-LD](https://www.w3.org/TR/json-ld/)                                              | 51.0% |    ✔    |\n| [RDFa](https://www.w3.org/TR/rdfa-primer/)                                             | 39.4% |     -     |\n| [Microdata](https://html.spec.whatwg.org/multipage/microdata.html)                     | 23.6% |    ✔    |\n| [Dublin Core](https://www.dublincore.org/specifications/dublin-core/dc-html/)          | 0.9%  |     -     |\n| [Microformats](https://microformats.org/wiki/Main_Page)                                | 0.4%  |     -     |\n\n## Installation\n\n```bash\ngo get github.com/aafeher/go-microdata-extract\n```\n\n```go\nimport \"github.com/aafeher/go-microdata-extract\"\n```\n\n## Usage\n\n### Create instance\n\nTo create a new instance with default settings, you can simply call the `New()` function.\n\n```go\ne := extract.New()\n```\n\n### Configuration defaults\n\n- syntaxes: `[]Syntax{extract.SyntaxOpenGraph, extract.SyntaxXCards, extract.SyntaxJSONLD, extract.SyntaxMicrodata}`\n- userAgent: `\"go-microdata-extract (+https://github.com/aafeher/go-microdata-extract/blob/main/README.md)\"`\n- fetchTimeout: `3` seconds\n\n### Overwrite defaults\n\n#### Syntaxes\n\nTo set the syntaxes whose results you want to retrieve after processing, use the `SetSyntaxes()` function.\n\n```go\ne := extract.New()\ne = e.SetSyntaxes([]Syntax{extract.SyntaxOpenGraph, extract.SyntaxJSONLD})\n```\n... or ...\n```go\ne := extract.New().SetSyntaxes([]Syntax{extract.SyntaxOpenGraph, extract.SyntaxJSONLD})\n```\n\n#### User Agent\n\nTo set the user agent, use the `SetUserAgent()` function.\n\n```go\ne := extract.New()\ne = e.SetUserAgent(\"YourUserAgent\")\n```\n... or ...\n```go\ne := extract.New().SetUserAgent(\"YourUserAgent\")\n```\n\n#### Fetch timeout\n\nTo set the fetch timeout, use the `SetFetchTimeout()` function. It should be specified in seconds as an **uint8** value.\n\n```go\ne := extract.New()\ne = e.SetFetchTimeout(10)\n```\n... or ...\n\n```go\ne := extract.New().SetFetchTimeout(10)\n```\n\n#### Chaining methods\n\nIn both cases, the functions return a pointer to the main object of the package, allowing you to chain these setting methods in a fluent interface style:\n\n```go\ne := extract.New()\n     .SetSyntaxes([]Syntax{extract.SyntaxOpenGraph, extract.SyntaxJSONLD})\n     .SetUserAgent(\"YourUserAgent\")\n     .SetFetchTimeout(10)\n```\n\n### Extract\n\nOnce you have properly initialized and configured your instance, you can extract structured data using the `Extract()` function.\n\nThe `Extract()` function takes in two parameters:\n\n- `url`: the URL of the webpage,\n- `urlContent`: an optional string pointer for the content of the URL\n\nIf you wish to provide the content yourself, pass the content as the second parameter. If not, simply pass nil and the function will fetch the content on its own.\nThe `Extract()` function performs concurrent extracting and fetching optimized by the use of Go's goroutines and sync package, ensuring efficient structured data handling.\n\n```go\ne, err := e.Extract(\"https://github.com/aafeher/go-microdata-extract\", nil)\n```\n\nIn this example, structured data is extracted from \"https://github.com/aafeher/go-microdata-extract\". The function fetches the content itself, as we passed nil as the urlContent.\n\n## Examples\n\nExamples can be found in [/examples](https://github.com/aafeher/go-microdata-extract/tree/main/examples).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faafeher%2Fgo-microdata-extract","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faafeher%2Fgo-microdata-extract","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faafeher%2Fgo-microdata-extract/lists"}