{"id":15632792,"url":"https://github.com/edsu/microdata","last_synced_at":"2025-10-28T03:03:10.908Z","repository":{"id":45888005,"uuid":"1863359","full_name":"edsu/microdata","owner":"edsu","description":"python library for extracting html microdata","archived":false,"fork":false,"pushed_at":"2023-05-08T08:49:06.000Z","size":61,"stargazers_count":166,"open_issues_count":10,"forks_count":37,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-29T08:07:16.508Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/edsu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2011-06-08T03:15:14.000Z","updated_at":"2024-11-22T20:47:56.000Z","dependencies_parsed_at":"2024-06-18T21:40:20.894Z","dependency_job_id":null,"html_url":"https://github.com/edsu/microdata","commit_stats":{"total_commits":98,"total_committers":17,"mean_commits":5.764705882352941,"dds":0.2857142857142857,"last_synced_commit":"999ea63d4e6072b06d1d56f4890441baa9c1947d"},"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmicrodata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmicrodata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmicrodata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmicrodata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/edsu","download_url":"https://codeload.github.com/edsu/microdata/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247312077,"owners_count":20918344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T10:45:20.840Z","updated_at":"2025-10-28T03:03:05.888Z","avatar_url":"https://github.com/edsu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"microdata\n=========\n\n[![Build Status](https://github.com/edsu/microdata/actions/workflows/ci.yml/badge.svg)](https://github.com/edsu/microdata/actions/workflows/ci.yml)\n \nmicrodata.py is a small utility library for extracting [HTML5\nMicrodata](http://dev.w3.org/html5/md/) from HTML. It depends on\n[html5lib](http://code.google.com/p/html5lib/) to do the heavy lifting of\nbuilding the DOM. For more about HTML5 Microdata check out Mark Pilgrim's\n[chapter](http://diveintohtml5.org/extensibility.html) on on it in [Dive Into\nHTML5](http://diveintohtml5.org/).\n\nCommand Line\n------------\n\nWhen you install microdata via pip it will also install a command line utility: \n\n```\n$ microdata https://www.youtube.com/watch?v=dQw4w9WgXcQ\nhttps://www.youtube.com/watch?v=dQw4w9WgXcQ\n{\n  \"items\": [\n    {\n      \"type\": [\n        \"http://schema.org/VideoObject\"\n      ],\n      \"properties\": {\n        \"url\": [\n          \"https://www.youtube.com/watch?v=dQw4w9WgXcQ\"\n        ],\n        \"name\": [\n          \"Rick Astley - Never Gonna Give You Up (Official Music Video)\"\n        ],\n        \"description\": [\n          \"The official video for \\u00e2\\u20ac\\u0153Never Gonna Give You Up\\u00e2\\u20ac\\ufffd by Rick Astley \\u00e2\\u20ac\\u0153Never Gonna Give You Up\\u00e2\\u20ac\\ufffd was a global smash on its release in July 1987, topping the charts ...\"\n        ],\n        \"paid\": [\n          \"False\"\n        ],\n        \"channelId\": [\n          \"UCuAXFkgsw1L7xaCfnd5JJOw\"\n        ],\n        \"videoId\": [\n          \"dQw4w9WgXcQ\"\n        ],\n        \"duration\": [\n          \"PT3M33S\"\n        ],\n        \"unlisted\": [\n          \"False\"\n        ],\n        \"author\": [\n          {\n            \"type\": [\n              \"http://schema.org/Person\"\n            ],\n            \"properties\": {\n              \"url\": [\n                \"http://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw\"\n              ],\n              \"name\": [\n                \"\"\n              ]\n            }\n          }\n        ],\n        \"thumbnailUrl\": [\n          \"https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg\"\n        ],\n        \"thumbnail\": [\n          {\n            \"type\": [\n              \"http://schema.org/ImageObject\"\n            ],\n            \"properties\": {\n              \"url\": [\n                \"https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg\"\n              ],\n              \"width\": [\n                \"1280\"\n              ],\n              \"height\": [\n                \"720\"\n              ]\n            }\n          }\n        ],\n        \"embedUrl\": [\n          \"https://www.youtube.com/embed/dQw4w9WgXcQ\"\n        ],\n        \"playerType\": [\n          \"HTML5 Flash\"\n        ],\n        \"width\": [\n          \"1280\"\n        ],\n        \"height\": [\n          \"720\"\n        ],\n        \"isFamilyFriendly\": [\n          \"true\"\n        ],\n        \"regionsAllowed\": [\n          \"AD,AE,AF,AG,AI,AL,AM,AO,AQ,AR,AS,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BJ,BL,BM,BN,BO,BQ,BR,BS,BT,BV,BW,BY,BZ,CA,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EE,EG,EH,ER,ES,ET,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,IO,IQ,IR,IS,IT,JE,JM,JO,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MF,MG,MH,MK,ML,MM,MN,MO,MP,MQ,MR,MS,MT,MU,MV,MW,MX,MY,MZ,NA,NC,NE,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,PA,PE,PF,PG,PH,PK,PL,PM,PN,PR,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,SS,ST,SV,SX,SY,SZ,TC,TD,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TR,TT,TV,TW,TZ,UA,UG,UM,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,YE,YT,ZA,ZM,ZW\"\n        ],\n        \"interactionCount\": [\n          \"1141688870\"\n        ],\n        \"datePublished\": [\n          \"2009-10-24\"\n        ],\n        \"uploadDate\": [\n          \"2009-10-24\"\n        ],\n        \"genre\": [\n          \"Music\"\n        ]\n      }\n    }\n  ]\n}\n```\n\n\nLibrary\n-------\n\nHere's the basic usage from Python using https://raw.github.com/edsu/microdata/master/test-data/example.html as an example:\n\n```python\n\u003e\u003e\u003e import microdata\n\u003e\u003e\u003e import urllib\n\u003e\u003e\u003e url = \"https://raw.github.com/edsu/microdata/master/test-data/example.html\"\n\u003e\u003e\u003e items = microdata.get_items(urllib.urlopen(url))\n\u003e\u003e\u003e item = items[0]\n\u003e\u003e\u003e item.itemtype\n[http://schema.org/Person]\n\u003e\u003e\u003e item.name\nu\"Jane Doe\"\n\u003e\u003e\u003e item.colleagues\nu\"http://www.xyz.edu/students/alicejones.html\"\n\u003e\u003e\u003e item.get_all('colleagues')\n[u\"http://www.xyz.edu/students/alicejones.html\", u\"http://www.xyz.edu/students/bobsmith.html\"]\n\u003e\u003e\u003e print item.json()\n{\n  \"type\": [\n    \"http://schema.org/Person\"\n  ],\n  \"id\": \"http://www.xyz.edu/~jane\",\n  \"properties\": {\n    \"colleagues\": [\n      \"http://www.xyz.edu/students/alicejones.html\",\n      \"http://www.xyz.edu/students/bobsmith.html\"\n    ],\n    \"name\": [\n      \"Jane Doe\"\n    ],\n    \"url\": [\n      \"http://www.janedoe.com\"\n    ],\n    \"jobTitle\": [\n      \"Professor\"\n    ],\n    \"image\": [\n      \"janedoe.jpg\"\n    ],\n    \"telephone\": [\n      \"(425) 123-4567\"\n    ],\n    \"address\": [\n      {\n        \"type\": [\n          \"http://schema.org/PostalAddress\"\n        ],\n        \"properties\": {\n          \"addressLocality\": [\n            \"Seattle\"\n          ],\n          \"addressRegion\": [\n            \"WA\"\n          ],\n          \"streetAddress\": [\n            \"\\n          20341 Whitworth Institute\\n          405 N. Whitworth\\n        \"\n          ],\n          \"postalCode\": [\n            \"98052\"\n          ]\n        }\n      }\n    ],\n    \"email\": [\n      \"mailto:jane-doe@xyz.edu\"\n    ]\n  }\n}\n```\n\nLicense\n-------\n\n* CC0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedsu%2Fmicrodata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fedsu%2Fmicrodata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedsu%2Fmicrodata/lists"}