{"id":14036081,"url":"https://github.com/edsu/metaweb","last_synced_at":"2025-05-08T00:06:27.900Z","repository":{"id":26857020,"uuid":"111224817","full_name":"edsu/metaweb","owner":"edsu","description":"get metadata for a web page","archived":false,"fork":false,"pushed_at":"2023-07-19T09:12:53.000Z","size":185,"stargazers_count":8,"open_issues_count":3,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-04-26T10:05:24.953Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/edsu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-11-18T17:27:20.000Z","updated_at":"2024-01-03T18:49:43.000Z","dependencies_parsed_at":"2022-07-25T16:48:15.030Z","dependency_job_id":"aeb01a29-c8a2-4b8f-8f0d-f23d66a87aa6","html_url":"https://github.com/edsu/metaweb","commit_stats":{"total_commits":23,"total_committers":1,"mean_commits":23.0,"dds":0.0,"last_synced_commit":"1495706bbfbfdfba3510949236006d10a66badbe"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmetaweb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmetaweb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmetaweb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edsu%2Fmetaweb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/edsu","download_url":"https://codeload.github.com/edsu/metaweb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252133336,"owners_count":21699537,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-12T03:01:49.776Z","updated_at":"2025-05-08T00:06:27.880Z","avatar_url":"https://github.com/edsu.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# metaweb\n\n[![build-status](https://travis-ci.org/edsu/metaweb.svg?branch=master)](https://travis-ci.org/edsu/metaweb)\n\n*metaweb* will extract metadata for a web page. Only metadata for the webpage\nitself is extracted, not metadata for items within the page. *metaweb* attempts\nto extract common metadata from standard HTML, Twitter Cards and Facebook's\n[Open Graph Protocol](http://opengraphprotocol.org/). It is not meant to be\nperfect, or adhere to any particular overarching standard, but just to scratch a\nparticular itch I had at the time. If you've got your own itch to scratch please\nadd an [issue](https://github.com/edsu/metaweb/issues).\n\nThe name metaweb pays homage to one of the more forward looking startups of the\n[same name](https://en.wikipedia.org/wiki/Metaweb), who created one of the first\ncommunity driven entity databases on the web.\n\n## Install\n\n    npm install metaweb\n\n## Command Line\n\nWhen you install *metaweb* you will get a command line program:\n\n```\n% metaweb http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\n{\n  \"url\": \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\",\n  \"canonical\": \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\",\n  \"status\": 200,\n  \"content_type\": \"text/html\",\n  \"title\": \"NSA slides explain the PRISM data-collection program - The Washington Post\",\n  \"description\": \"Through a Top-Secret program authorized by federal judges working under the Foreign Intelligence Surveillance Act (FISA), the U.S. intelligence community can gain access to the servers of nine internet companies for a wide range of digital data. Documents describing the previously undisclosed program, obtained by The Washington Post, show the breadth of U.S. electronic surveillance capabilities.\",\n  \"image\": \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/images/upstream-promo-296.jpg\"\n}\n```\n\nUse the `--includeRaw` parameter to include all the ran `meta` and `link` \ncontent.\n\n```\nmetaweb http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/ --includeRaw\n{\n  \"url\": \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\",\n  \"canonical\": \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\",\n  \"status\": 200,\n  \"content_type\": \"text/html\",\n  \"title\": \"NSA slides explain the PRISM data-collection program - The Washington Post\",\n  \"description\": \"Through a Top-Secret program authorized by federal judges working under the Foreign Intelligence Surveillance Act (FISA), the U.S. intelligence community can gain access to the servers of nine internet companies for a wide range of digital data. Documents describing the previously undisclosed program, obtained by The Washington Post, show the breadth of U.S. electronic surveillance capabilities.\",\n  \"image\": \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/images/upstream-promo-296.jpg\",\n  \"raw\": {\n    \"link\": {\n      \"canonical\": [\n        \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\"\n      ],\n      \"shorturl\": [\n        \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\"\n      ],\n      \"stylesheet\": [\n        \"http://css.wpdigital.net/wpost/css/combo?context=eidos\u0026c=true\u0026m=true\u0026r=/2.0.0/reset.css\u0026r=/2.0.0/structure.css\u0026r=/2.0.0/header.css\u0026r=/2.0.0/footer.css\u0026r=/2.0.0/right-rail.css\u0026r=/2.0.0/rules.css\u0026r=/2.0.0/forms.css\u0026r=/2.0.0/base.css\u0026r=/2.0.0/flipper.css\u0026r=/2.0.0/modules.css\u0026r=/2.0.0/wsodEWA.css\u0026r=/2.0.0/ads.css\u0026r=/2.0.0/fonts/font_FranklinITCProBold.css\",\n        \"http://css.wpdigital.net/wp-srv/graphics/css/pretty-comments.css\",\n        \"http://css.wpdigital.net/wp-srv/graphics/css/staticbase-2.0.css\",\n        \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/css/prism.css\"\n      ]\n    },\n    \"meta\": {\n      \"twitter:title\": [\n        \"NSA slides explain the PRISM data-collection program\"\n      ],\n      \"description\": [\n        \"Through a Top-Secret program authorized by federal judges working under the Foreign Intelligence Surveillance Act (FISA), the U.S. intelligence community can gain access to the servers of nine internet companies for a wide range of digital data. Documents describing the previously undisclosed program, obtained by The Washington Post, show the breadth of U.S. electronic surveillance capabilities.\"\n      ],\n      \"twitter:description\": [\n        \"Through a Top-Secret program authorized by federal judges working under the Foreign Intelligence Surveillance Act (FISA), the U.S. intelligence community can gain access to the servers of nine internet companies for a wide range of digital data. Documents describing the previously undisclosed program, obtained by The Washington Post, show the breadth of U.S. electronic surveillance capabilities.\"\n      ],\n      \"keywords\": [\n        \"nsa, security, privacy, government data collection, nsa data collection, nsa prism program, prism data collection, prism program\"\n      ],\n      \"twitter:url\": [\n        \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/\"\n      ],\n      \"og:image\": [\n        \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/images/upstream-promo-296.jpg\"\n      ],\n      \"twitter:image\": [\n        \"http://www.washingtonpost.com/wp-srv/special/politics/prism-collection-documents/images/upstream-promo-296.jpg\"\n      ],\n      \"twitter:site\": [\n        \"@postgraphics\"\n      ],\n      \"twitter:card\": [\n        \"summary\"\n      ],\n      \"fb:app_id\": [\n        \"41245586762\"\n      ],\n      \"og:site_name\": [\n        \"The Washington Post\"\n      ]\n    },\n    \"title\": \"NSA slides explain the PRISM data-collection program - The Washington Post\"\n  }\n}\n```\n\n## JavaScript\n\nUsually you will probably want to use *metaweb* as a library in your own\nJavaScript applications:\n\n```javascript\nmetaweb = require('metaweb')\n\nmetadata = metaweb.get(url).then((metadata) =\u003e {\n  // do something with the metadata\n})\n```\n\nIf you would like to also get the raw `link` and `meta` content use the \n`includeRaw` parameter:\n\n```javascript\nmetaweb.get(url, includeRaw=true)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedsu%2Fmetaweb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fedsu%2Fmetaweb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedsu%2Fmetaweb/lists"}