{"id":13845891,"url":"https://github.com/Josue87/MetaFinder","last_synced_at":"2025-07-12T03:32:44.926Z","repository":{"id":48481203,"uuid":"319952333","full_name":"Josue87/MetaFinder","owner":"Josue87","description":"Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata","archived":false,"fork":false,"pushed_at":"2024-01-19T23:22:13.000Z","size":55,"stargazers_count":195,"open_issues_count":4,"forks_count":32,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-11-07T04:19:41.325Z","etag":null,"topics":["crawler","metadata","osint"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Josue87.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-09T12:38:58.000Z","updated_at":"2024-10-20T21:50:43.000Z","dependencies_parsed_at":"2024-06-19T00:12:54.512Z","dependency_job_id":"d467a6b4-d498-42b9-b7c1-0716a05ec886","html_url":"https://github.com/Josue87/MetaFinder","commit_stats":{"total_commits":33,"total_committers":5,"mean_commits":6.6,"dds":0.2727272727272727,"last_synced_commit":"2caed73a8820ddcafea334b9bb3c941d4a0b52e9"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Josue87%2FMetaFinder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Josue87%2FMetaFinder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Josue87%2FMetaFinder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Josue87%2FMetaFinder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Josue87","download_url":"https://codeload.github.com/Josue87/MetaFinder/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225791438,"owners_count":17524783,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","metadata","osint"],"created_at":"2024-08-04T17:03:39.759Z","updated_at":"2024-11-21T19:30:48.884Z","avatar_url":"https://github.com/Josue87.png","language":"Python","readme":"\u003ch1 align=\"center\"\u003e\n  \u003cb\u003eMetaFinder\u003c/b\u003e\n  \u003cbr\u003e\n\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://www.python.org/\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/python-3.6+-blue.svg?style=flat-square\u0026logo=python\"\u003e \n  \u003c/a\u003e\n   \u003ca href=\"https://www.gnu.org/licenses/gpl-3.0.en.html\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/license-GNU-green.svg?style=square\u0026logo=gnu\"\u003e\n   \u003ca href=\"https://twitter.com/JosueEncinar\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/author-@JosueEncinar-orange.svg?style=square\u0026logo=twitter\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\nSearch for documents in a domain through Search Engines. The objective is to extract metadata. \n\u003c/p\u003e\n\u003cbr/\u003e\n\n## Installation:\n\n```\n\u003e pip3 install metafinder\n```\n\nUpgrades are also available using:\n\n```\n\u003e pip3 install metafinder --upgrade\n```\n\n## Usage \n\nMetaFinder can be used in 2 ways:\n\n### CLI\n```\nmetafinder -d domain.com -l 20 -o folder [-t 10] -go -bi -ba\n```\n\nParameters:\n* d: Specifies the target domain.\n* l: Specify the maximum number of results to be searched in the searchs engines.\n* o: Specify the path to save the report.\n* t: Optional. Used to configure the threads (4 by default).\n* v: Show Metafinder version.\n* Search Engines to select (Google by default):\n  * go: Optional. Search in Google.\n  * bi: Optional. Search in Bing.\n  * ba: Optional. Search in Baidu. (Experimental)\n\n### In Code\n```\nimport metafinder.extractor as metadata_extractor\n\ndocuments_limit = 5\ndomain = \"target_domain\"\nresult = metadata_extractor.extract_metadata_from_google_search(domain, documents_limit)\n# result = metadata_extractor.extract_metadata_from_bing_search(domain, documents_limit)\n# result = metadata_extractor.extract_metadata_from_baidu_search(domain, documents_limit)\nauthors = result.get_authors()\nsoftware = result.get_software()\nfor k,v in result.get_metadata().items():\n    print(f\"{k}:\")\n    print(f\"|_ URL: {v['url']}\")\n    for metadata,value in v['metadata'].items():\n        print(f\"|__ {metadata}: {value}\")\n\ndocument_name = \"test.pdf\"\ntry:\n    metadata_file = metadata_extractor.extract_metadata_from_document(document_name)\n    for k,v in metadata_file.items():\n        print(f\"{k}: {v}\")\nexcept FileNotFoundError:\n    print(\"File not found\")\n```\n\n## Example\n\n![image](https://user-images.githubusercontent.com/16885065/118243158-69ee7600-b49e-11eb-9562-2dc1fab59d67.png)\n\n# Author\n\nThis project has been developed by:\n\n* **Josué Encinar García** -- [@JosueEncinar](https://twitter.com/JosueEncinar)\n\n\n# Contributors\n\n\n* **Félix Brezo Fernández** -- [@febrezo](https://twitter.com/febrezo)\n\n\n# Disclaimer!\n\nThe software is designed to leave no trace in the documents we upload to a domain. The author is not responsible for any illegitimate use.\n","funding_links":[],"categories":["[](#table-of-contents) Table of contents","Python"],"sub_categories":["[](#websites-files-metadata-analyze-and-files-downloads)Website's files metadata analyze and files downloads"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJosue87%2FMetaFinder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJosue87%2FMetaFinder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJosue87%2FMetaFinder/lists"}