{"id":13478938,"url":"https://github.com/soxoj/marple","last_synced_at":"2025-04-12T21:26:29.379Z","repository":{"id":45389891,"uuid":"428809032","full_name":"soxoj/marple","owner":"soxoj","description":"📖 Collect links to profiles by username through search engines and analyze with various plugins","archived":false,"fork":false,"pushed_at":"2024-11-24T00:55:31.000Z","size":127,"stargazers_count":260,"open_issues_count":12,"forks_count":22,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-04T01:06:58.136Z","etag":null,"topics":["namecheck","namechecker","osint","scraper","search-engine","username-checker","username-search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soxoj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"patreon":"soxoj","github":"soxoj","buy_me_a_coffee":"soxoj"}},"created_at":"2021-11-16T20:48:54.000Z","updated_at":"2025-04-03T11:45:10.000Z","dependencies_parsed_at":"2025-02-15T02:11:00.590Z","dependency_job_id":"2a02413a-2a2a-4119-9b04-83fdc7e073a9","html_url":"https://github.com/soxoj/marple","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fmarple","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fmarple/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fmarple/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fmarple/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soxoj","download_url":"https://codeload.github.com/soxoj/marple/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248633499,"owners_count":21136880,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["namecheck","namechecker","osint","scraper","search-engine","username-checker","username-search"],"created_at":"2024-07-31T16:02:06.193Z","updated_at":"2025-04-12T21:26:29.353Z","avatar_url":"https://github.com/soxoj.png","language":"Python","readme":"# Marple\n\n**Collect links to profiles by username through 10+ search engines ([see the full list below](#supported-sources)).**\n\n\u003cimg src=\"https://github.com/user-attachments/assets/b8d64e54-56fa-4805-b1ab-ff39ebead753\" height=\"300\"/\u003e\n\n*Idea by [Cyber Detective](https://cybdetective.com/)*\n\nFeatures:\n- multiple engines\n- proxy support\n- CSV file export\n- plugins\n  - pdf metadata extraction\n  - social media info [extraction](socid_extractor)\n\n## Quick Start\n\n```\n./marple.py soxoj\n```\n\n\u003cimg src=\"https://raw.githubusercontent.com/soxoj/marple/main/example.png\" height=\"300\"/\u003e\n\n### Results\n```\nhttps://t.me/soxoj\nContact @soxoj - Telegram\n\nhttps://github.com/soxoj\nsoxoj - GitHub\n\nhttps://coder.social/soxoj\nsoxoj - Coder Social\n\nhttps://gitmemory.com/soxoj\nsoxoj\n\n...\n\nPDF files\nhttps://codeby.net/attachments/v-0-0-1-social-osint-fundamentals-pdf.45770\nSocial OSINT fundamentals - Codeby.net\n/Creator: Google\n\n...\n\nLinks: total collected 111 / unique with username in URL 97 / reliable 38 / documents 3\n```\n\nAdvanced usage:\n```\n./marple.py soxoj --plugins metadata\n\n./marple.py smirnov --engines google baidu -v\n```\n\n## Installation\n\nAll you need is Python3. And pip. And requirements, of course.\n\n```\npip3 install -r requirements.txt\n```\n\nYou need API keys for some search engines (see requirements in [Supported sources](#supported-sources)). Keys should be exported to env in this way:\n```\nexport YANDEX_KEY=key\n```\n\n## Options\n\nYou can specify 'junk threshold' with option `-t` or `--threshold` (default 300) to get more or less reliable results.\n\nJunk score is summing up from length of link URL and symbols next to username as a part of URL. \n\nAlso you can increase count of results from search engines with option `--results-count` (default 1000). Currently limit is only applicable for Google.\n\nOther options:\n```\n  -h, --help            show this help message and exit\n  -t THRESHOLD, --threshold THRESHOLD\n                        Threshold to discard junk search results\n  --results-count RESULTS_COUNT\n                        Count of results parsed from each search engine\n  --no-url-filter       Disable filtering results by usernames in URLs\n\n  --engines {baidu,dogpile,google,bing,ask,aol,torch,yandex,naver,paginated,yahoo,startpage,duckduckgo,qwant}\n                        Engines to run (you can choose more than one)\n\n  --plugins {socid_extractor,metadata,maigret} [{socid_extractor,metadata,maigret} ...]\n                        Additional plugins to analyze links\n\n  -v, --verbose         Display junk score for each result\n  -d, --debug           Display all the results from sources and debug messages\n  -l, --list            Display only list of all the URLs\n  --proxy PROXY         Proxy string (e.g. https://user:pass@1.2.3.4:8080)\n  --csv CSV             Save results to the CSV file\n```\n\n## Supported sources\n\n| Name                | Method                                | Requirements      |\n| ------------------- | --------------------------------------| ----------------- |\n| [Google](http://google.com/)              | scraping                              | None, works out of the box; frequent captcha  |\n| [DuckDuckGo](https://duckduckgo.com/)     | scraping                              | None, works out of the box                    |\n| [Yandex](https://yandex.ru/)              | XML API                               | [Register and get YANDEX_USER/YANDEX_KEY tokens](https://github.com/fluquid/yandex-search)   |\n| [Naver](https://www.naver.com/)           | SerpApi                               | [Register and get SERPAPI_KEY token](https://serpapi.com/)   |\n| [Baidu](https://www.baidu.com/)           | SerpApi                               | [Register and get SERPAPI_KEY token](https://serpapi.com/)   |\n| [Aol](https://search.aol.com/)            | scraping                              | None, scrapes with pagination  |\n| [Ask](https://www.ask.com/)               | scraping                              | None, scrapes with pagination  |\n| [Bing](https://www.bing.com/)             | scraping                              | None, scrapes with pagination  |\n| [Startpage](https://www.startpage.com/)   | scraping                              | None, scrapes with pagination  |\n| [Yahoo](https://yahoo.com/)               | scraping                              | None, scrapes with pagination  |\n| [Mojeek](https://www.mojeek.com)          | scraping                              | None, scrapes with pagination  |\n| [Dogpile](https://www.dogpile.com/)       | scraping                              | None, scrapes with pagination  |\n| [Torch](http://torchdeedp3i2jigzjdmfpn5ttjhthh5wbmda2rr3jvqjg5p77c54dqd.onion)               | scraping                              | Tor proxies (socks5://localhost:9050 by default), scrapes with pagination  |\n| [Qwant](https://www.qwant.com/)           | Qwant API                              | Check [if search available](https://www.qwant.com/) in your exit IP country, scrapes with pagination  |\n\n\n## Development \u0026 testing\n\n```sh\n$ python3 -m pytest tests\n```\n\n## TODO\n\n- [x] Proxy support\n- [ ] Engines choose through arguments\n- [ ] Exact search filter\n- [ ] Engine-specific filters\n- [ ] 'Username in title' check\n\n## Mentions and articles\n\n[Sector035 - Week in OSINT #2021-50](https://sector035.nl/articles/2021-50)\n\n[OS2INT - MARPLE: IDENTIFYING AND EXTRACTING SOCIAL MEDIA USER LINKS](https://os2int.com/toolbox/identifying-and-extracting-social-media-user-links-with-marple/)\n\n[Cyber Detective - X](https://threadreaderapp.com/thread/1532094437027102721.html)\n\n[OSINT Ambition - X post](https://twitter.com/osintambition/status/1725011306947006797)\n\n[Offensive Security Cheatsheet - Usernames](https://cheatsheet.haax.fr/open-source-intelligence-osint/human-recon/username/)\n","funding_links":["https://patreon.com/soxoj","https://github.com/sponsors/soxoj","https://buymeacoffee.com/soxoj"],"categories":["Python","search-engine","[](#table-of-contents) Table of contents"],"sub_categories":["[](#warc)Tools for working with WARC (WebARChive) files"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoxoj%2Fmarple","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoxoj%2Fmarple","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoxoj%2Fmarple/lists"}