{"id":13910802,"url":"https://github.com/soxoj/socid-extractor","last_synced_at":"2025-05-14T05:10:49.709Z","repository":{"id":40572570,"uuid":"222293272","full_name":"soxoj/socid-extractor","owner":"soxoj","description":"⛏️ Extract accounts info from personal pages on various sites for OSINT purpose","archived":false,"fork":false,"pushed_at":"2025-04-15T17:12:41.000Z","size":375,"stargazers_count":804,"open_issues_count":14,"forks_count":81,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-05-12T01:58:31.167Z","etag":null,"topics":["identifiers","osint","parsing","privacy","socid-extractor","socmint","uid"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soxoj.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"patreon":"soxoj","github":"soxoj","buy_me_a_coffee":"soxoj"}},"created_at":"2019-11-17T18:30:06.000Z","updated_at":"2025-05-09T15:42:59.000Z","dependencies_parsed_at":"2023-02-10T04:55:12.796Z","dependency_job_id":"6ed2bb61-b3cc-4a60-8871-c5b55c4a8eb3","html_url":"https://github.com/soxoj/socid-extractor","commit_stats":{"total_commits":210,"total_committers":5,"mean_commits":42.0,"dds":0.2761904761904762,"last_synced_commit":"076178aaafb943e662f71241c6aa02228a320d64"},"previous_names":["soxoj/socid_extractor"],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fsocid-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fsocid-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fsocid-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soxoj%2Fsocid-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soxoj","download_url":"https://codeload.github.com/soxoj/socid-extractor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254076850,"owners_count":22010611,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["identifiers","osint","parsing","privacy","socid-extractor","socmint","uid"],"created_at":"2024-08-07T00:01:46.221Z","updated_at":"2025-05-14T05:10:49.668Z","avatar_url":"https://github.com/soxoj.png","language":"Python","funding_links":["https://patreon.com/soxoj","https://github.com/sponsors/soxoj","https://buymeacoffee.com/soxoj","https://www.patreon.com/musemercier'","https://www.patreon.com/annetlovart'"],"categories":["Python"],"sub_categories":[],"readme":"# socid_extractor\n\nExtract information about a user from profile webpages / API responses and save it in machine-readable format.\n\n## Usage\n\nAs a command-line tool:\n```\n$ socid_extractor --url https://www.deviantart.com/muse1908\ncountry: France\ncreated_at: 2005-06-16 18:17:41\ngender: female\nusername: Muse1908\nwebsite: www.patreon.com/musemercier\nlinks: ['https://www.facebook.com/musemercier', 'https://www.instagram.com/muse.mercier/', 'https://www.patreon.com/musemercier']\ntagline: Nothing worth having is easy...\n```\n\nWithout installing: \n```\n$ ./run.py --url https://www.deviantart.com/muse1908\n```\n\nAs a Python library:\n```\n\u003e\u003e\u003e import socid_extractor, requests\n\u003e\u003e\u003e r = requests.get('https://www.patreon.com/annetlovart')\n\u003e\u003e\u003e socid_extractor.extract(r.text)\n{'patreon_id': '33913189', 'patreon_username': 'annetlovart', 'fullname': 'Annet Lovart', 'links': \"['https://www.facebook.com/322598031832479', 'https://www.instagram.com/annet_lovart', 'https://twitter.com/annet_lovart', 'https://youtube.com/channel/UClDg4ntlOW_1j73zqSJxHHQ']\"}\n```\n\n## Installation\n\n    $ pip3 install socid-extractor\n\nThe latest development version can be installed directly from GitHub:\n\n    $ pip3 install -U git+https://github.com/soxoj/socid_extractor.git\n\n## Sites and methods\n\n[More than 100 methods](https://github.com/soxoj/socid-extractor/blob/master/METHODS.md) for different sites and platforms are supported!\n\n- Google (all documents pages, maps contributions), cookies required\n- Yandex (disk, albums, znatoki, music, realty, collections), cookies required to prevent captcha blocks\n- Mail.ru (my.mail.ru user mainpage, photo, video, games, communities)\n- Facebook (user \u0026 group pages)\n- VK.com (user page)\n- OK.ru (user page)\n- Instagram\n- Reddit\n- Medium\n- Flickr\n- Tumblr\n- TikTok\n- GitHub\n\n...and many others.\n\nYou can also check [tests file](https://github.com/soxoj/socid-extractor/blob/master/tests/test_e2e.py) for data examples, [schemes file](https://github.com/soxoj/socid-extractor/blob/master/socid_extractor/schemes.py) to expore all the methods.\n\n## When it may be useful\n\n- Getting all available info by the username or/and account UID. Examples: [Week in OSINT](https://medium.com/week-in-osint/getting-a-grasp-on-googleids-77a8ab707e43), [OSINTCurious](https://osintcurio.us/2019/10/01/searching-instagram-part-2/)\n- Users tracking, checking that the account was previously known (by ID) even if all public info has changed. Examples: [Aware Online](https://www.aware-online.com/en/importance-of-user-ids-in-social-media-investigations/)\n- Searching by commonly used cross-service UIDs (GAIA ID, Facebook UID, Yandex Public ID, etc.)\n  - DB leaks of forums and platforms in SQL format\n  - Indexed links that contain target profile ID\n- Searching for tracking data by comparison with other IDs - [how it works](https://www.eff.org/wp/behind-the-one-way-mirror), [how can it be used](https://www.nytimes.com/interactive/2019/12/19/opinion/location-tracking-cell-phone.html).\n- Law enforcement investigations\n\n## SOWEL classification\n\nThis tool uses the following OSINT techniques:\n- [SOTL-1.4. Analyze Internal Identifiers](https://sowel.soxoj.com/internal-identifiers)\n- [SOTL-11.1. Check Outdated And Unused Functionality](https://sowel.soxoj.com/outdated-unused-functionality)\n\n\n## Tools using socid_extractor\n\n- [Maigret](https://github.com/soxoj/maigret) - powerful namechecker, generate a report with all available info from accounts found.\n\n- [TheScrapper](https://github.com/champmq/TheScrapper) - scrape emails, phone numbers and social media accounts from a website.\n\n- [InfoHunter](https://github.com/sweetnight19/InfoHunter) - An open source OSINT tool that allows you to search, collect and analyze information online to get a complete picture of the person or company you are interested in.\n\n- [YaSeeker](https://github.com/HowToFind-bot/YaSeeker) - tool to gather all available information about Yandex account by login/email.\n\n- [Marple](https://github.com/soxoj/marple) - scrape search engines results for a given username.\n\n## Testing\n\n```sh\npython3 -m pytest tests/test_e2e.py -n 10  -k 'not cookies' -m 'not github_failed and not rate_limited'\n```\n\n## Contributing\n\nCheck [separate page](https://github.com/soxoj/socid-extractor/blob/master/CONTRIBUTING.md) if you want to add a new methods of fix anything.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoxoj%2Fsocid-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoxoj%2Fsocid-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoxoj%2Fsocid-extractor/lists"}