{"id":13467970,"url":"https://github.com/DavidJacobson/SafeText","last_synced_at":"2025-03-26T03:31:22.307Z","repository":{"id":93150158,"uuid":"115885563","full_name":"DavidJacobson/SafeText","owner":"DavidJacobson","description":"Script to remove homoglyphs and zero-width characters to allow for safe distribution of documents from anonymous sources.","archived":false,"fork":false,"pushed_at":"2019-07-04T16:40:54.000Z","size":28,"stargazers_count":135,"open_issues_count":3,"forks_count":11,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-10-29T21:59:13.627Z","etag":null,"topics":["forensic-analysis"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DavidJacobson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-12-31T21:11:45.000Z","updated_at":"2024-09-17T08:25:09.000Z","dependencies_parsed_at":"2023-03-22T06:02:39.912Z","dependency_job_id":null,"html_url":"https://github.com/DavidJacobson/SafeText","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidJacobson%2FSafeText","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidJacobson%2FSafeText/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidJacobson%2FSafeText/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidJacobson%2FSafeText/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DavidJacobson","download_url":"https://codeload.github.com/DavidJacobson/SafeText/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245584774,"owners_count":20639626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["forensic-analysis"],"created_at":"2024-07-31T15:01:03.415Z","updated_at":"2025-03-26T03:31:21.407Z","avatar_url":"https://github.com/DavidJacobson.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# SafeText\nTool to sanitize text to allow for safe distribution of documents from anonymous sources by removing zero-width characters and homoglpyhs.\n\nIndividuals attempting to leak an email or other text file face the risk of identification through fingerprinting.\nFingerprinting often occurs when the original distributor of the document has embedded some form of a canary.\nFor example, Elon Musk's [email](https://web.archive.org/web/20131020092330/http://gawker.com/5164035/tesla-ceo-in-digital-witch-hunt) in 2008 in response to leaks featured slightly different \nwording for each employee. This tactic was realized by the employees, and failed. An easier\ntactic that is also employed, is the presence of nearly invisible changes to the text. \nSafeText is designed to identify and remove these changes.\nSpecifically this tool will remove homoglyphs, zero-width characters, and other subtle characters.\nThis tool will also attempt to identify unique spelling of words that could give away an individual's location.\n\n## Usage\n\nTo use SafeText, call:\n```shell\npython safetext.py inputfile\n```\nExample output is:\n```shell\nλ python safetext.py TestFile.txt\n[*] Cleaning TestFile.txt to TestFile.txt.safe ...\n[!] FOUND HOMOGLYPHIC CHARACTER CYRILLIC_large_H ON LINE 1\nThe message said: \"(Н)ey, let's hang out!\"\n[!] FOUND a SPACE ON LINE # 2\nLorem*Ipsum*Dolor*Sit\n[!] WARNING - Use of spelling (colour) that identifies country on line 3\n[!] FOUND HOMOGLYPHIC CHARACTER GREEK_B ON LINE 5\n[!] FOUND HOMOGLYPHIC CHARACTER GREEK_C ON LINE 5\nSubject: (Β)udget (Ϲ)uts\n[*] Output file closed\n\n```\nNote: The relevant characters will be underlined - not enclosed by parentheses. \nSafeText will output to infile.safe. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDavidJacobson%2FSafeText","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDavidJacobson%2FSafeText","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDavidJacobson%2FSafeText/lists"}