{"id":34079912,"url":"https://github.com/jeremyctrl/whatenc","last_synced_at":"2025-12-30T07:49:50.705Z","repository":{"id":320792426,"uuid":"1083346137","full_name":"jeremyctrl/whatenc","owner":"jeremyctrl","description":"Text encoding type classifier","archived":false,"fork":false,"pushed_at":"2025-11-02T21:05:41.000Z","size":607,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-02T23:07:41.028Z","etag":null,"topics":["classifier","encoding","text"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jeremyctrl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-25T20:37:25.000Z","updated_at":"2025-11-02T21:05:44.000Z","dependencies_parsed_at":"2025-10-25T23:25:30.598Z","dependency_job_id":"56b225cc-ae25-4ddf-b0af-be1330ae82f4","html_url":"https://github.com/jeremyctrl/whatenc","commit_stats":null,"previous_names":["jeremyctrl/whatenc"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/jeremyctrl/whatenc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeremyctrl%2Fwhatenc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeremyctrl%2Fwhatenc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeremyctrl%2Fwhatenc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeremyctrl%2Fwhatenc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jeremyctrl","download_url":"https://codeload.github.com/jeremyctrl/whatenc/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeremyctrl%2Fwhatenc/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27726965,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-14T02:00:11.348Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classifier","encoding","text"],"created_at":"2025-12-14T11:04:29.397Z","updated_at":"2025-12-30T07:49:50.698Z","avatar_url":"https://github.com/jeremyctrl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# whatenc\n\n\u003cdiv\u003e\n   \u003ca href=\"https://pypi.org/project/whatenc/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/whatenc.svg\" alt=\"PyPI\"\u003e\u003c/a\u003e\n   \u003ca href=\"https://opensource.org/licenses/MIT\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue.svg\" alt=\"License\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n\u003ca href=\"https://https://heyjeremy.vercel.app/blog/classifying-encoded-text\"\u003e\u003cimg src=\"https://img.shields.io/badge/blog-how%20can%20a%20program%20recognize%20encoded%20text%20without%20decoding%20it%3F-blue\" alt=\"Blog\"\u003e\u003c/a\u003e\n\nText encoding type classifier.\n\n\u003c/div\u003e\n\n`whatenc` is a command-line tool that identifies the encoding or transformation of a given string or file.\n\nThe model is trained on text samples from the English, Greek, Russian, Hebrew, and Arabic Wikipedia corpora, chosen to represent a diverse set of writing systems (Latin, Greek, Cyrillic, Hebrew, and Arabic scripts). Each line is encoded using multiple encoding schemes to generate labeled examples.\n\n## How It Works\n\n`whatenc` uses a character-level 1D Convolutional Neural Network trained directly on bigram token sequences. \n\nEach training sample is represented as:\n- bigram of characters, padded to a fixed maximum length\n- a true length scalar feature, allowing the network to learn relative string lengths\n\nThis neural approach achieves near-perfect classification accuracy after only a few epochs.\n\n### Supported Encodings\n\n`whatenc` currently recognizes the following formats and transformations:\n\n| Category | Encodings |\n| :------- | :-------- |\n| Base encodings | `base32`, `base64`, `base85`, `hex`, `url` |\n| Text ciphers | `morse` |\n| Compression | `gzip64` |\n| Hash digests | `md5`, `sha1`, `sha224`, `sha256`, `sha384`, `sha512` |\n\n## Installation\n\nYou can install `whatenc` using [pipx](https://pypa.github.io/pipx):\n\n```bash\npipx install whatenc\n```\n\n## Usage\n\n```bash\nwhatenc hello\nwhatenc samples.txt\n```\n\n### Examples\n\n```bash\n[+] input: ZW5jb2RlIHRvIGJhc2U2NCBmb3JtYXQ=\n   [~] top guess   = base64\n      [=] base64   = 1.000\n      [=] base85   = 0.000\n      [=] plain    = 0.000\n\n[+] input: hello\n   [~] top guess   = plain\n      [=] plain    = 1.000\n      [=] md5      = 0.000\n      [=] base64   = 0.000\n\n[*] loading model\n[+] input: האקדמיה ללשון העברית\n   [~] top guess   = plain\n      [=] plain    = 1.000\n      [=] base64   = 0.000\n      [=] base85   = 0.000\n\n[*] loading model\n[+] input: bfa99df33b137bc8fb5f5407d7e58da8\n   [~] top guess   = md5\n      [=] md5      = 0.999\n      [=] sha1     = 0.001\n      [=] sha224   = 0.000\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjeremyctrl%2Fwhatenc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjeremyctrl%2Fwhatenc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjeremyctrl%2Fwhatenc/lists"}