{"id":24185175,"url":"https://github.com/bitbynik/substitution_cipher","last_synced_at":"2025-08-16T11:07:27.165Z","repository":{"id":272020105,"uuid":"914261825","full_name":"BitByNIK/substitution_cipher","owner":"BitByNIK","description":"SIL765 Assignment-1","archived":false,"fork":false,"pushed_at":"2025-01-19T10:44:34.000Z","size":2343,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-19T11:29:22.387Z","etag":null,"topics":["cryptanalysis","decipher","iitd","security","substitution-cipher"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BitByNIK.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-09T08:59:41.000Z","updated_at":"2025-01-19T10:44:36.000Z","dependencies_parsed_at":"2025-01-19T11:24:42.869Z","dependency_job_id":null,"html_url":"https://github.com/BitByNIK/substitution_cipher","commit_stats":null,"previous_names":["bitbynik/substitution_cipher"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BitByNIK%2Fsubstitution_cipher","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BitByNIK%2Fsubstitution_cipher/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BitByNIK%2Fsubstitution_cipher/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BitByNIK%2Fsubstitution_cipher/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BitByNIK","download_url":"https://codeload.github.com/BitByNIK/substitution_cipher/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241587799,"owners_count":19986628,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cryptanalysis","decipher","iitd","security","substitution-cipher"],"created_at":"2025-01-13T11:19:01.696Z","updated_at":"2025-03-03T00:24:53.189Z","avatar_url":"https://github.com/BitByNIK.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Substitution Cipher Decoder\n\nThis project is a Python-based implementation of a substitution cipher decoder. It uses **hill climbing**, a search optimization algorithm, to iteratively improve decryption keys. By analyzing patterns in the ciphertext and comparing them with English language characteristics, the script deciphers encrypted text.\n\n---\n\n## How It Works\n\nA substitution cipher replaces each letter in plaintext with a specific character (e.g., numbers, symbols, or other letters). To decrypt such a cipher, the script finds the mapping between ciphertext characters and plaintext letters.\n\n### Steps in the Decryption Process\n\n1. **Initial Setup:**\n\n   - The script starts with a **random key**, mapping ciphertext characters to plaintext letters.\n   - A key represents how each ciphertext character translates into a plaintext letter.\n\n2. **Fitness Calculation:**\n\n   - The script evaluates how good the current key is by calculating a **fitness score**.\n   - This score measures how similar the decrypted text is to English using the frequency of groups of four letters (called **quadgrams**).\n   - Formula used for fitness:\n     $$\\text{Fitness} = \\sum_{\\text{quadgram}} \\log(\\text{QF}[\\text{quadgram}])$$\n     - **QF** represents the frequency of quadgrams in standard English text.\n     - Taking the logarithm ensures to avoid underflow because frequencies can be small.\n\n3. **Heuristic-Based Key Selection:**\n\n   - A **heuristic function** is used to prioritize which parts of the key to adjust.\n   - The heuristic identifies ciphertext characters whose frequencies differ the most from expected English frequencies.\n   - Formula for the heuristic:\n     $$H(b) = \\max(1, |\\text{FC}(b) - \\text{FE}(\\text{E}^{-1}(b))|)$$\n     Where:\n\n     - $$H(b)$$: Heuristic score for a ciphertext letter $$b$$.\n     - $$\\text{FC}(b)$$: Rank of $$b$$ in ciphertext frequencies.\n     - $$\\text{FE}(a)$$: Rank of the corresponding plaintext letter $$a$$ in English frequencies.\n\n   - A ciphertext letter $$b$$ with a large heuristic value is **more likely to be swapped** because it is far from its expected frequency rank. The goal is to quickly minimize these differences.\n   - Once a character $$b$$ is selected, it is swapped with another randomly chosen character in the key.\n\n4. **Key Evaluation:**\n\n   - The script decrypts the text using the modified key.\n   - If the fitness score improves, the new key is kept. Otherwise, it is discarded.\n\n5. **Stopping Condition:**\n   - If the fitness score does not improve after \\( T \\) iterations, the process stops.\n   - The entire procedure is repeated multiple times, and the best key across all attempts is chosen.\n\n---\n\n## Why Use a Heuristic?\n\nThe heuristic focuses on ciphertext letters whose observed frequencies differ the most from expected English frequencies. This:\n\n- **Speeds Up Convergence:** Quickly minimizes large frequency mismatches.\n- **Improves Accuracy:** Guides the algorithm to better keys, especially for longer ciphertexts where frequency distributions are more reliable.\n\n---\n\n## Features\n\n- **Hill Climbing Optimization:** Iteratively improves the decryption key using fitness scoring.\n- **Quadgram-Based Scoring:** Evaluates how \"English-like\" the decrypted text is.\n- **Heuristic Guidance:** Targets the most impactful key adjustments to improve efficiency.\n\n---\n\n## Requirements\n\n- **Python 3.6 or Higher**\n- A file containing a large sample of English text is required to calculate the letter and quadgram frequencies. Here, we have used big.txt, which should be a plain text file with diverse English content (e.g., books, articles). The quality of decryption improves with a larger and more representative reference text.\n\n### Dependencies:\n\n- Built-in Python libraries: `collections`, `math`, `random`.\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbitbynik%2Fsubstitution_cipher","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbitbynik%2Fsubstitution_cipher","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbitbynik%2Fsubstitution_cipher/lists"}