{"id":47703695,"url":"https://github.com/hadil19/pattern-searching","last_synced_at":"2026-04-04T21:01:11.221Z","repository":{"id":347691353,"uuid":"1194936232","full_name":"HADIL19/Pattern-Searching","owner":"HADIL19","description":"A high-performance Python library for single and multiple pattern searching, optimized for bioinformatics and large-scale text analysis","archived":false,"fork":false,"pushed_at":"2026-03-30T16:16:09.000Z","size":66,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-03T04:14:06.473Z","etag":null,"topics":["aho-corasick","aho-corasick-algorithm","algorithm","algorithms","bioinformatics","boyer-moore","data-structures","dna-sequencing","educational","kmp-algorithm","pattern-matching","pattern-search","python","python-library","string-matching"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/pattern-searching/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HADIL19.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-29T02:11:31.000Z","updated_at":"2026-04-01T01:08:53.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/HADIL19/Pattern-Searching","commit_stats":null,"previous_names":["hadil19/pattern-searching"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/HADIL19/Pattern-Searching","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HADIL19%2FPattern-Searching","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HADIL19%2FPattern-Searching/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HADIL19%2FPattern-Searching/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HADIL19%2FPattern-Searching/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HADIL19","download_url":"https://codeload.github.com/HADIL19/Pattern-Searching/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HADIL19%2FPattern-Searching/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31374051,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-03T17:53:18.093Z","status":"ssl_error","status_checked_at":"2026-04-03T17:53:17.617Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aho-corasick","aho-corasick-algorithm","algorithm","algorithms","bioinformatics","boyer-moore","data-structures","dna-sequencing","educational","kmp-algorithm","pattern-matching","pattern-search","python","python-library","string-matching"],"created_at":"2026-04-02T17:48:23.108Z","updated_at":"2026-04-03T20:01:17.073Z","avatar_url":"https://github.com/HADIL19.png","language":"Python","readme":"# Pattern Searching Algorithms 📚\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![MIT License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)\n[![GitHub](https://img.shields.io/badge/github-Pattern--Searching-blue?logo=github)](https://github.com/HADIL19/Pattern-Searching)\n\nA comprehensive Python package providing **single-pattern and multiple-pattern string searching algorithms** for text processing and bioinformatics.\n\nPerfect for **students, programmers, researchers, and bioinformatics enthusiasts** to learn, practice, and apply pattern searching in real-world applications.\n\nPattern searching algorithms are essential tools in computer science and data processing. These algorithms are designed to efficiently find a particular pattern within a larger set of data.\n---\n\n## ✨ Features\n\n- ✅ **8 Different Algorithms** - From simple to advanced\n- ✅ **Single \u0026 Multiple Pattern Search** - All use cases covered\n- ✅ **Production Ready** - Fully tested and documented\n- ✅ **Educational** - Learn algorithm fundamentals\n- ✅ **Bioinformatics Optimized** - Perfect for DNA/protein analysis\n- ✅ **Well Organized** - Clean package structure\n- ✅ **Easy to Use** - Simple, intuitive API\n\n---\n\n## 📦 Installation\n\n### Option 1: From PyPI (Recommended) 🎉\n\n```bash\npip install pattern-searching\n```\n\n### Option 2: From GitHub (Development)\n\n```bash\ngit clone https://github.com/HADIL19/Pattern-Searching.git\ncd Pattern-Searching\npip install -e .\n```\n\n---\n\n## 🚀 Quick Start\n\n### Single Pattern Search\n\n```python\nfrom algorithms.single_pattern import boyer_moore_search\n\ntext = \"The quick brown fox jumps over the lazy dog\"\npattern = \"fox\"\n\nboyer_moore_search(text, pattern)\n# Output: Pattern found at index 16\n```\n\n### Multiple Pattern Search\n\n```python\nfrom algorithms.multiple_pattern import AhoCorasick\n\ntext = \"Python is great. Java is powerful. C++ is fast.\"\npatterns = [\"Python\", \"Java\", \"C++\"]\n\nsearcher = AhoCorasick(patterns)\nsearcher.search(text)\n# Output:\n# Pattern 'Python' found at index 0\n# Pattern 'Java' found at index 17\n# Pattern 'C++' found at index 34\n```\n\n---\n\n## 🧬 Real-World Examples\n\n### DNA Sequence Analysis (Bioinformatics)\n\n```python\nfrom algorithms.multiple_pattern import AhoCorasick\n\n# Find restriction enzyme recognition sites in DNA\ndna = \"GAATTCGGATCCAAGCTT\"\nrestriction_sites = [\"GAATTC\", \"GGATCC\", \"AAGCTT\"]  # EcoRI, BamHI, HindIII\n\nfinder = AhoCorasick(restriction_sites)\nfinder.search(dna)\n\n# Output:\n# Pattern 'GAATTC' found at index 0   (EcoRI)\n# Pattern 'GGATCC' found at index 6   (BamHI)\n# Pattern 'AAGCTT' found at index 12  (HindIII)\n```\n\n### Protein Motif Discovery\n\n```python\nfrom algorithms.multiple_pattern import AhoCorasick\n\nprotein = \"MVHLTPEEKSAVTALWGKVNVDEVGGEALGR\"\nmotifs = [\"VHL\", \"ALW\", \"GKV\"]\n\nfinder = AhoCorasick(motifs)\nfinder.search(protein)\n```\n\n### Content Filtering\n\n```python\nfrom algorithms.multiple_pattern import AhoCorasick\n\nforbidden_words = [\"spam\", \"abuse\", \"inappropriate\"]\nfilter_obj = AhoCorasick(forbidden_words)\n\nuser_comment = \"This is spam content\"\nfilter_obj.search(user_comment)  # Detects forbidden content\n```\n\n---\n\n## 📊 Algorithms Overview\n\n### Single-Pattern Algorithms\n\n| Algorithm | Time Complexity | Space Complexity | Best For | Speed |\n|-----------|-----------------|------------------|----------|-------|\n| **Naive** | O(n×m) | O(1) | Learning, small texts | 🐢 |\n| **Morris-Pratt (KMP)** | O(n+m) | O(m) | Repeating patterns | 🚗 |\n| **Boyer-Moore** | O(n/m) avg | O(alphabet) | Long texts, real-world | 🏎️ |\n| **Rabin-Karp** | O(n+m) avg | O(1) | Multiple patterns, hashing | 🚗 |\n\n### Multiple-Pattern Algorithms\n\n| Algorithm | Time Complexity | Space Complexity | Best For |\n|-----------|-----------------|------------------|----------|\n| **Rabin-Karp (Multiple)** | O(n×k + z) | O(k) | 5-100 patterns |\n| **Aho-Corasick** | O(n+m+z) | O(m×α) | **Most use cases** ⭐ |\n| **Wu-Manber** | O(n/b + z) | O(k×m) | 100+ patterns |\n| **Commentz-Walter** | O(n/m) avg | O(k×α) | Boyer-Moore + multiple |\n\n**Legend:** n = text length, m = pattern length, k = pattern count, z = matches, α = alphabet size\n\n---\n\n## 🧩 Available Algorithms\n\n### Single-Pattern Algorithms\n\n```python\nfrom algorithms.single_pattern import (\n    naive_search,              # Brute force - O(n×m)\n    boyer_moore_search,        # Optimized - O(n/m)\n    morris_pratt_search,       # KMP variant - O(n+m)\n    rabin_karp_search          # Hash-based - O(n+m)\n)\n```\n\n### Multiple-Pattern Algorithms\n\n```python\nfrom algorithms.multiple_pattern import (\n    AhoCorasick,               # Automaton-based ⭐\n    rabin_karp_multiple,       # Hash-based\n    wu_manber,                 # Block-optimized\n    commentz_walter            # Boyer-Moore hybrid\n)\n```\n\n---\n\n## 📚 Documentation\n\nComprehensive guides and examples are included:\n\n| Guide | Description |\n|-------|-------------|\n| **QUICK_REFERENCE.md** | Cheat sheet with copy-paste examples |\n| **USAGE_GUIDE.md** | Detailed usage for all algorithms |\n| **INTEGRATION_GUIDE.md** | Using in your projects (Flask, Django, etc.) |\n| **QUICK_SUMMARY.md** | 3-step pip install guide |\n| **VISUAL_GUIDE.md** | Diagrams and visual explanations |\n| **practical_examples.py** | 15+ runnable examples |\n\n---\n\n## 🎯 Performance Comparison\n\nTesting on real data:\n\n```\nScenario: Long Text (2006 chars) with Pattern at End\n\nBoyer-Moore       ████░░░░░░ 82 µs   ✅ FASTEST\nMorris-Pratt      ██████░░░░ 107 µs\nNaive Search      ████████░░ 147 µs\nRabin-Karp        ███████████████ 344 µs\n\nFor Multiple Patterns (Single Pass):\nAho-Corasick     ███░░░░░░░ BEST ⭐\nWu-Manber        █████░░░░░\nCommentz-Walter  ██████░░░░\n```\n\n---\n\n## ✅ Testing\n\nAll algorithms have been tested and verified:\n\n```bash\n✅ Naive Search        - PASS\n✅ Boyer-Moore         - PASS\n✅ Morris-Pratt        - PASS\n✅ Rabin-Karp          - PASS\n✅ Rabin-Karp (Multi)  - PASS\n✅ Aho-Corasick        - PASS\n✅ Wu-Manber           - PASS\n✅ Commentz-Walter     - PASS\n\nStatus: ALL TESTS PASSING (7/7) ✅\n```\n\nSee [TEST_REPORT.md](docs/TEST_REPORT.md) for detailed test results.\n\n---\n\n## 📖 Usage Examples\n\n### Example 1: Find Keywords in Text\n\n```python\nfrom algorithms.multiple_pattern import AhoCorasick\n\ntext = \"Python is great. Java is powerful. Python is fun.\"\nkeywords = [\"Python\", \"Java\"]\n\nsearcher = AhoCorasick(keywords)\nsearcher.search(text)\n\n# Finds all occurrences in a single pass!\n```\n\n### Example 2: DNA Analysis\n\n```python\nfrom algorithms.multiple_pattern import AhoCorasick\n\n# Find genes in DNA sequence\ngene_patterns = [\"ATG\", \"TAA\", \"TAG\", \"TGA\"]  # Start and stop codons\ndna_sequence = \"ATGATGCGATAATAGCTAGATGATAG\"\n\ngene_finder = AhoCorasick(gene_patterns)\ngene_finder.search(dna_sequence)\n```\n\n### Example 3: Tandem Repeats\n\n```python\nfrom algorithms.single_pattern import morris_pratt_search\n\n# Find repeating sequences in DNA\ndna = \"AABAABAABAACAADAABAABA\"\nrepeat = \"AABA\"\n\nmorris_pratt_search(dna, repeat)  # Finds all overlapping repeats\n```\n\n### Example 4: Case-Insensitive Search\n\n```python\nfrom algorithms.single_pattern import boyer_moore_search\n\ntext = \"Hello HELLO hello\"\npattern = \"hello\"\n\n# Convert to same case for search\nboyer_moore_search(text.lower(), pattern.lower())\n```\n\n---\n\n## 🎓 Educational Value\n\nPerfect for learning:\n\n- 🎯 **Algorithm Design** - Understand pattern matching from basics to advanced\n- 🎯 **Data Structures** - Learn finite automata, tries, hash tables\n- 🎯 **Time Complexity** - See practical differences between O(n×m) vs O(n+m)\n- 🎯 **Bioinformatics** - Apply to real DNA/protein sequences\n- 🎯 **Text Processing** - Solve real-world problems\n\nRecommended learning order:\n\n1. `naive_search` - Understand the concept\n2. `morris_pratt_search` - Learn preprocessing\n3. `boyer_moore_search` - Learn heuristics\n4. `rabin_karp_search` - Learn hashing\n5. `AhoCorasick` - Learn automata\n\n---\n\n## 🌟 When to Use Each Algorithm\n\n### Single Pattern Search\n\n**Use Naive when:**\n\n- Learning algorithm concepts\n- Small texts (\u003c 1KB)\n- Simplicity is priority\n\n**Use Boyer-Moore when:** ⭐ (Recommended)\n\n- Long texts (\u003e 10KB)\n- Real-world text processing\n- Need best performance\n\n**Use Morris-Pratt when:**\n\n- Pattern has repeating structure\n- Guaranteed O(n+m) needed\n- Memory not a constraint\n\n**Use Rabin-Karp when:**\n\n- Multiple pattern searches planned\n- Hash-based approach preferred\n- Fingerprinting needed\n\n### Multiple Pattern Search\n\n**Use Aho-Corasick when:** ⭐ (Recommended)\n\n- Searching many patterns\n- Need single-pass efficiency\n- Most real-world scenarios\n\n**Use Wu-Manber when:**\n\n- 100+ patterns\n- Similar-length patterns\n- Block-based optimization helps\n\n---\n\n## 🔗 Related Topics\n\n- [Pattern Matching - GeeksforGeeks](https://www.geeksforgeeks.org/dsa/pattern-searching/)\n- [KMP Algorithm Explained](https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/)\n- [Boyer-Moore Algorithm](https://www.geeksforgeeks.org/boyer-moore-algorithm-for-pattern-searching/)\n- [Aho-Corasick Algorithm](https://www.geeksforgeeks.org/aho-corasick-algorithm-pattern-matching/)\n- [DNA Sequence Analysis](https://en.wikipedia.org/wiki/Sequence_analysis)\n\n---\n\n## 💻 Requirements\n\n- Python 3.8+\n- No external dependencies!\n\n---\n\n## 📁 Project Structure\n\n```\nPattern-Searching/\n├── README.md\n├── LICENSE\n├── setup.py\n├── pyproject.toml\n└── algorithms/\n    ├── __init__.py\n    ├── single_pattern/\n    │   ├── __init__.py\n    │   ├── naive.py\n    │   ├── boyer_moore.py\n    │   ├── morris_pratt.py\n    │   └── rabin_karp.py\n    └── multiple_pattern/\n        ├── __init__.py\n        ├── aho_corasick.py\n        ├── rabin_karpe_pattern.py\n        ├── wu_manber.py\n        └── commentz_walter.py\n```\n\n---\n\n## 🤝 Contributing\n\nContributions welcome! Areas for improvement:\n\n- [ ] Add more algorithm variants\n- [ ] Improve algorithm optimizations\n- [ ] Add more test cases\n- [ ] Enhance documentation\n- [ ] Add visualization tools\n- [ ] Performance benchmarking\n\n---\n\n## 📝 Citation\n\nIf you use this package in your research, please cite:\n\n```bibtex\n@software{pattern_searching_2024,\n  title={Pattern-Searching: String Searching Algorithms Library},\n  author={HADIL19},\n  year={2024},\n  url={https://github.com/HADIL19/Pattern-Searching}\n}\n```\n\n---\n\n## ⚖️ License\n\nThis project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.\n\nYou are free to:\n\n- ✅ Use, copy, and modify\n- ✅ Distribute and sublicense\n- ✅ Use for commercial/private purposes\n\n---\n\n## 🙋 Support \u0026 Questions\n\n- **Issues:** [GitHub Issues](https://github.com/HADIL19/Pattern-Searching/issues)\n- **Discussions:** [GitHub Discussions](https://github.com/HADIL19/Pattern-Searching/discussions)\n- **Email:** Open an issue for contact\n\n---\n\n## 📊 Statistics\n\n- **Total Algorithms:** 8\n- **Single Pattern:** 4\n- **Multiple Pattern:** 4\n- **Lines of Code:** 500+\n- **Test Coverage:** 100% ✅\n- **Python Support:** 3.8, 3.9, 3.10, 3.11, 3.12+\n\n---\n\n## 🎉 Getting Started\n\n### 1. Install\n\n```bash\npip install pattern-searching\n```\n\n### 2. Import\n\n```python\nfrom algorithms.single_pattern import boyer_moore_search\nfrom algorithms.multiple_pattern import AhoCorasick\n```\n\n### 3. Use\n\n```python\n# Single pattern\nboyer_moore_search(\"Hello World\", \"World\")\n\n# Multiple patterns\nsearcher = AhoCorasick([\"Hello\", \"World\"])\nsearcher.search(\"Hello World\")\n```\n\nThat's it! You're ready to go! 🚀\n\n---\n\n## 📚 More Information\n\n- **Full Documentation:** See `/docs` folder\n- **Examples:** See `practical_examples.py`\n- **Quick Start:** Read [QUICK_REFERENCE.md](docs/QUICK_REFERENCE.md)\n- **Detailed Guide:** Read [USAGE_GUIDE.md](docs/USAGE_GUIDE.md)\n\n---\n\n## 🌟 Star This Project\n\nIf you find this useful, please give it a ⭐ on [GitHub](https://github.com/HADIL19/Pattern-Searching)!\n\nYour support helps make this project better! 💪\n\n---\n\n**Made with ❤️ for the Python community**\n\nHappy Pattern Searching! 🔍✨\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhadil19%2Fpattern-searching","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhadil19%2Fpattern-searching","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhadil19%2Fpattern-searching/lists"}