{"id":42186622,"url":"https://github.com/paceaux/mechon-mamre-scraper","last_synced_at":"2026-01-26T22:27:46.373Z","repository":{"id":142964508,"uuid":"202363266","full_name":"paceaux/mechon-mamre-scraper","owner":"paceaux","description":"Python scraper that converts mechon-mamre.org into JSON","archived":false,"fork":false,"pushed_at":"2019-08-14T14:08:17.000Z","size":7,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-05-02T00:40:25.835Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paceaux.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-14T14:07:40.000Z","updated_at":"2024-05-02T00:40:25.836Z","dependencies_parsed_at":"2023-06-11T08:45:28.021Z","dependency_job_id":null,"html_url":"https://github.com/paceaux/mechon-mamre-scraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/paceaux/mechon-mamre-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paceaux%2Fmechon-mamre-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paceaux%2Fmechon-mamre-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paceaux%2Fmechon-mamre-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paceaux%2Fmechon-mamre-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paceaux","download_url":"https://codeload.github.com/paceaux/mechon-mamre-scraper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paceaux%2Fmechon-mamre-scraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28789738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-26T21:49:50.245Z","status":"ssl_error","status_checked_at":"2026-01-26T21:48:29.455Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-26T22:27:45.793Z","updated_at":"2026-01-26T22:27:46.366Z","avatar_url":"https://github.com/paceaux.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Mechon-Mamre HTML to JSON Converter\n\nA utility to convert [Mechon-Mamre](https://www.mechon-mamre.org/p/pt/pt0.htm) content from HTML to JSON, useful for building an API. **This project is not endorsed by Mechon-Mamre**.\n\n## Prerequisites\n\n- **Python 3.x**\n- **Beautiful Soup** - for parsing HTML content. Install via `pip`:\n\n  ```bash\n  pip install beautifulsoup4 requests\n  ```\n\n## Command Line Usage\n\n### 1. Convert a Single Book to JSON\n\nTo create a JSON file for a single book:\n\n```bash\npython bookScraper.py -u https://mechon-mamre.org/p/pt/pt0101.htm\n```\n\nThis command finds all chapters in the specified book and generates a single JSON file containing the book's content.\n\n### 2. Generate a JSON List of All Books\n\nTo create a JSON file that lists all books in the Tanakh:\n\n```bash\npython tanakScraper.py -u https://www.mechon-mamre.org/p/pt/pt0.htm\n```\n\n### 3. Generate JSON Files for Selected or All Books from the Book List\n\nTo scrape books from the Tanakh JSON list and create individual JSON files:\n\n- Use `-g` to specify the group (`torah`, `prophets`, or `writings`).\n- Use `-b` to specify specific books (comma-separated).\n- Use `-a` to scrape *all* books.\n\n#### Scrape Specific Books\n\n```bash\npython scrapeAllBooks.py -g prophets -b Zephaniah,Haggai\n```\n\nThis example scrapes and saves JSON files for *Zephaniah* and *Haggai* from the *prophets* group.\n\n#### Scrape All Books\n\n```bash\npython scrapeAllBooks.py -g writings -a\n```\n\nThis command scrapes and saves JSON files for *all books* in the *writings* group.\n\n## File Structure\n\nThe script saves HTML files to a `data/html` directory to prevent re-downloading content on repeated runs. This caching speeds up the process and reduces unnecessary server requests.\n\n## Important Notes\n\n- **Copyright**: Mechon-Mamre states that their content is copyrighted with all rights reserved. This project aims to respect these rights, and permission has been sought to perform this scraping; however, no response has been received.\n\n- **Use Responsibly**: This tool is intended for educational and non-commercial use. Please ensure your usage aligns with Mechon-Mamre’s terms.\n\n---\n\n**Disclaimer**: This utility is independently created and is not affiliated with or endorsed by Mechon-Mamre.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaceaux%2Fmechon-mamre-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaceaux%2Fmechon-mamre-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaceaux%2Fmechon-mamre-scraper/lists"}