{"id":28386964,"url":"https://github.com/netwrix/flarewell","last_synced_at":"2026-02-14T14:33:19.798Z","repository":{"id":293180805,"uuid":"983209640","full_name":"netwrix/flarewell","owner":"netwrix","description":"Say goodbye to MadCap Flare and convert your project to markdown!","archived":false,"fork":false,"pushed_at":"2025-06-06T13:24:23.000Z","size":822784,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-27T23:48:25.408Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/netwrix.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-14T03:42:17.000Z","updated_at":"2025-06-06T13:24:26.000Z","dependencies_parsed_at":"2025-05-14T05:08:46.248Z","dependency_job_id":null,"html_url":"https://github.com/netwrix/flarewell","commit_stats":null,"previous_names":["netwrix/flarewell"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/netwrix/flarewell","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netwrix%2Fflarewell","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netwrix%2Fflarewell/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netwrix%2Fflarewell/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netwrix%2Fflarewell/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/netwrix","download_url":"https://codeload.github.com/netwrix/flarewell/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netwrix%2Fflarewell/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29447274,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-14T14:10:32.461Z","status":"ssl_error","status_checked_at":"2026-02-14T14:09:49.945Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-30T16:09:01.554Z","updated_at":"2026-02-14T14:33:19.785Z","avatar_url":"https://github.com/netwrix.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HTML to Markdown Converter - Claude Instructions\n\n\u003cproject_overview\u003e\nA Python tool that converts HTML documentation (particularly from MadCap Flare) to Markdown format while preserving folder structure and centralizing images with intelligent deduplication.\n\u003c/project_overview\u003e\n\n## Core Functionality\n\n\u003cconversion_rules\u003e\n- **Input**: HTML files (`.html`, `.htm`, `.xhtml`)\n- **Output**: Markdown files (`.md`)\n- **Directory Structure**: Preserved except for images\n- **Image Handling**: Centralized in `static/img/{productname}` directory\n- **Filename Convention**: All lowercase with underscores replacing spaces\n- **Path References**: Absolute paths from parent output directory\n\u003c/conversion_rules\u003e\n\n## Key Features\n\n\u003cfeatures\u003e\n\u003cfeature name=\"intelligent_deduplication\"\u003e\n- Detects identical images using content hashing\n- Stores only one copy of duplicate images\n- Tracks usage in `image-manifest.json`\n\u003c/feature\u003e\n\n\u003cfeature name=\"link_preservation\"\u003e\n- Updates all internal `.html` links to `.md`\n- Maintains anchor links between documents\n- Resolves cross-file references automatically\n\u003c/feature\u003e\n\n\u003cfeature name=\"image_centralization\"\u003e\n- All images stored in `/static/img/{mirror_doc_directory}`\n- One image folder per product\n- Only referenced images are copied\n\u003c/feature\u003e\n\u003c/features\u003e\n\n## Installation \u0026 Setup\n\n\u003csetup_instructions\u003e\n```bash\n# 1. Clone repository\ngit clone [repository_url]\n\n# 2. Create virtual environment\npython3 -m venv venv\nsource venv/bin/activate  # Windows: venv\\Scripts\\activate\n\n# 3. Install dependencies\npip install beautifulsoup4 markdownify\n```\n\u003c/setup_instructions\u003e\n\n## Usage\n\n\u003cusage_examples\u003e\n\u003cexample name=\"basic\"\u003e\n```bash\npython app.py /path/to/html/docs /path/to/output\n```\n\u003c/example\u003e\n\n\u003cexample name=\"verbose\"\u003e\n```bash\npython app.py /path/to/html/docs /path/to/output --verbose\n```\n\u003c/example\u003e\n\u003c/usage_examples\u003e\n\n## Output Structure\n\n\u003coutput_structure\u003e\n```\noutput/                    # Specified output directory\n├── Product1/             # Markdown files (structure preserved)\n│   ├── guide/\n│   │   └── intro.md\n│   └── api/\n│       └── reference.md\n└── Product2/\n    └── docs/\n        └── overview.md\n\nstatic/                   # Parallel to output directory\n└── img/                 # Centralized images (not 'images')\n    ├── image-manifest.json  # Deduplication tracking\n    ├── Product1/\n    │   ├── guide/\n    │   │   └── screenshot.png\n    │   └── api/\n    │       └── diagram.png\n    └── Product2/\n        └── docs/\n            └── logo.png\n```\n\u003c/output_structure\u003e\n\n## Implementation Details\n\n\u003cprocessing_phases\u003e\n\u003cphase number=\"1\" name=\"scanning\"\u003e\n- Scan for images and build reference map\n- Create anchor mappings for cross-references\n- Build deduplication hash table\n\u003c/phase\u003e\n\n\u003cphase number=\"2\" name=\"conversion\"\u003e\n- Convert HTML to Markdown\n- Update all link references\n- Copy unique images to static directory\n- Generate image-manifest.json\n\u003c/phase\u003e\n\u003c/processing_phases\u003e\n\n## Critical Requirements\n\n\u003crequirements\u003e\n\u003crequirement priority=\"high\"\u003e\n- Never modify source files\n- Preserve all internal links\n- Handle MadCap Flare-specific HTML structures\n\u003c/requirement\u003e\n\n\u003crequirement priority=\"medium\"\u003e\n- Maintain readable Markdown output\n- Optimize image storage through deduplication\n- Generate comprehensive image manifest\n\u003c/requirement\u003e\n\u003c/requirements\u003e\n\n## Error Handling\n\n\u003cerror_scenarios\u003e\n\u003cscenario name=\"missing_images\"\u003e\n- Log warning but continue processing\n- Record in image-manifest.json\n- Preserve image reference in Markdown\n\u003c/scenario\u003e\n\n\u003cscenario name=\"invalid_html\"\u003e\n- Attempt best-effort conversion\n- Log parsing errors with file path\n- Continue with next file\n\u003c/scenario\u003e\n\n\u003cscenario name=\"duplicate_output\"\u003e\n- Check for existing files\n- Option to overwrite or skip\n- Log conflicts\n\u003c/scenario\u003e\n\u003c/error_scenarios\u003e\n\n## Performance Considerations\n\n\u003cperformance\u003e\n- **Expected Speed**: ~1-2 seconds per file\n- **Memory Usage**: Scales with image deduplication table\n- **Disk Usage**: Reduced through image deduplication\n- **Large Documentation Sets**: Two-pass processing for efficiency\n\u003c/performance\u003e\n\n## Troubleshooting Guide\n\n\u003ctroubleshooting\u003e\n\u003cissue name=\"broken_images\"\u003e\n\u003ccause\u003eImage not referenced in HTML or missing from source\u003c/cause\u003e\n\u003csolution\u003e\n1. Verify image exists in source\n2. Check if referenced in HTML\n3. Review image-manifest.json\n4. Confirm static/img structure\n\u003c/solution\u003e\n\u003c/issue\u003e\n\n\u003cissue name=\"broken_links\"\u003e\n\u003ccause\u003eCross-reference anchors not found\u003c/cause\u003e\n\u003csolution\u003e\n1. Check anchor mappings in verbose output\n2. Verify target document exists\n3. Confirm anchor ID consistency\n\u003c/solution\u003e\n\u003c/issue\u003e\n\u003c/troubleshooting\u003e\n\n## Command Reference\n\n\u003ccli_options\u003e\n| Option | Type | Description | Default |\n|--------|------|-------------|---------|\n| `input_dir` | Required | Source HTML directory | - |\n| `output_dir` | Required | Destination for Markdown | - |\n| `--verbose, -v` | Flag | Show detailed progress | False |\n| `--overwrite` | Flag | Overwrite existing files | False |\n| `--skip-images` | Flag | Convert without copying images | False |\n\u003c/cli_options\u003e\n\n## Testing Checklist\n\n\u003ctesting\u003e\n- [ ] Basic HTML to Markdown conversion\n- [ ] Image deduplication across multiple files\n- [ ] Cross-file link resolution\n- [ ] MadCap Flare specific elements\n- [ ] Large documentation set performance\n- [ ] Edge cases (empty files, broken HTML)\n\u003c/testing\u003e\n\n## Future Enhancements\n\n\u003cenhancements\u003e\n- Support for custom CSS preservation\n- Batch processing with progress bar\n- Configuration file support\n- Plugin system for custom transformations\n\u003c/enhancements\u003e","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnetwrix%2Fflarewell","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnetwrix%2Fflarewell","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnetwrix%2Fflarewell/lists"}