https://github.com/ergut/md-hierarchy
Transform large markdown files into hierarchical folder structures for better navigation and AI-assisted editing.
https://github.com/ergut/md-hierarchy
cli command-line-tool document-management documentation file-organization hierarchy markdown markdown-merge markdown-parser markdown-split markdown-tools static-site-generator
Last synced: 15 days ago
JSON representation
Transform large markdown files into hierarchical folder structures for better navigation and AI-assisted editing.
- Host: GitHub
- URL: https://github.com/ergut/md-hierarchy
- Owner: ergut
- Created: 2025-10-15T16:47:44.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-10-16T07:33:45.000Z (3 months ago)
- Last Synced: 2025-10-16T15:10:07.459Z (3 months ago)
- Topics: cli, command-line-tool, document-management, documentation, file-organization, hierarchy, markdown, markdown-merge, markdown-parser, markdown-split, markdown-tools, static-site-generator
- Language: Python
- Homepage:
- Size: 1.46 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# md-hierarchy
A CLI tool that splits markdown files into hierarchical folder structures based on heading levels, and can reconstruct the original markdown from the split pieces.
## Features
- **Split** markdown files into navigable folder hierarchies
- **Merge** folder structures back into single markdown files
- Preserves all markdown elements (code blocks, lists, tables, links, etc.)
- Handles edge cases (duplicate headings, empty headings, skipped levels)
- Round-trip compatible (split → merge produces equivalent content)
- Dry-run mode to preview operations
## Installation
```bash
# From PyPI
pip install md-hierarchy
# From source
pip install -e .
# With development dependencies
pip install -e ".[dev]"
```
## Usage
### Split Command
Split a markdown file into a hierarchical folder structure:
```bash
md-hierarchy split input.md output_dir --level 3
```
**Options:**
- `--level, -l`: Heading level to extract as files (1-4, default: 3)
- `--overwrite`: Overwrite output directory if it exists
- `--verbose, -v`: Print detailed operation log
- `--dry-run`: Show what would be done without writing files
**Example:**
```bash
# Split at level 3 (H3 headings become files)
md-hierarchy split proposal.md ./output --level 3
# Split with overwrite
md-hierarchy split proposal.md ./output --level 2 --overwrite
# Preview without creating files
md-hierarchy split proposal.md ./output --dry-run
```
### Merge Command
Merge a folder structure back into a single markdown file:
```bash
md-hierarchy merge input_dir output.md
```
**Options:**
- `--verbose, -v`: Print detailed operation log
**Example:**
```bash
# Merge folder structure
md-hierarchy merge ./output merged.md
# Merge with verbose output
md-hierarchy merge ./split-docs final.md --verbose
```
## Output Structure
When splitting at level 3, the tool creates this structure:
```
output-dir/
├── 00-__frontmatter__.md # Content before first heading (if exists)
├── 01-Introduction/
│ ├── 00-__intro__.md # H1 heading + intro content (always created)
│ ├── 01-Background/
│ │ ├── 00-__intro__.md # H2 heading + intro content (always created)
│ │ ├── 01-Problem-Statement.md # H3 section
│ │ └── 02-Research-Gap.md # H3 section
│ └── 02-Objectives/
│ ├── 00-__intro__.md # H2 heading (even if no intro content)
│ └── 01-Primary-Goals.md
└── 02-Methodology/
└── 00-__intro__.md # H1 heading + content
```
## File Naming Convention
- **Folders:** `NN-Sanitized-Title/` (e.g., `01-Introduction/`)
- **Intro files:** `00-__intro__.md` (always created for every heading folder)
- **Frontmatter:** `00-__frontmatter__.md` (at root, only if content exists before first heading)
- **Section files:** `NN-Sanitized-Title.md` (e.g., `01-Problem-Statement.md`)
- Numbers are zero-padded (01, 02, ..., 99)
- Special characters (`/ \ : * ? " < > |`) are removed
- Spaces are replaced with hyphens
- Maximum length: 50 characters
### Key Design Decisions
- `00-__intro__.md` is **always created** for every heading folder, even if empty
- This provides a consistent structure and an easy place to add intro text later
- Contains the heading declaration and any content before child sections
- The `00-` prefix ensures intro files sort first in directory listings
- The `__intro__` naming (double underscore) clearly marks these as special/meta files
- Frontmatter files are created at the root only when pre-heading content exists
## Edge Cases Handled
1. **Empty headings** → `Untitled-Section-N`
2. **Duplicate titles** → Append `-2`, `-3`, etc.
3. **Skipped levels** (H1 → H3) → Insert `00-Content/` folder
4. **Content before first heading** → `00-__frontmatter__.md` at root
5. **Heading attributes** (e.g., `{#id .class}`) → Preserved in content
6. **Headings with no intro content** → `00-__intro__.md` still created (with just the heading)
## Round-Trip Compatibility
The tool is designed for round-trip operations:
```bash
# Split
md-hierarchy split original.md ./split --level 3
# Merge
md-hierarchy merge ./split reconstructed.md
# Content should be equivalent
diff original.md reconstructed.md
```
## Development
### Setup
```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install in development mode
pip install -e ".[dev]"
```
### Run Tests
```bash
pytest
```
### Run Tests with Coverage
```bash
pytest --cov=md_hierarchy --cov-report=html
```
## Requirements
- Python 3.8+
- Dependencies:
- `markdown-it-py` - Markdown parsing
- `click` - CLI framework
## License
MIT