{"id":15130820,"url":"https://github.com/devin-liu/excel-to-markdown","last_synced_at":"2025-04-12T09:15:10.562Z","repository":{"id":256876150,"uuid":"856692689","full_name":"devin-liu/excel-to-markdown","owner":"devin-liu","description":"A Python tool that converts Excel sheets into Markdown tables with automatic table detection, multi-sheet processing, and interactive mode for complex layouts.","archived":false,"fork":false,"pushed_at":"2025-03-31T18:28:05.000Z","size":181,"stargazers_count":21,"open_issues_count":2,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-12T09:14:58.365Z","etag":null,"topics":["excel","markdown","openpyxl","pandas","python","spreadsheets"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devin-liu.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-13T02:57:01.000Z","updated_at":"2025-04-06T10:38:47.000Z","dependencies_parsed_at":"2024-10-31T11:25:35.679Z","dependency_job_id":"b5055220-ddbb-4aef-ad35-922171bac285","html_url":"https://github.com/devin-liu/excel-to-markdown","commit_stats":{"total_commits":40,"total_committers":2,"mean_commits":20.0,"dds":"0.025000000000000022","last_synced_commit":"c7d69dd88944e4b7c561f56f9a96c0ef5dbd070e"},"previous_names":["devin-liu/excel-to-markdown"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devin-liu%2Fexcel-to-markdown","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devin-liu%2Fexcel-to-markdown/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devin-liu%2Fexcel-to-markdown/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devin-liu%2Fexcel-to-markdown/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devin-liu","download_url":"https://codeload.github.com/devin-liu/excel-to-markdown/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248543839,"owners_count":21121838,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["excel","markdown","openpyxl","pandas","python","spreadsheets"],"created_at":"2024-09-26T03:06:51.002Z","updated_at":"2025-04-12T09:15:10.536Z","avatar_url":"https://github.com/devin-liu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EXCEL-TO-MARKDOWN\n\n![License](https://img.shields.io/badge/license-GPLv3-blue)\n![Python](https://img.shields.io/badge/python-3.1%2B-blue.svg)\n\n**EXCEL-TO-MARKDOWN** is a robust Python tool designed to convert Excel files (`.xlsx` and `.xls`) into well-formatted Markdown tables. Leveraging a modular architecture, this tool offers enhanced table detection capabilities, interactive prompts for handling complex Excel layouts, and seamless integration with various project workflows.\n\n## 🛠️ Features\n\n- **Automated Table Detection:** Identifies the first fully populated row as the table header, ensuring accurate Markdown conversion.\n- **Interactive Mode:** Prompts users to specify table regions when automatic detection fails, handling complex and irregular Excel structures.\n- **Modular Design:** Organized into distinct modules for detection, parsing, Markdown generation, and utilities, promoting maintainability and scalability.\n- **Supports Multiple Sheets:** Processes all sheets within an Excel file, generating separate Markdown files for each.\n- **Flexible Column Specification:** Allows users to define column ranges using both letter-based (e.g., `A:D`) and number-based (e.g., `1-4`) inputs.\n- **Unit Tested:** Comprehensive unit tests ensure reliability and facilitate future enhancements.\n- **Easy Integration:** Compatible with Poetry for dependency management and can be integrated into larger projects or CI/CD pipelines.\n\n## 📁 Project Structure\n\n```\nEXCEL-TO-MARKDOWN\n│\n├── .venv\n├── data\n│   ├── input\n│   └── output\n├── docs\n├── excel_to_markdown\n│   ├── __init__.py\n│   ├── main.py\n│   ├── detector.py\n│   ├── parser.py\n│   ├── markdown_generator.py\n│   └── utils.py\n├── src\n├── tests\n│   ├── test_detector.py\n│   ├── test_parser.py\n│   ├── test_markdown_generator.py\n│   └── test_main.py\n├── .gitignore\n├── LICENSE\n├── poetry.lock\n├── pyproject.toml\n└── readme.md\n```\n\n### **Module Breakdown**\n\n- **`excel_to_markdown/`**\n  - **`main.py`**: Entry point of the application. Handles argument parsing, orchestrates the workflow, and manages file I/O.\n  - **`detector.py`**: Contains functions related to detecting the table start within Excel sheets.\n  - **`parser.py`**: Handles parsing user inputs, such as column specifications.\n  - **`markdown_generator.py`**: Responsible for converting pandas DataFrames to Markdown format.\n  - **`utils.py`**: Utility functions like column letter to index conversion and filename sanitization.\n\n- **`tests/`**\n  - **`test_detector.py`**\n  - **`test_parser.py`**\n  - **`test_markdown_generator.py`**\n  - **`test_main.py`**\n  \n  *Each test file contains unit tests for their respective modules, ensuring functionality and reliability.*\n\n## 🚀 Installation\n\n### **Prerequisites**\n\n- **Python 3.7+**: Ensure you have Python installed. You can download it from [python.org](https://www.python.org/downloads/).\n- **Poetry**: Python dependency management tool. Install it using the following command:\n\n  ```bash\n  curl -sSL https://install.python-poetry.org | python3 -\n  ```\n\n### **Clone the Repository**\n\n```bash\ngit clone https://github.com/yourusername/EXCEL-TO-MARKDOWN.git\ncd EXCEL-TO-MARKDOWN\n```\n\n### **Set Up the Virtual Environment**\n\nPoetry manages virtual environments automatically. To install dependencies:\n\n```bash\npoetry install\n```\n\nTo activate the virtual environment:\n\n```bash\npoetry shell\n```\n\n## 📋 Usage\n\n### **Preparing Your Data**\n\n1. **Input Directory:** Place all your Excel files (`.xlsx` or `.xls`) in the `data/input` directory.\n\n2. **Output Directory:** The converted Markdown files will be saved in the `data/output` directory by default. If this directory doesn't exist, the script will create it.\n\n- **`data/input`**: Directory containing your Excel files.\n- **`data/output`**: (Optional) Directory where Markdown files will be saved. If not specified, an `output` folder will be created inside the input directory.\n\n\n### **Running the Localhost Server**\n\nYou can also start a localhost server for real-time editing using:\n\n```bash\npoetry run app\n```\n\nThis will start a server on your localhost, allowing you to make edits to your spreadsheets locally and see immediate updates.\n\n### **Running the CLI Script** \n\nExecute the main script over CLI using the following command:\n\n```bash\npython -m excel_to_markdown.main data/input data/output\n```\n\n**Example:**\n\n```bash\npython -m excel_to_markdown.main data/input data/output\n```\n\n### **Interactive Prompts**\n\nFor each sheet in each Excel file:\n\n1. **Automatic Detection:**\n   - The script attempts to detect the header row based on the enhanced logic (first fully populated row).\n   - If successful, it proceeds to convert without prompts.\n\n2. **Manual Specification:**\n   - If automatic detection fails, you'll be prompted to enter:\n     - **Header Row Number:** The row where your table headers are located (1-based index).\n     - **Columns to Include:** Specify the range of columns, e.g., `A:D` or `1-4`.\n\n**Sample Interaction:**\n\n```\nProcessing sheet: 'Sales Data' in file 'report1.xlsx'\nAutomatically detected table starting at row 2.\nMarkdown file 'report1_Sales_Data.md' for sheet 'Sales Data' has been created successfully.\n\nProcessing sheet: 'Summary' in file 'report1.xlsx'\nAutomatic table detection failed.\nEnter the header row number (1-based index): 5\nEnter the columns to include (e.g., A:D or 1-4): B:E\nMarkdown file 'report1_Summary.md' for sheet 'Summary' has been created successfully.\n```\n\n## 🧩 Contributing\n\nContributions are welcome! To contribute:\n\n1. **Fork the Repository**\n\n2. **Create a Feature Branch**\n\n   ```bash\n   git checkout -b feature/YourFeatureName\n   ```\n\n3. **Commit Your Changes**\n\n   ```bash\n   git commit -m \"Add some feature\"\n   ```\n\n4. **Push to the Branch**\n\n   ```bash\n   git push origin feature/YourFeatureName\n   ```\n\n5. **Open a Pull Request**\n\nPlease ensure that your contributions adhere to the existing code style and include relevant tests.\n\n## 🧪 Testing\n\nUnit tests are located in the `tests/` directory. To run the tests:\n\n```bash\npoetry run pytest\n```\n\nEnsure that you have the virtual environment activated via Poetry.\n\n## 📜 License\n\nThis project is licensed under the [GPLv3](LICENSE).\n\n## 📧 Contact\n\nFor any inquiries or support, please contact [devin.r.liu@gmail.com](mailto:devin.r.liu@gmail.com).\n\n---\n\n**Happy Converting! 🚀**\n\n---","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevin-liu%2Fexcel-to-markdown","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevin-liu%2Fexcel-to-markdown","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevin-liu%2Fexcel-to-markdown/lists"}