{"id":22620437,"url":"https://github.com/solrikk/magicxml","last_synced_at":"2026-04-18T00:02:00.066Z","repository":{"id":234472486,"uuid":"787264458","full_name":"Solrikk/MagicXML","owner":"Solrikk","description":"MagicXML is a high-performance web application built with FastAPI that converts data between XML, CSV, Excel, JSON, PDF, and image formats. Designed for data analysts, developers, and e-commerce professionals, MagicXML handles complex structures with advanced parsing capabilities, asyncio-powered processing, and intelligent data classification.","archived":false,"fork":false,"pushed_at":"2025-08-13T13:47:54.000Z","size":6537,"stargazers_count":4,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-13T15:43:04.993Z","etag":null,"topics":["async","convert","converter","converter-app","csv","csv-export","data-extraction","data-processing","data-transformation","excel","fastapi-template","multilingual","open-source","python","web-application","xml","xml-parser","xml-processing"],"latest_commit_sha":null,"homepage":"https://magic-xml.replit.app/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Solrikk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-04-16T07:43:33.000Z","updated_at":"2025-08-13T13:47:57.000Z","dependencies_parsed_at":"2025-07-01T09:27:41.912Z","dependency_job_id":"18732ffd-e6a8-4039-b563-4d306271c070","html_url":"https://github.com/Solrikk/MagicXML","commit_stats":null,"previous_names":["solrikk/magicxml","solrikk/magicdata","solrikk/magicxml-url"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/Solrikk/MagicXML","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Solrikk%2FMagicXML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Solrikk%2FMagicXML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Solrikk%2FMagicXML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Solrikk%2FMagicXML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Solrikk","download_url":"https://codeload.github.com/Solrikk/MagicXML/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Solrikk%2FMagicXML/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31950891,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T17:29:20.459Z","status":"ssl_error","status_checked_at":"2026-04-17T17:28:47.801Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["async","convert","converter","converter-app","csv","csv-export","data-extraction","data-processing","data-transformation","excel","fastapi-template","multilingual","open-source","python","web-application","xml","xml-parser","xml-processing"],"created_at":"2024-12-08T22:13:45.192Z","updated_at":"2026-04-18T00:02:00.053Z","avatar_url":"https://github.com/Solrikk.png","language":"Python","readme":"\n\u003cdiv align=\"center\"\u003e\n  \u003ch1\u003eMagicXML 🧙‍♂️\u003c/h1\u003e\n  \u003cp\u003e\u003cstrong\u003eAdvanced XML to CSV Conversion Tool\u003c/strong\u003e\u003c/p\u003e\n  \n  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n  [![FastAPI](https://img.shields.io/badge/FastAPI-0.95.0-009688.svg)](https://fastapi.tiangolo.com/)\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e \n  \u003ch3\u003e\n    \u003ca href=\"https://github.com/Solrikk/MagicXML/blob/main/README.md\"\u003e⭐English⭐\u003c/a\u003e | \n    \u003ca href=\"https://github.com/Solrikk/MagicXML/blob/main/README_RU.md\"\u003eRussian\u003c/a\u003e | \n    \u003ca href=\"https://github.com/Solrikk/MagicXML/blob/main/README_GE.md\"\u003eGerman\u003c/a\u003e | \n    \u003ca href=\"https://github.com/Solrikk/MagicXML/blob/main/README_JP.md\"\u003eJapanese\u003c/a\u003e | \n    \u003ca href=\"README_KR.md\"\u003eKorean\u003c/a\u003e | \n    \u003ca href=\"README_CN.md\"\u003eChinese\u003c/a\u003e \n  \u003c/h3\u003e \n\u003c/div\u003e\n\n-----------------\n\n## 🚀 Overview\n\n**MagicXML** is a high-performance web application built with FastAPI that converts data between XML, CSV, Excel, JSON, PDF, and image formats. Designed for data analysts, developers, and e-commerce professionals, MagicXML handles complex structures with advanced parsing capabilities, asyncio-powered processing, and intelligent data classification.\n\n### Supported Conversions\n\n- Convert CSV to XML\n- Convert CSV to Excel\n- Convert Excel to CSV\n- Convert JSON to CSV\n- Convert CSV to JSON\n- Convert XML to JSON\n- JPEG↔PNG image conversion\n- Convert PDF to CSV\n- Convert PDF to Excel\n- Convert PDF to JSON\n- Convert CSV to PDF\n- Convert Excel to PDF\n\n🔗 **Live Demo**: [https://magic-xml.replit.app](https://magic-xml.replit.app)\n\n## ✨ Key Features\n\n- **High-Performance Processing**: Asynchronous architecture for efficient handling of large XML files\n- **Intelligent Data Extraction**: Contextual parsing of complex nested XML structures\n- **Data Cleaning \u0026 Sanitization**: Automatic cleaning of HTML tags and special characters\n- **Multilingual Support**: Interface available in English, Russian, and more languages\n- **RESTful API**: Programmatic access for seamless integration with your systems\n- **Callback Support**: Optional webhook notifications when processing is complete\n- **Robust Error Handling**: Comprehensive error management with detailed reporting\n\n- **Versatile Format Conversions**: Convert between CSV, XML, Excel, JSON, PDF, and JPEG/PNG images\n\n## 🛠️ Technical Architecture\n\nMagicXML leverages several advanced technologies to deliver exceptional performance:\n\n- **FastAPI Backend**: High-performance asynchronous API framework\n- **Asyncio \u0026 Aiohttp**: Non-blocking I/O operations for concurrent processing\n- **XML ElementTree**: Efficient XML parsing and traversal\n- **BeautifulSoup**: Intelligent HTML content cleaning\n- **Modern Frontend**: Responsive design with custom CSS and JavaScript\n\n## 📊 Use Cases\n\n- **E-commerce Data Processing**: Convert product feeds from XML to CSV\n- **Data Analysis**: Transform XML datasets into analysis-ready CSV format\n- **System Integration**: Bridge XML-based systems with CSV-compatible tools\n- **Catalog Management**: Process large product catalogs efficiently\n- **Automated Workflows**: Integrate with data pipelines via API\n\n## 🔧 Installation \u0026 Setup\n\n### Prerequisites\n\n- Python 3.8+\n- Git\n\n### Quick Start\n\n```bash\n# Clone the repository\ngit clone https://github.com/Solrikk/MagicXML.git\ncd MagicXML\n\n# Install dependencies\npoetry install\n\n# Run the application\npoetry run uvicorn main:app --host 0.0.0.0 --port 8080 --reload\n```\n\nAlternatively, install dependencies with `pip`:\n\n```bash\npip install -r requirements.txt\n```\n\n## 🔌 API Reference\n\n### Convert XML to CSV\n\n```bash\ncurl -X 'POST' \\\n  'https://magic-xml.replit.app/process_link' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"link_url\": \"https://example.com/data.xml\",\n    \"preset_id\": \"optional-tracking-id\",\n    \"return_url\": \"https://your-callback-url.com/webhook\"\n  }'\n```\n\n#### Response\n\n```json\n{\n  \"file_url\": \"https://magic-xml.replit.app/download/data_files/example_com.csv\",\n  \"preset_id\": \"optional-tracking-id\",\n  \"status\": \"completed\"\n}\n```\n\n### Check Processing Status\n\n```bash\ncurl -X 'GET' 'https://magic-xml.replit.app/status/{preset_id}'\n```\n\n### Download Generated CSV\n\n```bash\ncurl -X 'GET' 'https://magic-xml.replit.app/download/data_files/{filename}'\n```\n\n## 📝 Implementation Details\n\n### Asynchronous Processing\n\nMagicXML processes XML files asynchronously using Python's `asyncio` and `aiohttp`:\n\n```python\nasync def process_offers_chunk(offers_chunk, build_category_path, format_type):\n    offers = []\n    for offer_elem in offers_chunk:\n        offer_data = await process_offer(offer_elem, build_category_path, format_type)\n        offers.append(offer_data)\n    return {\"offers\": offers}\n```\n\nThis approach enables efficient concurrent processing, drastically reducing conversion time for large XML files.\n\n### Text Processing \u0026 Data Cleaning\n\nThe application implements sophisticated text processing to ensure data quality:\n\n```python\ndef clean_description(description):\n    if not description:\n        return ''\n    soup = BeautifulSoup(description, 'html5lib')\n    allowed_tags = ['p', 'br']\n    for tag in soup.find_all(True):\n        if tag.name not in allowed_tags:\n            tag.unwrap()\n    # Additional cleaning logic...\n    return str(soup)\n```\n\n\u003cdiv align=\"center\"\u003e\n  \u003cp\u003e© 2025 MagicXML - Advanced XML to CSV Converter\u003c/p\u003e\n  \u003cp\u003e\n    \u003ca href=\"https://github.com/Solrikk/MagicXML\"\u003eGitHub\u003c/a\u003e •\n    \u003ca href=\"https://magic-xml.replit.app\"\u003eLive Demo\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsolrikk%2Fmagicxml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsolrikk%2Fmagicxml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsolrikk%2Fmagicxml/lists"}