An open API service indexing awesome lists of open source software.

https://github.com/pcpp94/raw_etl_pipeline

A streamlined ETL solution for ingesting and processing legacy data formats with minimal resources. Includes daily and weekly .bat scripts on Task Scheduler for automated extraction, cleaning, and normalization, turning complex files into structured data effortlessly.
https://github.com/pcpp94/raw_etl_pipeline

etl legacy proof-of-concept

Last synced: 12 months ago
JSON representation

A streamlined ETL solution for ingesting and processing legacy data formats with minimal resources. Includes daily and weekly .bat scripts on Task Scheduler for automated extraction, cleaning, and normalization, turning complex files into structured data effortlessly.

Awesome Lists containing this project

README

          

**Automated ETL Pipeline for Legacy Data Formats on Windows VM**
A robust ETL repository designed to handle legacy data formats and perform automated data ingestion with minimal resources. This repository integrates various data extraction and cleaning tools for complex formats, including Excel parsing, secure web scraping, and custom data normalization. It includes .bat scripts that run daily and weekly via Task Scheduler on a Windows VM, demonstrating effective, resource-efficient automation for legacy document processing and ingestion into structured, analysis-ready formats. Ideal for modernizing data workflows with minimal manual effort.