https://github.com/nitor-infotech-oss/large-datafile-process-spark

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/nitor-infotech-oss/large-datafile-process-spark
Owner: nitor-infotech-oss
Created: 2023-06-16T04:36:38.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-07-14T09:55:23.000Z (about 2 years ago)
Last Synced: 2025-03-27T04:16:40.522Z (7 months ago)
Language: Python
Size: 506 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

### Handle Multiple Header File with Azure key vault Integration.

Overview
The purpose of this script is to download a file from Azure Blob Storage using the secret key vault credentials and separate the file data into multiple CSV files based on their headers.

Prerequisites
- Azure Blob Storage account
- Secret key vault credentials

Instructions
1. Set up the necessary environment configuration for the Azure Blob Storage account and secret key vault credentials.
2. Install the Azure Blob Storage Python library.
3. Use the Azure Blob Storage Python library to download the file with the secret key vault credentials.
4. Open the file and read the data.
5. Upload the separated CSVs back to Azure Blob Storage.

Usage
1. Set up the environment configuration variables for Azure Blob Storage and the secret key vault credentials.
2. Run the script to download the file and separate the data into multiple CSV files.
3. upload the separated CSV files back to Azure Blob Storage.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nitor-infotech-oss/large-datafile-process-spark

Awesome Lists containing this project

README