Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gsy0911/azfs
AzFS is to provide convenient Python read/write functions for Azure Storage Account.
https://github.com/gsy0911/azfs
azure azure-storage-account dataframe pandas-dataframe python
Last synced: 24 days ago
JSON representation
AzFS is to provide convenient Python read/write functions for Azure Storage Account.
- Host: GitHub
- URL: https://github.com/gsy0911/azfs
- Owner: gsy0911
- License: mit
- Created: 2020-04-28T14:00:21.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-07-17T04:33:06.000Z (over 3 years ago)
- Last Synced: 2025-01-15T22:38:50.350Z (about 1 month ago)
- Topics: azure, azure-storage-account, dataframe, pandas-dataframe, python
- Language: Python
- Homepage: https://azfs.readthedocs.io/en/latest/?badge=latest
- Size: 532 KB
- Stars: 0
- Watchers: 2
- Forks: 2
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AzFS
[![pytest](https://github.com/gsy0911/azfs/workflows/pytest/badge.svg)](https://github.com/gsy0911/azfs/actions?query=workflow%3Apytest)
[![codecov](https://codecov.io/gh/gsy0911/azfs/branch/master/graph/badge.svg)](https://codecov.io/gh/gsy0911/azfs)
[![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/gsy0911/azfs.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/gsy0911/azfs/context:python)
[![Documentation Status](https://readthedocs.org/projects/azfs/badge/?version=latest)](https://azfs.readthedocs.io/en/latest/?badge=latest)[![PythonVersion](https://img.shields.io/badge/python-3.6|3.7|3.8-blue.svg)](https://www.python.org/downloads/release/python-377/)
[![PiPY](https://img.shields.io/pypi/v/azfs.svg)](https://pypi.org/project/azfs/)
[![Downloads](https://pepy.tech/badge/azfs)](https://pepy.tech/project/azfs)AzFS is to provide convenient Python read/write functions for Azure Storage Account.
`AzFS` can
* list files in blob (also with wildcard `*`),
* check if file exists,
* read csv as pd.DataFrame, and json as dict from blob,
* write pd.DataFrame as csv, and dict as json to blob.## install
```bash
$ pip install azfs
```## usage
For `Blob` Storage.
```python
import azfs
from azure.identity import DefaultAzureCredential
import pandas as pd# credential is not required if your environment is on AAD(Azure Active Directory)
azc = azfs.AzFileClient()# credential is required if your environment is not on AAD
credential = "[your storage account credential]"
# or
credential = DefaultAzureCredential()
azc = azfs.AzFileClient(credential=credential)# connection_string is also supported
connection_string = "DefaultEndpointsProtocol=https;AccountName=xxxx;AccountKey=xxxx;EndpointSuffix=core.windows.net"
azc = azfs.AzFileClient(connection_string=connection_string)# data paths
csv_path = "https://testazfs.blob.core.windows.net/test_caontainer/test_file.csv"# read csv as pd.DataFrame
df = azc.read_csv(csv_path, index_col=0)
# or
with azc:
df = pd.read_csv_az(csv_path, header=None)# write csv
azc.write_csv(path=csv_path, df=df)
# or
with azc:
df.to_csv_az(path=csv_path, index=False)# you can read multiple files
csv_pattern_path = "https://testazfs.blob.core.windows.net/test_caontainer/*.csv"
df = azc.read().csv(csv_pattern_path)# to apply additional filter or another process
df = azc.read().apply(function=lambda x: x[x['id'] == 'AAA']).csv(csv_pattern_path)# in addition, you can use multiprocessing
df = azc.read(use_mp=True).apply(function=lambda x: x[x['id'] == 'AAA']).csv(csv_pattern_path)
```For `Queue` Storage
```python
import azfs
queue_url = "https://{storage_account}.queue.core.windows.net/{queue_name}"azc = azfs.AzFileClient()
queue_message = azc.get(queue_url)
# message will not be deleted if `delete=False`
# queue_message = azc.get(queue_url, delete=False)# get message content
queue_content = queue_message.get('content')```
For `Table` Storage
```python
import azfs
cons = {
"account_name": "{storage_account_name}",
"account_key": "{credential}",
"database_name": "{database_name}"
}table_client = azfs.TableStorageWrapper(**cons)
# put data, according to the keyword you put
table_client.put(id_="1", message="hello_world")# get data
table_client.get(id_="1")```
check more details in [![Documentation Status](https://readthedocs.org/projects/azfs/badge/?version=latest)](https://azfs.readthedocs.io/en/latest/?badge=latest)
### types of authorization
Supported authentication types are
* [Azure Active Directory (AAD) token credential](https://docs.microsoft.com/azure/storage/common/storage-auth-aad).
* connection_string, like `DefaultEndpointsProtocol=https;AccountName=xxxx;AccountKey=xxxx;EndpointSuffix=core.windows.net`### types of storage account kind
The table below shows if `AzFS` provides read/write functions for the storage.
| account kind | Blob | Data Lake | Queue | File | Table |
|:--|:--:|:--:|:--:|:--:|:--:|
| StorageV2 | O | O | O | X | O |
| StorageV1 | O | O | O | X | O |
| BlobStorage | O | - | - | - | - |* O: provides basic functions
* X: not provides
* -: storage type unavailable## dependencies
```
pandas
azure-identity >= "1.3.1"
azure-storage-blob >= "12.3.0"
azure-storage-file-datalake >= "12.0.0"
azure-storage-queue >= "12.1.1"
azure-cosmosdb-table
```## references
* [azure-sdk-for-python/storage](https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/storage)
* [filesystem_spec](https://github.com/intake/filesystem_spec)