Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/danielgp/tableau-hyper-management
Manage importing any CSV file into Tableau-Hyper format (to be used with Tableau Desktop/Server) with minimal configuration (as column detection, content type detection and reinterpretation of content are part of the included logic) with additional script to publish to Tableau Sever as well
https://github.com/danielgp/tableau-hyper-management
column-detection csv detection tableau tableau-extract tableau-hyper tableau-server
Last synced: about 2 months ago
JSON representation
Manage importing any CSV file into Tableau-Hyper format (to be used with Tableau Desktop/Server) with minimal configuration (as column detection, content type detection and reinterpretation of content are part of the included logic) with additional script to publish to Tableau Sever as well
- Host: GitHub
- URL: https://github.com/danielgp/tableau-hyper-management
- Owner: danielgp
- License: lgpl-3.0
- Created: 2019-11-15T19:11:18.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2024-10-30T23:14:36.000Z (2 months ago)
- Last Synced: 2024-10-31T00:19:22.856Z (2 months ago)
- Topics: column-detection, csv, detection, tableau, tableau-extract, tableau-hyper, tableau-server
- Language: Python
- Homepage:
- Size: 498 KB
- Stars: 8
- Watchers: 3
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGE_LOG.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# Tableau-Hyper-Management
[![Scrutinizer Code Quality](https://scrutinizer-ci.com/g/danielgp/tableau-hyper-management/badges/quality-score.png?b=master)](https://scrutinizer-ci.com/g/danielgp/tableau-hyper-management/?branch=master)
[![Build Status](https://scrutinizer-ci.com/g/danielgp/tableau-hyper-management/badges/build.png?b=master)](https://scrutinizer-ci.com/g/danielgp/tableau-hyper-management/build-status/master)
[![Crowdin](https://badges.crowdin.net/tableau-hyper-management/localized.svg)](https://crowdin.com/project/tableau-hyper-management)## What is this repository for?
Based on [Tableau Hyper API](https://help.tableau.com/current/api/hyper_api/en-us/) this repository is intended to manage importing any CSV file into Tableau-Hyper format (to be used with Tableau Desktop/Server) with minimal configuration (as column detection, content type detection and reinterpretation of content are part of the included logic), therefore speed up the process of building extract.
Also, a publishing data source script allows taking resulted Tableau Hyper file and publish it to a Tableau Server. This is possible thank to excellent Tableau supported logic: [Tableau Server Client (Python)](https://github.com/tableau/server-client-python) package.
> This feature allows you to automate tedious tasks to refresh data on the server side (one real-life example could be a daily/weekly snapshot of a dynamically changing content to capture big variations in time in Development or Quality layer before reaching Production environment).## Who do I talk to?
Repository owner is: [Daniel Popiniuc](mailto:[email protected])
## Implemented features
- conversion intake data from a single or multiple CSV files based on a single input parameter (can be specific or contain a file pattern);
- dynamic fields detection based ont 1st line content and provided field separator (strategic advantage);
- dynamic advanced content type detection covering following data types: integer, float-dot, date-iso8601, date-DMY-dash, date-DMY-dot, date-DMY-slash, date-MDY, date-MDY-medium, date-MDY-long, time-12, time-12-micro-sec, time-24, time-24-micro-sec, datetime-iso8601, datetime-iso8601-micro-sec, datetime-MDY, datetime-MDY-micro-sec, datetime-MDY-medium, datetime-MDY-medium-micro-sec, datetime-MDY-long, datetime-MDY-long-micro-sec, string;
- support for empty field content for any data type (required re-interpreting CSV to be accepted by Hyper Inserter to ensure data types INT or DOUBLE are considered);
- use Panda package to benefit of Data Frames speed and flexibility;
- log file to capture entire logic details (very useful for either traceability but also debugging);
- most of the logic actions are not timed for performance measuring, so you can plan better your needs;
- publishing a Tableau Extract (Hyper format) to a Tableau Server (specifying Site and Project);
- detection of operating system current region language and log all feedback details using that.## Combinations of file types supported
| Output (right)
File Type/Format
Input (down) | Comma Separated Values | Excel | JSON | Parquet | Pickle | Tableau Extract (Hyper) |
|:-------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|
| Comma Separated Values | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Excel | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :no_entry: |
| JSON | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :no_entry: |
| Parquet | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Pickle | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Tableau Extract (Hyper) | :heavy_check_mark: | :no_entry: | :no_entry: | :no_entry: | :heavy_check_mark: | :soon: |## Installation
Installation can be completed in few steps as follows:
* Ensure you have git available to your system:
```
$ git --version
```
> If you get an error, depending on your system, you need to install it.
>> For Windows, you can do so from [Git for Windows](https://github.com/git-for-windows/git/releases/);
* Download this project from Github:
```
$ git clone https://github.com/danielgp/tableau-hyper-management
```
> conventions used:
>> = variables to be replaced with user values relevant strings
* Create a Python Virtual Environment using following command executed from project root folder:
```
$ python(.exe) -m venv /virtual_environment/
```
* Upgrade pip (PIP is a package manager for Python packages) using following command executed from newly created virtual environment and Scripts sub-folder:
```
$ /virtual_environment/Scripts/python(.exe) -m pip install --upgrade pip
```
* Install project prerequisites using following command executed from project root folder:
```
$ /virtual_environment/Scripts/pip install -r requirements.txt
```
* Ensure all localization source files are compiled properly in order for the package to work properly
```
$ /virtual_environment/Scripts/python(.exe) /sources/localizations_compile.py
```## Maintaining local package up-to-date
Once the package is installed is quite important to keep up with the latest releases as such are addressing important code improvements and potential security issues, and this can be achieved by following command:
```
$ git --work-tree= --git-dir=/.git/ --no-pager pull origin master
```
- conventions used:
- = variables to be replaced with user values relevant strings## Usage
### Converting CSV file into Tableau Extract (Hyper format)
```
$ /virtual_environment/Scripts/python(.exe) /tableau_hyper_management/converter.py --input-file --input-file-format csv|excel|json|pickle --input-file-compression infer|bz2|gzip|xz|zip --csv-field-separator ,|; --output-file (.hyper) --output-file-format csv|excel|hyper|json|pickle --output-file-compression infer|bz2|gzip|xz|zip (--output-log-file ) (--unique-values-to-analyze-limit 100|200=default_value_if_omitted|500|1000)
```
- conventions used:
- (content_within_round_parenthesis) = optional
- = variables to be replaced with user values relevant strings
- single vertical pipeline = separator for alternative options### Publishing a Tableau Extract (Hyper format) to a Tableau Server
```
$ /virtual_environment/Scripts/python(.exe) /tableau_hyper_management/publish_data_source.py --input-file (.hyper) --tableau-server --tableau-site --tableau-project --publishing-mode Append|CreateNew|Overwrite==default_if_omitted --input-credentials-file %credentials_file% (--output-log-file )
```
- conventions used:
- (content_within_round_parenthesis) = optional
- = variables to be replaced with user values relevant strings
- single vertical pipeline = separator for alternative options## Change Log / Releases detailed
see [CHANGE_LOG.md](CHANGE_LOG.md)
## Planned features to add (of course, when time will permit / help would be appreciated / votes|feedback is welcomed)
- additional formats to be recognized, like:
- float-USA-thousand-separator,
- float-EU,
- float-EU-thousand-separator;
- geographical identifiers (Country, US - Zip Codes)## Features to request template
Use [feature_request.md](.github/ISSUE_TEMPLATE/feature_request.md)
## Required software/drivers/configurations
see [readme_software.md](readme_software.md)
## Used references
see [readme_reference.md](readme_reference.md)