https://github.com/c-3lab/dim
📦 dim: Manage the open data in your project like a package manager.
https://github.com/c-3lab/dim
cli commads command-line-tool data dataops dim gpt gpt-3 llm opendata package-manager public-data public-dataset
Last synced: 4 months ago
JSON representation
📦 dim: Manage the open data in your project like a package manager.
- Host: GitHub
- URL: https://github.com/c-3lab/dim
- Owner: c-3lab
- License: mit
- Created: 2021-08-20T06:42:41.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2024-01-21T12:09:03.000Z (almost 2 years ago)
- Last Synced: 2024-12-06T19:18:01.837Z (12 months ago)
- Topics: cli, commads, command-line-tool, data, dataops, dim, gpt, gpt-3, llm, opendata, package-manager, public-data, public-dataset
- Language: TypeScript
- Homepage:
- Size: 294 MB
- Stars: 127
- Watchers: 5
- Forks: 21
- Open Issues: 32
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# dim
[](https://codecov.io/gh/c-3lab/dim)
[](https://github.com/c-3lab/dim/releases/latest)
[](https://github.com/c-3lab/dim/releases/latest)
Data Installation Manager: Manage the open data in your project like a package manager.

## Join community
We are looking for members to develop together as an open source community.
[Slack](https://join.slack.com/t/c3lab-hq/shared_invite/zt-v6zz66n9-1VYkVXC4zoQViWSMdzMTLg)
# Features
- [📀 Record the source url and post-processing, etc., of downloaded open-data](#install)
- [🔧 Prepare all open data needed for the project in one command by using the `dim.json` recorded by someone else](#install-the-all-data)
- [🚀 General post-processing, such as unzip, encoding, etc., is available from the start](#specify-the-installation-post-process)
- [🔍 Search open-data from CKAN](#search)
- [🧠 Generate code to process data using **GPT-3**](#generate)
# Document
For more information about how to use it, please refer to [this document](/doc).
# Quick Start
## Install the dim
[Install the dim from binary files](#install-the-dim-from-binary-files) or
[Run the dim using Deno](#run-the-dim-using-deno)
### Install the dim from binary files
Download the dim from binary files.
[aarch64-apple-darwin](https://github.com/c-3lab/dim/releases/latest/download/aarch64-apple-darwin-dim)
```
curl -L https://github.com/c-3lab/dim/releases/latest/download/aarch64-apple-darwin-dim -o /usr/local/bin/dim
```
[x86_64-apple-darwin](https://github.com/c-3lab/dim/releases/latest/download/x86_64-apple-darwin-dim)
```
curl -L https://github.com/c-3lab/dim/releases/latest/download/x86_64-apple-darwin-dim -o /usr/local/bin/dim
```
[x86_64-pc-windows-msvc](https://github.com/c-3lab/dim/releases/latest/download/x86_64-pc-windows-msvc-dim.exe)
```
curl https://github.com/c-3lab/dim/releases/latest/download/x86_64-pc-windows-msvc-dim.exe -o C:\Users\user-name\dim.exe
```
[x86_64-unknown-linux-gnu](https://github.com/c-3lab/dim/releases/latest/download/x86_64-unknown-linux-gnu-dim)
```
curl -L https://github.com/c-3lab/dim/releases/latest/download/x86_64-unknown-linux-gnu-dim -o /usr/local/bin/dim
```
#### Grant user execution permission
```
chmod u+x /usr/local/bin/dim
```
## New Project
1. init the project
Generate `dim.json`, `dim-lock.json` and `data_files/` by init command.
```
$ dim init
```
2. Install a data
This command stores information about installed data in `dim.json` and `dim-lock.json`.
```
$ dim install https://example.com -n "example"
```
3. Installed data is saved in `data_files/`.
```
$ ls ./data_files
```
## Install all data written to dim.json shared by members
Install all data written to `dim.json` shared by members.

1. Make sure existing the dim.json in current directory
```
$ ls ./
dim.json ....
```
2. Install all data written in the dim.json
```
$ dim install
```
3. Installed data is saved in `data_files/`.
```
$ ls ./data_files
```
# Command Usage
## [Init](doc/en/init.md)
Generate `dim.json`, `dim-lock.json` and `data_files/`.
```
$ dim init
```
## [Install](doc/en/install.md)
### Install the all data.
Install all data written to `dim.json`.
```
$ dim install
```
Install from a specified local `dim.json`.
```
$ dim install -f ./path/dim.json
```
Install from a specified remote `dim.json` in internet.
```
$ dim install -f https://raw.githubusercontent.com/xxxx/xxxx/main/dim.json
```
### Install the specified data.
```
$ dim install https://example.com -n "example"
```
#### Specify headers.
```
$ dim install https://example.com -n "example" -H "Authorization: 1234567890abc" -H "Fiware-Service: example"
```
#### Specify the installation post-process
Postprocess unzip
```
$ dim install https://example.com -n "example" -p unzip
```
Postprocess encoding
```
$ dim install https://example.com -n "example" -p "encode utf-8"
```
Postprocess xlsx-to-csv
```
$ dim install https://example.com -n "example" -p xlsx-to-csv
```
Postprocess csv-to-json
```
$ dim install https://example.com -n "example" -p csv-to-json
```
Postprocess custom command
You can specify a custom command after **"cmd"**.
```
$ dim install https://example.com -n "example" -p "cmd ******"
```
The file path will be passed as an argument at the end of the specified command.
```
$ dim install https://example.com -n "example" -p "cmd python ./tests/test_custom_command.py"
```
Command to be executed during postprocessing.
```
$ python ./tests/test_custom_command.py ./data_files/***/***.xx
```
#### Install by specifying the HTML page
Install data by specifying regular expressions for links within a specified page.
```
$ dim install -P https://example.com -e ".pdf" -n "example"
```
#### Forced execution
Forced install. Overwrite already exist data file.
```
$ dim install https://example.com -n "example" -F
```
## [Uninstall](doc/en/uninstall.md)
Delete data with the specified name from `dim.json`, `dim-lock.json` and `data_files/`.
```
$ dim uninstall [name]
```
## [List](doc/en/list.md)
Display the information described in `dim-lock.json`.
```
$ dim list
```
### Simple List
```
$ dim list -s
```
## [Verify](doc/en/verify.md)
Verify the data
```
$ dim verify
```
## [Update](doc/en/update.md)
Update the all data.
```
$ dim update
```
Update the specified data.
```
$ dim update [name]
```
## Clean
Clean the project. Delete the data_files and init the project.
```
$ dim clean
```
## [Search](doc/en/search.md)
Search data from package_search CKAN API.
Use [データカタログ横断検索システム](https://search.ckan.jp/) by default to do the search.
```
$ dim search 避難所
```
Specify the number of data to get by option -n (default 10).
```
$ dim search 避難所 -n 3
```
### Interactive installation
Write data information to `dim.json` from ckan.
Store the data to `data_files`.
```
$ dim search -i "東京 避難所"
131105_東京都_目黒区_大地震時における地域避難所
- Catalog URL : https://www.geospatial.jp/ckan/dataset/131105-002
- Catalog Description: ####大地震時における地域避難所のデータです。
####東京都目黒区のオープンデータです。【リソース】大地震時における地域避難所 / ####大地震時における地域避難所のXLSXです。
【キーワード】東京都 / 目黒区 / 避難所
- Catalog License : クリエイティブ・コモンズ 表示
1. 大地震時における地域避難所
* Resource URL : https://www.geospatial.jp/ckan/dataset/1e07b569-80a5-4c31-8a7b-be88d1e8f327/resource/8d8de117-2342-4c61-a98d-8f7a9c5b71a2/download/131105evacuationspace.xlsx
* Resource Description: ####大地震時における地域避難所のXLSXです。
* Created : 2018-10-30T02:55:40.179726
* Format : XLSX
131059_東京都_文京区_緊急避難場所・避難所
- Catalog URL : https://www.geospatial.jp/ckan/dataset/131059-025
- Catalog Description: ####緊急避難場所・避難所のデータです。
####東京都文京区のオープンデータです。【リソース】緊急避難場所・避難所 / ####文京区の避難所・緊急避難場所の一覧データのCSVです。####更新日:2018年10月23日 / ####文京区の避難所・緊急避難場所の一覧データのXLSXです。
####更新日:2018年10月23日【キーワード】文京区 / 東京都 / 避難場所 / 避難所
- Catalog License : CC-BY2.1
2. 緊急避難場所・避難所
* Resource URL : https://www.geospatial.jp/ckan/dataset/b17c1f51-ce1c-4e6a-8ff9-5ff0203b1e43/resource/008d34ad-61a5-4dbd-8996-fa6d647c2986/download/kinkyuhinanbasyo-hinanjo.csv
* Resource Description: ####文京区の避難所・緊急避難場所の一覧データのCSVです。
####更新日:2018年10月23日
* Created : 2018-10-30T05:44:44.623645
* Format : CSV
3. 緊急避難場所・避難所
* Resource URL : https://www.geospatial.jp/ckan/dataset/b17c1f51-ce1c-4e6a-8ff9-5ff0203b1e43/resource/0c4942d4-a149-4091-a52f-69b7da8fa143/download/kinkyuhinanbasyo-hinanjo.xlsx
* Resource Description: ####文京区の避難所・緊急避難場所の一覧データのXLSXです。
####更新日:2018年10月23日
* Created : 2018-10-30T05:44:46.127915
* Format : XLSX
...
? Enter the number of data to install > 1
? Enter the name. Enter blank if want to use CKAN resource name. >
? Enter the post-processing you wish to add. Enter blank if not required. > xlsx-to-csv
? Is there a post-processing you would like to add next? (Y/n) > No
Convert xlsx to csv.
Installed to ./data_files/131105_東京都_目黒区_大地震時における地域避難所_大地震時における地域避難所/131105evacuationspace.xlsx
```
## [Generate](doc/en/generate.md)
Auto-generate code about target data using **GPT-3**. For example, conversion processing, visualization processing,
etc..

Export APIKey of OpenAI to `OPENAI_API_KEY`.
```
$ export OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxx
```
You can get APIKey URL: https://platform.openai.com/account/api-keys
Combine the specified target data and prompt, send it to GPT-3 API, output the code, and save it.
```
$ dim generate -t "./data.csv" "Python code that converts this csv data to geojson"
```
Specify the data name managed by dim using `-t`
```
$ dim generate -t "shelter" "Python code that converts this csv data to geojson"
```
### Example prompt List
```
Python code that converts this csv data to geojson
```
```
Python code that remove id column from this csv data
```
```
Python code that visualizes this csv data as a map
```
```
Python code that visualizes this csv data as a map
```
```
Python code that visualizes this csv data as HTML page
```
```
Python code that saves this csv data to PostgreSQL
```
```
Python code that converts full-width numbers in this csv file to half-width numbers
```
## [Help](doc/en/help.md)
```
$ dim help
```
# Run the dim using Deno
1. Install Deno
- Deno == 1.28.2
```
$ curl -fsSL https://deno.land/install.sh | sh
$ echo 'export DENO_INSTALL=~/.deno' >> ~/.bashrc
$ echo 'export PATH="$DENO_INSTALL/bin:$PATH"' >> ~/.bashrc
$ source ~/.bashrc
```
2. Clone the repository
```
$ git clone https://github.com/c-3lab/dim.git
```
```
$ cd dim
```
3. Run the dim commands
```
$ deno run -A dim.ts init
$ deno run -A dim.ts install https://xxxxxx/data.json -n 'data_name'
```
4. Install dim
```
$ deno install --unstable --allow-read --allow-write --allow-run --allow-net --allow-env dim.ts
```
# Run test and display coverage
1. Run test
```
$ deno test -A --coverage=tests/coverage
```
2. Display coverage
```
$ deno coverage ./tests/coverage
```
# Upgrade the dim version
You need to be able to run Deno in a local environment.\
If you don't have an execution environment of Deno, please re-install.\
[Install the dim from binary files](#install-the-dim-from-binary-files)
```
$ dim upgrade
```
# Contributers
Made with [contributors-img](https://contributors-img.web.app).
# LICENSE
[MIT LICENSE](./LICENSE)