{"id":13553566,"url":"https://github.com/datopian/data-cli","last_synced_at":"2025-07-02T11:06:14.942Z","repository":{"id":40295446,"uuid":"92135715","full_name":"datopian/data-cli","owner":"datopian","description":"data - command line tool for working with data, Data Packages and the DataHub","archived":false,"fork":false,"pushed_at":"2022-12-07T09:48:56.000Z","size":790,"stargazers_count":64,"open_issues_count":23,"forks_count":7,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-05-28T07:22:48.008Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://datahub.io/docs/features/data-cli","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datopian.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-05-23T06:08:07.000Z","updated_at":"2024-08-17T22:14:46.000Z","dependencies_parsed_at":"2023-01-24T07:01:21.930Z","dependency_job_id":null,"html_url":"https://github.com/datopian/data-cli","commit_stats":null,"previous_names":["datahq/data-cli","datopian/datahub-cli"],"tags_count":53,"template":false,"template_full_name":null,"purl":"pkg:github/datopian/data-cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datopian%2Fdata-cli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datopian%2Fdata-cli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datopian%2Fdata-cli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datopian%2Fdata-cli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datopian","download_url":"https://codeload.github.com/datopian/data-cli/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datopian%2Fdata-cli/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260183347,"owners_count":22971204,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T12:02:28.444Z","updated_at":"2025-07-02T11:06:14.873Z","avatar_url":"https://github.com/datopian.png","language":"JavaScript","readme":"## Overview\n\n**\"Data-cli\"** is an important part of the [DataHub](https://datahub.io/docs/about) project. This is a command line tool, that helps you to manipulate your data (as `git` manipulates the code).\n\nFor example you have a set of data as a result of your work, let it be few data-files and a description. And you want to share it with your colleagues. With the **\"data-cli\"** you just need to:\n\n```shell\ncd data-folder\ndata init  # convert my data files into the data-package\n\u003e \"Answer a few questions here, e.g. dataset name, files to include, etc\"\ndata push  # upload the dataset onto a DataHub\n\u003e \"As a result you'll got a link to share:\nhttp://datahub.io/user-name/data-package-name\n```\n\nThat's it! Your data is online. You can make your data unlisted or private, add some pretty graphics, and many more. Please read http://datahub.io/docs for details.\n\nWith `data-cli` you can also:\n\n* Get data from online sources\n* Get info about data files and datasets (local and remote)\n* Validate your data to ensure its quality\n* Initialize a new dataset (as a Data Package)\n\n## Usage examples:\n\nHere we show examples of usage for common `data` commands. To see the full command documentation - click on the command name, or proceed to the [help pages](https://github.com/datahq/data-cli/tree/master/docs).\n\n### data login\n\nYou should login at the first use of data-cli:\n```bash\n$ data login\n? Login with... Github\n\u003e Opening browser and waiting for you to authenticate online\n\u003e You are logged in!\n```\n\n### [data push](https://github.com/datahq/data-cli/blob/master/docs/push.md)\n\nUpload a dataset or a separate file on the DataHub:\n```bash\n$ data push mydata.csv\n? Please, confirm name for this dataset:\n0-selfish-cougar-7 mydataset\n? Please, confirm title for this dataset:\nMydataset Mydataset\n  Uploading [******************************] 100% (0.0s left)\n  your data is published!\n🔗  https://datahub.io/myname/mydataset/v/1 (copied to clipboard)\n```\n\nAlternatively you can set name without interaction\n```bash\n$ data push mydata.csv --name=mydataset\n  Uploading [******************************] 100% (0.0s left)\n  your data is published!\n🔗  https://datahub.io/myname/mydataset/v/1 (copied to clipboard)\n```\n\n**Note:** by default, findability flag for your dataset is set to `--public`. Use `--unlisted` flag if you want it to not appear in the search results.\n\n### [data get](https://github.com/datahq/data-cli/blob/master/docs/get.md)\n\nGet a dataset from the DataHub or GitHub:\n```bash\n$ data get http://datahub.io/core/gold-prices\nTime elapsed: 1.72 s\nDataset/file is saved in \"core/gold-prices\"\n```\n\n### [data info](https://github.com/datahq/data-cli/blob/master/docs/info.md)\n\nShows info about the dataset (local or remote):\n```bash\n$ data info http://datahub.io/core/gold-prices\n# Gold Prices (Monthly in USD)\n\nMonthly gold prices since 1950 in USD (London market). Data is sourced from the Bundesbank.\n\n## Data\n    * [Bundesbank statistic ... [see more below]\n\n## RESOURCES\n┌───────────────────┬────────┬───────┬───────┐\n│ Name              │ Format │ Size  │ Title │\n├───────────────────┼────────┼───────┼───────┤\n│ data_csv          │ csv    │ 16172 │       │\n├───────────────────┼────────┼───────┼───────┤\n│ data_json         │ json   │ 32956 │       │\n├───────────────────┼────────┼───────┼───────┤\n│ gold-prices_zip   │ zip    │ 17755 │       │\n├───────────────────┼────────┼───────┼───────┤\n│ data              │ csv    │ 16170 │       │\n└───────────────────┴────────┴───────┴───────┘\n\n## README\nMonthly gold prices since 1950 in USD (London market). Data is sourced from the Bundesbank.\n...\n\n### Licence\n...\n```\n\n### [data cat](https://github.com/datahq/data-cli/blob/master/docs/cat.md)\n\nWorks similar as Unix `cat` command but works with remote resources and can convert tabular data into different formats:\n```bash\n$ data cat http://datahub.io/core/gold-prices/r/0.csv\n┌──────────────────────────────────────┬──────────────────────────────────────┐\n│ date                                 │ price                                │\n├──────────────────────────────────────┼──────────────────────────────────────┤\n│ 1950-02-01                           │ 34.730                               │\n├──────────────────────────────────────┼──────────────────────────────────────┤\n│ 1950-03-01                           │ 34.730                               │\n\n...........\n```\nYou can also convert tabular data into different formats (the source could be remote as well):\n```bash\n$ data cat prices.csv prices.md\n\u003e All done! Your data is saved in \"prices.md\"\nuser@pc:~/Downloads$ cat prices.md\n| date       | price    |\n| ---------- | -------- |\n| 1950-02-01 | 34.730   |\n| 1950-03-01 | 34.730   |\n```\n\n### [data init](https://github.com/datahq/data-cli/blob/master/docs/init.md)\n\nData-cli has an `init` command that will automatically generate Data Package metadata including scanning the current directory for data files and inferring [table schema] for tabular files:\n```bash\n$ data init\nThis process initializes a new datapackage.json file.\nOnce there is a datapackage.json file, you can still run `data init`\nto update/extend it.\nPress ^C at any time to quit.\n\n? Enter Data Package name prices\n? Enter Data Package title prices\n? Do you want to add following file as a resource \"prices.csv\" - y/n? y\nprices.csv is just added to resources\n? Do you want to add following file as a resource \"prices.xls\" - y/n? y\nprices.xls is just added to resources\n\n? Going to write to /home/user/Downloads/datapackage.json:\n{\n  \"name\": \"prices\",\n  \"title\": \"prices\",\n  \"resources\": [\n    {\n      \"path\": \"prices.csv\",\n      \"name\": \"prices\",\n      \"format\": \"csv\",\n....\n    },\n      \"schema\": {\n        \"fields\": [\n          {\n            \"name\": \"date\",\n            \"type\": \"date\",\n            \"format\": \"default\"\n          },\n          {\n........\n    {\n      \"path\": \"prices.xls\",\n      \"pathType\": \"local\",\n      \"name\": \"prices\",\n      \"format\": \"xls\",\n      \"mediatype\": \"application/vnd.ms-excel\",\n      \"encoding\": \"windows-1250\"\n    }\n  ]\n}\n\n\nIs that OK - y/n? y\ndatapackage.json file is saved in /home/user/Downloads/datapackage.json\n```\n\n### [data validate](https://github.com/datahq/data-cli/blob/master/docs/validate.md)\n\n```bash\n$ data validate path/to/correct/datapackage\n\u003e Your Data Package is valid!\n```\n```bash\n$ data validate path/to/invalid-data\n\u003e Error! Validation has failed for \"missing-column\"\n\u003e Error! The column header names do not match the field names in the schema on line 2\n\n```\n\n### data help\n\nAlso you can run \"help\" command in your terminal to see command docs:\n```shell\n$ data help\n'General description'\n$ data help push\n\u003e 'push command description'\n\n# data help get\n# data help init\n# etc ...\n```\n\n## Installation\n\n```\nnpm install data-cli --global\n```\nAfter installation you can run `data-cli` by the name `data`:\n```\ndata --version\n\u003e 0.8.9\n```\n\nIf you're not using NPM you can install `data-cli` binaries following [this instructions](https://datahub.io/docs/getting-started/installing-data#installing-binaries).\n\n# For developers\n\n[![Build Status](https://travis-ci.org/datahq/data-cli.svg?branch=master)](https://travis-ci.org/datahq/data-cli)\n[![XO code style](https://img.shields.io/badge/code_style-XO-5ed9c7.svg)](https://github.com/sindresorhus/xo)\n[![Issues](https://img.shields.io/badge/issue-tracker-orange.svg)](https://github.com/datahq/data-cli/issues)\n\n## Configuration\n\nConfiguration is in `~/.config/datahub/config.json`. In general, you should not need to edit this by hand. You can also override any variables in there using environment variables or on the command line by using the same name e.g.\n\n```\n$ data login --api https://api-testing.datahub.io\n```\n\nNB: you can set a custom location for the `config.json` config file using the `DATAHUB_JSON` environment variable e.g.:\n\n```\nexport DATAHUB_JSON=~/.config/datahub/my-special-config.json\n```\n\n## Environment\n\n*You need to have Node.js version \u003e7.6*\n\n**NOTE:** if you're a developer, you need to set `datahub=dev` environment variable so your usage of the CLI isn't tracked in the analytics:\n\nIt is recommended that you set this up permanently, e.g., MacOS users need to edit `~/.bash_profile` file - add this script in your `~/.bash_profile`:\n\n```bash\n# The next line sets 'datahub' env var so data-cli doesn't send tracking data to Analytics\nexport datahub=dev\n```\n\nand then restart your terminal.\n\n## Install\n\n```\n$ npm install\n```\n\n## Running tests\n\nWe use Ava for our tests. For running tests use:\n\n```\n$ [sudo] npm test\n```\n\nTo run tests in watch mode:\n\n```\n$ [sudo] npm run watch:test\n```\n\nWe also have tests for `push` command that publishes some of test datasets to DataHub. While Travis runs all tests on every commit, the `push` tests are run only on tagged commits. To run these tests locally you need to have credentials for 'test' user and use following command:\n\n```\n$ [sudo] npm test test/push/push.test.js\n```\n\n## Lint\n\nWe use XO for checking our code for JS standard/convention/style:\n\n```bash\n# When you run tests, it first runs lint:\n$ npm test\n\n# To run lint separately:\n$ npm run lint # shows errors only\n\n# Fixing erros automatically:\n$ xo --fix\n```\n","funding_links":[],"categories":["JavaScript","others"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatopian%2Fdata-cli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatopian%2Fdata-cli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatopian%2Fdata-cli/lists"}