{"id":13582563,"url":"https://github.com/caltechlibrary/datatools","last_synced_at":"2025-04-09T10:06:03.008Z","repository":{"id":48491092,"uuid":"81131065","full_name":"caltechlibrary/datatools","owner":"caltechlibrary","description":"A set of tools for working with JSON, CSV and Excel workbooks","archived":false,"fork":false,"pushed_at":"2025-01-31T18:29:26.000Z","size":4271,"stargazers_count":78,"open_issues_count":0,"forks_count":10,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-04-02T08:51:34.873Z","etag":null,"topics":["csv","data-munging","excel-workbook","json","shell-scripting","structured-data","xlsx"],"latest_commit_sha":null,"homepage":"https://caltechlibrary.github.io/datatools","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/caltechlibrary.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":"codemeta.json"}},"created_at":"2017-02-06T20:39:26.000Z","updated_at":"2025-03-20T17:41:16.000Z","dependencies_parsed_at":"2024-07-11T02:04:57.727Z","dependency_job_id":"fe487081-316d-44dc-96e2-8d53755c6e0e","html_url":"https://github.com/caltechlibrary/datatools","commit_stats":null,"previous_names":[],"tags_count":51,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Fdatatools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Fdatatools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Fdatatools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Fdatatools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/caltechlibrary","download_url":"https://codeload.github.com/caltechlibrary/datatools/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248018060,"owners_count":21034048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data-munging","excel-workbook","json","shell-scripting","structured-data","xlsx"],"created_at":"2024-08-01T15:02:50.785Z","updated_at":"2025-04-09T10:06:02.987Z","avatar_url":"https://github.com/caltechlibrary.png","language":"Go","readme":"\ndatatools\n=========\n\n_datatools_ is a rich collection of command line programs targetting\ndata conversion, cleanup and analysis directly from your favorite\nPOSIX shell. It has proven useful for data collaberations where\nindividual members of a project may prefer different toolsets in their\nanalysis (e.g. Julia, R, Python) but want to work from a common baseline.\nIt also has been used intensively for internal reporting from various\nCaltech Library metadata sources.\n\nThe tools fall into three broad categories \n\n- data transformation and conversion\n- shell scripting helpers\n- \"string\", a tool providing the common string operations missing from shell\n\nSee [user manual](user-manual.md) for a complete list of the command line\nprograms. The data transformation tools include support for formats such as\nExcel XML, csv, tab delimited files, json, yaml and toml.\n\nCompiled versions of the datatools collection are provided for Linux\n(amd64), Mac OS X (amd64), Windows 10 (amd64) and Raspbian (ARM7).\nSee https://github.com/caltechlibrary/datatools/releases.\n\nUse \"-help\" option for a full list of options for each utility (e.g. `csv2json -help`).\n\nData transformation\n-------------------\n\nThe tooling around transformation includes data conversion. These\ninclude tools that work with CSV, tab delimited, JSON, TOML, YAML\nand Excel XML.\n\nThere is also tooling to change data shapes using JSON as the\nintermediate data format.\n\nFor the shell\n-------------\n\nVarious utilities for simplifying work on the command line. \n\n+ [findfile](docs/findfile/) - find files based on prefix, suffix or contained string\n+ [finddir](docs/finddir/) - find directories based on prefix, suffix or contained string\n+ [mergepath](docs/mergepath/) - prefix, append, clip path variables\n+ [range](docs/range/) - emit a range of integers (useful for numbered loops in Bash)\n+ [reldate](docs/reldate/) - display a relative date in YYYY-MM-DD format\n+ [reltime](docs/reltime/) - display a relative time in 24 hour notation, HH:MM:SS format\n+ [timefmt](docs/timefmt/) - format a time value based on Golang's time format language\n+ [urlparse](docs/urlparse/) - split a URL into parts\n\nFor strings\n-----------\n\n_datatools_ provides the [string](docs/string/) command for working with \ntext strings (limited to memory available).  This is commonly needed when \ncleanup data for analysis. The _string_ command was created for when the \nold Unix standbys- grep, awk, sed, tr are unwieldly or inconvient. \n_string_ provides operations are common in most language like, trimming, \nspliting, and transforming letter case.  The _string_ command also makes \nit easy to join JSON string arrays into single a string using a delimiter \nor split a string into a JSON array based on a delimiter. The form of the \ncommand is `string [OPTIONS] [ACTION] [ARCTION_PARAMETERS...]`\n\n```shell\n    string toupper \"one two three\"\n```\n\nWould yield \"ONE TWO THREE\".\n\nSome of the features included\n\n+ change case (upper, lower, title, English title)\n+ length, position and count of substrings\n+ has prefix, suffix or contains\n+ trim prefix, suffix and cutsets\n+ split and join to/from JSON string arrays\n\nSee [string](docs/string/) for full details\n\nInstallation\n------------\n\nSee [INSTALL.md](install.html) for details for installing pre-compiled \nversions of the programs.\n\n","funding_links":[],"categories":["Data","Misc","Go","Go (134)"],"sub_categories":["Javascript \u0026 Typescript"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaltechlibrary%2Fdatatools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcaltechlibrary%2Fdatatools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaltechlibrary%2Fdatatools/lists"}