{"id":18265131,"url":"https://github.com/nasa-ammos/datadrive-commandline","last_synced_at":"2025-04-09T01:42:41.040Z","repository":{"id":224624777,"uuid":"727859809","full_name":"NASA-AMMOS/DataDrive-CommandLine","owner":"NASA-AMMOS","description":"DataDrive-CommandLine","archived":false,"fork":false,"pushed_at":"2025-03-24T21:00:50.000Z","size":269,"stargazers_count":0,"open_issues_count":2,"forks_count":1,"subscribers_count":7,"default_branch":"develop","last_synced_at":"2025-03-24T22:19:43.796Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NASA-AMMOS.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-05T18:13:55.000Z","updated_at":"2025-03-24T21:00:43.000Z","dependencies_parsed_at":"2025-03-24T22:19:41.339Z","dependency_job_id":"f287395c-0021-4994-b1c3-cf093db36618","html_url":"https://github.com/NASA-AMMOS/DataDrive-CommandLine","commit_stats":null,"previous_names":["nasa-ammos/datadrive-commandline"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NASA-AMMOS%2FDataDrive-CommandLine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NASA-AMMOS%2FDataDrive-CommandLine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NASA-AMMOS%2FDataDrive-CommandLine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NASA-AMMOS%2FDataDrive-CommandLine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NASA-AMMOS","download_url":"https://codeload.github.com/NASA-AMMOS/DataDrive-CommandLine/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247958710,"owners_count":21024821,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T11:17:12.852Z","updated_at":"2025-04-09T01:42:41.020Z","avatar_url":"https://github.com/NASA-AMMOS.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DataDrive-CommandLine\n\nDataDrive Command Line client\n\nThis project provides command-line applications to interface with Datadrive/OCS. The DataDrive command line supplements tools within the OCS suite by providing a way of listening to OCS events without having to interface with AWS. It is compatible with CSSO or CAM authentication systems (for OCS and AOCS respectively).\n\n## Installation\n\n### From Github Release\n\nThis is the typical way to install the released software.\n\nPlease visit https://github.com/NASA-AMMOS/DataDrive-CommandLine/releases to get the latest and past releases. Each zip contains binaries for Linux, Windows and OSX.\n\n### From Source\n\nIf you are a developer or want to make changes, or try an unreleased development version of the software, you will need to install from the source code.\n\n#### Prerequisites\n\n-   NodeJS: tested with v14.15.0, versions newer should also work.\n\n#### Building with Docker (recommended)\n\n-   Install Docker if you do not have it (https://www.docker.com/). This allows you to run apps in a sandbox and avoids a variety of configuration and compatibility issues that can arise from building software directly\n-   Run Docker\n-   Clone this repository and go to `[cloned_folder]`\n-   Run `make build`\n    -   Note: Make sure you have artifactory set as another npm registry by creating or updating `~/.npmrc` file with the following content added or appended.\n        -   `@gov.nasa.jpl.m2020.cs3:registry=https://cae-artifactory.jpl.nasa.gov:443/artifactory/api/npm/npm-release-local/`\n        -   `@gov.nasa.jpl.ammos.ids:registry=https://artifactory.jpl.nasa.gov/artifactory/api/npm/npm-develop-local/`\n    -   Please reference \u003chttps://www.jfrog.com/confluence/display/RTF/Npm+Registry\u003e for a more detailed read up.\n-   A ZIP of built binaries will be located in `[cloned_folder]/dist`. Unzip that file and you will get folders for each supported platform.\n\n#### Building with NPM\n\n-   Clone this repository and go to `[cloned_folder]`.\n-   Run `make package`\n    -   Note: Make sure you have artifactory set as another npm registry by creating or updating `~/.npmrc` file with the following content added or appended.\n        -   `@gov.nasa.jpl.m2020.cs3:registry=https://cae-artifactory.jpl.nasa.gov:443/artifactory/api/npm/npm-release-local/`\n        -   `@gov.nasa.jpl.ammos.ids:registry=https://artifactory.jpl.nasa.gov/artifactory/api/npm/npm-develop-local/`\n    -   Please reference \u003chttps://www.jfrog.com/confluence/display/RTF/Npm+Registry\u003e for a more detailed read up.\n-   Built binary will be located in `[cloned_folder]/deployment/dist`\n\n## Getting Started\n\n### Getting an authentication token (Logging in)\n\n-   For an OCS (CSSO) environment (such as M2020), download [credss](https://github.jpl.nasa.gov/CS3/credss) from the mission-support area, or one of the [latest releases](https://github.jpl.nasa.gov/CS3/credss/releases) and follow the instructions\n-   For an AOCS (CAM) environment, `cd` into the [etc](./etc) directory of the cloned repository, and run [request-ssotoken.sh](./etc/request-ssotoken.sh)\n    -   Example: `./request-ssotoken.sh -login_method=LDAP_CLI -access_manager_uri=https://cam.jpl.nasa.gov/cam` where `access_manager_uri` may be different\n    -   Run `request-ssotoken.sh` with no arguments to see the help..\n\n### Configure CLI\n\nTo configure the CLI, please run the following command:\n\n-   `ddrv config --help` will show the help screen and available options\n-   The basic required configuration is the Datadrive Middleware server and the PEP server, ex: `./ddrv config -d [datadrive_middleware_hostname] -p [pep_hostname]`\n-   Other typical options include a custom log path and the time interval to roll the log files: `./ddrv config --dd-host datadrive-middle-dev.dev.m20.jpl.nasa.gov --pep-host data.dev.m20.jpl.nasa.gov --logdir log_output_folder --log-date-pattern daily`\n    \n    The command above will create a configuration JSON file in `~/.datadrive/datadrive.json`. Note: You should only need to run this once unless `~/.datadrive/datadrive.json` file is deleted or you are using the ddrv CLI with multiple environments.\n-   Your mission-support team will typically give you the server names, or potentially even a\n    configuration file\n-   Check your configuration by running `ddrv config` with no arguments\n\n### Basic Subscriptions\n\nTo listen for new files that have been added in a specific package and download the file to a specified folder.\n\n-   Ex:\n    -   `./ddrv subscribe -p [ocs_package_name] -o [output_directory_path_here] -r`\n        -   Note: `-r` flag is to retain the ocs path\n        -   Note: `--help` flag displays help information\n    -   Note: Underneath the covers, this CLI uses the web socket API from the DataDrive Middleware, which itself polls a SQS queue that receives update events from OCS's SNS.\n\n### Subscriptions with Wildcard filtering\n\nSimilar to **Basic Subscriptions** above, but only download files that match a wildcard filter.\n\n-   Ex: `./ddrv subscribe -p [ocs_package_name] -o [output_directory_path_here] -r -f [wildcard_filter]`\n-   `-f` flag should be followed by a wildcard filter. For example, `nick*.txt` will match `nick_09-24-2019.txt` and not `nick_09-24-2019.gif`.\n-   **Basic Subscriptions** with no `-f` flag are essentially running the `-f` flag with `*` as its value.\n\n### Subscriptions with Regex filtering\n\nSimilar to **Basic Subscriptions** above, but filter for only files that match a regex filter.\n\n-   Ex: `./ddrv subscribe -p [ocs_package_name] -o [output_directory_path_here] -r -x [regex_filter]`\n-   `-x` flag should be followed by a regex filter. For example, `nick.*\\.txt` will match `nick_09-24-2019.txt` and not `nick_09-24-2019.gif`.\n-   The regex filter uses ElasticSearch's regex syntax and does not take anchor strings indicating beginning `^` or end `$`. Values put into the filter will always be anchored and the regex filter provided must match the entire string. For more information on the regex expression syntax, please visit \u003chttps://www.elastic.co/guide/en/elasticsearch/reference/6.4/query-dsl-regexp-query.html#regexp-syntax\u003e.\n-   The regex filter requires a \"/\" when matching a path, otherwise it will match the expression against the file basename\n\n### Skip Unchanged\n\n`-S` or `--skip-unchanged`: Only download files that are new or have changed.\n\nUses the OCS index event field `s3_object_changed` which indicates if associated file is considered changed in S3.\nIf a user specifies the skip-unchanged flag, then for each OCS file event, the CLI will:\n\n-   check if the OCS event indicates that file is unchanged;\n-   check if local filesystem already contains a file with same name in the output location.\n-   If both S3 unchanged and local file exists, then OCS file event will be skipped.\n\nNote: Skipping only applies to OCS events. When performing Playback, all files are assumed to be changed, even if they actually weren't.\n\n### Subscriptions with Saved Search\n\nInstead of defining a `ocs_package_name` with the `-p` flag, you can subscribe to a saved search that can be created in the DataDrive UI using the `-s` flag. The saved search will include a package and some filters.\n\n-   Ex: `./ddrv subscribe -s [saved_search_name] -o [output_directory_path_here] -r`\n-   Note: The example command uses the `-r` flag which will retain the ocs path.\n\n### Playback events\n\nYou can \"playback\" events that have since happened from a certain date. When \"playback\" is enabled, the CLI will query OCS for documents (that have not been deleted) since the last saved checkpoint date.\n\n-   `-P` flag is used to denote the use of the playback option.\n-   The checkpoint date is saved in `YYYY-MM-DDTHH:mm:ss.SSSZZ` format in the `\u003coutput_dir\u003e/.datadrive/checkpoint.txt` file in the specified output folder.\n\n### Configuring Logging\n\nBy default, all log entries are stored in a single file in the configured log directory (`combined.log`).\n\nTo utilize rolling logfiles, use the `--log-date-pattern` option with `ddrv config`. Simple options are `monthly`, `weekly`, and `daily`. You can also specify a custom log rotation value using this format: (https://momentjs.com/docs/#/parsing/string-format/). This format will define the frequency of the log rotation, with the logs rotating on the smallest date/time increment specified in the format.\n\nFor example, `ddrv config --log-date-pattern YYYYMMDDhhmm` will create a new log file every minute, named something like `combined-202406131233.log`.\n\nBy default, logs are archived in a gzip file. To disable this functionality, set the `--no_gzip_rolling_logs` option when running `ddrv config`.\n\nThe `--log-date-pattern`, `--no_gzip_rolling_logs`, and `--logdir` options can also be used with the `ddrv subscribe` command to determine the logging options for that particular subscription session.\n\n### Subscriptions that auto-execute a command on each downloaded file\n\nYou can run a single shell command that will be called for every file downloaded. You can use this to automatically run post-processing, for example.\n\n-   Ex: `./ddrv subscribe -p [ocs_package_name] -o [output_directory_path_here] -r -x [regex_filter] --callback 'command_to_execute'`\n-   `--callback` flag's value should be in single quotes\n-   `--callback` flag's value is an shell command and can use these special variables: `$FILENAME` `$SRC_PATH` `$DEST_PATH`\n- for example: `ddrv subscribe -if --retain-path -p sample-package -o output_folder -f 'filter-string*' -P --callback 'echo $FILENAME $SRC_PATH $DEST_PATH'`\n\n\n## Unit Test\n\nTo run unit tests, go to `src/` folder and run the command `npm test`. Note: To avoid confusion, the commands below is for manual testing without packaging files into a single binary.\n\n## Manual Tests\n\nPlease note, these test cases are for **developers** performing manual test from the source code. For others that are performing these tests via the published binary, you would start commands below with `ddrv` instead of `node ddrv.js`.\n\n### Test 1 - Subscriptions Only\n\n#### Steps\n\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp -p DataDrive-Dev-Congruent-1` or `./ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1`\n-   Upload a file into the specified package in the example above \"DataDrive-Dev-Congruent-1\"\n\n#### Expected Results\n\n-   File should be downloaded into the specified folder. In example above, the `tmp` folder.\n-   A date should be written into the `\u003coutput_dir\u003e/.datadrive/checkpoint.txt` file under the specified folder.\n    -   Ex: `2019-09-08T11:51:28.000-0700`\n\n### Test 2 - Subscriptions and Playback Events\n\n#### Steps\n\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o /tmp -p DataDrive-Dev-Congruent-1 -P` or `./ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1 -P`\n\n#### Expected Results\n\n-   Files since the date listed in `\u003coutput_dir\u003e/.datadrive/checkpoint.txt` will be downloaded into the specified folder.\n-   The CLI will now be listening for new files uploaded into the specified package.\n-   Reference **Subscription Only** for expected results for file subscriptions.\n\n### Test 3 - Playback Events on Deleted File\n\n#### Steps\n\n-   Upload files into specified package then delete these files.\n-   Make sure `\u003coutput_dir\u003e/.datadrive/checkpoint.txt` file has a date/time that is earlier than the date/time for uploaded files above.\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp -p DataDrive-Dev-Congruent-1 -P` or `./ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1 -P`\n\n#### Expected Results\n\n-   Files uploaded above will **NOT** be downloaded by the CLI.\n\n### Test 4 - Adjusting checkpoint.txt\n\n#### Steps\n\n-   Upload files into specified package.\n-   Go into `\u003coutput_dir\u003e/.datadrive/checkpoint.txt` file and adjust the date to an earlier time before uploaded files.\n    -   Note: The time format must be in a valid format such as... Ex: `2019-09-08T11:51:28.000-0700`.\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp -p DataDrive-Dev-Congruent-1 -P` or `./ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1 -P`\n\n#### Expected Results\n\n-   Files since given date will be downloaded.\n\n### Test 5 - Same file name\n\n#### Steps\n\n-   CLI tries to download a file with the same file name from the middleware (even if the contents are different)\n\n#### Expected Results\n\n-   An error will be given as the file already exists.\n\n### Test 6 - Wildcard Filter\n\n#### Steps\n\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp -p DataDrive-Dev-Congruent-1 -f 'msl*.txt'` or `./ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1 -f 'msl*.txt'`\n-   Upload a file with a file name \"msl_sol_123.txt\".\n-   Upload a file with a file name \"europa_day_2345.txt\".\n\n#### Expected Results\n\n-   \"msl_sol_123.txt\" file will be downloaded to the `tmp` directory.\n\n#### Cleanup\n\n-   Delete \"msl_sol_123.txt\" from both DataDrive web app and files in the `tmp` directory.\n\n### Test 7 - Regex Filter\n\n#### Steps\n\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp -p DataDrive-Dev-Congruent-1 -x 'msl.*\\.txt'` or `./ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1 -x 'msl.*\\.txt'`\n-   Upload a file with a file name \"msl_sol_123.txt\".\n-   Upload a file with a file name \"europa_day_2345.txt\".\n\n#### Expected Results\n\n-   \"msl_sol_123.txt\" file will be downloaded to the `tmp` directory.\n\n#### Cleanup\n\n-   Delete \"msl_sol_123.txt\" from both DataDrive web app and files in the `tmp` directory.\n\n### Test 8 - Regex Filter and Wildcard Filter\n\n#### Steps\n\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp -p DataDrive-Dev-Congruent-1 -x 'msl.*\\.txt' -f 'msl*.txt'` or `ddrv subscribe -o tmp -p DataDrive-Dev-Congruent-1 -x 'msl.*\\.txt' -f 'msl*.txt'`\n\n#### Expected Results\n\n-   An error will be returned to the user as you cannot specify both regex and wildcard filter together.\n\n### Test9 - Test out subscripts for a saved search\n\n#### Steps\n\n-   Ensure that there is a folder called `tmp_ss` created where you will be downloading file.\n-   Ensure that a saved search has been created. This can be done using the DataDrive UI. Create a saved search that will filter based on a directory path in `ocs_path` field and with value of `/jeff`\n-   Run the command line such as the example below. Note: the `-s` flag is for the saved search name created above.\n    -   Ex: `node ddrv.js subscribe -o tmp_ss -s 'jeffliu_saved_search_2' -r`\n-   Upload a file with a name \"seeme123.txt\" into `/jeff` folder.\n-   Upload a file with a name \"seemenot123.txt\" not into `/jeff` folder.\n\n#### Expected Results\n\n-   \"seeme123.txt\" file will be downloaded to the `tmp_ss` directory under `/jeff` folder.\n\n#### Cleanup\n\n-   Delete \"seeme123.txt\" from files in the `tmp_ss` directory.\n-   Delete \"seemenot123.txt\" from files in the `tmp_ss` directory.\n\n### Test 10 - Adjusting checkpoint.txt for a saved search\n\n#### Steps\n\n-   This test case depends on test 9 above. Make sure that you run test 9 first.\n-   Go into `tmp_ss/.datadrive/checkpoint.txt` file and adjust the date to an earlier time before uploaded files in test 9.\n    -   Note: The time format must be in a valid format such as... Ex: `2019-09-08T11:51:28.000-0700`.\n-   Run the command line such as the example below.\n    -   Ex: `node ddrv.js subscribe -o tmp_ss -s 'jeffliu_saved_search_2' -P`\n\n#### Expected Results\n\n-   Files since given date will be downloaded. In this case, \"seeme123.txt\" should be downloaded..\n\n#### Cleanup\n\n-   Delete \"seeme123.txt\" from files in the `tmp_ss` directory.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnasa-ammos%2Fdatadrive-commandline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnasa-ammos%2Fdatadrive-commandline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnasa-ammos%2Fdatadrive-commandline/lists"}