{"id":28829115,"url":"https://github.com/forcedotcom/analytics-cloud-dataset-utils","last_synced_at":"2025-08-15T22:14:41.610Z","repository":{"id":22210919,"uuid":"25543574","full_name":"forcedotcom/Analytics-Cloud-Dataset-Utils","owner":"forcedotcom","description":"Friendly utility to load your on-prem data, whether large or small,  to Einstein Analytics Datasets, with useful features such as autoloading, dataflow control and dataset inspection.","archived":false,"fork":false,"pushed_at":"2023-06-13T22:54:58.000Z","size":2568,"stargazers_count":130,"open_issues_count":39,"forks_count":64,"subscribers_count":42,"default_branch":"master","last_synced_at":"2025-07-12T18:52:58.419Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/forcedotcom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2014-10-21T20:10:56.000Z","updated_at":"2025-01-27T01:40:54.000Z","dependencies_parsed_at":"2024-01-02T20:53:02.707Z","dependency_job_id":null,"html_url":"https://github.com/forcedotcom/Analytics-Cloud-Dataset-Utils","commit_stats":null,"previous_names":[],"tags_count":37,"template":false,"template_full_name":null,"purl":"pkg:github/forcedotcom/Analytics-Cloud-Dataset-Utils","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/forcedotcom%2FAnalytics-Cloud-Dataset-Utils","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/forcedotcom%2FAnalytics-Cloud-Dataset-Utils/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/forcedotcom%2FAnalytics-Cloud-Dataset-Utils/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/forcedotcom%2FAnalytics-Cloud-Dataset-Utils/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/forcedotcom","download_url":"https://codeload.github.com/forcedotcom/Analytics-Cloud-Dataset-Utils/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/forcedotcom%2FAnalytics-Cloud-Dataset-Utils/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270637448,"owners_count":24620426,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-15T02:00:12.559Z","response_time":110,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-19T05:11:51.006Z","updated_at":"2025-08-15T22:14:41.570Z","avatar_url":"https://github.com/forcedotcom.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n#  DatasetUtils\n\nDatasetUtils is a reference implementation of the Einstein Analytics  External Data API. This tool is free to use, but it is not officially supported by Salesforce.\nThis is a community project that have not been officially tested or documented. Please do not contact Salesforce for support when using this application.\n\n\n## Log4j2 Issues (CVE-2021-44228 and CVE-2021-45046)\n\nThe whole thrust  of these vulnerabilities is to be able to inject malicious byte code using JNDI look ups or JDBC appender approach; A review of the DatasetUtils does not show JDBC Appender or jndi lookups being used. In addition  we are using an old version of Log4j 1.2 which does not have these vulnerabilities reported.  \n\n## DatasetUtils current status\nAs there is quite a lot of scope for re-write and removal of deprecated code in DatasetUtils including reducing or removing 3rd party libs as prompted by the Log4j issues we are looking at end of life for this code and actively considering a replacement in both functionality and technology employed.\n\n\n## Downloading DatasetUtils\n\nDownload the latest version from [releases](https://github.com/forcedotcom/Analytics-Cloud-Dataset-Utils/releases) and follow the examples below:\n\n## Running DatasetUtils\n\n## Prerequisite\n\nDownload and install Java JDK (not JRE) from Zulu Open JDK\n\n* [Zulu Open JDK](https://www.azul.com/downloads/zulu-community/?\u0026architecture=x86-64-bit\u0026package=jdk)\n\nAfter installation is complete. Different versions of DatasetUtils require different versions of JDK, the latest release API 48.1.1 requires JDK 11. Open a console and check that the java version is correct for your DatasetUtils version  by running the following command:\n\n\n``java -version``\n\n\n### Server mode with Web UI\n\n\n**Windows**: \n\nUnzip datasetutils.zip to a local folder. To start the jar in server mode: Double click on run.bat\n\n**Mac**: \n\nInstall  datasetutils.dmg by double clicking it and dragging and dropping it into applications folder.\n\nRun datasetutils.app by double clicking it\n\t \t \n\n### Console Mode\n\nBest is to run in interactive mode. open a terminal and type in the following command and follow the prompts on the console: \n\n    java -jar datasetutil-\u003cversion\u003e.jar --server false\n\nOr you can pass in all the param in the command line and let it run uninterrupted.\n \n    java -jar datasetutil-\u003cversion\u003e.jar --action \u003caction\u003e --u \u003cuser@domain.com\u003e --dataset \u003cdataset\u003e --app \u003capp\u003e --inputFile \u003cinputFile\u003e --endpoint \u003cendPoint\u003e\n\nInput Parameter\n\n--action  :\"load\" OR  \"downloadxmd\"  OR \"uploadxmd\"  OR \"detectEncoding\" OR \"downloadErrorFile\"\n \n**load**: for loading csv  \n\n**downloadxmd**: to download existing xmd files  \n\n**uploadxmd**: for uploading user.xmd.json  \n\n**detectEncoding**: To detect the encoding of the inputFile  \n\n**downloadErrorFile**: To downloading the error file for csv upload jobs\n\n--u       : Salesforce.com login\n\n--p       : (Optional) Salesforce.com password,if omitted you will be prompted\n\n--token   : (Optional) Salesforce.com token\n\n--endpoint: (Optional) The Salesforce soap api endpoint (test/prod) Default: https://login.salesforce.com\n\n--dataset : (Optional) the dataset alias. required if action=load\n\n--datasetLabel : (Optional) the dataset label, Defaults to dataset alias.\n\n--app     : (Optional) the app/folder name for the dataset\n\n--operation     : (Optional) the operation for load (Overwrite/Upsert/Append/Delete) Default is Overwrite\n\n--inputFile : (Optional) the input csv file. required if action=load\n\n--rootObject: (Optional) the root SObject for the extract\n\n--rowLimit: (Optional) the number of rows to extract, -1=all, default=1000\n\n--sessionId : (Optional) the Salesforce sessionId. if specified,specify endpoint\n\n--fileEncoding : (Optional) the encoding of the inputFile default UTF-8\n\n--CodingErrorAction:(optional) What to do in case input characters are not UTF8: IGNORE|REPORT|REPLACE. Default REPORT. If you change this option you risk importing garbage characters\n\n--uploadFormat : (Optional) the whether to upload as binary or csv. default binary\");\n\n--mode : (Optional) incremental upload. It can be \"Incremental\" or \"None\"(default value)\n\n**OR**\n\n--server  : set this to true if you want to run this in server mode and use the UI. **If you give this param all other params will be ignored**\n\n## Usage Example 1: Start the server for using the UI\n    java -jar datasetutils-48.1.1jar --server true\n\n## Usage Example 2: Upload a local csv to a dataset in production\n    java -jar datasetutils-48.1.1.jar --action load --u pgupta@force.com --p @#@#@# --inputFile Opportunity.csv --dataset puntest\n    \n## Usage Example 3: Append a local csv to a dataset\n\tjava -jar datasetutils-48.1.1.jar --action load --operation append --u pgupta@force.com --p @#@#@# --inputFile Opportunity.csv --dataset puntest\n\t\n## Usage Example 4: Upload a local csv to a dataset in sandbox\n\tjava -jar datasetutils-48.1.1.jar --action load --u pgupta@force.com --p @#@#@# --inputFile Opportunity.csv --dataset puntest --endpoint https://test.salesforce.com/services/Soap/u/56.0\n\n## Usage Example 5: Download dataset main xmd json file\n    java -jar datasetutils-48.1.1.jar --action downloadxmd --u pgupta@force.com --p @#@#@# --dataset puntest\n\n## Usage Example 6: Upload user.xmd.json\n    java -jar datasetutils-48.1.1.jar --action uploadxmd --u pgupta@force.com --p @#@#@# --inputFile user.xmd.json --dataset puntest\n\n## Usage Example 7: Detect inputFile encoding\n    java -jar datasetutils-48.1.1.jar --action detectEncoding --inputFile Opportunity.csv\n\n## Usage Example 8: download error logs file for csv uploads\n    java -jar datasetutils-48.1.1.jar --action downloadErrorFile --u pgupta@force.com --p @#@#@# --dataset puntest\n\n## Building DatasetUtils\n    git clone https://github.com/forcedotcom/Analytics-Cloud-Dataset-Utils.git\n    mvn clean install\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fforcedotcom%2Fanalytics-cloud-dataset-utils","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fforcedotcom%2Fanalytics-cloud-dataset-utils","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fforcedotcom%2Fanalytics-cloud-dataset-utils/lists"}