{"id":20038534,"url":"https://github.com/yahoo/validatar","last_synced_at":"2025-05-05T06:32:17.093Z","repository":{"id":31625007,"uuid":"35190085","full_name":"yahoo/validatar","owner":"yahoo","description":"Functional testing framework for Big Data pipelines.","archived":false,"fork":false,"pushed_at":"2023-07-06T20:06:50.000Z","size":1088,"stargazers_count":56,"open_issues_count":8,"forks_count":29,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-04-08T18:51:36.953Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yahoo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"Contributing.md","funding":null,"license":"LICENSE","code_of_conduct":"Code-of-Conduct.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-05-07T00:32:11.000Z","updated_at":"2025-03-05T07:36:06.000Z","dependencies_parsed_at":"2024-11-13T10:40:32.259Z","dependency_job_id":null,"html_url":"https://github.com/yahoo/validatar","commit_stats":null,"previous_names":[],"tags_count":30,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yahoo%2Fvalidatar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yahoo%2Fvalidatar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yahoo%2Fvalidatar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yahoo%2Fvalidatar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yahoo","download_url":"https://codeload.github.com/yahoo/validatar/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252451807,"owners_count":21749987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T10:29:49.449Z","updated_at":"2025-05-05T06:32:16.730Z","avatar_url":"https://github.com/yahoo.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Validatar\n\n[![Build Status](https://cd.screwdriver.cd/pipelines/7218/badge)](https://cd.screwdriver.cd/pipelines/7218)\n[![Coverage Status](https://coveralls.io/repos/yahoo/validatar/badge.svg?branch=master)](https://coveralls.io/r/yahoo/validatar?branch=master)\n[![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.yahoo.validatar/validatar/badge.svg)](https://maven-badges.herokuapp.com/maven-central/com.yahoo.validatar/validatar/)\n\n## Table of Contents\n\n* [What is Validatar?](#what-is-validatar)\n* [Using Validatar](#using-validatar)\n\t* [Test File Format](#test-file-format)\n\t* [Assertions](#assertions)\n\t\t* [Assertion Format](#assertion-format)\n\t\t* [Examples](#examples)\n\t* [Report Generation](#report-generation)\n\t* [Parameter Substitution](#parameter-substitution)\n* [Execution Engines](#execution-engines)\n\t* [Hive](#hive)\n\t* [Pig](#pig)\n\t* [REST](#rest)\n\t* [CSV](#csv-and-other-delimited-text-data)\n* [How to Install](#how-to-install)\n\t* [Direct Download](#direct-download)\n\t* [Maven](#maven)\n\t* [Gradle](#gradle)\n* [How to Run](#how-to-run)\n\t* [Running Hive Tests](#running-hive-tests)\n\t* [Running Pig Tests](#running-pig-tests)\n* [Pluggability](#pluggability)\n* [Help](#help)\n* [Contributing](#contributing)\n* [Changelog](#changelog)\n\n## What is Validatar?\n\n* A Functional Testing Framework for Big Data\n* Lets you define how to read your data and what the tests are using a simple YAML file (or folder of files)\n* Talks to various data sources through Hive, Pig, or REST based endpoints\n* Reads and models data from highly variable datasources as a standard columnar (table) format\n* Lets you write powerful assertions on this data. You can join, filter and run comparisons on your data\n* Is fully typed and preserves the types of your data sources\n* Generates test reports that can be published in various environments or emailed (currently the JUnit XML format or emailing reports is supported)\n* Is completely modular and pluggable. You can easily extend and add new datasources, input sources, output reports etc.\n\nThe data sources we currently support:\n\n* Hive (HiveServer2)\n* Pig (PigServer)\n* Generic REST endpoint (for datasources like Druid etc)\n* Static data (CSV, TSV, etc)\n\n## Using Validatar\n\n### Test file format\n\nTest files are written in the YAML format. See examples of all different datasources in src/test/resources/. The schema is as follows:\n\n```\nname: String describing the Test Suite name\ndescription: String describing the Test Suite\nqueries:\n   - name: String containing the unique name for the query\n     engine: String telling Validatar what execution engine to use such as \"hive\" or \"pig\" or \"rest\"\n     value: String describing the engine specific method to get the data such as \"SELECT COUNT(*) AS pv_count FROM page_data\" for hive\n     priority: Optional integer telling Validatar what order and groups to run queries in when executing in parallel. Queries with a higher priority (lower value) \n               and the same priority run first and together. If missing, the query is assigned the lowest priority (INT_MAX) and runs last. \n     metadata:\n        - key: String key of the metadata entry containing query specific options for the engine\n          value: String value of the metadata entry containing query specific options for the engine\n   ...\ntests:\n   - name: String describing the Test name\n     description: String describing the Test\n     asserts:\n        - A String assertion statement referencing data from the queries. See below for exact details on Validatar assert statements.\n   ...\n```\n\nQueries must have unique names. This name is used as a namespace for all the values returned from the query. In the above example, if the name of the query was \"Analytics\" and it stored a column called \"count\", then you would be able to use this in your test asserts as \"Analytics.count\".\n\nIf you want a test in the tests section to warn only instead of failing, you can set the optional key ```warnOnly``` to ```true```. See [here](https://github.com/yahoo/validatar/blob/master/src/test/resources/sample-tests/tests.yaml) for an example.\n\nValidatar can run a single test file or a folder of test files. Use the --help option to see more details or refer to the Help section below.\n\n### Assertions\n\nThis section describes the asserts that you can write in the test section in a Validatar test.  Validatar assertions are quite flexible, allowing for the following operations on your data:\n\n```\n                   \u003e  : greater than\n                   \u003e= : greater or equal to\n                   \u003c  : less than\n                   \u003c= : less or equal to\n                   == : equal to\n                   != : not equal to\n                   +  : add\n                   -  : subtract\n                   *  : multiply\n                   /  : divide\n                   \u0026\u0026 : boolean and\n                   || : boolean or\napprox(a, b, percent) : true if a and b within percent difference (0.0 to 1.0) of each other.\n```\n\n#### Assertion format\n\nA Validatar assertion is an expression similar to ones in C or Java where binary operations from above can combined with parantheses etc to produce an expression that evaluates to true or false. An assertion can optionally contain a ```where``` clause that can filter or join multiple datasets. This where clause is provided after the expression and its syntax is the same as the assert itself. So you can leverage the full power of Validatar's assertion expressions to filter and join your datasets as well. See below for [examples](#examples).\n\nValidatar detects the datasets used in your assertion statement and performs automatic **cartesian products** for them. The resulting dataset is what is used for your asserts. The where section can be used to perform a filter on this resulting cartesian product. In other words, if you have a single dataset used in your assert, then including a where lets you perform a **filter** on the dataset. If you have multiple datasets, the where clause is letting you perform a **join** on the dataset.\n\nYour assertion can omit the ```where``` clause and simply assert using the operations above. For the examples below, let us pretend we had the following two queries, A and B, that were run against Hive and produced the data as below.\n\n#### Examples\n\nQuery: A\n\n|   date   | country | views | clicks |\n|----------|---------|-------|--------|\n| 20170101 | us      | 10000 | 124    |\n| 20170101 | uk      | 4340  | 14     |\n| 20170101 | fr      | 4520  | 0      |\n| 20170101 | cn      | 99999 | 1024   |\n| 20170101 | eg      | 100   | 24     |\n| 20170102 | us      | 9900  | 328    |\n| 20170102 | uk      | 2340  | 13     |\n| 20170102 | fr      | 4313  | 20     |\n| 20170102 | cn      | 97345 | 2034   |\n| 20170102 | eg      | 100   | 24     |\n| 20170102 | sa      | 0     | 2      |\n\nQuery: B\n\n| country | continent | threshold | expected |\n|---------|-----------|-----------|----------|\n| us      | na        | 0.01      | 10090    |\n| uk      | eu        | 0.1       | 4100     |\n| fr      | eu        | 0.0       | 4500     |\n| cn      | as        | 0.05      | 100000   |\n| eg      | af        | 0.15      | 110      |\n| sa      | af        | 0.1       | 10       |\n| au      | au        | 0.2       | 5        |\n\nValidatar would be modeling the data as two tables, A and B with the columns and their values as shown above.\n\n##### Example 1\n\n```\n    A.clicks \u003e= 0 \u0026\u0026 A.views \u003e= 0\n```\n\nThis assert is making sure that our views and clicks columns only contain positive values. This would fail if any cell contained a negative number.\n\n##### Example 2\n\n```\n    (A.clicks / (A.views + A.clicks)) * 100.0 \u003c= 5 || A.clicks \u003e 100 where A.date == \"20170101\"\n```\n\nThis assert checks to see where the ratio of clicks to clicks and views is less than 5% or the clicks are greater than 100 for the 20170101 date. Only the A dataset\nis used and the where clause is used as a way to filter the dataset to only use the rows where A.date is \"20170101\". For those rows, the assertion is applied.\n\n##### Example 3\n\n```\n    approx(A.views, B.expected, B.threshold) where A.country == B.country \u0026\u0026 B.continent != \"as\"\n```\n\nThis assert uses the where clause to perform a cartesian product of A and B and picks all the rows where the country is the same (inner join on country) and the continent is not \"as\". For these rows, it checks to see the value for A.views is within the corresponding value in B.expected by the corresponding B.threshold percentage. For example, \"us\" will have approx(10000, 10090, 0.01) performed, which is true.\n\nYou can find this failing test suite if you are interested in playing around with it here ([src/test/resources/csv-tests/test.yaml](https://github.com/yahoo/validatar/blob/master/src/test/resources/csv-tests/test.yaml).\n\nThe Validatar assertion grammar is written in ANTLR and can be found [here](https://github.com/yahoo/validatar/blob/master/src/main/antlr4/com/yahoo/validatar/assertion/Grammar.g4) if you're interested in the exact syntax.\n\n### Report Generation\n\nValidatar by default uses the JUnit XML report format to write your test results in a JUnit XML file that you can publish. If you have a SMTP server, you can also generate a pretty HTML E-Mail report to mail out to a list of recipients by changing the ```report-format``` setting to ```--report-format email```.\n\n![Report E-Mail](https://user-images.githubusercontent.com/1041753/34065062-2ad8586c-e1b3-11e7-82d6-875427c4cd2d.png)\n\nIf you want to only generate a report if there were failures in running your queries or tests (including tests that were set to warn only), pass the ```report-on-failure-only true``` flag when launching Validatar.\n\n### Parameter Substitution\n\nYou may want queries, asserts or query metadata to use a specific date column, or some other changing parameter. For this, we have a parameter substitution feature.\n\nSimply pass `--parameter KEY=VALUE` in the CLI and the `KEY` will be replaced with `VALUE` in all queries, query metadata and test assertions. For example, to query June 23rd 2015, you could use `--parameter DATE=2015-06-23`. If the query uses `${DATE}` in the query it will be replaced before execution with `2015-06-23`.\n\n### Query Parallelism\n\nYou may want to run queries in parallel rather than sequentially especially if you have many time-consuming queries.\n\nTo enable this feature, pass in `--query-parallel-enable true` when launching Validatar. By default, this will run all queries in parallel. If this number needs to be limited, pass in `--query-parallel-max VALUE` where `VALUE` is the max number of queries that should run concurrently.\n\n## Execution Engines\n\n### Hive\n\nThe query part of a Hive test is just a HiveSQL statement. We recommend that you push all the heavy lifting to the query - joins, aggregate results etc. We use Hive JDBC underneath to execute against HiveServer2 and fetch the results. We support hive settings at the execution level by passing in --hive-setting arguments to Validatar.\n\nSome mock tests can be found in [src/test/resources/sample-tests/tests.yaml](https://github.com/yahoo/validatar/blob/master/src/test/resources/sample-tests/tests.yaml).\n\n### Pig\n\nThe query part of a Pig test is a PigLatin script. You can register your UDFs etc as long as you register them with the full path to them at runtime. We use PigServer underneath to run the query. You can provide the alias in the script to fetch your results from (or leave it to the default). Setting the exec mode and other pig settings are supported.\n\nValidatar is currently compiled against *Pig-0.14*. Running against an older or newer version may result in issues if interfaces have changed. These are relatively minor from experience and can be fixed with relatively minor fixes to engine code if absolutely needed. Feel free to raise issues or you can always tweak the Pig engine and plug it into Validatar.\n\nSome mock tests can be found in [src/test/resources/pig-tests/sample.yaml](https://github.com/yahoo/validatar/blob/master/src/test/resources/pig-tests/sample.yaml).\n\n### REST\n\nThe query part of a REST test is a Javascript function that processes the response from your HTTP endpoint into a standard table-like format - a Javascript object (dictionary) where the keys are the column names and the value is an array of the column values.\nWe execute the native Javascript via Nashorn. The function that takes a single argument - the string response from your endpoint. The name of this function is customizable if desired.\n\nThe metadata for the query is used to define the REST call. We currently support setting the method (defaults to GET), the body (if POST), timeout, retry and custom headers.\n\nThis execution engine exists essentially a catch-all for any other type of Big Data datasource that has a REST interface but is not natively supported in Validatar. But if you feel like it should be in Validatar, feel free to create an issue and we'll look into supporting it.\n\nSome mock tests and examples can be found in [src/test/resources/rest-tests/sample.yaml](https://github.com/yahoo/validatar/blob/master/src/test/resources/rest-tests/sample.yaml).\n\n### CSV (and other delimited text data)\n\nThis execution engine lets you load static data from a file or by defining it in your test YAML file. This is provided to make it easy for to load expected data to run assertions against your actual data. For instance, in the [examples shown above](#examples), Query B with the thresholds for the various countries could be defined as a static dataset and Query A could actually be the result of a query on your Big Data that you are validating.\n\nSome mock tests and examples can be found in [src/test/resources/csv-tests/sample.yaml](https://github.com/yahoo/validatar/blob/master/src/test/resources/csv-tests/sample.yaml).\n\n## How to install\n\n### Direct Download\n\nValidatar is available on JCenter/Bintray. You can download the artifacts (you will need the jar-with-dependencies artifact to run Validatar) directly from [JCenter](http://jcenter.bintray.com/com/yahoo/validatar/validatar/)\n\nThe JARs should be sufficient for usage but if you need to depend on Validatar source directly. You will need to point your Maven or other build tools to JCenter.\n\n### Maven\n\n```\n\u003crepositories\u003e\n    \u003crepository\u003e\n        \u003csnapshots\u003e\n            \u003cenabled\u003efalse\u003c/enabled\u003e\n        \u003c/snapshots\u003e\n        \u003cid\u003ecentral\u003c/id\u003e\n        \u003cname\u003ebintray\u003c/name\u003e\n        \u003curl\u003ehttp://jcenter.bintray.com\u003c/url\u003e\n    \u003c/repository\u003e\n\u003c/repositories\u003e\n```\n\n```\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.yahoo.validatar\u003c/groupId\u003e\n  \u003cartifactId\u003evalidatar\u003c/artifactId\u003e\n  \u003cversion\u003e${validatar.version}\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n### Gradle\n\n```\nrepositories {\n    maven {\n        url  \"http://jcenter.bintray.com\"\n    }\n}\n```\n\n```\ncompile 'com.yahoo.validatar:validatar:${validatar.version}'\n```\n\n## How to Run\n\nFor Hadoop based engines like Hive or Pig, it is recommended you run Validatar with ```hadoop jar``` since that sets up most of the classpath for you. Otherwise, you can launch validatar with ```java -cp /PATH/TO/JARS com.yahoo.validatar.App ...```, where ```com.yahoo.validatar.App``` is the main class.\n\nUse ```hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App --help``` (or -h) for Help\n\n### Running Hive Tests\n\n    export HADOOP_CLASSPATH=\"$HADOOP_CLASSPATH:/path/to/hive/jdbc/lib/jars/*\"\n    hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App -s tests/ --report report.xml --hive-jdbc ...\n\nHive needs the JDBC uri of HiveServer2. Note that the DB is in the URI.\n\n```\n--hive-jdbc \"jdbc:hive2://\u003cURI\u003e/\u003cDB\u003e;\u003cOptional params: E.g. sasl.qop=auth;principal=hive/\u003cPRINCIPAL_URL\u003e etc\u003e\n```\n\nDo not add it if your queries use the\n\n```\n... FROM DB.TABLE WHERE ...\n```\n\nformat. Instead, you should leave it out and have **ALL** your queries specify the database.\n\n### Running Pig Tests\n\n    export HADOOP_CLASSPATH=\"$HADOOP_CLASSPATH:/path/to/pig/lib/*\" (Add other jars here depending on your pig exec type or if hive/hcat is used in Pig)\n    hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App -s tests/ --report report.xml --pig-exec-type mr --pig-setting 'mapreduce.job.acl-view-job=*' ...\n\nPig parameters are not supported in the pig query. Instead, use our parameter substitution (see below).\n\nRunning REST tests require no other dependencies and can be launched with Java instead of hadoop jar.\n\n## Pluggability\n\nEngines, report format generators and test suite parsers are all pluggable. You can implement your own extending the appropriate interfaces below and pass them in to Validatar to load at run time by placing it in the classpath. If you wished to have a report generated and posted to a web service, you could do that! Or vice versa to read test suites off of a webservice or a queue somewhere. Refer to the options below to see how to pass in the custom implementations using the ```custom-engine```, ```custom-parser```, ```custom-formatter``` options.\n\n| Module | Interface to implement |\n| ------ | ---------------------- |\n| Parser | [Parser.java](https://github.com/yahoo/validatar/blob/master/src/main/java/com/yahoo/validatar/parse/Parser.java) |\n| Engine | [Engine.java](https://github.com/yahoo/validatar/blob/master/src/main/java/com/yahoo/validatar/execution/Engine.java) |\n| Formatter | [Formatter.java](https://github.com/yahoo/validatar/blob/master/src/main/java/com/yahoo/validatar/report/Formatter.java) |\n\n## Help\n\nFeel free to reach out to us if you run into issues. You are welcome to open any issues. Pull requests are welcome!\n\nWe list the complete help output from Validatar for reference here:\n\n```\nApplication options:\nOption (* = required)             Description\n---------------------             -----------\n-h, --help                        Shows help message.\n--parameter \u003cParameter\u003e           Parameter to replace all '${VAR}' in\n                                    the query string. Ex: --parameter\n                                    DATE=2014-07-24\n* --test-suite \u003cFile: Test suite  File or folder that contains the test\n  file/folder\u003e                      suite file(s).\n\n\nAdvanced Parsing Options:\nOption                                 Description\n------                                 -----------\n--custom-parser \u003cAdditional custom     Additional custom parser to load.\n  fully qualified classes to plug in\u003e\n\n\nEngine Options:\nOption                              Description\n------                              -----------\n--query-parallel-enable \u003cBoolean:   Whether or not queries should run in\n  Query parallelism option\u003e           parallel. (default: false)\n--query-parallel-max \u003cInteger: Max  The max number of queries that will\n  query parallelism\u003e                  run concurrently. If non-positive or\n                                      unspecified, all queries will run at\n                                      once. (default: 0)\n\n\nHive engine options:\nOption (* = required)                   Description\n---------------------                   -----------\n--hive-driver \u003cHive driver\u003e             Fully qualified package name to the\n                                          hive driver. (default: org.apache.\n                                          hive.jdbc.HiveDriver)\n* --hive-jdbc \u003cHive JDBC connector\u003e     JDBC string to the HiveServer2 with an\n                                          optional database. If the database\n                                          is provided, the queries must NOT\n                                          have one. Ex: 'jdbc:hive2:\n                                          //HIVE_SERVER:PORT/\n                                          [DATABASE_FOR_ALL_QUERIES]'\n--hive-password \u003cHive server password\u003e  Hive server password. (default: anon)\n--hive-setting \u003cHive generic settings   Settings and their values. Ex: 'hive.\n  to use.\u003e                                execution.engine=mr'\n--hive-username \u003cHive server username\u003e  Hive server username. (default: anon)\n\n\nREST Engine options:\nOption                               Description\n------                               -----------\n--rest-function \u003cREST Javascript     The name of the Javascript function\n  method name\u003e                         used in all queries (default:\n                                       process)\n--rest-retry \u003cInteger: REST Query    The default number of times to retry\n  retry limit\u003e                         each HTTP request (default: 3)\n--rest-timeout \u003cInteger: REST Query  The default time to wait for each HTTP\n  timeout\u003e                             request (default: 60000)\n\nThis REST Engine works by making a HTTP GET or POST, parsing the response (JSON is best)\nusing your provided native JavaScript into a common format.\nThe query part of the engine is a JavaScript function that takes your response from your\nrequest and transforms it to a columnar JSON object with the columns as keys and values\nas arrays of values. You may need to iterate over your output and pull out your columns\nand return it as a JSON string using JSON stringify. Example: Suppose you extracted\ncolumns called 'a' and 'b', you would create and return the following JSON string :\n{\"a\": [a1, a2, ... an], \"b\": [b1, b2, ... bn]}\nThis engine will inspect these elements and convert them to the proper typed objects.\nThe metadata part of the query contains the required key/value pairs for making the REST\ncall. The url to make the request to can be set using the url. You can use a\ncustom timeout in ms for the call using rest-timeout. The HTTP method can be set\nusing the method - currently support GET and POST\nThe string body for the POST can be set using the body. The number of\ntimes to retry can be set using rest-retry. If you wish to change the name of the\nJavascript function you are using, use the rest-function. Default name is\nprocess. Any other key/value pair is added as headers to the REST call,\nwith the key being the header name and the value, its value.\n\nCSV Engine options:\nOption                                 Description\n------                                 -----------\n--csv-delimiter \u003cThe field delimiter\u003e  The delimiter to use while parsing\n                                         fields within a record. Defaults to\n                                         ',' or CSV (default: ,)\n\nThis Engine lets you load delimited text data from files or to specify it directly as a query.\nIt follows the RFC 4180 CSV specification: https://tools.ietf.org/html/rfc4180\n\nYour data MUST contain a header row naming your columns.\nThe types of all fields will be inferred as STRINGS. However, you can provide mappings\nfor each column name by adding entries to the metadata section of the query, where\nthe key is the name of your column and the value is the type of the column.\nThe values can be BOOLEAN, STRING, LONG, DECIMAL, DOUBLE, and TIMESTAMP.\nDECIMAL is used for really large numbers that cannot fit inside a long (2^63). TIMESTAMP is\nused to interpret a whole number as a timestamp field - millis from epoch. Use to load dates.\nThis engine primarily exists to let you easily load expected data in as a dataset. You can\nthen use the data by joining it with some other data and performing asserts on the joined\ndataset.\n\nPig engine options:\nOption                                  Description\n------                                  -----------\n--pig-exec-type \u003cPig execution type\u003e    The exec-type for Pig to use.  This is\n                                          the -x argument used when running\n                                          Pig. Ex: local, mr, tez etc.\n                                          (default: mr)\n--pig-output-alias \u003cPig default output  The default name of the alias where\n  alias\u003e                                  the result is.This should contain\n                                          the data that will be collected\n                                          (default: validatar_results)\n--pig-setting \u003cPig generic settings to  Settings and their values. The -D\n  use.\u003e                                   params that would have been sent to\n                                          Pig. Ex: 'mapreduce.job.acl-view-\n                                          job=*'\n\n\nAdvanced Engine Options:\nOption                                 Description\n------                                 -----------\n--custom-engine \u003cAdditional custom     Additional custom engine to load.\n  fully qualified classes to plug in\u003e\n\nReporting options:\nOption                              Description\n------                              -----------\n--report-format \u003cReport formats\u003e     Which report formats to use. (default:\n                                      junit)\n--report-on-failure-only \u003cBoolean:  Should the reporter be only run on\n  Report on failure\u003e                  failure. (default: false)\n\n\nJunit report options:\nOption                       Description\n------                       -----------\n--report-file \u003cReport file\u003e  File to store the test reports.\n                               (default: report.xml)\nEmail report options:\nOption (* = required)             Description\n---------------------             -----------\n* --email-from                    Email shown to recipients as 'from'\n* --email-recipient, --email-     Comma-separated or multi-option emails\n  recipients \u003cReport recipients'    to send reports\n  emails\u003e\n* --email-reply-to                Email to which replies will be sent\n--email-sender-name               Name of sender displayed to report\n                                    recipients (default: Validatar)\n* --email-smtp-host               Email SMTP host name\n* --email-smtp-port               Email SMTP port\n--email-smtp-strategy             Email SMTP transport strategy -\n                                    SMTP_PLAIN, SMTP_TLS, SMTP_SSL\n                                    (default: SMTP_TLS)\n--email-subject-prefix            Prefix for the subject of the email\n                                    (default: [VALIDATAR] Test Status - ))\n\nAdvanced Reporting Options:\nOption                                 Description\n------                                 -----------\n--custom-formatter \u003cAdditional custom  Additional custom formatter to load.\n  fully qualified classes to plug in\u003e\n```\n\n## Contributing\n\nAll contributions, ideas and feedback are welcome! To run and build Validatar, you need Maven 3 and JDK (1.8.60+ for Nashorn). You can\nuse the make commands in the Makefile to run tests and see coverage (need a clover license) etc.\n\n## Changelog\n\nVersion | Notes\n------- | -----\n0.1.4   | Initial release with Hive\n0.1.5   | Typesystem, metadata support\n0.1.6   | No feature release. Source and Javadoc bundled in artifact\n0.1.7   | Multiple Hive databases across Queries\n0.1.8   | Null types in Hive results fix\n0.1.9   | Empty results handling bug fix\n0.2.0   | Internal switch to Java 8. hive-queue is no longer a setting. Use hive-setting.\n0.3.0   | Pig support added.\n0.4.0   | Rest API datasource added.\n0.4.1   | Classloader and reflections library removal [#19](https://github.com/yahoo/validatar/issues/19)\n0.4.2   | Parameter Expansion in metadata [#21](https://github.com/yahoo/validatar/issues/21)\n0.4.3   | Parameter Expansion in asserts [#24](https://github.com/yahoo/validatar/issues/24). Hive NULL type bug fix.\n0.5.1   | Vector support, join and filter clauses using where [#26](https://github.com/yahoo/validatar/issues/26). CSV static datasource from file or String [#27](https://github.com/yahoo/validatar/issues/27).\n0.5.2   | Validatar exits with an exit code of 1 if there are failures. Added a warnOnly parameter for tests. JUnit reporter now uses CDATA in XML for additional information.\n0.5.3   | Added an Email reporter that sends an HTML formatted email test report contributed by [Mogball](https://github.com/mogball).\n0.5.4   | Added a flag ```--report-on-failure-only``` to only generate reports if there were failures in tests (including warnOnly) or queries\n0.5.5   | Shaded ```org.objectweb.asm``` to not clash with asm in Hadoop environments\n0.5.6   | Fixed a bug with pretty-printing results with nulls\n0.6.0   | Better reporting (show data with only the assertion columns with the assertion result column, columns now in sorted order, Email Subject Prefix). Can now write multiple reports per invocation (specify more than one report formatter using --report-format)\n0.6.1   | Added a flag to configure the Email reporter SMTP strategy. Use ```--email-smtp-strategy``` to pass in ```SMTP_PLAIN```, ```SMTP_TLS``` or ```SMTP_SSL```.\n0.6.2   | Bintray EOL. First rerelease of 0.6.1 on Maven Central instead\n0.6.3   | Screwdriver migration. First rerelease of 0.6.1 using Screwdriver instead of Travis.\n0.7.0   | Added support for running queries in parallel\n0.7.1   | Fixed a bug with parallel query execution in REST and Hive engines. Upgraded to Apache HttpClient 5.x which supports HTTP2 in REST engine.\n0.7.2   | Added query priority to allow executing parallel queries in groups\n0.7.3   | Log4j2\n0.7.4   | Shaded ```org.antlr``` to not clash with other versions in Hadoop environments\n\n## Members\n\nAkshai Sarma, akshaisarma@gmail.com\nJosh Walters, josh@joshwalters.com\n[0aix](https://github.com/0aix)\n\n## Contributors\n\n[Mogball](https://github.com/mogball) - Email Reporter [0.5.3]\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyahoo%2Fvalidatar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyahoo%2Fvalidatar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyahoo%2Fvalidatar/lists"}