{"id":15161774,"url":"https://github.com/romans-weapon/spear-framework","last_synced_at":"2025-10-24T22:31:56.322Z","repository":{"id":50792663,"uuid":"370235957","full_name":"romans-weapon/spear-framework","owner":"romans-weapon","description":"Rapid ETL/ELT-connectors/pipeline development leveraged on top of Apache Spark","archived":false,"fork":false,"pushed_at":"2021-12-16T07:37:49.000Z","size":2534,"stargazers_count":19,"open_issues_count":0,"forks_count":22,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-07T03:51:15.264Z","etag":null,"topics":["docker-compose","hadoop","kafka","scala","shell-script","spark"],"latest_commit_sha":null,"homepage":"https://romans-weapon.github.io/spear-framework/","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/romans-weapon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-05-24T05:18:17.000Z","updated_at":"2025-03-21T04:41:20.000Z","dependencies_parsed_at":"2022-09-26T22:10:53.918Z","dependency_job_id":null,"html_url":"https://github.com/romans-weapon/spear-framework","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/romans-weapon/spear-framework","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romans-weapon%2Fspear-framework","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romans-weapon%2Fspear-framework/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romans-weapon%2Fspear-framework/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romans-weapon%2Fspear-framework/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/romans-weapon","download_url":"https://codeload.github.com/romans-weapon/spear-framework/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romans-weapon%2Fspear-framework/sbom","scorecard":{"id":783783,"data":{"date":"2025-08-11","repo":{"name":"github.com/romans-weapon/spear-framework","commit":"856db3a60cce00a697bc819767930735a170fb0a"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.8,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/spear-framework-build.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/spear-framework-build.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/romans-weapon/spear-framework/spear-framework-build.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/spear-framework-build.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/romans-weapon/spear-framework/spear-framework-build.yml/main?enable=pin","Info:   0 out of   2 GitHub-owned GitHubAction dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-23T05:33:45.303Z","repository_id":50792663,"created_at":"2025-08-23T05:33:45.303Z","updated_at":"2025-08-23T05:33:45.303Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280878369,"owners_count":26406641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-24T02:00:06.418Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker-compose","hadoop","kafka","scala","shell-script","spark"],"created_at":"2024-09-27T00:44:55.781Z","updated_at":"2025-10-24T22:31:55.738Z","avatar_url":"https://github.com/romans-weapon.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spear Framework\n\n[![Build Status](https://github.com/romans-weapon/spear-framework/workflows/spear-framework-build/badge.svg)](https://github.com/romans-weapon/spear-framework/actions)\n[![Code Quality Grade](https://api.codiga.io/project/23492/score/svg)](https://app.codiga.io/public/user/github/AnudeepKonaboina)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Maven Central](https://img.shields.io/maven-central/v/io.github.romans-weapon/spear-framework_2.11.svg?label=Maven%20Central)](https://search.maven.org/search?q=g:%22io.github.romans-weapon%22%20AND%20a:%22spear-framework_2.11%22)\n[![Website shields.io](https://img.shields.io/website-up-down-green-red/http/shields.io.svg)](https://romans-weapon.github.io/spear-framework/)\n\nThe spear-framework provides scope to write simple ETL/ELT-connectors/pipelines for moving data from different sources to different destinations which greatly minimizes the effort of writing complex codes for data ingestion. Connectors which have the ability to extract and load (ETL or ELT) any kind of data from source with custom tansformations applied can be written and executed seamlessly using spear connectors.\n\n# Table of Contents\n- [Introduction](#introduction)\n- [Design and Code Quality](#design-and-code-quality)\n- [Getting started with Spear](#getting-started-with-spear)\n    * [SBT dependency for Spear](#sbt-dependency-for-spear)\n    * [Maven dependency for Spear](#maven-dependency-for-spear)\n    * [Spark shell package for Spear](#spark-shell-package-for-spear)\n    * [Docker container setup for Spear](#docker-container-setup-for-spear)\n- [Develop your first connector using Spear](#develop-your-first-connector-using-spear)\n- [Example Connectors](#example-connectors)\n    * [Target JDBC](#target-jdbc)\n        - [File Source](#file-source)\n            + [CSV to JDBC Connector](#csv-to-jdbc-connector)\n        - [JDBC Source](#jdbc-source)\n            + [Oracle to Postgres Connector](#oracle-to-postgres-connector)\n        - [Streaming Source](#streaming-source)\n            + [kafka to Postgres Connector](#kafka-to-postgres-connector)\n    * [Target FS (HDFS)](#target-fs-hdfs)\n        - [JDBC Source](#jdbc-source)\n            + [Postgres to Hive Connector](#postgres-to-hive-connector)\n        - [Streaming Source](#streaming-source)\n            + [kafka to Hive Connector](#kafka-to-hive-connector)\n    * [Target FS (Cloud)](#target-fs-cloud)\n        + [Oracle to S3 Connector](#oracle-to-s3-connector)\n    * [Target NOSQL](#target-nosql)\n        - [File Source](#file-source)\n            + [CSV to MongoDB Connector](#csv-to-mongodb-connector)\n    * [Target GraphDB](#target-graphdb)\n        - [File Source](#file-source)\n            + [CSV to neo4j Connector](#csv-to-neo4j-connector)\n- [Other Functionalities of Spear](#other-functionalities-of-spear)\n    * [Merge using executeQuery API](#merge-using-executequery-api)\n    * [Write to multi-targets using branch API](#write-to-multi-targets-using-branch-api)\n- [Contributions and License](#contributions-and-license)\n- [Visit Website](#visit-website)\n\n# Introduction\n\nSpear Framework provides the developers thae ability to write connectors (ETL/ELT jobs) from a source to a target,applying business logic/transformations over the soure data and ingesting it to the corresponding destination with very minimal code.\n\n![image](https://user-images.githubusercontent.com/59328701/122396653-d412a300-cf95-11eb-8bd5-bef400c07de8.png)\n\n# Design and Code Quality\n\n![image](https://user-images.githubusercontent.com/59328701/122229966-d661f800-ced6-11eb-839a-c77ca7cca610.png)\n\n\n# Getting Started with Spear\nThe master version of the framework has the support for **spark-2.4.x** with **scala 2.11.x** .For spear-framework with **spark 3.x** support [click here](https://github.com/romans-weapon/spear-framework/blob/spark-3.1.1/README.md#getting-started-with-spear)\n\nYou can get started with spear using any of the below methods:\n### SBT dependency for Spear\n\nYou can add spear-framework as dependency in your projects build.sbt file as show below\n```commandline\nlibraryDependencies += \"io.github.romans-weapon\" %% \"spear-framework\" % \"2.4-3.0.3\"\n```\n\n### Maven dependency for Spear\nMaven dependency for spear is shown below:\n```commandline\n\u003cdependency\u003e\n  \u003cgroupId\u003eio.github.romans-weapon\u003c/groupId\u003e\n  \u003cartifactId\u003espear-framework_2.11\u003c/artifactId\u003e\n  \u003cversion\u003e2.4-3.0.3\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n### Spark shell package for Spear\n\nYou can also add it as a package while staring spark-shell along with other packages.\n```commandline\nspark-shell --packages \"io.github.romans-weapon:spear-framework_2.11:2.4-3.0.3\"\n```\n\n### Docker container setup for Spear\nBelow are the simple steps to setup spear on any machine having docker and docker-compose installed :\n\n1. Clone the repository from git and navigate to project directory\n```commandline\ngit clone https://github.com/AnudeepKonaboina/spear-framework.git \u0026\u0026 cd spear-framework\n```\n\n2. Run setup.sh script using the command\n```commandline\nsh setup.sh\n```\n\n3. Once the setup is completed run the below command for entering into the container containing spear\n```commandline\nuser@node~$ docker exec -it spear bash\n```\n\n4. Run `spear-shell` inside the conatiner to start spark shell integrated with spear .\n```\nroot@hadoop # spear-shell\n```\n5. Once you enter into the conatiner you will get default hadoop/hive environment readily available to read data from any source and write it to HDFS so that it gives you complete environment to create your own data-pipelines using spear-framework.\\\nServices and their corresponding versions available within the container are shown below:\n\n| Service      | Version     |\n| -----------  | ----------- |\n| Spark        | 2.4.7       |\n| Hadoop       | 2.10.1      |\n| Hive         | 2.1.1       |\n\nAlso it has a postgres database and a NO-SQL database mongodb as well which you can use it as a source or as a desination for writing and testing your connector.\n\n5. Start writing your own connectors and explore .To understand how to write a connector [click here](develop-your-first-connector-using-spear)\n\n\n# Develop your first connector using Spear\n\nBelow are the steps to write any connector:\n\n1. Get the suitable connector object using Spearconnector by providing the source and destination details as shown below:\\\n\na. For connectors from single source to single destination below is how you create a connector object\n```commandline\nimport com.github.edge.roman.spear.SpearConnector\n\nval connector = SpearConnector\n  .createConnector(name=\"defaultconnector\")                //give a name to your connector(any name)\n  .source(sourceType = \"relational\", sourceFormat = \"jdbc\")//source type and format for loading data  \n  .target(targetType = \"FS\", targetFormat = \"parquet\")     //target type and format for writing data to dest.\n  .getConnector \n```\nb. For connectors from a source to multiple targets below is how you create a connector object\n\n```commandline\nval multiTargetConnector = SpearConnector\n  .createConnector(name=\"defaultconnector\")                //give a name to your connector(any name)\n  .source(sourceType = \"relational\", sourceFormat = \"jdbc\")//source type and format for loading data  \n  .multitarget                                             //use multitarget in case of more than on destinations\n  .getConnector \n```\n\n2. Below are the source and destination type combinations that spear-framework supports:\n\n#### Connector types (source and dest.) supported by spear:\n\n|source type  | dest. type    | description                                                | \n|------------ |:-------------:|:-----------------------------------------------------------:\n| file        |  relational   |connector object with file source and database as dest.     |\n| relational  |  relational   |connector object with database source and database as dest. |\n| stream      |  relational   |connector object with stream source and database as dest.   |\n| nosql       |  relational   |connector object with nosql  source and relational as dest. |\n| graph       |  relational   |connector object with graph source and database as dest.    |\n| file        |  FS           |connector object with file source and FileSystem as dest.   |\n| relational  |  FS           |connector object with database source and FileSystem as dest|\n| stream      |  FS           |connector object with stream source and FileSystem as dest. |\n| FS          |  FS           |connector object with FileS  source and FileSystem as dest. |\n| graph       |  FS           |connector object with graph  source and FileSystem as dest. |\n| nosql       |  FS           |connector object with nosql  source and FileSystem as dest. |\n| file        |  nosql        |connector object with nosql  source and nosql      as dest. |\n| relational  |  nosql        |connector object with nosql  source and nosql      as dest. |\n| nosql       |  nosql        |connector object with nosql  source and nosql      as dest. |\n| graph       |  nosql        |connector object with graph  source and nosql      as dest. |\n| file        |  graph        |connector object with file  source and  Graph      as dest. |\n| relational  |  graph        |connector object with relational  source and Graph as dest. |\n| nosql       |  graph        |connector object with nosql  source and Graph      as dest. |\n\n\n\n3. Write the connector logic using the connector object in step 1.\n\n```commandline\n-\u003e Souce object and connection profile needs to be specified for reading data from source\nconnector\n  .source(sourceObject=\"\u003cfilename/tablename/topic/api\u003e\", \u003csource_connection_profile Map((key-\u003evalue))\u003e) \n  (or) \n  .sourceSql(\u003cconnection profile\u003e,\u003csql_text\u003e)\n  \n-\u003eThe saveAs api creates a temporary table on the source data with the given alias name which can be used for further transformations\n  .saveAs(\"\u003calias temporary table name\u003e\")\n\n-\u003eapply custom tranformations on the loaded source data.(optional/can be applied only if necessary)\n  .transformSql(\"\u003ctransformation sql to be applied on source data\u003e\")\n\n-\u003etarget details where you want to load the data.\n  .targetFS(destinationFilePath = \"\u003chdfs /s3/gcs file path\u003e\", saveAsTable = \"\u003ctablename\u003e\", \u003cSavemode can be overwrite/append/ignore\u003e) \n  (or)\n  .targetJDBC(tableName=\u003ctable_name\u003e, targetProperties, \u003cSavemode can be overwrite/append/ignore\u003e)\n  (or)\n  .targetNoSQL(\u003cnosql_obj_name\u003e,targetProperties,\u003cSavemode can be overwrite/append/ignore\u003e)\n  \n -\u003efor multitarget use the branch api.The dest format will be given whithin the target which will be shown in the examples below.\n   .branch\n   .targets(\n   //target-1\n   //target-2\n   ..\n   //target-n\n   )\n```\n\n3. On completion stop the connector.\n\n```commandline\n//stops the connector object and the underlying spark session\nconnector.stop()\n```\n\n4. Enable verbose logging\n   To get the output df at each stage in your connector you can explicitly enable verbose logging as below as soon as you a connector object.This is completely optional.\n\n```commandline\nconnector.setVeboseLogging(true) //default value is false.\n```\n\n## Diagramatic Representation:\n![image](https://user-images.githubusercontent.com/59328701/119258939-7afb5d80-bbe9-11eb-837f-02515cb7cf74.png)\n\n## Example Connectors\nConnector is basically the logic/code which allows you to create a pipeline from source to target using the spear framework, using which you can write data from any source to any destination.\n\n### Target JDBC\nSpear framework supports writing data to any RDBMS with jdbc as destination(postgres/oracle/msql etc..)  from various sources like a file(csv/json/filesystem etc..)/database(RDBMS/cloud db etc..)/streaming(kafka/dir path etc..).Given below are examples of few connectors with JDBC as target.Below examples are written for postgresql as JDBC target,but this can be extended for any jdbc target.\n\n### File source\n\n#### CSV to JDBC Connector\nAn example connector for reading csv file applying transformations and storing it into postgres table using spear:\\\n\nThe input data is available in the data/us-election-2012-results-by-county.csv. Simply copy the below connector and paste it in your interactive shell and see your data being moved to a table in postgres with such minimal code !!!.\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.SaveMode\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\nval targetProps = Map(\n    \"driver\" -\u003e \"org.postgresql.Driver\",\n    \"user\" -\u003e \"postgres_user\",\n    \"password\" -\u003e \"mysecretpassword\",\n    \"url\" -\u003e \"jdbc:postgresql://postgres:5432/pgdb\"\n  )\n\nval csvJdbcConnector = SpearConnector\n    .createConnector(name=\"CSVtoPostgresConnector\")\n    .source(sourceType = \"file\", sourceFormat = \"csv\")\n    .target(targetType = \"relational\", targetFormat = \"jdbc\")\n    .getConnector   \n \ncsvJdbcConnector.setVeboseLogging(true)\ncsvJdbcConnector\n  .source(sourceObject=\"file:///opt/spear-framework/data/us-election-2012-results-by-county.csv\", Map(\"header\" -\u003e \"true\", \"inferSchema\" -\u003e \"true\"))\n  .saveAs(\"__tmp__\")\n  .transformSql(\n    \"\"\"select state_code,party,\n      |sum(votes) as total_votes\n      |from __tmp__\n      |group by state_code,party\"\"\".stripMargin)\n  .targetJDBC(objectName=\"mytable\", props=targetProps, saveMode=SaveMode.Overwrite)\ncsvJdbcConnector.stop()\n```\n\n##### Output:\n\n```\n21/06/17 08:04:03 INFO targetjdbc.FiletoJDBC: Connector:CSVtoPostgresConnector to Target:JDBC with Format:jdbc from Source:file:///opt/spear-framework/data/us-election-2012-results-by-county.csv with Format:csv started running !!\n21/06/17 08:04:09 INFO targetjdbc.FiletoJDBC: Reading source file: file:///opt/spear-framework/data/us-election-2012-results-by-county.csv with format: csv status:success\n+----------+----------+------------+-------------------+-----+----------+---------+-----+\n|country_id|state_code|country_name|country_total_votes|party|first_name|last_name|votes|\n+----------+----------+------------+-------------------+-----+----------+---------+-----+\n|1         |AK        |Alasaba     |220596             |Dem  |Barack    |Obama    |91696|\n|2         |AK        |Akaskak     |220596             |Dem  |Barack    |Obama    |91696|\n|3         |AL        |Autauga     |23909              |Dem  |Barack    |Obama    |6354 |\n|4         |AK        |Akaska      |220596             |Dem  |Barack    |Obama    |91696|\n|5         |AL        |Baldwin     |84988              |Dem  |Barack    |Obama    |18329|\n|6         |AL        |Barbour     |11459              |Dem  |Barack    |Obama    |5873 |\n|7         |AL        |Bibb        |8391               |Dem  |Barack    |Obama    |2200 |\n|8         |AL        |Blount      |23980              |Dem  |Barack    |Obama    |2961 |\n|9         |AL        |Bullock     |5318               |Dem  |Barack    |Obama    |4058 |\n|10        |AL        |Butler      |9483               |Dem  |Barack    |Obama    |4367 |\n+----------+----------+------------+-------------------+-----+----------+---------+-----+\nonly showing top 10 rows\n\n21/06/17 08:04:10 INFO targetjdbc.FiletoJDBC: Saving data as temporary table:__tmp__ success\n21/06/17 08:04:12 INFO targetjdbc.FiletoJDBC: Executing transformation sql: select state_code,party,\nsum(votes) as total_votes\nfrom __tmp__\ngroup by state_code,party status :success\n+----------+-----+-----------+\n|state_code|party|total_votes|\n+----------+-----+-----------+\n|AL        |Dem  |793620     |\n|NY        |GOP  |2226637    |\n|MI        |CST  |16792      |\n|ID        |GOP  |420750     |\n|ID        |Ind  |2495       |\n|WA        |CST  |7772       |\n|HI        |Grn  |3121       |\n|MS        |RP   |969        |\n|MN        |Grn  |13045      |\n|ID        |Dem  |212560     |\n+----------+-----+-----------+\nonly showing top 10 rows\n\n21/06/17 08:04:17 INFO targetjdbc.FiletoJDBC: Write data to table/object:mytable completed with status:success\n+----------+-----+-----------+\n|state_code|party|total_votes|\n+----------+-----+-----------+\n|AL        |Dem  |793620     |\n|NY        |GOP  |2226637    |\n|MI        |CST  |16792      |\n|ID        |GOP  |420750     |\n|ID        |Ind  |2495       |\n|WA        |CST  |7772       |\n|HI        |Grn  |3121       |\n|MS        |RP   |969        |\n|MN        |Grn  |13045      |\n|ID        |Dem  |212560     |\n+----------+-----+-----------+\nonly showing top 10 rows\n\n```\nA lot of connectors from other file source to JDBC destination are avaialble [here](https://romans-weapon.github.io/spear-framework/#file-source).\n\n\n### JDBC source\n\n#### Oracle to Postgres Connector\nThis example shows the usage of sourceSql api for reading from source with filters applied on the source query as ahown below.\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.SaveMode\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\n\nval targetParams = Map(\n  \"driver\" -\u003e \"org.postgresql.Driver\",\n  \"user\" -\u003e \"postgres_user\",\n  \"password\" -\u003e \"mysecretpassword\",\n  \"url\" -\u003e \"jdbc:postgresql://localhost:5432/pgdb\"\n)\n\nval oracleTOPostgresConnector = SpearConnector\n  .createConnector(name=\"OracletoPostgresConnector\")\n  .source(sourceType = \"relational\", sourceFormat = \"jdbc\")\n  .target(targetType = \"relational\", targetFormat = \"jdbc\")\n  .getConnector\n\noracleTOPostgresConnector.setVeboseLogging(true)\n\noracleTOPostgresConnector\n  .sourceSql(Map(\"driver\" -\u003e \"oracle.jdbc.driver.OracleDriver\", \"user\" -\u003e \"user\", \"password\" -\u003e \"pass\", \"url\" -\u003e \"jdbc:oracle:thin:@ora-host:1521:orcl\"),\n    \"\"\"\n      |SELECT\n      |        to_char(sys_extract_utc(systimestamp), 'YYYY-MM-DD HH24:MI:SS.FF') as ingest_ts_utc,\n      |        to_char(TIMESTAMP_0, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_0,\n      |        to_char(TIMESTAMP_5, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_5,\n      |        to_char(TIMESTAMP_7, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_7,\n      |        to_char(TIMESTAMP_9, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_9,\n      |        to_char(TIMESTAMP0_WITH_TZ) as timestamp0_with_tz , to_char(sys_extract_utc(TIMESTAMP0_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS') as timestamp0_with_tz_utc,\n      |        to_char(TIMESTAMP5_WITH_TZ) as timestamp5_with_tz , to_char(sys_extract_utc(TIMESTAMP5_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp5_with_tz_utc,\n      |        to_char(TIMESTAMP8_WITH_TZ) as timestamp8_with_tz , to_char(sys_extract_utc(TIMESTAMP8_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp8_with_tz_utc,\n      |        to_char(TIMESTAMP0_WITH_LTZ) as timestamp0_with_ltz , to_char(sys_extract_utc(TIMESTAMP0_WITH_LTZ), 'YYYY-MM-DD HH24:MI:SS') as timestamp0_with_ltz_utc,\n      |        to_char(TIMESTAMP5_WITH_LTZ) as timestamp5_with_ltz , to_char(sys_extract_utc(TIMESTAMP5_WITH_LTZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp5_with_ltz_utc,\n      |        to_char(TIMESTAMP8_WITH_LTZ) as timestamp8_with_ltz , to_char(sys_extract_utc(TIMESTAMP8_WITH_LTZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp8_with_ltz_utc\n      |        from DBSRV.ORACLE_TIMESTAMPS\n      |\"\"\".stripMargin)\n  .saveAs(\"__source__\")\n  .transformSql(\n    \"\"\"\n      |SELECT\n      |        TO_TIMESTAMP(ingest_ts_utc) as ingest_ts_utc,\n      |        TIMESTAMP_0 as timestamp_0,\n      |        TIMESTAMP_5 as timestamp_5,\n      |        TIMESTAMP_7 as timestamp_7,\n      |        TIMESTAMP_9 as timestamp_9,\n      |        TIMESTAMP0_WITH_TZ as timestamp0_with_tz,TIMESTAMP0_WITH_TZ_utc as timestamp0_with_tz_utc,\n      |        TIMESTAMP5_WITH_TZ as timestamp5_with_tz,TIMESTAMP5_WITH_TZ_utc as timestamp5_with_tz_utc,\n      |        TIMESTAMP8_WITH_TZ as timestamp8_with_tz,TIMESTAMP8_WITH_TZ_utc as timestamp8_with_tz_utc,\n      |        TIMESTAMP0_WITH_LTZ as timestamp0_with_ltz,TIMESTAMP0_WITH_LTZ_utc as timestamp0_with_ltz_utc,\n      |        TIMESTAMP5_WITH_LTZ as timestamp5_with_ltz,TIMESTAMP5_WITH_LTZ_utc as timestamp5_with_ltz_utc,\n      |        TIMESTAMP8_WITH_LTZ as timestamp8_with_ltz,TIMESTAMP8_WITH_LTZ_utc as timestamp8_with_ltz_utc\n      |        from __source__\n      |\"\"\".stripMargin)\n  .targetJDBC(objectName = \"pgdb.ora_to_postgres\", params=targetParams, saveMode=SaveMode.Overwrite)\n\noracleTOPostgresConnector.stop()\n\n```\n\n### Output\n\n```commandline\n21/05/04 17:35:50 INFO targetjdbc.JDBCtoJDBC: Executing source sql query:\nSELECT\n        to_char(sys_extract_utc(systimestamp), 'YYYY-MM-DD HH24:MI:SS.FF') as ingest_ts_utc,\n        to_char(TIMESTAMP_0, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_0,\n        to_char(TIMESTAMP_5, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_5,\n        to_char(TIMESTAMP_7, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_7,\n        to_char(TIMESTAMP_9, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_9,\n        to_char(TIMESTAMP0_WITH_TZ) as timestamp0_with_tz , to_char(sys_extract_utc(TIMESTAMP0_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS') as timestamp0_with_tz_utc,\n        to_char(TIMESTAMP5_WITH_TZ) as timestamp5_with_tz , to_char(sys_extract_utc(TIMESTAMP5_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp5_with_tz_utc,\n        to_char(TIMESTAMP8_WITH_TZ) as timestamp8_with_tz , to_char(sys_extract_utc(TIMESTAMP8_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp8_with_tz_utc\n        from DBSRV.ORACLE_TIMESTAMPS\n\n21/05/04 17:35:50 INFO targetjdbc.JDBCtoJDBC: Data is saved as a temporary table by name: __source__\n21/05/04 17:35:50 INFO targetjdbc.JDBCtoJDBC: showing saved data from temporary table with name: __source__\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|INGEST_TS_UTC             |TIMESTAMP_0          |TIMESTAMP_5              |TIMESTAMP_7                |TIMESTAMP_9                  |TIMESTAMP0_WITH_TZ                 |TIMESTAMP0_WITH_TZ_UTC|TIMESTAMP5_WITH_TZ                       |TIMESTAMP5_WITH_TZ_UTC   |TIMESTAMP8_WITH_TZ                          |TIMESTAMP8_WITH_TZ_UTC      |\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|2021-05-04 17:35:50.620944|2021-04-07 15:15:16.0|2021-04-07 15:15:16.03356|2021-04-07 15:15:16.0335610|2021-04-07 15:15:16.033561000|07-APR-21 03.15.16 PM ASIA/CALCUTTA|2021-04-07 09:45:16   |07-APR-21 03.15.16.03356 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356|07-APR-21 03.15.16.03356100 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356100|\n|2021-05-04 17:35:50.620944|2021-04-07 15:16:51.6|2021-04-07 15:16:51.60911|2021-04-07 15:16:51.6091090|2021-04-07 15:16:51.609109000|07-APR-21 03.16.52 PM ASIA/CALCUTTA|2021-04-07 09:46:52   |07-APR-21 03.16.51.60911 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60911|07-APR-21 03.16.51.60910900 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60910900|\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n\n21/05/04 17:35:50 INFO targetjdbc.JDBCtoJDBC: Data after transformation using the SQL :\nSELECT\n        TO_TIMESTAMP(ingest_ts_utc) as ingest_ts_utc,\n        TIMESTAMP_0 as timestamp_0,\n        TIMESTAMP_5 as timestamp_5,\n        TIMESTAMP_7 as timestamp_7,\n        TIMESTAMP_9 as timestamp_9,\n        TIMESTAMP0_WITH_TZ as timestamp0_with_tz,TIMESTAMP0_WITH_TZ_utc as timestamp0_with_tz_utc,\n        TIMESTAMP5_WITH_TZ as timestamp5_with_tz,TIMESTAMP5_WITH_TZ_utc as timestamp5_with_tz_utc,\n        TIMESTAMP8_WITH_TZ as timestamp8_with_tz,TIMESTAMP8_WITH_TZ_utc as timestamp8_with_tz_utc\n        from __source__\n\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|ingest_ts_utc             |timestamp_0          |timestamp_5              |timestamp_7                |timestamp_9                  |timestamp0_with_tz                 |timestamp0_with_tz_utc|timestamp5_with_tz                       |timestamp5_with_tz_utc   |timestamp8_with_tz                          |timestamp8_with_tz_utc      |\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|2021-05-04 17:35:50.818643|2021-04-07 15:15:16.0|2021-04-07 15:15:16.03356|2021-04-07 15:15:16.0335610|2021-04-07 15:15:16.033561000|07-APR-21 03.15.16 PM ASIA/CALCUTTA|2021-04-07 09:45:16   |07-APR-21 03.15.16.03356 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356|07-APR-21 03.15.16.03356100 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356100|\n|2021-05-04 17:35:50.818643|2021-04-07 15:16:51.6|2021-04-07 15:16:51.60911|2021-04-07 15:16:51.6091090|2021-04-07 15:16:51.609109000|07-APR-21 03.16.52 PM ASIA/CALCUTTA|2021-04-07 09:46:52   |07-APR-21 03.16.51.60911 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60911|07-APR-21 03.16.51.60910900 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60910900|\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n\n21/05/04 17:35:50 INFO targetjdbc.JDBCtoJDBC: Writing data to target table: pgdb.ora_to_postgres\n21/05/04 17:35:56 INFO targetjdbc.JDBCtoJDBC: Showing data in target table  : pgdb.ora_to_postgres\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|ingest_ts_utc             |timestamp_0          |timestamp_5              |timestamp_7                |timestamp_9                  |timestamp0_with_tz                 |timestamp0_with_tz_utc|timestamp5_with_tz                       |timestamp5_with_tz_utc   |timestamp8_with_tz                          |timestamp8_with_tz_utc      |\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|2021-05-04 17:35:52.709503|2021-04-07 15:15:16.0|2021-04-07 15:15:16.03356|2021-04-07 15:15:16.0335610|2021-04-07 15:15:16.033561000|07-APR-21 03.15.16 PM ASIA/CALCUTTA|2021-04-07 09:45:16   |07-APR-21 03.15.16.03356 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356|07-APR-21 03.15.16.03356100 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356100|\n|2021-05-04 17:35:52.709503|2021-04-07 15:16:51.6|2021-04-07 15:16:51.60911|2021-04-07 15:16:51.6091090|2021-04-07 15:16:51.609109000|07-APR-21 03.16.52 PM ASIA/CALCUTTA|2021-04-07 09:46:52   |07-APR-21 03.16.51.60911 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60911|07-APR-21 03.16.51.60910900 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60910900|\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n```\nMore connectors from other JDBC sources to JDBC destination are avaialble [here](https://romans-weapon.github.io/spear-framework/#jdbc-source).\n\n\n### Streaming source\n\n#### Kafka to Postgres Connector\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.types.{StringType, StructField, StructType}\nimport org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}\n\nval targetParams = Map(\n  \"driver\" -\u003e \"org.postgresql.Driver\",\n  \"user\" -\u003e \"postgres_user\",\n  \"password\" -\u003e \"mysecretpassword\",\n  \"url\" -\u003e \"jdbc:postgresql://localhost:5432/pgdb\"\n)\n\nval streamTOPostgres=SpearConnector\n   .createConnector(name=\"StreamKafkaToPostgresconnector\")\n   .source(sourceType = \"stream\",sourceFormat = \"kafka\")\n   .target(targetType = \"relational\",targetFormat = \"jdbc\")\n   .getConnector\n\nval schema = StructType(\n    Array(StructField(\"id\", StringType),\n      StructField(\"name\", StringType)\n    ))\n\nstreamTOPostgres\n    .source(sourceObject = \"stream_topic\",Map(\"kafka.bootstrap.servers\"-\u003e \"kafka:9092\",\"failOnDataLoss\"-\u003e\"true\",\"startingOffsets\"-\u003e \"earliest\"),schema)\n    .saveAs(\"__tmp2__\")\n    .transformSql(\"select cast (id as INT) as id, name from __tmp2__\")\n    .targetJDBC(objectName=\"person\", params=targetParams, SaveMode.Append)\n\nstreamTOPostgres.stop()\n```\n\n### Target FS (HDFS)\n\n#### Postgres to Hive Connector\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.spark.sql.SaveMode\nimport org.apache.log4j.{Level, Logger}\nimport java.util.Properties\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\n\nval postgresToHiveConnector = SpearConnector\n  .createConnector(name=\"PostgrestoHiveConnector\")\n  .source(sourceType = \"relational\", sourceFormat = \"jdbc\")\n  .target(targetType = \"FS\", targetFormat = \"parquet\")\n  .getConnector\n  \npostgresToHiveConnector.setVeboseLogging(true)  \n\npostgresToHiveConnector\n  .source(\"source_db.instance\", Map(\"driver\" -\u003e \"org.postgresql.Driver\", \"user\" -\u003e \"postgres\", \"password\" -\u003e \"test\", \"url\" -\u003e \"jdbc:postgresql://postgres-host:5433/source_db\"))\n  .saveAs(\"__tmp__\")\n  .transformSql(\n    \"\"\"\n      |select cast( uuid as string) as uuid,\n      |cast( type_id as bigint ) as type_id, \n      |cast( factory_message_process_id as bigint) as factory_message_process_id,\n      |cast( factory_uuid as string ) as factory_uuid,\n      |cast( factory_id as bigint ) as factory_id,\n      |cast( engine_id as bigint ) as engine_id,\n      |cast( topic as string ) as topic,\n      |cast( status_code_id as int) as status_code_id,\n      |cast( cru_by as string ) as cru_by,cast( cru_ts as timestamp) as cru_ts \n      |from __tmp__\"\"\".stripMargin)\n  .targetFS(destinationFilePath = \"/tmp/ingest_test.db\", saveAsTable = \"ingest_test.postgres_data\", saveMode=SaveMode.Overwrite)\n\npostgresToHiveConnector.stop()\n```\n\n### Output\n\n```commandline\n21/05/01 10:39:20 INFO targetFS.JDBCtoFS: Reading source data from table: source_db.instance\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|uuid                                 |type_id    |factory_message_process_id   |factory_uuid                        |factory_id    |   engine_id      |topic                      |status_code_id|cru_by|cru_ts                    |\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|null                                |1          |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |factory_2_2                |5             |ABCDE |2021-04-27 10:17:37.529195|\n|null                                |1          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |factory_1_1                |5             |ABCDE |2021-04-27 10:17:37.533318|\n|null                                |1          |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |factory_3_3                |5             |ABCDE |2021-04-27 10:17:37.535323|\n|59d9b23e-ff93-4351-af7e-0a95ec4fde65|10         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_authtoken           |5             |ABCDE |2021-04-27 10:17:50.441147|\n|111eeff6-c61d-402e-9e70-615cf80d3016|10         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_authtoken           |5             |ABCDE |2021-04-27 10:18:02.439379|\n|2870ff43-73c9-424e-9f3c-c89ac4dda278|10         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_authtoken           |5             |ABCDE |2021-04-27 10:18:14.5242  |\n|58fe7575-9c4f-471e-8893-9bc39b4f1be4|18         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_error              |5             |ABCDE |2021-04-27 10:21:17.098984|\n|534a2af0-af74-4633-8603-926070afd76f|16         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_filter_resolver_jdbc|5             |ABCDE |2021-04-27 10:21:17.223042|\n|9971130b-9ae1-4a53-89ce-aa1932534956|18         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_error               |5             |ABCDE |2021-04-27 10:21:17.437489|\n|6db9c72f-85b0-4254-bc2f-09dc1e63e6f3|9          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_flowcontroller      |5             |ABCDE |2021-04-27 10:21:17.780313|\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\nonly showing top 10 rows\n\n21/05/01 10:39:31 INFO targetFS.JDBCtoFS: Data is saved as a temporary table by name: __tmp__\n21/05/01 10:39:31 INFO targetFS.JDBCtoFS: showing saved data from temporary table with name: __tmp__\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|uuid                                |type_id    |factory_message_process_id    |factory_uuid                        |factory_id    | engine_id        |topic                      |status_code_id|cru_by|cru_ts                    |\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|null                                |1          |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |factory_2_2                |5             |ABCDE |2021-04-27 10:17:37.529195|\n|null                                |1          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |factory_1_1                |5             |ABCDE |2021-04-27 10:17:37.533318|\n|null                                |1          |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |factory_3_3                |5             |ABCDE |2021-04-27 10:17:37.535323|\n|59d9b23e-ff93-4351-af7e-0a95ec4fde65|10         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_authtoken           |5             |ABCDE |2021-04-27 10:17:50.441147|\n|111eeff6-c61d-402e-9e70-615cf80d3016|10         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_authtoken           |5             |ABCDE |2021-04-27 10:18:02.439379|\n|2870ff43-73c9-424e-9f3c-c89ac4dda278|10         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_authtoken           |5             |ABCDE |2021-04-27 10:18:14.5242  |\n|58fe7575-9c4f-471e-8893-9bc39b4f1be4|18         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_error            |5             |ABCDE |2021-04-27 10:21:17.098984|\n|534a2af0-af74-4633-8603-926070afd76f|16         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_filter_resolver_jdbc|5             |ABCDE |2021-04-27 10:21:17.223042|\n|9971130b-9ae1-4a53-89ce-aa1932534956|18         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_error            |5             |ABCDE |2021-04-27 10:21:17.437489|\n|6db9c72f-85b0-4254-bc2f-09dc1e63e6f3|9          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_flowcontroller      |5             |ABCDE |2021-04-27 10:21:17.780313|\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\nonly showing top 10 rows\n\n21/05/01 10:39:33 INFO targetFS.JDBCtoFS: Data after transformation using the SQL :\nselect cast( uuid as string) as uuid,\ncast( type_id as bigint ) as type_id,\ncast( factory_message_process_id as bigint) as factory_message_process_id,\ncast( factory_uuid as string ) as factory_uuid,\ncast( factory_id as bigint ) as factory_id,\ncast( workflow_engine_id as bigint ) as workflow_engine_id,\ncast( topic as string ) as topic,\ncast( status_code_id as int) as status_code_id,\ncast( cru_by as string ) as cru_by,cast( cru_ts as timestamp) as cru_ts\nfrom __tmp__\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|uuid                                |type_id|factory_message_process_id        |factory_uuid                        |factory_id    |engine_id         |topic                      |status_code_id|cru_by|cru_ts                    |\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|null                                |1          |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |factory_2_2                |5             |ABCDE |2021-04-27 10:17:37.529195|\n|null                                |1          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |factory_1_1                |5             |ABCDE |2021-04-27 10:17:37.533318|\n|null                                |1          |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |factory_3_3                |5             |ABCDE |2021-04-27 10:17:37.535323|\n|59d9b23e-ff93-4351-af7e-0a95ec4fde65|10         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_authtoken           |5             |ABCDE |2021-04-27 10:17:50.441147|\n|111eeff6-c61d-402e-9e70-615cf80d3016|10         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_authtoken           |5             |ABCDE |2021-04-27 10:18:02.439379|\n|2870ff43-73c9-424e-9f3c-c89ac4dda278|10         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_authtoken           |5             |ABCDE |2021-04-27 10:18:14.5242  |\n|58fe7575-9c4f-471e-8893-9bc39b4f1be4|18         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_error               |5             |ABCDE |2021-04-27 10:21:17.098984|\n|534a2af0-af74-4633-8603-926070afd76f|16         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_filter_resolver_jdbc|5             |ABCDE |2021-04-27 10:21:17.223042|\n|9971130b-9ae1-4a53-89ce-aa1932534956|18         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_error               |5             |ABCDE |2021-04-27 10:21:17.437489|\n|6db9c72f-85b0-4254-bc2f-09dc1e63e6f3|9          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_flowcontroller      |5             |ABCDE |2021-04-27 10:21:17.780313|\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\nonly showing top 10 rows\n\n21/05/01 10:39:35 INFO targetFS.JDBCtoFS: Writing data to target file: /tmp/ingest_test.db\n21/05/01 10:39:35 INFO targetFS.JDBCtoFS: Saving data to table:ingest_test.postgres_data\n21/05/01 10:39:35 INFO targetFS.JDBCtoFS: Target Data in table:ingest_test.postgres_data\n\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|uuid                                |type_id    |factory_message_process_id    |factory_uuid                        |factory_id    |        engine_id |topic                      |status_code_id|cru_by|cru_ts                    |\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\n|null                                |1          |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |factory_2_2                |5             |ABCDE |2021-04-27 10:17:37.529195|\n|null                                |1          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |factory_1_1                |5             |ABCDE |2021-04-27 10:17:37.533318|\n|null                                |1          |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |factory_3_3                |5             |ABCDE |2021-04-27 10:17:37.535323|\n|59d9b23e-ff93-4351-af7e-0a95ec4fde65|10         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_authtoken           |5             |ABCDE |2021-04-27 10:17:50.441147|\n|111eeff6-c61d-402e-9e70-615cf80d3016|10         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_authtoken           |5             |ABCDE |2021-04-27 10:18:02.439379|\n|2870ff43-73c9-424e-9f3c-c89ac4dda278|10         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_authtoken           |5             |ABCDE |2021-04-27 10:18:14.5242  |\n|58fe7575-9c4f-471e-8893-9bc39b4f1be4|18         |1619518658043                 |5ef4bcb3-f064-4532-ad4f-5e8b68c33f70|3             |3                 |bale_3_error               |5             |ABCDE |2021-04-27 10:21:17.098984|\n|534a2af0-af74-4633-8603-926070afd76f|16         |1619518657679                 |b218b4a2-2723-4a51-a83b-1d9e5e1c79ff|2             |2                 |bale_2_filter_resolver_jdbc|5             |ABCDE |2021-04-27 10:21:17.223042|\n|9971130b-9ae1-4a53-89ce-aa1932534956|18         |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_error               |5             |ABCDE |2021-04-27 10:21:17.437489|\n|6db9c72f-85b0-4254-bc2f-09dc1e63e6f3|9          |1619518657481                 |ec65395c-fdbc-4697-ac91-bc72447ae7cf|1             |1                 |bale_1_flowcontroller      |5             |ABCDE |2021-04-27 10:21:17.780313|\n+------------------------------------+-----------+------------------------------+------------------------------------+--------------+------------------+---------------------------+--------------+------+--------------------------+\nonly showing top 10 rows\n```\n\nMore connectors to target FileSystem HDFS are avaialable [here](https://romans-weapon.github.io/spear-framework/#target-fs-hdfs).\n\n### Streaming source\n\n#### Kafka to Hive Connector\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.types.{StringType, StructField, StructType}\nimport org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}\n\nval streamTOHdfs=SpearConnector\n  .createConnector(name=\"StreamKafkaToPostgresconnector\")\n  .source(sourceType = \"stream\",sourceFormat = \"kafka\")\n  .target(targetType = \"FS\",targetFormat = \"parquet\")\n  .getConnector\n\nval schema = StructType(\n  Array(StructField(\"id\", StringType),\n    StructField(\"name\", StringType)\n  ))\n\nstreamTOHdfs\n  .source(sourceObject = \"stream_topic\",Map(\"kafka.bootstrap.servers\"-\u003e \"kafka:9092\",\"failOnDataLoss\"-\u003e\"true\",\"startingOffsets\"-\u003e \"earliest\"),schema)\n  .saveAs(\"__tmp2__\")\n  .transformSql(\"select cast (id as INT), name as __tmp2__\")\n  .targetFS(destinationFilePath = \"/tmp/ingest_test.db\", saveAsTable = \"ingest_test.ora_data\", saveMode=SaveMode.Append)\n\nstreamTOHdfs.stop()\n```\n\n\n\n### Target FS (Cloud)\n\n#### Oracle To S3 Connector\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.SaveMode\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\n\nspark.sparkContext.hadoopConfiguration.set(\"fs.s3a.access.key\", \"*****\")\nspark.sparkContext.hadoopConfiguration.set(\"fs.s3a.secret.key\", \"*****\")\n\n\nval oracleTOS3Connector = SpearConnector\n  .createConnector(\"ORAtoS3\")\n  .source(sourceType = \"relational\", sourceFormat = \"jdbc\")\n  .target(targetType = \"FS\", targetFormat = \"parquet\")\n  .getConnector\n\noracleTOS3Connector.setVeboseLogging(true)  \noracleTOS3Connector\n  .sourceSql(Map(\"driver\" -\u003e \"oracle.jdbc.driver.OracleDriver\", \"user\" -\u003e \"user\", \"password\" -\u003e \"pass\", \"url\" -\u003e \"jdbc:oracle:thin:@ora-host:1521:orcl\"),\n    \"\"\"\n      |SELECT\n      |        to_char(sys_extract_utc(systimestamp), 'YYYY-MM-DD HH24:MI:SS.FF') as ingest_ts_utc,\n      |        to_char(TIMESTAMP_0, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_0,\n      |        to_char(TIMESTAMP_5, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_5,\n      |        to_char(TIMESTAMP_7, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_7,\n      |        to_char(TIMESTAMP_9, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_9,\n      |        to_char(TIMESTAMP0_WITH_TZ) as timestamp0_with_tz , to_char(sys_extract_utc(TIMESTAMP0_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS') as timestamp0_with_tz_utc,\n      |        to_char(TIMESTAMP5_WITH_TZ) as timestamp5_with_tz , to_char(sys_extract_utc(TIMESTAMP5_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp5_with_tz_utc,\n      |        to_char(TIMESTAMP8_WITH_TZ) as timestamp8_with_tz , to_char(sys_extract_utc(TIMESTAMP8_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp8_with_tz_utc\n      |        from DBSRV.ORACLE_NUMBER\n      |\"\"\".stripMargin)\n  .saveAs(\"__source__\")\n  .transformSql(\n    \"\"\"\n      |SELECT\n      |        TO_TIMESTAMP(ingest_ts_utc) as ingest_ts_utc,\n      |        TIMESTAMP_0 as timestamp_0,\n      |        TIMESTAMP_5 as timestamp_5,\n      |        TIMESTAMP_7 as timestamp_7,\n      |        TIMESTAMP_9 as timestamp_9,\n      |        TIMESTAMP0_WITH_TZ as timestamp0_with_tz,TIMESTAMP0_WITH_TZ_utc as timestamp0_with_tz_utc,\n      |        TIMESTAMP5_WITH_TZ as timestamp5_with_tz,TIMESTAMP5_WITH_TZ_utc as timestamp5_with_tz_utc,\n      |        TIMESTAMP8_WITH_TZ as timestamp8_with_tz,TIMESTAMP8_WITH_TZ_utc as timestamp8_with_tz_utc\n      |        from __source__\n      |\"\"\".stripMargin)\n  .targetFS(destinationFilePath=\"s3a://destination/data\", saveMode=SaveMode.Overwrite)\n\noracleTOS3Connector.stop()\n```\n\n### Output\n\n```commandline\n21/05/08 08:46:11 INFO targetFS.JDBCtoFS: Executing source sql query:\nSELECT\n        to_char(sys_extract_utc(systimestamp), 'YYYY-MM-DD HH24:MI:SS.FF') as ingest_ts_utc,\n        to_char(TIMESTAMP_0, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_0,\n        to_char(TIMESTAMP_5, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_5,\n        to_char(TIMESTAMP_7, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_7,\n        to_char(TIMESTAMP_9, 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp_9,\n        to_char(TIMESTAMP0_WITH_TZ) as timestamp0_with_tz , to_char(sys_extract_utc(TIMESTAMP0_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS') as timestamp0_with_tz_utc,\n        to_char(TIMESTAMP5_WITH_TZ) as timestamp5_with_tz , to_char(sys_extract_utc(TIMESTAMP5_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp5_with_tz_utc,\n        to_char(TIMESTAMP8_WITH_TZ) as timestamp8_with_tz , to_char(sys_extract_utc(TIMESTAMP8_WITH_TZ), 'YYYY-MM-DD HH24:MI:SS.FF') as timestamp8_with_tz_utc\n        from DBSRV.ORACLE_TIMESTAMPS\n\n21/05/08 08:46:11 INFO targetFS.JDBCtoFS: Data is saved as a temporary table by name: __source__\n21/05/08 08:46:11 INFO targetFS.JDBCtoFS: Showing saved data from temporary table with name: __source__\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|INGEST_TS_UTC             |TIMESTAMP_0          |TIMESTAMP_5              |TIMESTAMP_7                |TIMESTAMP_9                  |TIMESTAMP0_WITH_TZ                 |TIMESTAMP0_WITH_TZ_UTC|TIMESTAMP5_WITH_TZ                       |TIMESTAMP5_WITH_TZ_UTC   |TIMESTAMP8_WITH_TZ                          |TIMESTAMP8_WITH_TZ_UTC      |\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|2021-05-08 08:46:12.178719|2021-04-07 15:15:16.0|2021-04-07 15:15:16.03356|2021-04-07 15:15:16.0335610|2021-04-07 15:15:16.033561000|07-APR-21 03.15.16 PM ASIA/CALCUTTA|2021-04-07 09:45:16   |07-APR-21 03.15.16.03356 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356|07-APR-21 03.15.16.03356100 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356100|\n|2021-05-08 08:46:12.178719|2021-04-07 15:16:51.6|2021-04-07 15:16:51.60911|2021-04-07 15:16:51.6091090|2021-04-07 15:16:51.609109000|07-APR-21 03.16.52 PM ASIA/CALCUTTA|2021-04-07 09:46:52   |07-APR-21 03.16.51.60911 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60911|07-APR-21 03.16.51.60910900 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60910900|\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n\n21/05/08 08:46:12 INFO targetFS.JDBCtoFS: Data after transformation using the SQL :\nSELECT\n        TO_TIMESTAMP(ingest_ts_utc) as ingest_ts_utc,\n        TIMESTAMP_0 as timestamp_0,\n        TIMESTAMP_5 as timestamp_5,\n        TIMESTAMP_7 as timestamp_7,\n        TIMESTAMP_9 as timestamp_9,\n        TIMESTAMP0_WITH_TZ as timestamp0_with_tz,TIMESTAMP0_WITH_TZ_utc as timestamp0_with_tz_utc,\n        TIMESTAMP5_WITH_TZ as timestamp5_with_tz,TIMESTAMP5_WITH_TZ_utc as timestamp5_with_tz_utc,\n        TIMESTAMP8_WITH_TZ as timestamp8_with_tz,TIMESTAMP8_WITH_TZ_utc as timestamp8_with_tz_utc\n        from __source__\n\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|ingest_ts_utc             |timestamp_0          |timestamp_5              |timestamp_7                |timestamp_9                  |timestamp0_with_tz                 |timestamp0_with_tz_utc|timestamp5_with_tz                       |timestamp5_with_tz_utc   |timestamp8_with_tz                          |timestamp8_with_tz_utc      |\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n|2021-05-08 08:46:12.438578|2021-04-07 15:15:16.0|2021-04-07 15:15:16.03356|2021-04-07 15:15:16.0335610|2021-04-07 15:15:16.033561000|07-APR-21 03.15.16 PM ASIA/CALCUTTA|2021-04-07 09:45:16   |07-APR-21 03.15.16.03356 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356|07-APR-21 03.15.16.03356100 PM ASIA/CALCUTTA|2021-04-07 09:45:16.03356100|\n|2021-05-08 08:46:12.438578|2021-04-07 15:16:51.6|2021-04-07 15:16:51.60911|2021-04-07 15:16:51.6091090|2021-04-07 15:16:51.609109000|07-APR-21 03.16.52 PM ASIA/CALCUTTA|2021-04-07 09:46:52   |07-APR-21 03.16.51.60911 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60911|07-APR-21 03.16.51.60910900 PM ASIA/CALCUTTA|2021-04-07 09:46:51.60910900|\n+--------------------------+---------------------+-------------------------+---------------------------+-----------------------------+-----------------------------------+----------------------+-----------------------------------------+-------------------------+--------------------------------------------+----------------------------+\n21/05/08 08:46:12 INFO targetFS.JDBCtoFS: Writing data to target path: s3a://destination/data\n21/05/08 08:47:06 INFO targetFS.JDBCtoFS: Saving data to path:s3a://destination/data\n\nData at S3:\n===========\nuser@node:~$ aws s3 ls s3://destination/data\n2021-05-08 12:10:09          0 _SUCCESS\n2021-05-08 12:09:59       4224 part-00000-71fad52e-404d-422c-a6af-7889691bc506-c000.snappy.parquet\n\n```\n\nMore connectors to target FileSystem Cloud (s3/gcs/adls..ect) are avaialable [here](https://romans-weapon.github.io/spear-framework/#target-fs-cloud).\n\n## Target NOSQL\n\n### File source\n\n#### CSV to MongoDB Connector\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.SaveMode\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\nval mongoProps = Map(\n  \"uri\" -\u003e \"mongodb://mongo:27017\"\n)\n\nval csvMongoConnector = SpearConnector\n  .createConnector(\"csv-mongo\")\n  .source(sourceType = \"file\", sourceFormat = \"csv\")\n  .target(targetType = \"nosql\", targetFormat = \"mongo\")\n  .getConnector\ncsvMongoConnector.setVeboseLogging(true)\ncsvMongoConnector\n  .source(sourceObject = \"file:///opt/spear-framework/data/us-election-2012-results-by-county.csv\", Map(\"header\" -\u003e \"true\", \"inferSchema\" -\u003e \"true\"))\n  .saveAs(\"__tmp__\")\n  .transformSql(\n    \"\"\"select state_code,party,\n      |sum(votes) as total_votes\n      |from __tmp__\n      |group by state_code,party\"\"\".stripMargin)\n  .targetNoSQL(objectName = \"ingest.csvdata\", props = mongoProps, saveMode = SaveMode.Overwrite)\ncsvMongoConnector.stop()\n```\n\n\n##### Output\n\n````commandline\n\n21/05/30 11:18:02 INFO targetNoSQL.FilettoNoSQL: Connector:csv-mongo to Target:NoSQL DB with Format:mongo from Source:file:///opt/spear-framework/data/us-election-2012-results-by-county.csv with Format:csv started running !!\n21/05/30 11:18:04 INFO targetNoSQL.FilettoNoSQL: Reading source file: file:///opt/spear-framework/data/us-election-2012-results-by-county.csv with format: csv status:success\n+----------+----------+------------+-------------------+-----+----------+---------+-----+\n|country_id|state_code|country_name|country_total_votes|party|first_name|last_name|votes|\n+----------+----------+------------+-------------------+-----+----------+---------+-----+\n|1         |AK        |Alasaba     |220596             |Dem  |Barack    |Obama    |91696|\n|2         |AK        |Akaskak     |220596             |Dem  |Barack    |Obama    |91696|\n|3         |AL        |Autauga     |23909              |Dem  |Barack    |Obama    |6354 |\n|4         |AK        |Akaska      |220596             |Dem  |Barack    |Obama    |91696|\n|5         |AL        |Baldwin     |84988              |Dem  |Barack    |Obama    |18329|\n|6         |AL        |Barbour     |11459              |Dem  |Barack    |Obama    |5873 |\n|7         |AL        |Bibb        |8391               |Dem  |Barack    |Obama    |2200 |\n|8         |AL        |Blount      |23980              |Dem  |Barack    |Obama    |2961 |\n|9         |AL        |Bullock     |5318               |Dem  |Barack    |Obama    |4058 |\n|10        |AL        |Butler      |9483               |Dem  |Barack    |Obama    |4367 |\n+----------+----------+------------+-------------------+-----+----------+---------+-----+\nonly showing top 10 rows\n\n21/05/30 11:18:04 INFO targetNoSQL.FilettoNoSQL: Saving data as temporary table:__tmp__ success\n21/05/30 11:18:04 INFO targetNoSQL.FilettoNoSQL: Executing tranformation sql: select state_code,party,\nsum(votes) as total_votes\nfrom __tmp__\ngroup by state_code,party status :success\n+----------+-----+-----------+\n|state_code|party|total_votes|\n+----------+-----+-----------+\n|AL        |Dem  |793620     |\n|NY        |GOP  |2226637    |\n|MI        |CST  |16792      |\n|ID        |GOP  |420750     |\n|ID        |Ind  |2495       |\n|WA        |CST  |7772       |\n|HI        |Grn  |3121       |\n|MS        |RP   |969        |\n|MN        |Grn  |13045      |\n|ID        |Dem  |212560     |\n+----------+-----+-----------+\nonly showing top 10 rows\n\n21/05/30 11:18:08 INFO targetNoSQL.FilettoNoSQL: Write data to object ingest.csvdata completed with status:success\n+----------+-----+-----------+\n|state_code|party|total_votes|\n+----------+-----+-----------+\n|AL        |Dem  |793620     |\n|NY        |GOP  |2226637    |\n|MI        |CST  |16792      |\n|ID        |GOP  |420750     |\n|ID        |Ind  |2495       |\n|WA        |CST  |7772       |\n|HI        |Grn  |3121       |\n|MS        |RP   |969        |\n|MN        |Grn  |13045      |\n|ID        |Dem  |212560     |\n+----------+-----+-----------+\nonly showing top 10 rows\n\n````\nOther connectors with NO-SQL as destination are avaialble [here](https://romans-weapon.github.io/spear-framework/#target-nosql).\n\n\n\n## Target GraphDB\nSpear framework is also provisioned to write connectors with graph databases as targets from different sources.This section has the example connectors form different source to graph databases.The target properties or options for writing to graph databas as target can be refered from [here](https://neo4j.com/developer/spark/writing/)\n\n### File Source\n\n#### CSV to Neo4j Connector\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.{Column, DataFrame, SaveMode}\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\n\nval neo4jParams = Map(\"url\" -\u003e \"bolt://host:7687\",\n    \"authentication.basic.username\" -\u003e \"neo4j\",\n    \"authentication.basic.password\" -\u003e \"****\"\n  )\n\nval csvtoNeo4j = SpearConnector\n    .createConnector(\"CSV-to-Neo4j\")\n    .source(\"file\", \"csv\")\n    .target(\"graph\", \"neo4j\")\n    .getConnector\ncsvtoNeo4j.setVeboseLogging(true)\n\ncsvtoNeo4j\n    .source(sourceObject = \"file:///opt/spear-framework/data/FinancialSample.csv\", Map(\"header\" -\u003e \"true\", \"inferSchema\" -\u003e \"true\"))\n    .saveAs(\"__STAGE__\")\n    .transformSql(\n      \"\"\"\n        |select Segment,Country,Product\n        |`Units Sold`,`Manufacturing Price`\n        |from __STAGE__\"\"\".stripMargin)\n    .targetGraphDB(objectName = \"finance\", props = neo4jParams, saveMode = SaveMode.Overwrite)\ncsvtoNeo4j.stop()    \n```\n\n##### Output\n```commandline\n21/06/17 13:03:51 INFO targetGraphDB.FiletoGraphDB: Connector:CSV-to-Neo4j to Target:GraphDB with Format:neo4j from Source:file:///opt/spear-framework/data/FinancialSample.csv with Format:csv started running !!\n21/06/17 13:03:51 INFO targetGraphDB.FiletoGraphDB: Reading source file: file:///opt/spear-framework/data/FinancialSample.csv with format: csv status:success\n+----------------+-------+-----------+-------------+----------+-------------------+----------+------------+---------+------------+------------+------------+---------+------------+------------+----+\n|Segment         |Country|Product    |Discount Band|Units Sold|Manufacturing Price|Sale Price|Gross Sales |Discounts|Sales       |COGS        |Profit      |Date     |Month Number| Month Name |Year|\n+----------------+-------+-----------+-------------+----------+-------------------+----------+------------+---------+------------+------------+------------+---------+------------+------------+----+\n|Government      |Canada | Carretera | None        | 1,618.50 |3.0                |20.0      | 32,370.00  | -       | 32,370.00  | 16,185.00  | 16,185.00  |1/1/2014 |1           | January    |2014|\n|Government      |Germany| Carretera | None        | 1,321.00 |3.0                |20.0      | 26,420.00  | -       | 26,420.00  | 13,210.00  | 13,210.00  |1/1/2014 |1           | January    |2014|\n|Midmarket       |France | Carretera | None        | 2,178.00 |3.0                |15.0      | 32,670.00  | -       | 32,670.00  | 21,780.00  | 10,890.00  |6/1/2014 |6           | June       |2014|\n|Midmarket       |Germany| Carretera | None        | 888.00   |3.0                |15.0      | 13,320.00  | -       | 13,320.00  | 8,880.00   | 4,440.00   |6/1/2014 |6           | June       |2014|\n|Midmarket       |Mexico | Carretera | None        | 2,470.00 |3.0                |15.0      | 37,050.00  | -       | 37,050.00  | 24,700.00  | 12,350.00  |6/1/2014 |6           | June       |2014|\n|Government      |Germany| Carretera | None        | 1,513.00 |3.0                |350.0     | 529,550.00 | -       | 529,550.00 | 393,380.00 | 136,170.00 |12/1/2014|12          | December   |2014|\n|Midmarket       |Germany| Montana   | None        | 921.00   |5.0                |15.0      | 13,815.00  | -       | 13,815.00  | 9,210.00   | 4,605.00   |3/1/2014 |3           | March      |2014|\n|Channel Partners|Canada | Montana   | None        | 2,518.00 |5.0                |12.0      | 30,216.00  | -       | 30,216.00  | 7,554.00   | 22,662.00  |6/1/2014 |6           | June       |2014|\n|Government      |France | Montana   | None        | 1,899.00 |5.0                |20.0      | 37,980.00  | -       | 37,980.00  | 18,990.00  | 18,990.00  |6/1/2014 |6           | June       |2014|\n|Channel Partners|Germany| Montana   | None        | 1,545.00 |5.0                |12.0      | 18,540.00  | -       | 18,540.00  | 4,635.00   | 13,905.00  |6/1/2014 |6           | June       |2014|\n+----------------+-------+-----------+-------------+----------+-------------------+----------+------------+---------+------------+------------+------------+---------+------------+------------+----+\nonly showing top 10 rows\n\n21/06/17 13:03:51 INFO targetGraphDB.FiletoGraphDB: Saving data as temporary table:__STAGE__ success\n21/06/17 13:03:51 INFO targetGraphDB.FiletoGraphDB: Executing transformation sql:\nselect Segment,Country,Product\n`Units Sold`,`Manufacturing Price`\nfrom __STAGE__ status :success\n+----------------+-------+-----------+-------------------+\n|Segment         |Country|Units Sold |Manufacturing Price|\n+----------------+-------+-----------+-------------------+\n|Government      |Canada | Carretera |3.0                |\n|Government      |Germany| Carretera |3.0                |\n|Midmarket       |France | Carretera |3.0                |\n|Midmarket       |Germany| Carretera |3.0                |\n|Midmarket       |Mexico | Carretera |3.0                |\n|Government      |Germany| Carretera |3.0                |\n|Midmarket       |Germany| Montana   |5.0                |\n|Channel Partners|Canada | Montana   |5.0                |\n|Government      |France | Montana   |5.0                |\n|Channel Partners|Germany| Montana   |5.0                |\n+----------------+-------+-----------+-------------------+\nonly showing top 10 rows\n\n21/06/17 13:03:52 INFO targetGraphDB.FiletoGraphDB: Write data to object:finance completed with status:success\n+----------------+-------+-----------+-------------------+\n|Segment         |Country|Units Sold |Manufacturing Price|\n+----------------+-------+-----------+-------------------+\n|Government      |Canada | Carretera |3.0                |\n|Government      |Germany| Carretera |3.0                |\n|Midmarket       |France | Carretera |3.0                |\n|Midmarket       |Germany| Carretera |3.0                |\n|Midmarket       |Mexico | Carretera |3.0                |\n|Government      |Germany| Carretera |3.0                |\n|Midmarket       |Germany| Montana   |5.0                |\n|Channel Partners|Canada | Montana   |5.0                |\n|Government      |France | Montana   |5.0                |\n|Channel Partners|Germany| Montana   |5.0                |\n+----------------+-------+-----------+-------------------+\nonly showing top 10 rows\n```\nOther connectors with graph db as destination are avaialble [here](https://romans-weapon.github.io/spear-framework/#target-graphdb).\n\n\n## Other Functionalities of Spear\nThis section describes other functionalities which you can use with spear\n\n### Merge using executeQuery API\nWhen you want to merge or join two sources of the same type and then tranform and load the resultant data you can use the executeQuery() function of spear.Below is the example\n\n```scala\n\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport org.apache.spark.sql.SaveMode\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\nval postgresToHiveConnector = SpearConnector\n  .createConnector(\"PostgresToHiveConnector\")\n  .source(sourceType = \"relational\", sourceFormat = \"jdbc\")\n  .target(targetType = \"FS\", targetFormat = \"parquet\")\n  .getConnector\n\npostgresToHiveConnector\n  .source(\"table1\", Map(\"driver\" -\u003e \"org.postgresql.Driver\", \"user\" -\u003e \"postgres_user\", \"password\" -\u003e \"mysecretpassword\", \"url\" -\u003e \"jdbc:postgresql://postgres:5432/pgdb\"))\n  .saveAs(\"tmp\")\n\npostgresToHiveConnector\n  .source(\"table2\", Map(\"driver\" -\u003e \"org.postgresql.Driver\", \"user\" -\u003e \"postgres_user\", \"password\" -\u003e \"mysecretpassword\", \"url\" -\u003e \"jdbc:postgresql://postgres:5432/pgdb\"))\n  .saveAs(\"tmp2\")\n  \n\npostgresToHiveConnector.executeQuery(\n  \"\"\"\n  //execute join query between the loaded sources\n   \"\"\".stripMargin)\n  .saveAs(\"result\")\n  .transformSql(\n    \"\"\"\n    //tranform sql\n    \"\"\".stripMargin)\n  .targetFS(destinationFilePath = \"/tmp/ingest\", saveAsTable = \"target\", saveMode = SaveMode.Overwrite)\n```\nSee more detailed explanation about executeQuery AP1 with diagrams [here](https://romans-weapon.github.io/spear-framework/#merge-using-executequery-api)\n\n### Write to multi-targets using branch API\nWhile using multitarget api make sure you are specifying thet target type while defining destination.\n\n```scala\nimport com.github.edge.roman.spear.SpearConnector\nimport org.apache.log4j.{Level, Logger}\nimport java.util.Properties\nimport org.apache.spark.sql.{Column, DataFrame, SaveMode}\n\nLogger.getLogger(\"com.github\").setLevel(Level.INFO)\nval properties = new Properties()\nproperties.put(\"driver\", \"org.postgresql.Driver\")\nproperties.put(\"user\", \"postgres_user\")\nproperties.put(\"password\", \"mysecretpassword\")\nproperties.put(\"url\", \"jdbc:postgresql://postgres:5432/pgdb\")\n\nval mongoProps = new Properties()\nmongoProps.put(\"uri\", \"mongodb://mongodb:27017\")\n\nval csvMultiTargetConnector = SpearConnector\n  .createConnector(\"CSV-Any\")\n  .source(sourceType = \"file\", sourceFormat = \"csv\")\n  .multiTarget  //For multi target use multitarget intsead of target and provide the dest-format along with the target definition in the connector logic\n  .getConnector\n\ncsvMultiTargetConnector.setVeboseLogging(true)\n\ncsvMultiTargetConnector\n  .source(sourceObject = \"file:///opt/spear-framework/data/us-election-2012-results-by-county.csv\", Map(\"header\" -\u003e \"true\", \"inferSchema\" -\u003e \"true\"))\n  .saveAs(\"_table_\")\n  .branch\n  .targets(\n     csvMultiTargetConnector.targetFS(destinationFilePath = \"\", destFormat = \"parquet\", saveAsTable = \"ingest.raw\", saveMode = SaveMode.Overwrite) -- //target -1\n    //target -2\n    ....\n    //target -n\n  )\n\n```\nSee more detailed explanation about multi-destinations with diagrams [here](https://romans-weapon.github.io/spear-framework/#multi-targets-using-branch-api)\n\n\n## Contributions and License\n#### License\nSoftware Licensed under the [Apache License 2.0](LICENSE)\n#### Author\nAnudeep Konaboina \u003ckrantianudeep@gmail.com\u003e\n#### Contributor\nKayan Deshi \u003ckalyan.mgit@gmail.com\u003e\n\n## Visit Website\nWatch example connectors from different sources to different targets, visit github page [here](https://romans-weapon.github.io/spear-framework/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fromans-weapon%2Fspear-framework","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fromans-weapon%2Fspear-framework","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fromans-weapon%2Fspear-framework/lists"}