{"id":15222451,"url":"https://github.com/data-tools/big-data-types","last_synced_at":"2025-10-30T11:31:44.903Z","repository":{"id":37077097,"uuid":"310056962","full_name":"data-tools/big-data-types","owner":"data-tools","description":"A library to transform Scala product types and Schemes from different systems into other Schemes. Any implemented type automatically gets methods to convert it into the rest of the types and vice versa. E.g: a Spark Schema can be transformed into a BigQuery table.","archived":false,"fork":false,"pushed_at":"2025-01-06T14:50:32.000Z","size":3342,"stargazers_count":13,"open_issues_count":10,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-02T08:12:02.068Z","etag":null,"topics":["apache-spark","bigquery","bigquery-tables","cassandra","circe","database-types","scala","schemas","spark","typeclass","typeclass-derivation","typesafe"],"latest_commit_sha":null,"homepage":"https://data-tools.github.io/big-data-types/","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/data-tools.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-04T16:25:39.000Z","updated_at":"2025-01-06T14:49:29.000Z","dependencies_parsed_at":"2023-10-16T10:31:48.996Z","dependency_job_id":"2339eed8-90e1-47c5-ba8f-a4ad920282aa","html_url":"https://github.com/data-tools/big-data-types","commit_stats":{"total_commits":536,"total_committers":9,"mean_commits":59.55555555555556,"dds":0.7518656716417911,"last_synced_commit":"8a1e1d4ec7a17286527f5c7a0d699d0546e9d327"},"previous_names":[],"tags_count":30,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-tools%2Fbig-data-types","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-tools%2Fbig-data-types/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-tools%2Fbig-data-types/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-tools%2Fbig-data-types/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/data-tools","download_url":"https://codeload.github.com/data-tools/big-data-types/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238960553,"owners_count":19559294,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-spark","bigquery","bigquery-tables","cassandra","circe","database-types","scala","schemas","spark","typeclass","typeclass-derivation","typesafe"],"created_at":"2024-09-28T15:12:05.746Z","updated_at":"2025-10-30T11:31:43.829Z","avatar_url":"https://github.com/data-tools.png","language":"Scala","readme":"# Big Data Types\n[![CI Tests](https://github.com/data-tools/big-data-types/workflows/ci-tests/badge.svg)](https://github.com/data-tools/big-data-types/actions/workflows/ci-tests.yml)\n[![BQ IT](https://github.com/data-tools/big-data-types/workflows/BigQuery-Integration/badge.svg)](https://github.com/data-tools/big-data-types/actions/workflows/bigquery-integration.yml)\n![Maven Central](https://img.shields.io/maven-central/v/io.github.data-tools/big-data-types-core_2.13)\n[![codecov](https://codecov.io/gh/data-tools/big-data-types/branch/main/graph/badge.svg?token=1DUBMIAEO8)](https://codecov.io/gh/data-tools/big-data-types)\n[![Scala Steward badge](https://img.shields.io/badge/Scala_Steward-helping-blue.svg?style=flat\u0026logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA4AAAAQCAMAAAARSr4IAAAAVFBMVEUAAACHjojlOy5NWlrKzcYRKjGFjIbp293YycuLa3pYY2LSqql4f3pCUFTgSjNodYRmcXUsPD/NTTbjRS+2jomhgnzNc223cGvZS0HaSD0XLjbaSjElhIr+AAAAAXRSTlMAQObYZgAAAHlJREFUCNdNyosOwyAIhWHAQS1Vt7a77/3fcxxdmv0xwmckutAR1nkm4ggbyEcg/wWmlGLDAA3oL50xi6fk5ffZ3E2E3QfZDCcCN2YtbEWZt+Drc6u6rlqv7Uk0LdKqqr5rk2UCRXOk0vmQKGfc94nOJyQjouF9H/wCc9gECEYfONoAAAAASUVORK5CYII=)](https://scala-steward.org)\n\nA type-safe library to transform Case Classes into Database schemas and to convert implemented types into other types\n\n\n# Documentation\nCheck the [Documentation website](https://data-tools.github.io/big-data-types) to learn more about how to use this library\n  \n\n# Available conversions:\n\n|  From \\ To   |                                                                                                                       | Scala Types |      BigQuery      |       Spark        |     Cassandra      | Circe (JSON) |\n|:------------:|:---------------------------------------------------------------------------------------------------------------------:|:-----------:|:------------------:|:------------------:|:------------------:|:------------:|\n|    Scala     |               \u003cimg src=\"./website/static/img/logos/scala.png\" style=\"max-height:50px;max-width:70px\" /\u003e               |      -      | :white_check_mark: | :white_check_mark: | :white_check_mark: |              |\n|   BigQuery   |             \u003cimg src=\"./website/static/img/logos/bigquery.png\" style=\"max-height:50px;max-width:70px\" /\u003e              |             |         -          | :white_check_mark: | :white_check_mark: |              |\n|    Spark     |  \u003cimg src=\"./website/static/img/logos/spark.png\" style=\"background-color:white;max-height:100px;max-width:100px\" /\u003e   |             | :white_check_mark: |         -          | :white_check_mark: |              |\n|  Cassandra   | \u003cimg src=\"./website/static/img/logos/cassandra.png\" style=\"background-color:white;max-height:50px;max-width:100px\" /\u003e |             | :white_check_mark: | :white_check_mark: |         -          |              |\n| Circe (JSON) |    \u003cimg src=\"./website/static/img/logos/circe.png\" style=\"background-color:gray;max-height:50px;max-width:70px\" /\u003e    |             | :white_check_mark: | :white_check_mark: | :white_check_mark: |       -      |\n\n\nVersions for Scala ![Scala 2.12](https://img.shields.io/badge/Scala-2.12-red) ,![Scala_2.13](https://img.shields.io/badge/Scala-2.13-red) \nand ![Scala 3.x](https://img.shields.io/badge/Scala-3.x-red) are available in Maven\n\n\n# Quick Start\nThe library has different modules that can be imported separately\n- BigQuery\n```\nlibraryDependencies += \"io.github.data-tools\" %% \"big-data-types-bigquery\" % \"{version}\"\n```\n- Spark\n```\nlibraryDependencies += \"io.github.data-tools\" %% \"big-data-types-spark\" % \"{version}\"\n```\n- Cassandra\n```\nlibraryDependencies += \"io.github.data-tools\" %% \"big-data-types-cassandra\" % \"{version}\"\n```\n- Circe (JSON)\n```\nlibraryDependencies += \"io.github.data-tools\" %% \"big-data-types-circe\" % \"{version}\"\n```\n- Core\n    - To get support for abstract SqlTypes, it is included in the others, so it is not needed if you are using one of the others\n```\nlibraryDependencies += \"io.github.data-tools\" %% \"big-data-types-core\" % \"{version}\"\n```\n\nIn order to transform one type into another, both modules have to be imported.\n\n## How it works\n\nThe library internally uses a generic ADT ([SqlType](https://github.com/data-tools/big-data-types/blob/main/core/src/main/scala_3/org/datatools/bigdatatypes/basictypes/SqlType.scala))\nthat can store any schema representation, and from there, it can be converted into any other.\nTransformations are done through 2 different type-classes.\n\n### Quick examples\nCase Classes to other types\n```scala\n//Spark\nval s: StructType = SparkSchemas.schema[MyCaseClass]\n//BigQuery\nval bq: List[Field] = SqlTypeToBigQuery[MyCaseClass].bigQueryFields // just the schema\nBigQueryTable.createTable[MyCaseClass](\"myDataset\", \"myTable\") // Create a table in a BigQuery real environment\n//Cassandra\nval c: CreateTable = CassandraTables.table[MyCaseClass]\n```\n\nThere are also `extension methods` that make easier the transformation between types when there are instances\n```scala\n//from Case Class instance\nval foo: MyCaseClass = ???\nfoo.asBigQuery // List[Field]\nfoo.asSparkSchema // StructType\nfoo.asCassandra(\"TableName\", \"primaryKey\") // CreateTable\n```\n\n**Conversion between types** works in the same way\n```scala\n// From Spark to others\nval foo: StructType = myDataFrame.schema\nfoo.asBigQuery // List[Field]\nfoo.asCassandra(\"TableName\", \"primaryKey\") // CreateTable\n\n//From BigQuery to others\nval foo: Schema = ???\nfoo.asSparkFields // List[StructField]\nfoo.asSparkSchema // StructType\nfoo.asCassandra(\"TableName\", \"primaryKey\") // CreateTable\n\n//From Cassandra to others\nval foo: CreateTable = ???\nfoo.asSparkFields // List[StructField]\nfoo.asSparkSchema // StructType\nfoo.asBigQuery // List[Field]\nfoo.asBigQuery.schema // Schema\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdata-tools%2Fbig-data-types","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdata-tools%2Fbig-data-types","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdata-tools%2Fbig-data-types/lists"}