{"id":13906349,"url":"https://github.com/typelevel/frameless","last_synced_at":"2025-05-14T11:12:19.335Z","repository":{"id":29485698,"uuid":"33022955","full_name":"typelevel/frameless","owner":"typelevel","description":"Expressive types for Spark.","archived":false,"fork":false,"pushed_at":"2025-04-19T20:43:34.000Z","size":3344,"stargazers_count":884,"open_issues_count":46,"forks_count":137,"subscribers_count":27,"default_branch":"master","last_synced_at":"2025-05-07T10:25:48.299Z","etag":null,"topics":["fp","functional-programming","scala","spark","typelevel"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/typelevel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-03-28T06:03:10.000Z","updated_at":"2025-04-19T20:40:05.000Z","dependencies_parsed_at":"2023-02-10T22:16:00.451Z","dependency_job_id":"f49a9bea-d961-4f15-ba85-116cbcf0ff7a","html_url":"https://github.com/typelevel/frameless","commit_stats":{"total_commits":842,"total_committers":62,"mean_commits":"13.580645161290322","dds":0.8444180522565321,"last_synced_commit":"724195eafa1651022ca362637d7795d7d944bb72"},"previous_names":["adelbertc/frameless"],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/typelevel%2Fframeless","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/typelevel%2Fframeless/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/typelevel%2Fframeless/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/typelevel%2Fframeless/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/typelevel","download_url":"https://codeload.github.com/typelevel/frameless/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253620346,"owners_count":21937355,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fp","functional-programming","scala","spark","typelevel"],"created_at":"2024-08-06T23:01:33.991Z","updated_at":"2025-05-14T11:12:14.322Z","avatar_url":"https://github.com/typelevel.png","language":"Scala","funding_links":[],"categories":["Scala"],"sub_categories":[],"readme":"# Frameless\n\n[![Workflow Badge](https://github.com/typelevel/frameless/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/typelevel/frameless/actions/workflows/ci.yml)\n[![Codecov Badge](https://codecov.io/gh/typelevel/frameless/branch/master/graph/badge.svg)](https://codecov.io/gh/typelevel/frameless)\n[![Discord Badge](https://img.shields.io/badge/chat-on%20discord-46BC99)](https://discord.gg/ZDZsxWcBJt)\n[![Maven Badge](https://img.shields.io/maven-central/v/org.typelevel/frameless-core_2.12?color=blue)](https://search.maven.org/search?q=g:org.typelevel%20and%20frameless)\n[![Snapshots Badge](https://img.shields.io/nexus/s/https/s01.oss.sonatype.org/org.typelevel/frameless-core_2.12)](https://s01.oss.sonatype.org/content/repositories/snapshots/org/typelevel/frameless-core_2.12/)\n\nFrameless is a Scala library for working with [Spark](http://spark.apache.org/) using more expressive types.\nIt consists of the following modules:\n\n* `frameless-dataset` for a more strongly typed `Dataset`/`DataFrame` API\n* `frameless-ml` for a more strongly typed Spark ML API based on `frameless-dataset`\n* `frameless-cats` for using Spark's `RDD` API with [cats](https://github.com/typelevel/cats)\n\nNote that while Frameless is still getting off the ground, it is very possible that breaking changes will be\nmade for at least the next few versions.\n\nThe Frameless project and contributors support the\n[Typelevel](http://typelevel.org/) [Code of Conduct](http://typelevel.org/code-of-conduct.html) and want all its\nassociated channels (e.g. GitHub, Discord) to be a safe and friendly environment for contributing and learning.\n\n## Versions and dependencies\n\nThe compatible versions of [Spark](http://spark.apache.org/) and\n[cats](https://github.com/typelevel/cats) are as follows:\n\n| Frameless | Spark                       | Cats     | Cats-Effect | Scala       |\n|-----------|-----------------------------|----------|-------------|-------------|\n| 0.16.0    | 3.5.0 / 3.4.0 / 3.3.0       | 2.x      | 3.x         | 2.12 / 2.13 |\n| 0.15.0    | 3.4.0 / 3.3.0 / 3.2.2       | 2.x      | 3.x         | 2.12 / 2.13 |\n| 0.14.1    | 3.4.0 / 3.3.0 / 3.2.2       | 2.x      | 3.x         | 2.12 / 2.13 |\n| 0.14.0    | 3.3.0 / 3.2.2 / 3.1.3       | 2.x      | 3.x         | 2.12 / 2.13 |\n| 0.13.0    | 3.3.0 / 3.2.2 / 3.1.3       | 2.x      | 3.x         | 2.12 / 2.13 |\n| 0.12.0    | 3.2.1 / 3.1.3 / 3.0.3       | 2.x      | 3.x         | 2.12 / 2.13 |\n| 0.11.1    | 3.2.0 / 3.1.2 / 3.0.1       | 2.x      | 2.x         | 2.12 / 2.13 |\n| 0.11.0*   | 3.2.0 / 3.1.2 / 3.0.1       | 2.x      | 2.x         | 2.12 / 2.13 |\n| 0.10.1    | 3.1.0                       | 2.x      | 2.x         | 2.12        |\n| 0.9.0     | 3.0.0                       | 1.x      | 1.x         | 2.12        |\n| 0.8.0     | 2.4.0                       | 1.x      | 1.x         | 2.11 / 2.12 |\n| 0.7.0     | 2.3.1                       | 1.x      | 1.x         | 2.11        |\n| 0.6.1     | 2.3.0                       | 1.x      | 0.8         | 2.11        |\n| 0.5.2     | 2.2.1                       | 1.x      | 0.8         | 2.11        |\n| 0.4.1     | 2.2.0                       | 1.x      | 0.8         | 2.11        |\n| 0.4.0     | 2.2.0                       | 1.0.0-IF | 0.4         | 2.11        |\n\n_\\* 0.11.0 has broken Spark 3.1.2 and 3.0.1 artifacts published._\n\nStarting 0.11 we introduced Spark cross published artifacts:\n\n* By default, frameless artifacts depend on the most recent Spark version\n* Suffix `-spark{major}{minor}` is added to artifacts that are released for the previous Spark version(s)\n\nArtifact names examples:\n\n* `frameless-dataset` (the latest Spark dependency)\n* `frameless-dataset-spark33` (Spark 3.3.x dependency)\n* `frameless-dataset-spark32` (Spark 3.2.x dependency)\n\nVersions 0.5.x and 0.6.x have identical features. The first is compatible with Spark 2.2.1 and the second with 2.3.0.\n\nThe **only** dependency of the `frameless-dataset` module is on [shapeless](https://github.com/milessabin/shapeless) 2.3.2.\nTherefore, depending on `frameless-dataset`, has a minimal overhead on your Spark's application jar.\nOnly the `frameless-cats` module depends on cats and cats-effect, so if you prefer to work just with `Datasets` and not with `RDD`s,\nyou may choose not to depend on `frameless-cats`.\n\nFrameless intentionally **does not** have a compile dependency on Spark.\nThis essentially allows you to use any version of Frameless with any version of Spark.\nThe aforementioned table simply provides the versions of Spark we officially compile\nand test Frameless with, but other versions may probably work as well.\n\n### Breaking changes in 0.9\n\n* Spark 3 introduces a new ExpressionEncoder approach, the schema for single value DataFrame's is now [\"value\"](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala#L270) not \"_1\".\n\n## Why?\n\nFrameless introduces a new Spark API, called `TypedDataset`.\nThe benefits of using `TypedDataset` compared to the standard Spark `Dataset` API are as follows:\n\n* Typesafe columns referencing (e.g., no more runtime errors when accessing non-existing columns)\n* Customizable, typesafe encoders (e.g., if a type does not have an encoder, it should not compile)\n* Enhanced type signature for built-in functions (e.g., if you apply an arithmetic operation on a non-numeric column, you\nget a compilation error)\n* Typesafe casting and projections\n\nClick [here](http://typelevel.org/frameless/TypedDatasetVsSparkDataset.html) for a\ndetailed comparison of `TypedDataset` with Spark's `Dataset` API.\n\n## Documentation\n\n* [TypedDataset: Feature Overview](http://typelevel.org/frameless/FeatureOverview.html)\n* [Typed Spark ML](http://typelevel.org/frameless/TypedML.html)\n* [Comparing TypedDatasets with Spark's Datasets](http://typelevel.org/frameless/TypedDatasetVsSparkDataset.html)\n* [Typed Encoders in Frameless](http://typelevel.org/frameless/TypedEncoder.html)\n* [Injection: Creating Custom Encoders](http://typelevel.org/frameless/Injection.html)\n* [Job\\[A\\]](http://typelevel.org/frameless/Job.html)\n* [Using Cats with RDDs](http://typelevel.org/frameless/Cats.html)\n* [Proof of Concept: TypedDataFrame](http://typelevel.org/frameless/TypedDataFrame.html)\n\n## Quick Start\n\nSince the 0.9.x release, Frameless is compiled only against Scala 2.12.x.\n\nTo use Frameless in your project add the following in your `build.sbt` file as needed:\n\n```scala\nval framelessVersion = \"\u003clatest version\u003e\"\n\nresolvers ++= Seq(\n  // for snapshot artifacts only\n  \"s01-oss-sonatype\" at \"https://s01.oss.sonatype.org/content/repositories/snapshots\"\n)\n\nlibraryDependencies ++= List(\n  \"org.typelevel\" %% \"frameless-dataset\" % framelessVersion,\n  \"org.typelevel\" %% \"frameless-ml\"      % framelessVersion,\n  \"org.typelevel\" %% \"frameless-cats\"    % framelessVersion\n)\n```\n\nAn easy way to bootstrap a Frameless sbt project:\n\n* if you have [Giter8][g8] installed then simply:\n\n```bash\ng8 imarios/frameless.g8\n```\n\n- with sbt \u003e= 0.13.13:\n\n```bash\nsbt new imarios/frameless.g8\n```\n\nTyping `sbt console` inside your project will bring up a shell with Frameless\nand all its dependencies loaded (including Spark).\n\n## Need help?\n\nFeel free to messages us on our [discord](https://discord.gg/ZDZsxWcBJt)\nchannel for any issues/questions.\n\n## Development\n\nWe require at least _one_ sign-off (thumbs-up, +1, or similar) to merge pull requests. The current maintainers\n(people who can merge pull requests) are:\n\n* [adelbertc](https://github.com/adelbertc)\n* [imarios](https://github.com/imarios)\n* [kanterov](https://github.com/kanterov)\n* [non](https://github.com/non)\n* [OlivierBlanvillain](https://github.com/OlivierBlanvillain/)\n\n### Testing\n\nFrameless contains several property tests.  To avoid `OutOfMemoryError`s, we\ntune the default generator sizes.  The following environment variables may\nbe set to adjust the size of generated collections in the `TypedDataSet` suite:\n\n| Property                    | Default |\n|-----------------------------|--------:|\n| FRAMELESS_GEN_MIN_SIZE      |       0 |\n| FRAMELESS_GEN_SIZE_RANGE    |      20 |\n\n## License\n\nCode is provided under the Apache 2.0 license available at \u003chttp://opensource.org/licenses/Apache-2.0\u003e,\nas well as in the LICENSE file. This is the same license used as Spark.\n\n[g8]: http://www.foundweekends.org/giter8/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftypelevel%2Fframeless","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftypelevel%2Fframeless","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftypelevel%2Fframeless/lists"}