{"id":13801590,"url":"https://github.com/fingo/spata","last_synced_at":"2025-05-13T11:31:27.269Z","repository":{"id":51122025,"uuid":"267963417","full_name":"fingo/spata","owner":"fingo","description":"Functional, stream-based CSV processor for Scala","archived":false,"fork":false,"pushed_at":"2025-04-05T19:28:04.000Z","size":920,"stargazers_count":36,"open_issues_count":0,"forks_count":8,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-05T20:18:53.684Z","etag":null,"topics":["cats","csv","fs2","scala","stream"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fingo.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":"contributing.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-29T22:01:35.000Z","updated_at":"2025-03-25T19:56:30.000Z","dependencies_parsed_at":"2023-12-22T17:24:00.964Z","dependency_job_id":"8da5482e-6303-45f4-9dc4-9acce1f96ae3","html_url":"https://github.com/fingo/spata","commit_stats":null,"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fingo%2Fspata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fingo%2Fspata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fingo%2Fspata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fingo%2Fspata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fingo","download_url":"https://codeload.github.com/fingo/spata/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253932892,"owners_count":21986473,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cats","csv","fs2","scala","stream"],"created_at":"2024-08-04T00:01:24.730Z","updated_at":"2025-05-13T11:31:27.255Z","avatar_url":"https://github.com/fingo.png","language":"Scala","funding_links":[],"categories":["Table of Contents"],"sub_categories":["CSV"],"readme":"spata\n=====\n\n[![Build Status](https://github.com/fingo/spata/actions/workflows/ci.yaml/badge.svg)](https://github.com/fingo/spata/actions/workflows/ci.yaml)\n[![Code Coverage](https://codecov.io/gh/fingo/spata/branch/master/graph/badge.svg)](https://codecov.io/gh/fingo/spata)\n[![Maven Central](https://img.shields.io/maven-central/v/info.fingo/spata_3.svg?label=Maven%20Central)](https://search.maven.org/search?q=g:%22info.fingo%22%20AND%20a:%22spata_3%22)\n[![Scala Doc](https://javadoc.io/badge2/info.fingo/spata_3/javadoc.svg)](https://javadoc.io/doc/info.fingo/spata_3/latest/info/fingo/spata/index.html)\n[![Gitter](https://badges.gitter.im/fingo-spata/community.svg)](https://gitter.im/fingo-spata/community)\n\n**spata** is a functional tabular data (`CSV`) processor for Scala.\nThe library is backed by [FS2 - Functional Streams for Scala](https://github.com/functional-streams-for-scala/fs2).\n\n\u003e **spata 3** supports Scala 3 only.\n\u003e Scala 2 support is still available in [**spata 2**](https://github.com/fingo/spata/tree/spata2).\n\nThe main goal of the library is to provide handy, functional, stream-based API\nwith easy conversion between records and case classes, completed with precise information about possible flaws\nand their location in source data for parsing while maintaining good performance.\nProviding the location of the cause of a parsing error has been the main motivation to develop the library.\nIt is typically not that hard to parse a well-formatted `CSV` file,\nbut it could be a nightmare to locate the source of a problem in case of any distortions in a large data file.\n\nThe source (while parsing) and destination (while rendering) data format is assumed to conform basically to\n[RFC 4180](https://www.ietf.org/rfc/rfc4180.txt), but allows some variations - see `CSVConfig` for details.\n\n*   [Getting started](#getting-started)\n*   [Basic usage](#basic-usage)\n*   [Tutorial](#tutorial)\n*   [Alternatives](#alternatives)\n*   [Credits](#credits)\n\nGetting started\n---------------\n\nspata 3 is available for Scala 3.x and requires at least Java 11.\n\nTo use spata you have to add this single dependency to your `build.sbt`:\n```sbt\nlibraryDependencies += \"info.fingo\" %% \"spata\" % \"\u003cversion\u003e\"\n```\nThe latest version may be found on the badge above.\n\nLink to the current API version is available through the badge as well.\n\nBasic usage\n-----------\nThe whole parsing process in a simple case may look like this:\n```scala\nimport scala.io.Source\nimport cats.syntax.traverse.given // to get list.sequence\nimport cats.effect.IO\nimport cats.effect.unsafe.implicits.global  // default IORuntime\nimport fs2.Stream\nimport info.fingo.spata.CSVParser\nimport info.fingo.spata.io.Reader\n\ncase class Data(item: String, value: Double)\nval records = Stream\n  // get stream of CSV records while ensuring source cleanup\n  .bracket(IO(Source.fromFile(\"input.csv\")))(source =\u003e IO(source.close()))\n  .through(Reader.plain[IO].by) // produce stream of chars from source\n  .through(CSVParser[IO].parse)  // parse CSV file with default configuration and get CSV records\n  .filter(_.get[Double](\"value\").exists(_ \u003e 1000))  // do some operations using Record and Stream API\n  .map(_.to[Data]) // convert records to case class\n  .handleErrorWith(ex =\u003e Stream.emit(Left(ex))) // convert global (I/O, CSV structure) errors to Either\nval list = records.compile.toList.unsafeRunSync() // run everything while converting result to list\nval result = list.sequence  // convert List[Either[Throwable,Data]] to Either[Throwable,List[Data]]\n```\n\nAnother example may be taken from [FS2 readme](https://fs2.io/#/getstarted/example),\nassuming that the data is stored and written back in `CSV` format with two fields, `date` and `temp`:\n```scala\nimport java.nio.file.Paths\nimport scala.io.Codec\nimport cats.effect.{IO, IOApp}\nimport fs2.Stream\nimport info.fingo.spata.{CSVParser, CSVRenderer}\nimport info.fingo.spata.io.{Reader, Writer}\n\nobject Converter extends IOApp.Simple:\n\n  val converter: Stream[IO, Unit] =\n    given codec: Codec = Codec.UTF8\n    def fahrenheitToCelsius(f: Double): Double = (f - 32.0) * (5.0 / 9.0)\n\n    Reader[IO]\n      .read(Paths.get(\"testdata/fahrenheit.txt\"))\n      .through(CSVParser[IO].parse)\n      .filter(r =\u003e r(\"temp\").exists(!_.isBlank))\n      .map(_.altered(\"temp\")(fahrenheitToCelsius))\n      .rethrow\n      .through(CSVRenderer[IO].render)\n      .through(Writer[IO].write(Paths.get(\"testdata/celsius.txt\")))\n\n  def run: IO[Unit] = converter.compile.drain\n```\nModified versions of this sample may be found in [error handling](#error-handling)\nand [schema validation](#schema-validation) parts of the tutorial.\n\nMore examples of how to use the library may be found in `src/test/scala/info/fingo/sample/spata`.\n\nTutorial\n--------\n\n*   [Parsing](#parsing)\n*   [Rendering](#rendering)\n*   [Configuration](#configuration)\n*   [Reading and writing data](#reading-and-writing-data)\n*   [Getting actual data](#getting-actual-data)\n*   [Creating and modifying records](#creating-and-modifying-records)\n*   [Text parsing and rendering](#text-parsing-and-rendering)\n*   [Schema validation](#schema-validation)\n*   [Error handling](#error-handling)\n*   [Logging](#logging)\n\n### Parsing\n\nCore spata operation is a transformation from a stream of characters into a stream of `Record`s.\nThis is available through `CSVParser.parse` method (supplying FS2 `Pipe`)\nand is probably the best way to include `CSV` parsing into any FS2 stream processing pipeline:\n```scala\nval input: Stream[IO, Char] = ???\nval parser: CSVParser[IO] = CSVParser[IO]\nval output: Stream[IO, Record] = input.through(parser.parse)\n```\nIn accordance with FS2, spata is polymorphic in the effect type and may be used with different effect implementations\n([Cats Effect IO](https://typelevel.org/cats-effect/docs/getting-started)\nor [ZIO](https://zio.dev/overview/getting-started),\nespecially with support for [ZIO-CE interop](https://zio.dev/guides/interop/with-cats-effect/);\ninteroperability with [Monix](https://monix.io/) is not possible yet,\nbut [this may change in the future](https://alexn.org/blog/2022/04/05/future-monix-typelevel/#the-future-of-monix)).\nPlease note, however, that Cats Effect [IO](https://typelevel.org/cats-effect/api/3.x/cats/effect/IO.html)\nis the only effect implementation used for testing and documentation purposes.\n\nType class dependencies are defined in terms of the\n[Cats Effect](https://typelevel.org/cats-effect/docs/typeclasses) class hierarchy.\nTo support effect suspension, spata requires in general `cats.effect.Sync` type class implementation for its effect type.\nSome methods need enhanced type classes to support asynchronous or concurrent computation.\nSome are satisfied with more general effects.\n\nLike in the case of any other FS2 processing, spata consumes only as much of the source stream as required,\ngive or take a chunk size.\n\nField and record delimiters are required to be single characters.\nThere are however no other assumptions about them - particularly the record delimiter does not have to be a line break\nand spata does not assume line break presence in the source data - it does not read the data by lines.\n\nIf newline (`LF`, `\\n`, `0x0A`) is used as the record delimiter,\ncarriage return character (`CR`, `\\r`, `0x0D`) is automatically skipped if not escaped, to support `CRLF` line breaks.\n\nFields containing delimiters (field or record) or quotes have to be wrapped in quotation marks.\nAs defined in [RFC 4180](https://www.ietf.org/rfc/rfc4180.txt),\nquotation marks in the content have to be escaped through double quotation marks.\n\nBy default, in accordance with the standard, whitespace characters are considered part of the field and are not ignored.\nNonetheless, it is possible to turn on trimming of leading and trailing whitespaces with a configuration option.\nThis differs from stripping whitespaces from resulting field content\nbecause it distinguishes between quoted and unquoted spaces. For example, having the following input:\n```csv\nX,Y,Z\nxxx,\" yyy \",zzz\nxxx, yyy ,zzz\n```\nwithout trimming the content of `Y` field will be `\" yyy \"` for both records.\nWith trimming on, we get `\" yyy \"` for the first record and `\"yyy\"` for the second.\n\nPlease also note, that the following content:\n```csv\nX,Y,Z\nxxx, \" yyy \" ,zzz\n```\nis correct with trimming on (and produces `\" yyy \"` for field `Y`), but will cause an error without it,\nas spaces are considered regular characters in this case and quote has to be put around the whole field.\n\nNot all invisible characters (notably non-breaking space, `'\\u00A0'`) are whitespaces.\nSee Java's\n[Char.isWhitespace](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Character.html#isWhitespace(char))\nfor details.\n\nIf we have to work with a stream of `String`s (e.g. from FS2 `text.utf8.decode` or `io.file.Files.readUtf8`),\nwe may used string-oriented parse method:\n```scala\nval input: Stream[IO, String] = ???\nval output: Stream[IO, Record] = input.through(CSVParser[IO].parseS)\n```\nAlternatively, it is always possible to convert a stream of strings into a stream of characters:\n```scala\nval ss: Stream[IO, String] = ???\nval sc: Stream[IO, Char] = ss.through(text.string2char)\n```\n\nIn addition to the `parse`, `CSVParser` provides other methods to read `CSV` data:\n*   `get`, to load data into `List[Record]`, which may be handy for small data sets,\n*   `process`, to deal with data record by record through a callback function, synchronously,\n*   `async`, to process data through a callback function in an asynchronous way.\n\nThe three above functions return the result (`List` or `Unit`) wrapped in an effect and require call to one of the\n\"at the end of the world\" methods (`unsafeRunSync` or `unsafeRunAsync` for `cats.effect.IO`) to trigger computation.\n```scala\nval stream: Stream[IO, Char] = ???\nval parser: CSVParser[IO] = CSVParser[IO]\nval list: List[Record] = parser.get(stream).unsafeRunSync()\n```\nAlternatively, instead of calling an unsafe function,\nthe whole processing may run through [IOApp](https://typelevel.org/cats-effect/api/3.x/cats/effect/IOApp.html).\n\nSee [Reading and writing data](#reading-and-writing-data) for helper methods\nto get a stream of characters from various sources.\n\n### Rendering\n\nComplementary to parsing, spata offers `CSV` rendering feature -\nit allows conversion from a stream of `Record`s to a stream of characters.\nThis is available through `CSVRenderer.render` method (supplying FS2 `Pipe`):\n```scala\nval input: Stream[IO, Record] = ???\nval renderer: CSVRenderer[IO] = CSVRenderer[IO]\nval output: Stream[IO, Char] = input.through(renderer.render)\n```\nAs with parsing, rendering is polymorphic in the effect type and may be used with different effect implementations.\nThe renderer has weaker demands for its effect type than parser and requires only the `MonadError` type class implementation.\n\nThe `render` method may encode only a subset of fields in a record. This is controlled by the `header` parameter,\nbeing optionally passed to the method:\n```scala\nval input: Stream[IO, Record] = ???\nval header: Header = ???\nval renderer: CSVRenderer[IO] = CSVRenderer[IO]\nval output: Stream[IO, Char] = input.through(renderer.render(header))\n```\nThe provided header is used to select fields and does not cause adding header row to output.\nThis is controlled by `CSVConfig.hasHeader` parameter and may be induced even for `render` method without header argument.\n\nIf no explicit header is passed to `render`, it is extracted from the first record in the input stream.\n\nIf we have to create a stream of `String`s (e.g. to pass to FS2 `text.utf8.encode` or `io.file.Files.writeUtf8`),\nwe may used string-oriented render method:\n```scala\nval input: Stream[IO, Record] = ???\nval output: Stream[IO, String] = input.through(CSVRenderer[IO].renderS)\n```\nAlternatively, it is always possible to convert a stream of characters into a stream of strings:\n```scala\nval sc: Stream[IO, Char] = ???\nval ss: Stream[IO, String] = sc.through(text.char2string)\n```\nPlease note that using `renderS` directly is more efficient than converting to characters,\nbecause rendering uses stream of strings as an intermediary format anyway.\n\nThe main advantage of using `CSVRenderer` over `makeString` and `intersperse` methods\nis its ability to properly escape special characters (delimiters and quotation marks) in source data.\nThe escape policy is set through configuration and by default, the fields are quoted only when required.\n\nLike parser, renderer supports any single-character field as record delimiters.\nAs result, the `render` method does not allow separating records with `CRLF`.\nIf this is required, the `rows` method has to be used:\n```scala\nval input: Stream[IO, Record] = ???\nval output: Stream[IO, String] = input.through(CSVRenderer[IO].rows).intersperse(\"\\r\\n\")\n```\nThe above stream of strings may be converted to a stream of characters as presented in the [rendering](#rendering) part.\n\nUnlike `render`, the `rows` method outputs all fields from each record and never outputs the header row.\n\nSee [Reading and writing data](#reading-and-writing-data) for helper methods\nto transmit a stream of characters to various destinations.\n\n### Configuration\n\n`CSVParser` and `CSVRenderer` are configured through `CSVConfig`, which is a parameter to their constructors.\nIt may be provided through a builder-like method, which takes the defaults and allows altering selected parameters:\n```scala\nval parser = CSVParser.config.fieldDelimiter(';').noHeader.parser[IO]\nval renderer = CSVRenderer.config.fieldDelimiter(';').noHeader.renderer[IO]\n```\n\nIndividual configuration parameters are described in\n`CSVConfig`'s [Scaladoc](https://javadoc.io/doc/info.fingo/spata_3/latest/info/fingo/spata/CSVConfig.html).\n\nA specific setting is the header mapping, available through `CSVConfig.mapHeader`.\nIt allows replacing original header values with more convenient ones or even defining header if no one is present.\nWhen set for the parser, the new values are then used in all operations referencing individual fields,\nincluding automatic conversion to case classes or tuples.\nThe mapping may be defined only for a subset of fields, leaving the rest in their original form.\n```csv\ndate,max temparature,min temparature\n2020-02-02,13.7,-2.2\n```\n```scala\nval stream: Stream[IO, Char] = ???\nval parser: CSVParser[IO] =\n  CSVParser.config.mapHeader(Map(\"max temparature\" -\u003e \"tempMax\", \"min temparature\" -\u003e \"tempMin\")).parser[IO]\nval frosty: Stream[IO, Record] = stream.through(parser.parse).filter(_.get[Double](\"minTemp\").exists(_ \u003c 0))\n```\nIt may also be defined for more fields than there are present in any particular data source,\nwhich allows using a single parser for multiple datasets with different headers.\n\nThere is also index-based header mapping available. It may be used not only to define or redefine header\nbut to remove duplicates as well:\n```csv\ndate,temparature,temparature\n2020-02-02,13.7,-2.2\n```\n```scala\nval stream: Stream[IO, Char] = ???\nval parser: CSVParser[IO] =\n  CSVParser.config.mapHeader(Map(1 -\u003e \"tempMax\", 2 -\u003e \"tempMin\")).parser[IO]\nval frosty: Stream[IO, Record] = stream.through(parser.parse).filter(_.get[Double](\"minTemp\").exists(_ \u003c 0))\n```\n\nHeader mapping may be used for renderer too, to output different header values from those used by `Record` or case class:\n```scala\nval stream: Stream[IO, Record] = ???\nval renderer: CSVRenderer[IO] =\n  CSVRenderer.config.mapHeader(Map(\"tempMax\" -\u003e \"max temparature\", \"tempMin\" -\u003e \"min temparature\")).renderer[IO]\nval frosty: Stream[IO, Char] = stream.filter(_.get[Double](\"minTemp\").exists(_ \u003c 0)).through(renderer.render)\n```\n\nFS2 takes care of limiting the amount of processed data and consumed memory to the required level.\nThis works well to restrict the number of records, nevertheless, each record has to be fully loaded into memory,\nno matter how large it is.\nThis is not a  problem if everything goes well - individual records are typically not that large.\nA record can, however, grow uncontrollably in case of incorrect configuration (e.g. wrong record delimiter)\nor malformed structure (e.g. unclosed quotation).\nTo prevent `OutOfMemoryError` in such situations,\nspata can be configured to limit the maximum size of a single field using `fieldSizeLimit`.\nIf this limit is exceeded during parsing, the processing stops with an error.\nBy default, no limit is specified.\n\n### Reading and writing data\n\nAs mentioned earlier, `CSVParser` requires a stream of characters as its input.\nTo simplify working with common data sources, like files or sockets, spata provides a few convenience methods,\navailable through its `io.Reader` object.\n\nSimilarly, `io.Writer` simplifies the process of writing a stream of characters produced by `CSVRenderer`\nto an external destination.\n\nThere are two groups of the `read` and `write` methods in `Reader` and `Writer`:\n\n*   basic ones, accessible through `Reader.plain` and `Writer.plain`,\n    where reading and writing is done synchronously on the current thread,\n\n*   with support for [thread shifting](https://typelevel.org/cats-effect/docs/thread-model#blocking),\n    accessible through `Reader.shifting` and `Writer.shifting`.\n\nIt is recommended to use the thread shifting version, especially for long reading or writing operations,\nfor better thread pool utilization.\nSee [Cats Effect schedulers](https://typelevel.org/cats-effect/docs/schedulers)\ndescription for thread pools configuration.\nMore information about threading may be found in\n[Cats Effect thread model](https://typelevel.org/cats-effect/docs/thread-model).\nThe plain (non-shifting) versions could be useful when (for any reason),\nthe underlying effect system is limited to `Sync` type class (the shifting versions require `Async`),\nor the data is read from `scala.io.Source` - the shifting version may be less performant in some scenarios.\n\nThe simplest way to read data from and write to a file is:\n```scala\nval stream: Stream[IO, Char] = Reader[IO].read(Path.of(\"data.csv\")) // Reader.apply is an alias for Reader.shifting\n// do some processing on stream\nval eff: Stream[IO, Unit] = stream.through(Writer[IO].write(Path.of(\"data.csv\"))) // Writer.apply is an alias for Writer.shifting\n```\nThere are explicit methods for reader and writer without thread shifting (doing i/o on current thread):\n```scala\nval stream: Stream[IO, Char] = Reader.plain[IO].read(Path.of(\"data.csv\"))\n// do some processing on stream\nval eff: Stream[IO, Unit] = stream.through(Writer.plain[IO].write(Path.of(\"data.csv\")))\n```\nThe thread shifting reader and writer provide similar methods:\n```scala\nval stream: Stream[IO, Char] = Reader.shifting[IO].read(Path.of(\"data.csv\"))\nval eff: Stream[IO, Unit] = stream.through(Writer.shifting[IO].write(Path.of(\"data.csv\")))\n```\nCats Effect 3 provides the thread shifting facility out of the box through `Async.blocking`,\nusing the built-in internal blocking thread pool.\n\nAll `read` operations load data in [chunks](https://fs2.io/guide.html#chunks) for better performance.\nChunk size may be supplied while creating a reader:\n```scala\nval stream: Stream[IO, Char] = Reader.plain[IO](1024).read(Path.of(\"data.csv\"))\n```\nIf not provided explicitly, a default chunk size will be used.\n\nExcept for `Source`, which is already character-based,\nother data sources and all data destinations require given instance of `Codec` to convert bytes into characters:\n```scala\ngiven codec: Codec = Codec.UTF8\n```\n\nThe caller to a `read` or a `write` method which takes a resource as a parameter (`Source`, `InputStream` or `OutputStream`)\nis responsible for its cleanup. This may be achieved through FS2 `Stream.bracket`:\n```scala\nval stream: Stream[IO, Unit] = for\n  source \u003c- Stream.bracket(IO(Source.fromFile(\"data.csv\")))(source =\u003e IO(source.close()))\n  destination \u003c- Stream.bracket(IO(new FileOutputStream(\"data.csv\")))(fos =\u003e IO(fos.close()))\n  out \u003c- Reader.shifting[IO].read(source)\n    // do some processing\n    .through(Writer[IO].write(destination))\nyield out\n```\nOther methods of resource acquisition and releasing are described in\n[Cats Effect tutorial](https://typelevel.org/cats-effect/docs/tutorial#acquiring-and-releasing-resources).\n\nUnlike the `Reader.read` method, which creates a new stream, `Writer.write` operates on an existing stream.\nBeing often the last operation in the stream pipeline, it has to allow access to the final stream,\nbeing a handle to run the entire processing.\nThis is why `write` returns a `Pipe`, converting a stream of characters to a unit stream.\n\nThere is a `by` method in `Reader`, which returns a `Pipe` too.\nIt converts a single-element stream containing data source into a stream of characters:\n```scala\nval stream: Stream[IO, Char] = Stream\n  .bracket(IO(Source.fromFile(\"data.csv\")))(source =\u003e IO(source.close()))\n  .through(Reader.shifting[IO].by)\n```\n\nThe parsing and rendering may be also backed up by [FS2 I/O](https://fs2.io/#/io?id=files),\nwith possible support from FS2 text decoding and encoding.\nBecause those methods operate on a stream of strings instead of a stream of characters,\nspecial, string-oriented methods for parsing (`parseS`) and rendering ('renderS') should be used.\nSee [parsing](#parsing) and [rendering](#rendring) respectively.\n\n### Getting actual data\n\nSole `CSV` parsing operation produces a stream of `Record`s.\nEach record may be seen as a map from `String` to `String`, where the keys, forming a header,\nare shared among all records.\nThe basic method to obtain individual values is through the call to `apply`, by key (taken from the header):\n```scala\nval record: Record = ???\nval value: Option[String] = record(\"some key\")\n```\nor index:\n```scala\nval record: Record = ???\nval value: Option[String] = record(0)\n```\n\nCSV `Record` supports retrieval of typed values.\nIn simple cases, when the value is serialized in its canonical form,\nlike ISO format for dates, which does not require any additional format information,\nor the formatting is fixed for all data, this may be done with single-parameter `get` function:\n```scala\nval record: Record = ???\nval num: Decoded[Double] = record.get[Double](\"123.45\")\n```\n`Decoded[A]` is an alias for `Either[ContentError, A]`.\nThis method requires a `text.StringParser[A]`, which is described in [Text parsing](#text-parsing-and-rendering).\n\n`get` has overloaded versions, which support formatting-aware parsing:\n```scala\nval record: Record = ???\nval df = new DecimalFormat(\"#,###\")\nval num: Decoded[Double] = record.get[Double](\"123,45\", df)\n```\nThis method requires a `text.FormattedStringParser[A, B]`,\nwhich is also described in [Text parsing](#text-parsing-and-rendering).\n(It uses an intermediary class `Field` to provide a nice syntax, but this should be transparent in most cases).\n\nAbove methods are available also in unsafe, exception-throwing version, accessible through `Record.unsafe` object:\n```scala\nval record: Record = ???\nval v1: String = record.unsafe(\"key\")\nval v2: String = record.unsafe(0)\nval n1: Double = record.unsafe.get[Double](\"123.45\")\nval df = new DecimalFormat(\"#,###\")\nval n2: Double = record.unsafe.get[Double](\"123,45\", df)\n```\nThey may throw `ContentError` exception.\n\nIn addition to retrieval of single fields, a `Record` may be converted to a case class or a tuple.\nAssuming a `CSV` data in the following form:\n```csv\nelement,symbol,melting,boiling\nhydrogen,H,13.99,20.271\nhelium,He,0.95,4.222\nlithium,Li,453.65,1603\n```\nthe data can be converted from a record directly into a case class:\n```scala\nval record: Record = ???\ncase class Element(symbol: String, melting: Double, boiling: Double)\nval element: Decoded[Element] = record.to[Element]\n```\nNotice that not all source fields have to be used for conversion.\nThe conversion is name-based - header keys have to match case class field names exactly, including their case.\nWe can use header mapping, described in [Configuration](#configuration), if they do not match.\n\nFor tuples, the header has to match traditional tuple field names (`_1`, `_2`, etc.)\nand is automatically generated in this form for source data without a header:\n```csv\nhydrogen,H,13.99,20.271\nhelium,He,0.95,4.222\nlithium,Li,453.65,1603\n```\n```scala\nval record: Record = ???\ntype Element = (String, String, Double, Double)\nval element: Decoded[Element] = record.to[Element]\n```\nNotice that in this case the first column has been included in the conversion to ensure header and tuple field matching.\n\nBoth forms of conversion require given instance of `StringParser`.\nParsers for common types and their default formats are provided through `StringParser` object\nand are automatically brought in scope.\nBecause it is not possible to explicitly provide custom formatter while converting a record into a case class,\na given `StringParser` instance has to be defined in case of specific formats or types:\n```csv\nelement,symbol,melting,boiling\nhydrogen,H,\"13,99\",\"20,271\"\nhelium,He,\"0,95\",\"4,222\"\nlithium,Li,\"453,65\",\"1603\"\n```\n```scala\nval record: Record = ???\ncase class Element(symbol: String, melting: Double, boiling: Double)\nval nf = NumberFormat.getInstance(new Locale(\"pl\", \"PL\"))\ngiven StringParser[Double] = (str: String) =\u003e nf.parse(str).doubleValue()\nval element: Decoded[Element] = record.to[Element]\n```\n\n### Creating and modifying records\n\nA `Record` is not only the result of parsing, it is also the source for `CSV` rendering.\nTo let the renderer do its work, we need to convert the data to records first.\nAs mentioned above, a `Record` is essentially a map from `String` (key) to `String` (value).\nThe keys form a header, which is, when only possible, shared among records.\nThis sharing is always in effect for records parsed by spata but requires some attention\nwhen records are created by application code, especially when performance and memory usage matter.\n\n#### Creating records\n\nThe basic way to create a record is to pass values as variable arguments:\n```scala\nval header = Header(\"symbol\", \"melting\", \"boiling\")\nval record = Record(\"H\",\"13.99\",\"20.271\")(header)\nval value = record(\"melting\")  // returns Some(\"13.99\")\n```\nThe header length is expected to match the number of values (arguments).\nIf it does not, it is reduced (the last keys are omitted) or extended (tuple-style keys are added).\n\nIt is possible to create a record without providing a header and rely on the header that is implicitly generated:\n```scala\nval record = Record.fromValues(\"H\",\"13.99\",\"20.271\")\nval header = record.header  // returns Header(\"_1\", \"_2\", \"_3\")\nval value = record(\"_2\")  // returns Some(\"13.99\")\n```\nBecause the record's header may be needless in some scenarios\n(e.g. while using the index-based `CSVRenderer.rows` method),\nits implicit creation is lazy - it is postponed until the header is accessed.\nIf the header is created, each record gets its own copy.\n\nA similar option provides record creation from key-value pairs:\n```scala\nval record = Record.fromPairs(\"symbol\" -\u003e \"H\", \"melting\" -\u003e \"13.99\", \"boiling\" -\u003e \"20.271\")\nval value = record(\"melting\")  // returns Some(\"13.99\")\n```\nThis method creates a header per record and it should not be used with large data sets.\n\nAll three above methods require record values to be already converted to strings.\nHowever, what we often need, is to create a record from typed data, with proper formatting / locale.\nThere are two methods to achieve that.\n\nThe first one is to employ a record builder, which allows adding typed values to the record one by one:\n```scala\nval record = Record.builder.add(\"symbol\", \"H\").add(\"melting\", 13.99).add(\"boiling\", 20.271).get\nval value = record(\"melting\")  // returns Some(\"13.99\")\n```\nTo convert a typed value to a string, this method requires a given `StringRenderer[A]` instance,\nwhich is described in [Text rendering](#text-parsing-and-rendering).\nSimilarly to `StringParser`, renderers for basic types and formats are provided out of the box\nand specific ones may be implemented:\n```scala\nval nf = NumberFormat.getInstance(new Locale(\"pl\", \"PL\"))\ngiven StringRenderer[Double] = (v: Double) =\u003e nf.format(v)\nval record = Record.builder.add(\"symbol\", \"H\").add(\"melting\", 13.99).add(\"boiling\", 20.271).get\nval value = record(\"melting\")  // returns Some(\"13,99\")\n```\n\nThe second method allows direct conversion of cases classes or tuples to records:\n```scala\ncase class Element(symbol: String, melting: Double, boiling: Double)\nval element = Element(\"H\", 13.99, 20.271)\nval record = Record.from(element)\nval value = record(\"melting\")  // returns Some(\"13.99\")\n```\nThis approach relies on `StringRenderer` for data formatting as well.\nIt may be used even more comfortably by calling a method directly on case class:\n```scala\nimport info.fingo.spata.Record.ProductOps\nval nf = NumberFormat.getInstance(new Locale(\"pl\", \"PL\"))\ngiven StringRenderer[Double] = (v: Double) =\u003e nf.format(v)\ncase class Element(symbol: String, melting: Double, boiling: Double)\nval element = Element(\"H\", 13.99, 20.271)\nval record = element.toRecord\nval value = record(\"melting\")  // returns Some(\"13,99\")\n```\n\nA disadvantage of both above methods operating on typed values is header creation for each record.\nThey may be not the optimal choice for large data sets when performance matters.\n\n#### Modifying records\n\nSometimes only a few fields of the original record have to be modified and the rest remains intact.\nIn such situations, it may be much more convenient to modify a record instead of creating a new one from scratch,\nespecially for large records.\nBecause `Record` is immutable,\nmodifying means creation of a copy of the record with selected fields set to new values.\n\nThe simplest way is to provide a new string value for a record field, referenced by key or index:\n```scala\nval record: Record = ???\nval modified: Record = record.updated(0, \"new value\").updated(\"key\", \"another value\")\n```\n\nIt is also possible to access existing value while updating record:\n```scala\nval record: Record = ???\nval modified: Record = record.updatedWith(0)(v =\u003e v.toUpperCase).updatedWith(\"key\")(v =\u003e v.toUpperCase)\n```\n\nRecord provides a method to modify typed values too:\n```scala\nval record: Record = ???\nval altered: Either[ContentError, Record] = record.altered(\"int value\")((i: Int) =\u003e i % 2 == 0)\n```\nor in extended form:\n```scala\nval dateFormat = DateTimeFormatter.ofPattern(\"dd.MM.yy\")\ngiven StringParser[LocalDate] = (str: String) =\u003e LocalDate.parse(str, dateFormat)\ngiven StringRenderer[LocalDate] = (ld: LocalDate) =\u003e dateFormat.format(ld)\nval record: Record = ???\nval altered: Either[ContentError, Record] = for\n  r1 \u003c- record.altered(\"field 1\")((d: Double) =\u003e d.abs)\n  r2 \u003c- r1.altered(\"field 2\")((ld: LocalDate) =\u003e ld.plusDays(1))\nyield r2\n```\nPlease note, however, that this method may produce an error\nbecause the source values have to be parsed before being passed to the updating function.\nTo support value parsing and rendering,\ngiven instances of `StringParser[A]` and `StringRenderer[B]` have to be provided for specific data formats.\n\nAll the above methods preserve record structure and keep existing record header.\nIt is also possible to modify the structure, if necessary:\n```scala\nval record: Record = ???\nval modified: Record = record.patch.remove(\"field 1\").add(\"field 10\", 3.14).get\n```\n`Record.patch` employs `RecordBuilder` to enhance or reduce record.\nSee [Creating records](#creating-records) above for more information.\n\n### Text parsing and rendering\n\n`CSV` data is parsed as `String`s.\nWe often need typed values, e.g. numbers or dates, for further processing.\nThere is no standard, uniform interface available for Scala or Java to parse strings to different types.\nNumbers may be parsed using `java.text.NumberFormat`.\nDates and times through `parse` methods in `java.time.LocalDate` or `LocalTime`, taking format as an argument.\nThis is awkward when providing a single interface for various types as `Record` does.\nThis is the place where spata's `text.StringParser` comes in handy.\n\nThe situation is similar when typed values have to be converted to strings to create `CSV` data.\nAlthough there is a `toString` method available for each value, it is often insufficient,\nbecause specific format of dates, numbers, and other values may be required.\nAgain, there is no single interface for encoding different types into strings.\nspata provides `text.StringRenderer` to help with this.\n\nSimilar solutions in other libraries are often called `Decoder` and `Encoder` in place of `Parser` and `Renderer`.\n\n#### Parsing text\n\n`StringParser` object provides methods for parsing strings with default or implicitly provided format:\n```scala\nval num: ParseResult[Double] = StringParser.parse[Double](\"123.45\")\n```\nwhere `ParseResult[A]` is just an alias for `Either[ParseError, A]`.\n\nWhen a specific format has to be provided, an overloaded version of the above method is available:\n```scala\nval df = new DecimalFormat(\"#,###\")\nval num: ParseResult[Double] = StringParser.parse[Double](\"123,45\", df)\n```\n(It uses intermediary class `Pattern` to provide nice syntax,\nthis should be however transparent in most cases).\n\nThese functions require given `StringParser` or `FormattedStringParser` instance respectively.\nGiven instances for a few basic types are already available - see Scaladoc for `StringParser`.\nWhen additional parsers are required,\nthey may be easily provided by implementing `StringParser` or `FormattedStringParser` traits.\n\nLet's take `java.sql.Date` as an example. Having implemented `StringParser[Date]`:\n```scala\ngiven sdf: StringParser[Date] = (s: String) =\u003e Date.valueOf(s)\n```\nwe can use it as follows:\n```scala\nval date = StringParser.parse[Date](\"2020-02-02\")\n```\n\nDefining a parser with support for custom formatting requires the implementation of `FormattedStringParser`:\n```scala\ngiven sdf: FormattedStringParser[Date, DateFormat] with\n  def apply(str: String): Date = Date.valueOf(str.strip)\n  def apply(str: String, fmt: DateFormat): Date =  new Date(fmt.parse(str.strip).getTime)\n```\nand can be used as follows:\n```scala\nval df = DateFormat.getDateInstance(DateFormat.SHORT, new Locale(\"pl\", \"PL\"))\nval date = StringParser.parse[Date](\"02.02.2020\", df)\n```\nPlease note that this sample implementation accepts partial string parsing,\ne.g. `\"02.02.2020xyz\"` will successfully parse to `2020-02-02`.\nThis is different from the built-in parsing behavior for `LocalDate`,\nwhere the entire string has to conform to the format.\n\nParsing implementations are expected to throw specific runtime exceptions when parsing fails.\nThis is converted to `ParseError` in `StringParser` object's `parse` method\nwhile keeping the original exception in the `cause` field.\n\nAlthough this design decision might be seen as questionable,\nas returning `Either` instead of throwing an exception could be a better choice,\nit is made deliberately - all available Java parsing methods throw an exception,\nso it is more convenient to use them directly while implementing `StringParser` traits,\nleaving all exception handling in a single place, i.e. the `StringParser.parse` method.\n\n#### Rendering text\n\nRendering is symmetrical with parsing.\n`StringRenderer` object provides methods for rendering strings with default or implicitly provided format:\n```scala\nval str: String = StringRenderer.render(123.45)\n```\n\nWhen a specific format has to be provided, an overloaded version of the above method is available:\n```scala\nval df = new DecimalFormat(\"#,###\")\nval str: String = StringRenderer.render(123.45, df)\n```\n\nThese functions require given `StringRenderer` or `FormattedStringRenderer` instance respectively.\nGiven instances for a few basic types are already available - see Scaladoc for `StringRenderer`.\nWhen additional renderers are required,\nthey may be easily provided by implementing `StringRenderer` or `FormattedStringRenderer` traits.\n\nLet's take again `java.sql.Date` as an example. Having implemented `StringRenderer[Date]`:\n```scala\ngiven sdf: StringRenderer[Date] = (d: Date) =\u003e if(d == null) \"\" else d.toString\n```\nwe can use it as follows:\n```scala\nval date = Date.valueOf(LocalDate.now)\nval str = StringRenderer.render(date)\n```\n\nDefining a renderer with support for custom formatting requires the implementation of `FormattedStringRenderer`:\n```scala\ngiven sdf: FormattedStringRenderer[Date, DateFormat] with\n  def apply(date: Date): String = date.toString\n  def apply(date: Date, fmt: DateFormat): String = fmt.format(date)\n```\nand can be used as follows:\n```scala\nval df = DateFormat.getDateInstance(DateFormat.SHORT, new Locale(\"pl\", \"PL\"))\nval date = Date.valueOf(LocalDate.now)\nval str = StringRenderer.render(date, df)\n```\n\n### Schema validation\n\nSuccessful `CSV` parsing means that the underlying source has the correct format (taking into account parser configuration).\nNonetheless, the obtained records may have any content - being a collection of strings they are very permissive.\nWe often require strict data content and format to be able to use it in accordance with our business logic.\nTherefore spata supports basic fields' format definition and validation.\n\n`CSV` schema can be defined using `schema.CSVSchema`:\n```scala\nval schema = CSVSchema()\n  .add[String](\"symbol\")\n  .add[LocalDateTime](\"time\")\n  .add[BigDecimal](\"price\")\n  .add[String](\"currency\")\n```\nSchema is basically specified by the names of expected `CSV` fields and their data types.\n\nWe do not need to include every field from the `CSV` source in the schema definition.\nIt is enough to do it only for those fields we are interested in.\n\nSchema is validated as part of a regular stream processing pipeline:\n```scala\nval schema = ???\nval stream: Stream[IO, Char] = ???\nval validatedStream = stream.through(CSVParser[IO].parse).through(schema.validate)\n```\nAs a result of the validation process, the ordinary `CSV` `Record` is converted to `ValidatedRecord[K,V]`,\nwhich is an alias for `Validated[InvalidRecord, TypedRecord[K,V]]`.\nThe parametric types `K` and `V` are the compile-time, tuple-based representations of the record keys and values.\nBecause they depend on the schema definition and are quite elaborate,\nwe are not able to manually provide it - we have to let the compiler infer it.\nThis is why the type signatures are omitted from some variable definitions in code excerpts in this chapter.\n\n[Validated](https://typelevel.org/cats/datatypes/validated.html) is a Cats data type for wrapping validation results.\nIt is similar to `Either`, with `Valid` corresponding to `Right` and `Invalid` to `Left`,\nbut more suitable for validation scenarios. Please reach for Cats documentation for an in-depth introduction.\n\nThe compile-time nature of this process makes future record handling fully type-safe:\n```scala\nval validatedStream = ???\nvalidatedStream.map: validated =\u003e\n  validated.map: typedRecord =\u003e\n    val symbol: String = typedRecord(\"symbol\")\n    val price: BigDecimal = typedRecord(\"price\")\n    // ...\n```\nPlease notice, that in contrast to the regular record, where the result is wrapped in `Decoded[T]`,\nwe always get the straight type out of typed record.\nIf we try to access a non-existing field (not defined by schema) or assign it to a wrong value type,\nwe will get a compilation error:\n```scala\nval typedRecord = ???\nval price: BigDecimal = typedRecord(\"prce\") // does not compile\n```\n\nThe key (field name) used to access the record value is a [literal type](https://docs.scala-lang.org/sips/42.type.html).\nTo be accepted by a typed record, the key has to be a literal value or a variable having a singleton type:\n```scala\nval typedRecord = ???\nval narrow: \"symbol\" = \"symbol\" // singleton\nval wide = \"symbol\" // String\nval symbol1 = typedRecord(\"symbol\")  // OK\nval symbol2 = typedRecord(narrow)  // OK\nval symbol3 = typedRecord(wide)  // does not compile\n```\n\nTyped records, similarly to regular ones, support conversion to case classes:\n```scala\ncase class StockPrice(symbol: String, price: BigDecimal)\nval typedRecord = ???\nval stockPrice: StockPrice = typedRecord.to[StockPrice]\n```\nLike in the case of [regular records](#getting-actual-data),\nthe conversion is name-based and may cover only a subset of a record's fields.\nHowever, in contrast to a regular record, the result is not wrapped in `Decoded` anymore.\n\nA field declared in the schema has to be present in the source stream.\nMoreover, its values, by default, must not be empty.\nIf values are optional, they have to be clearly marked as such in the schema definition:\n```scala\nval schema = CSVSchema()\n  .add[String](\"symbol\")\n  .add[Option[LocalDateTime]](\"time\")\n```\nPlease note, that this still requires the field (column) to be present, only permits it to contain empty values.\n\nWhile processing a validated stream, we have access to invalid data as well:\n```scala\nval validatedStream = ???\nvalidatedStream.map: validated =\u003e\n  validated.map: typedRecord =\u003e\n    val price: BigDecimal = typedRecord(\"price\")\n    // ...\n  .leftMap: invalid =\u003e\n    val price: Option[String] = invalid.record(\"price\")\n    // ...\n```\n(the above `map`/`leftMap` combination may be simplified to `bimap`).\n\nSchema validation requires string parsing, described in the previous chapter.\nSimilarly to the conversion to case classes, we are not able to directly pass a formatter to the validation,\nso a regular `StringParser` given instance with the correct format has to be provided for each parsed type.\nAll remarks described in [Text parsing and rendering](#text-parsing-and-rendering) apply to the validation process.\n\nType verification, although probably the most important aspect of schema validation,\nis often not the only constraint on `CSV` required to successfully process the data.\nWe often have to check if the values match many other business rules.\nspata provides a mean to declaratively verify basic constraints on the field level:\n```scala\nval schema = CSVSchema()\n  .add[String](\"symbol\", LengthValidator(3, 5))\n  .add[LocalDateTime](\"time\")\n  .add[BigDecimal](\"price\", MinValidator(BigDecimal(0.01)))\n  .add[String](\"currency\", LengthValidator(3))\n```\nRecords that do not pass the provided schema validators render the result invalid, as in the case of a wrong type.\n\nIt is possible to provide multiple validators for each field.\nThe validation process for a field is stopped on the first failing validator.\nThe order of running validators is not specified.\nNevertheless, the validation is run independently for each field defined by the schema.\nThe returned `InvalidRecord` contains error information from all incorrect fields.\n\nThe validators are defined in terms of typed (already correctly parsed) values.\nA bunch of typical ones is available as part of `info.fingo.spata.schema.validator` package.\nAdditional ones may be provided by implementing the `schema.validator.Validator` trait.\n\nThe converter example presented in [Basic usage](#basic-usage) may be improved to take advantage of schema validation:\n```scala\nimport java.nio.file.Paths\nimport java.time.LocalDate\nimport scala.io.Codec\nimport cats.effect.{IO, IOApp}\nimport fs2.Stream\nimport info.fingo.spata.{CSVParser, CSVRenderer, Record}\nimport info.fingo.spata.io.{Reader, Writer}\nimport info.fingo.spata.schema.CSVSchema\n\nobject Converter extends IOApp.Simple:\n\n  val converter: Stream[IO, Unit] =\n    given codec: Codec = Codec.UTF8\n    val schema = CSVSchema().add[LocalDate](\"date\").add[Double](\"temp\")\n    def fahrenheitToCelsius(f: Double): Double = (f - 32.0) * (5.0 / 9.0)\n\n    Reader[IO]\n      .read(Paths.get(\"testdata/fahrenheit.txt\"))\n      .through(CSVParser[IO].parse)\n      .through(schema.validate)\n      .map:\n        _.leftMap(_.toString).map: tr =\u003e\n          val date = tr(\"date\")\n          val temp = fahrenheitToCelsius(tr(\"temp\"))\n          Record.builder.add(\"date\", date).add(\"temp\", temp).get\n      .evalMap:\n        _ match\n          case Invalid(s) =\u003e IO.println(s) \u003e\u003e IO.none\n          case Valid(r) =\u003e IO(Some(r))\n      .unNone\n      .through(CSVRenderer[IO].render)\n      .through(Writer[IO].write(Paths.get(\"testdata/celsius.txt\")))\n\n  def run: IO[Unit] = converter.compile.drain\n```\n\n### Error handling\n\nThere are three types of errors that may arise while parsing `CSV`:\n\n*   Various I/O errors, including but not limited to `IOException`.\n    They are not directly related to parsing logic but `CSV` is typically read from an external, unreliable source.\n    They may be raised by `Reader` operations.\n\n*   Errors caused by malformed `CSV` structure reported as `StructureException`.\n    They may be caused by `CSVParser`'s methods.\n\n*   Errors caused by unexpected / incorrect data in record fields reported as one of `ContentError` subclasses.\n    They may result from interactions with `Record`.\n    Alternatively, when schema validation is in use,\n    this type of error results in `InvalidRecord` with `SchemaError`s (one per each field) being yielded.\n    More precisely, `ContentError` is wrapped in `TypeError`,\n    while any custom validation problem is reported as `ValidationError`.\n\nThe two first error categories are unrecoverable and stop stream processing.\nFor the `StructureException` errors, we can precisely identify the place that caused the problem.\nSee Scaladoc for `CSVException` for further information about the error location.\n\nThe last category is reported on the record level and allows for different handling policies.\nPlease notice, however, that if the error is not handled locally (e.g. using safe functions returning `Decoded`)\nand propagates through the stream, further processing of input data is stopped, like for the above error categories.\n\nAs for rendering, there are basically two types of errors possible:\n\n*   Errors caused by missing records keys, including records of different structures rendered together.\n    They are reported on record level with `HeaderError`.\n\n*   Similarly to parsing, various I/O errors, when using `Writer`.\n\nErrors are raised and should be handled by using the [FS2 error handling](https://fs2.io/#/guide?id=error-handling) mechanism.\nFS2 captures exceptions thrown or reported explicitly with `raiseError`\nand in both cases is able to handle them with `handleErrorWith`.\nTo fully support this, `CSVParser` and `CSVRenderer` require the `RaiseThrowable` type class instance for its effect type,\nwhich is covered with `cats.effect.Sync` type class for the parser.\n\nThe converter example presented in [Basic usage](#basic-usage) may be enriched with explicit error handling:\n```scala\nimport java.nio.file.Paths\nimport scala.io.Codec\nimport scala.util.Try\nimport cats.data.Validated.{Invalid, Valid}\nimport cats.effect.{ExitCode, IO, IOApp}\nimport fs2.Stream\nimport info.fingo.spata.{CSVParser, CSVRenderer, Record}\nimport info.fingo.spata.io.{Reader, Writer}\n\nobject Converter extends IOApp:\n\n  val converter: Stream[IO, ExitCode] =\n    def fahrenheitToCelsius(f: Double): Double = (f - 32.0) * (5.0 / 9.0)\n    given codec: Codec = Codec.UTF8\n    val src = Paths.get(\"testdata/fahrenheit.txt\")\n    val dst = Paths.get(\"testdata/celsius.txt\")\n\n    Reader[IO]\n      .read(src)\n      .through(CSVParser[IO].parse)\n      .filter(r =\u003e r(\"temp\").exists(!_.isBlank))\n      .map: r =\u003e\n        for\n          date \u003c- r.get[String](\"date\")\n          fTemp \u003c- r.get[Double](\"temp\")\n          cTemp = fahrenheitToCelsius(fTemp)\n        yield Record.builder.add(\"date\", date).add(\"temp\", cTemp).get\n      .rethrow\n      .through(CSVRenderer[IO].render)\n      .through(Writer[IO].write(dst))\n      .fold(ExitCode.Success)((z, _) =\u003e z)\n      .handleErrorWith: ex =\u003e\n        Try(dst.toFile.delete())\n        Stream.eval(IO(println(ex)) *\u003e IO(ExitCode.Error))\n\n  def run(args: List[String]): IO[ExitCode] = converter.compile.lastOrError\n```\n\nThe `rethrow` method in the above code raises an error for `Left`, converting `Either` to simple values.\n\nSometimes we would like to convert a stream to a collection.\nWe should wrap the result in `Either` in such situations to distinguish successful processing from erroneous one.\nSee the first code snippet in [Basic usage](#basic-usage) for sample.\n\n### Logging\n\nLogging is turned off by default in spata (no-op logger)\nand may be activated by defining given instance of `util.Logger`,\npassing an SLF4J logger instance to it:\n```scala\nval slf4jLogger = LoggerFactory.getLogger(\"spata\")\ngiven spataLogger: Logger[IO] = new Logger[IO](slf4jLogger)\n```\nspata does not create per-class loggers but uses the provided one for all logging operations.\n\nAll logging operations are deferred in the stream effect and executed as part of effect evaluation,\ntogether with main effectful operations.\n\nThe logging is currently limited to only a few events per parsed `CSV` source (single `info` entry,\na couple of `debug` entries, and possibly an `error` entry). There are no log events generated per `CSV` record.\nNo stack trace is recorded for `error` events.\n\nThe `debug` level introduces additional operations on the stream and may slightly impact performance.\n\nNo parsed data is explicitly written to the log.\nThis can however occur when the `CSV` source is assumed to have a header row, but it does not.\nThe first record of data is then assumed to be the header and is logged at debug level.\nPlease do not use the debug level if data security is crucial.\n\nAlternatives\n------------\n\nFor those who need a different characteristic of a `CSV` library, there are a few alternatives available for Scala:\n*   [Itto-CSV](https://github.com/gekomad/itto-csv) - `CSV` handling library based on FS2 and Cats with support for case class conversion. Supports Scala 2 and 3.\n*   [fs2 data](https://github.com/satabin/fs2-data) - collection of FS2 based parsers, including `CSV`. Part of [typelevel-toolkit](https://typelevel.org/toolkit/). Supports Scala 2 and 3.\n*   [kantan.csv](https://github.com/nrinaudo/kantan.csv) - well documented `CSV` parser/serializer with support for different parsing engines. Available for Scala 2.\n*   [scala-csv](https://github.com/tototoshi/scala-csv) - easy to use `CSV` reader/writer. Available for Scala 2 and 3.\n*   [cormorant](https://github.com/davenverse/cormorant) - functional `CSV` processor with support for FS2, http4s and case class conversion. Available for Scala 2.\n*   [scala-csv-parser](https://github.com/zamblauskas/scala-csv-parser) - `CSV` parser with support for conversion into case classes. Available for Scala 2.\n*   [TableParser](https://github.com/rchillyard/tableparser) - parser and renderer of tabular data in different formats, including `CSV`.\n*   [CVSSide](https://github.com/underscoreio/csvside) - functional `CVS` parser with case class conversion. Available for Scala 2.\n*   [csv3s](https://index.scala-lang.org/johnspade/csv3s) - `CSV` parser and renderer with case class conversion. Based on [ZIO Parser](https://github.com/zio/zio-parser). Supports Scala 3.\n*   [PureCSV](https://github.com/kontainers/purecsv) (previously [here](https://github.com/sentenza/PureCSV)) - easy to use `CSV` parser and renderer with case class conversion. Available for Scala 2.\n*   [ceesvee](https://github.com/guymers/ceesvee) - `CSV` parser with case class conversion, designed for use with streams (FS2, ZStream). Support Scala 2 and 3.\n*   [Frugal Mechanic Flat File Reader](https://github.com/frugalmechanic/fm-flatfile) - flat file (including `CSV`) reader/writer. Support Scala 2 and 3.\n\nCredits\n-------\n\n**spata** makes use of the following tools, languages, frameworks, libraries and data sets (in alphabetical order):\n*   [Cats](https://typelevel.org/cats/) licensed under [MIT](http://www.slf4j.org/license.html) /C\n*   [Cats Effect](https://typelevel.org/cats-effect/) licensed under [Apache-2.0](https://github.com/typelevel/cats-effect/blob/master/LICENSE.txt) /C\n*   [Codecov](https://codecov.io/) available under following [Terms of Use](https://codecov.io/terms) /D\n*   [FS2](https://fs2.io/) licensed under [MIT](https://github.com/functional-streams-for-scala/fs2/blob/master/LICENSE) /C\n*   [Git](https://git-scm.com/) licensed under [GPL-2.0](https://git-scm.com/about/free-and-open-source) /D\n*   [GitHub](https://github.com/) available under following [Terms of Service](https://help.github.com/en/github/site-policy/github-terms-of-service) /D\n*   [Gitter](https://gitter.im/) available under following [Terms of Use](https://about.gitlab.com/terms/) /D\n*   [IntelliJ IDEA CE](https://www.jetbrains.com/idea/) licensed under [Apache 2.0](https://www.jetbrains.com/idea/download/) /D\n*   [javadoc.io](https://www.javadoc.io/) licensed under [Apache-2.0](https://github.com/maxcellent/javadoc.io/blob/master/LICENSE) /D\n*   [Mars weather data](https://github.com/the-pudding/data/tree/master/mars-weather) made publicly available by [NASA](https://pds.nasa.gov/) and [CAB](https://cab.inta-csic.es/rems/en) /T\n*   [Metals](https://scalameta.org/metals/) licensed under [Apache-2.0](https://github.com/scalameta/metals/blob/main/LICENSE) /D\n*   [OpenJDK](https://adoptopenjdk.net/) licensed under [GPL-2.0 with CE](https://openjdk.java.net/legal/gplv2+ce.html) /C\n*   [sbt](https://www.scala-sbt.org/) licensed under [BSD-2-Clause](https://www.lightbend.com/legal/licenses) /D\n*   [sbt-api-mappings](https://github.com/ThoughtWorksInc/sbt-api-mappings) licensed under [Apache-2.0](https://github.com/ThoughtWorksInc/sbt-api-mappings/blob/3.0.x/LICENSE) /D\n*   [sbt-dynver](https://github.com/dwijnand/sbt-dynver) licensed under [Apache-2.0](https://github.com/dwijnand/sbt-dynver/blob/master/LICENSE) /D\n*   [sbt-header](https://github.com/sbt/sbt-header) licensed under [Apache-2.0](https://github.com/sbt/sbt-header/blob/master/LICENSE) /D\n*   [sbt-license-report](https://github.com/sbt/sbt-license-report) licensed under [Apache-2.0](https://github.com/sbt/sbt-header/blob/master/LICENSE) /D\n*   [sbt-pgp](https://github.com/sbt/sbt-pgp) licensed under [BSD-3-Clause](https://github.com/sbt/sbt-pgp/blob/master/LICENSE) /D\n*   [sbt-scoverage](https://github.com/scoverage/sbt-scoverage) licensed under [Apache-2.0](https://github.com/scoverage/sbt-scoverage#license) /D\n*   [sbt-sonatype](https://github.com/xerial/sbt-sonatype) licensed under [Apache-2.0](https://github.com/xerial/sbt-sonatype/blob/master/LICENSE.txt) /D\n*   [Scala](https://www.scala-lang.org/download/) licensed under [Apache-2.0](https://www.scala-lang.org/license/) /C\n*   [Scalafix](https://github.com/scalacenter/scalafix) licensed under [BSD-3-Clause](https://github.com/scalacenter/scalafix/blob/master/LICENSE.md) /D\n*   [Scalafmt](https://scalameta.org/scalafmt/docs/installation.html#sbt) licensed under [Apache-2.0](https://github.com/scalameta/scalafmt/blob/master/LICENCE.md) /D\n*   [ScalaMeter](https://scalameter.github.io/) licensed under [BSD-3-Clause](https://scalameter.github.io/home/license/) /T\n*   [ScalaTest](http://www.scalatest.org/) licensed under [Apache-2.0](http://www.scalatest.org/about) /T\n*   [shapeless](https://github.com/milessabin/shapeless) licensed under [Apache-2.0](https://github.com/milessabin/shapeless/blob/master/LICENSE) /S\n*   [SLF4J](http://www.slf4j.org/) licensed under [MIT](http://www.slf4j.org/license.html) /C\n*   [sonatype OSSRH](https://central.sonatype.org/) available under following [Terms of Service](https://central.sonatype.org/pages/central-repository-producer-terms.html) /D\n*   [Travis CI](https://travis-ci.org/) available under following [Terms of Service](https://docs.travis-ci.com/legal/terms-of-service/) /D\n\n**/C** means compile/runtime dependency,\n**/T** means test dependency,\n**/S** means source code derivative and\n**/D** means development tool.\nOnly direct dependencies are presented in the above list.\n\nRun `sbt dumpLicenseReport` to get a complete list of compile and runtime dependencies, including transitive ones.\n**spata** build verifies whether all its dependencies are permissive so it can be deployed confidently in most usage scenarios.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffingo%2Fspata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffingo%2Fspata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffingo%2Fspata/lists"}