{"id":16647419,"url":"https://github.com/gyrdym/ml_dataframe","last_synced_at":"2025-03-21T16:31:00.724Z","repository":{"id":35077356,"uuid":"198516882","full_name":"gyrdym/ml_dataframe","owner":"gyrdym","description":"A way to store and manipulate data","archived":false,"fork":false,"pushed_at":"2022-08-07T21:22:39.000Z","size":245,"stargazers_count":18,"open_issues_count":5,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-18T03:02:37.354Z","etag":null,"topics":["data-science","dataframe","datascience","dataset","toy-dataset","toy-datasets"],"latest_commit_sha":null,"homepage":"","language":"Dart","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gyrdym.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-23T22:21:25.000Z","updated_at":"2024-10-05T01:03:51.000Z","dependencies_parsed_at":"2022-07-24T17:47:15.271Z","dependency_job_id":null,"html_url":"https://github.com/gyrdym/ml_dataframe","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyrdym%2Fml_dataframe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyrdym%2Fml_dataframe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyrdym%2Fml_dataframe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyrdym%2Fml_dataframe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gyrdym","download_url":"https://codeload.github.com/gyrdym/ml_dataframe/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244829421,"owners_count":20517303,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","dataframe","datascience","dataset","toy-dataset","toy-datasets"],"created_at":"2024-10-12T08:44:42.360Z","updated_at":"2025-03-21T16:31:00.302Z","avatar_url":"https://github.com/gyrdym.png","language":"Dart","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://github.com/gyrdym/ml_dataframe/workflows/CI%20pipeline/badge.svg)](https://github.com/gyrdym/ml_dataframe/actions?query=branch%3Amaster+)\n[![Coverage Status](https://coveralls.io/repos/github/gyrdym/ml_dataframe/badge.svg?branch=master)](https://coveralls.io/github/gyrdym/ml_dataframe?branch=master)\n[![pub package](https://img.shields.io/pub/v/ml_dataframe.svg)](https://pub.dartlang.org/packages/ml_dataframe)\n[![Gitter Chat](https://badges.gitter.im/gyrdym/gyrdym.svg)](https://gitter.im/gyrdym/)\n\n# ml_dataframe\nA way to store and manipulate data\n\nThe library exposes in-memory storage for dynamically typed data. The storage is represented by [DataFrame](https://github.com/gyrdym/ml_dataframe/blob/master/lib/src/data_frame/data_frame.dart) class.\n\n## Table of contents\n\n- [Usage example](#usage-example)\n- [DataFrame API](#dataframe-api-with-examples)\n    - [Get the header](#get-the-header-of-the-data)\n    - [Get the rows](#get-the-rows-of-the-data)\n    - [Get the series](#get-the-series-collection-columns-of-the-data)\n    - [Get the shape](#get-the-shape-of-the-data)\n    - [Add a series](#add-a-series)\n    - [Drop a series by a name](#drop-a-series-by-a-series-name)\n    - [Drop a series by an index](#drop-a-series-by-a-series-index)\n    - [Sample a dataframe from rows](#sample-a-new-dataframe-from-rows-of-an-existing-dataframe)\n    - [Sample a dataframe from series indices](#sample-a-new-dataframe-from-series-indices-of-an-existing-dataframe)\n    - [Sample a dataframe from series names](#sample-a-new-dataframe-from-series-names-of-an-existing-dataframe)\n    - [Save a dataframe](#save-a-dataframe-to-a-json-file)\n    - [Shuffle rows of a dataframe](#shuffle-rows-in-a-dataframe)\n    - [Get a JSON representation](#get-a-json-serializable-representation)\n    - [Convert to Matrix](#convert-a-dataframe-to-a-matrix)\n    - [Get a series by name](#get-a-series-by-its-name)\n    - [Get a series by index](#get-a-series-by-its-index)\n    - [Map values](#map-values-of-a-dataframe)\n    - [Map values of a series](#map-values-of-a-specific-dataframe-series)\n- [Ways to create a dataframe](#ways-to-create-a-dataframe)\n    - [DataFrame constructor](#dataframe-constructor)\n    - [Create a dataframe from a CSV file](#fromcsv-function)\n    - [Restore a dataframe from JSON](#restore-a-dataframe-previously-persisted-as-a-json-file----fromjson-function)\n- [Prefilled dataframes](#dataframes-with-prefilled-data)\n    - [Iris dataset](#iris-dataset---function-getirisdataframe)\n    - [Pima Indians diabetes dataset](#pima-indians-diabetes-dataset---function-getpimaindiansdiabetesdataframe)\n    - [Red wine quality dataset](#red-wine-quality-dataset---function-getwinequalitydataframe)\n    - [Boston housing dataset](#boston-housing-dataset---function-gethousingdataframe)\n- [Contacts](#contacts)\n\n## Usage example:\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = [\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ];\n    \n  final dataframe = DataFrame(data);\n    \n  print(dataframe);\n  // DataFrame (5 x 6)\n  //  Id   SepalLengthCm   SepalWidthCm   PetalLengthCm   PetalWidthCm           Species\n  //   1             5.1            3.5             1.4            0.2       Iris-setosa\n  //   2             4.9            3.0             1.4            0.2       Iris-setosa\n  //  89             5.6            3.0             4.1            1.3   Iris-versicolor\n  //  90             5.5            2.5             4.0            1.3   Iris-versicolor\n  //  91             5.5            2.6             4.4            1.2   Iris-versicolor\n}\n```\n\n## `DataFrame` API with examples:\n\n### Get the header of the data\n\nBy default, the very first row is considered a header, unless one specify their own header or autogenerated one. More on\nthis is [here](#dataframe-constructor)\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final header = dataframe.header;\n\n  print(header);\n  // ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm', 'Species']\n}\n```\n\n### Get the rows of the data\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final rows = dataframe.rows;\n\n  print(rows);\n  // [\n  //   [1, 5.1, 3.5, 1.4, 0.2, 'Iris-setosa'],\n  //   [2, 4.9, 3.0, 1.4, 0.2, 'Iris-setosa'],\n  //   [89, 5.6, 3.0, 4.1, 1.3, 'Iris-versicolor'],\n  //   [90, 5.5, 2.5, 4.0, 1.3, 'Iris-versicolor'],\n  //   [91, 5.5, 2.6, 4.4, 1.2, 'Iris-versicolor'],\n  // ],\n}\n``` \n\n### Get the series collection (columns) of the data\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final series = dataframe.series;\n    \n  print(series);\n  // [\n  //   'Id': [1, 2, 89, 90, 91],\n  //   'SepalLengthCm': [5.1, 4.9, 5.6, 5.5, 5.5],\n  //   'SepalWidthCm': [3.5, 3.0, 3.0, 2.5, 2.6],\n  //   'PetalLengthCm': [1.4, 1.4, 4.1, 4.0, 4.4],\n  //   'PetalWidthCm': [0.2, 0.2, 1.3, 1.3, 1.2],\n  //   'Species': ['Iris-setosa', 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor'],\n  // ],\n}\n``` \n\n### Get the shape of the data\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final shape = dataframe.shape;\n\n  print(shape);\n  // [5, 6] - 5 rows, 6 columns\n}\n```\n\n### Add a series\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final firstSeries = Series('super_series', [1, 2, 3, 4, 5, 6]);\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n\n  final modifiedDataframe = dataframe.addSeries([firstSeries]); // The method doesn't mutate the original dataframe\n\n  print(modifiedDataframe.series.first);\n  // 'super_series': [1, 2, 3, 4, 5, 6]\n}\n```\n\n### Drop a series by a series name\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n\n  print(dataframe.shape);\n  // [5, 6] - 6 rows, 6 columns \n\n  final modifiedDataframe = dataframe.dropSeries(names: ['Id']); // The method doesn't mutate the original dataframe\n\n  print(modifiedDataframe.shape);\n  // [5, 5] -  after a series had been dropped, the number of columns became one lesser\n} \n````\n\n### Drop a series by a series index\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  print(dataframe.shape);\n  // [5, 6] - 5 rows, 6 columns \n\n  final modifiedDataframe = dataframe.dropSeries(indices: [0]); // The method doesn't mutate the original dataframe\n\n  print(modifiedDataframe.shape);\n  // [5, 5] -  after a series had been dropped, the number of columns became one lesser\n} \n````\n\n### Sample a new dataframe from rows of an existing dataframe\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final sampled = dataframe.sampleFromRows([0, 5]);\n\n  print(sampled);\n  // DataFrame (2 x 6)\n  //  Id   SepalLengthCm   SepalWidthCm   PetalLengthCm   PetalWidthCm           Species\n  //   1             5.1            3.5             1.4            0.2       Iris-setosa\n  //  91             5.5            2.6             4.4            1.2   Iris-versicolor\n} \n````\n\n### Sample a new dataframe from series indices of an existing dataframe\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final sampled = dataframe.sampleFromSeries(indices: [0, 1]);\n\n  print(sampled);\n  // DataFrame (5 x 2)\n  //  Id   SepalLengthCm\n  //   1             5.1\n  //   2             4.9\n  //  89             5.6\n  //  90             5.5\n  //  91             5.5\n}\n````\n\n### Sample a new dataframe from series names of an existing dataframe\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final sampled = dataframe.sampleFromSeries(names: ['Id', 'SepalLengthCm']);\n\n  print(sampled);\n  // DataFrame (5 x 2)\n  //  Id   SepalLengthCm\n  //   1             5.1\n  //   2             4.9\n  //  89             5.6\n  //  90             5.5\n  //  91             5.5\n}\n````\n\n### Save a dataframe to a JSON file\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() async {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  \n  await dataframe.saveAsJson('path/to/json/file.json');\n}\n````\n\n### Shuffle rows in a dataframe\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  \n  print(dataframe);\n  // DataFrame (5 x 6)\n  //  Id   SepalLengthCm   SepalWidthCm   PetalLengthCm   PetalWidthCm           Species\n  //   1             5.1            3.5             1.4            0.2       Iris-setosa\n  //   2             4.9            3.0             1.4            0.2       Iris-setosa\n  //  89             5.6            3.0             4.1            1.3   Iris-versicolor\n  //  90             5.5            2.5             4.0            1.3   Iris-versicolor\n  //  91             5.5            2.6             4.4            1.2   Iris-versicolor\n\n  final shuffled = dataframe.shuffle(); // keep in mind that `shuffle` like other methods returns a new dataframe, the method doesn't mutate the source dataframe \n\n  print(shuffled);\n  // DataFrame (5 x 6)\n  //  Id   SepalLengthCm   SepalWidthCm   PetalLengthCm   PetalWidthCm           Species\n  //  89             5.6            3.0             4.1            1.3   Iris-versicolor\n  //   1             5.1            3.5             1.4            0.2       Iris-setosa\n  //  91             5.5            2.6             4.4            1.2   Iris-versicolor\n  //   2             4.9            3.0             1.4            0.2       Iris-setosa\n  //  90             5.5            2.5             4.0            1.3   Iris-versicolor\n}\n````\n\nOne can use `seed` parameter to keep the order of rows disregard the number of `shuffle` calls:   \n\n```dart\ndataframe.shuffle(seed: 10);\n``` \n\n### Get a json-serializable representation\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final json = dataframe.toJson(); // json contains a serializable map\n}\n```\n\n### Convert a dataframe to a matrix:\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm'],\n    [   1,             5.1,            3.5,             1.4,            0.2],\n    [   2,             4.9,            3.0,             1.4,            0.2],\n    [  89,             5.6,            3.0,             4.1,            1.3],\n    [  90,             5.5,            2.5,             4.0,            1.3],\n    [  91,             5.5,            2.6,             4.4,            1.2],\n  ]);\n  \n  final matrix = dataframe.toMatrix();\n  \n  print(matrix); // because of internal representation of Float32 numbers there are some round-off errors in the output\n  // Matrix 5 x 5:\n  // (1.0, 5.099999904632568, 3.5, 1.399999976158142, 0.20000000298023224)\n  // (2.0, 4.900000095367432, 3.0, 1.399999976158142, 0.20000000298023224)\n  // (89.0, 5.599999904632568, 3.0, 4.099999904632568, 1.2999999523162842)\n  // (90.0, 5.5, 2.5, 4.0, 1.2999999523162842)\n  // (91.0, 5.5, 2.5999999046325684, 4.400000095367432, 1.2000000476837158)\n}\n```\n\nthe method throws an error if there are inconvertible to a number values in the dataframe.\n\n### Get a series by its index\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final series = dataframe[0];\n\n  print(series);\n  // Id: [1, 2, 89, 90, 91]\n}\n```\n\n### Get a series by its name\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final dataframe = DataFrame([\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ]);\n  final series = dataframe['Id'];\n\n  print(series);\n  // Id: [1, 2, 89, 90, 91]\n}\n```\n\n### Map values of a dataframe\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe';\n\nvoid main() {\n  final data = DataFrame([\n    ['col_1', 'col_2', 'col_3'],\n    [      2,      20,     200],\n    [      3,      30,     300],\n    [      4,      40,     400],\n  ]);\n  // the first generic type ia a type of the source value, the second generic type is a type of the mapped value\n  final modifiedData = data.map\u003cnum, num\u003e((value) =\u003e value * 2);\n    \n  print(modifiedData);\n  // DataFrame (3 x 3)\n  // col_1 col_2 col_3\n  //     4    40   400\n  //     6    60   600\n  //     8    80   800\n}\n```\n\n### Map values of a specific dataframe series\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe';\n\nvoid main() {\n  final data = DataFrame([\n    ['col_1', 'col_2', 'col_3'],\n    [      2,      20,     200],\n    [      3,      30,     300],\n    [      4,      40,     400],\n  ]);\n  // the first generic type ia a type of the source value, the second generic type is a type of the mapped value\n  final modifiedData = data.mapSeries\u003cnum, num\u003e((value) =\u003e value * 2, name: 'col_2');\n    \n  print(modifiedData);\n  // DataFrame (3 x 3)\n  // col_1 col_2 col_3\n  //     2    40   200\n  //     3    60   300\n  //     4    80   400\n}\n```\n\n## Ways to create a dataframe\n\n### `DataFrame` constructor\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = [\n    ['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',         'Species'],\n    [   1,             5.1,            3.5,             1.4,            0.2,     'Iris-setosa'],\n    [   2,             4.9,            3.0,             1.4,            0.2,     'Iris-setosa'],\n    [  89,             5.6,            3.0,             4.1,            1.3, 'Iris-versicolor'],\n    [  90,             5.5,            2.5,             4.0,            1.3, 'Iris-versicolor'],\n    [  91,             5.5,            2.6,             4.4,            1.2, 'Iris-versicolor'],\n  ];\n\n  final dataframe = DataFrame(data);\n}\n```\n\nBy default, the very first row is considered a header. If the data does not have a header, one can use autogenerated \nheader by providing `headerExists: false` config to the constructor:  \n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = [\n    [1, 5.1, 3.5, 1.4, 0.2, 'Iris-setosa'],\n    [2, 4.9, 3.0, 1.4, 0.2, 'Iris-setosa'],\n    [89, 5.6, 3.0, 4.1, 1.3, 'Iris-versicolor'],\n    [90, 5.5, 2.5, 4.0, 1.3, 'Iris-versicolor'],\n    [91, 5.5, 2.6, 4.4, 1.2, 'Iris-versicolor'],\n  ];\n\n  final dataframe = DataFrame(data, headerExists: false);\n\n  print(dataframe.header);\n}\n```\n\nIt outputs `['col_1', 'col_2', 'col_3', 'col_4', 'col_5', 'col_6']`. `col_` is a default prefix for the autogenerated \ncolumns.\n\nAlso, if there are no header row in the data, one can use a predefined header:\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = [\n    [1, 5.1, 3.5, 1.4, 0.2, 'Iris-setosa'],\n    [2, 4.9, 3.0, 1.4, 0.2, 'Iris-setosa'],\n    [89, 5.6, 3.0, 4.1, 1.3, 'Iris-versicolor'],\n    [90, 5.5, 2.5, 4.0, 1.3, 'Iris-versicolor'],\n    [91, 5.5, 2.6, 4.4, 1.2, 'Iris-versicolor'],\n  ];\n\n  final dataframe = DataFrame(data, header: ['feature_1', 'feature_2', 'feature_3', 'feature_4', 'feature_5', 'feature_6']);\n}\n```\n\n### `fromCsv` function\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() async {\n  final data = await fromCsv('path/to/csv/file.csv');\n}\n```\n\nIf the `csv` file does not have a header row, it's needed to provide the corresponding flag:\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() async {\n  final data = await fromCsv('path/to/csv/file.csv', headerExists: false);\n}\n```\n\n### Restore a dataframe previously persisted as a json file  - `fromJson` function\n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() async {\n  final data = await fromJson('path/to/json/file.json');\n}\n```\n\nThis function works in conjunction with DataFrame `saveAsJson` method.\n\n## Dataframes with prefilled data\n\nIn order to test data processing algorithms, one can use \"toy\" datasets. The library exposes several of them:\n\n### Iris dataset - function `getIrisDataFrame`\n\nOne can create a dataframe filled with [Iris](https://www.kaggle.com/datasets/uciml/iris) data: \n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = getIrisDataFrame();\n\n  print(data);\n  // DataFrame (150 x 6)\n  // Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species\n  // ...\n}\n```\n\n### Pima Indians diabetes dataset - function `getPimaIndiansDiabetesDataFrame`\n\nOne can create a dataframe filled with [Pima Indians diabetes](https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database) data: \n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = getPimaIndiansDiabetesDataFrame();\n\n  print(data);\n  // DataFrame (768 x 9)\n  // Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome\n  // ...\n}\n```\n\n### Red wine quality dataset - function `getWineQualityDataframe`\n\nOne can create a dataframe filled with [Red wine quality](https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009) data: \n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = getWineQualityDataframe();\n\n  print(data);\n  // DataFrame (1599 x 12)\n  // fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality\n  // ...\n}\n```\n\n### Boston housing dataset - function `getHousingDataframe`\n\nOne can create a dataframe filled with [Boston housing](http://lib.stat.cmu.edu/datasets/boston) data: \n\n```dart\nimport 'package:ml_dataframe/ml_dataframe.dart';\n\nvoid main() {\n  final data = getHousingDataframe();\n\n  print(data);\n  // DataFrame (506 x 14)\n  //    CRIM     ZN   INDUS   CHAS     NOX      RM   ...   MEDV\n  // 0.00632   18.0    2.31      0   0.538   6.575   ...   24.0\n  // 0.02731    0.0    7.07      0   0.469   6.421   ...   21.6\n  // 0.02729    0.0    7.07      0   0.469   7.185   ...   34.7\n  // 0.03237    0.0    2.18      0   0.458   6.998   ...   33.4\n  // 0.06905    0.0    2.18      0   0.458   7.147   ...   36.2\n  //     ...    ...     ...    ...     ...     ...   ...    ...\n  // 0.06263    0.0   11.93      0   0.573   6.593   ...   22.4\n  // 0.04527    0.0   11.93      0   0.573    6.12   ...   20.6\n  // 0.06076    0.0   11.93      0   0.573   6.976   ...   23.9\n  // 0.10959    0.0   11.93      0   0.573   6.794   ...   22.0\n  // 0.04741    0.0   11.93      0   0.573    6.03   ...   11.9\n}\n```\n\n## Contacts\nIf you have questions, feel free to text me on\n - [Twitter](https://twitter.com/ilgyrd) \n - [Facebook](https://www.facebook.com/ilya.gyrdymov)\n - [Linkedin](https://www.linkedin.com/in/gyrdym/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgyrdym%2Fml_dataframe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgyrdym%2Fml_dataframe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgyrdym%2Fml_dataframe/lists"}