{"id":18450544,"url":"https://github.com/uwdata/flechette","last_synced_at":"2025-04-04T16:09:53.458Z","repository":{"id":252929912,"uuid":"840850872","full_name":"uwdata/flechette","owner":"uwdata","description":"Fast, lightweight access to Apache Arrow data.","archived":false,"fork":false,"pushed_at":"2025-03-18T18:47:26.000Z","size":11024,"stargazers_count":83,"open_issues_count":0,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-21T14:09:00.471Z","etag":null,"topics":["arrow","data","interchange"],"latest_commit_sha":null,"homepage":"https://idl.uw.edu/flechette/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uwdata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-10T22:06:57.000Z","updated_at":"2025-03-18T20:46:53.000Z","dependencies_parsed_at":"2025-03-21T14:18:30.319Z","dependency_job_id":null,"html_url":"https://github.com/uwdata/flechette","commit_stats":null,"previous_names":["uwdata/flechette"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uwdata%2Fflechette","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uwdata%2Fflechette/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uwdata%2Fflechette/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uwdata%2Fflechette/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uwdata","download_url":"https://codeload.github.com/uwdata/flechette/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246049630,"owners_count":20715511,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arrow","data","interchange"],"created_at":"2024-11-06T07:25:34.807Z","updated_at":"2025-03-28T15:04:35.695Z","avatar_url":"https://github.com/uwdata.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# Flechette \u003ca href=\"https://idl.uw.edu/flechette\"\u003e\u003cimg align=\"right\" src=\"https://raw.githubusercontent.com/uwdata/flechette/main/docs/assets/logo.svg\" height=\"38\"\u003e\u003c/img\u003e\u003c/a\u003e\n\n**Flechette** is a JavaScript library for reading and writing the [Apache Arrow](https://arrow.apache.org/) columnar in-memory data format. It provides a faster, lighter, zero-dependency alternative to the [Arrow JS reference implementation](https://github.com/apache/arrow/tree/main/js).\n\nFlechette performs fast extraction and encoding of data columns in the Arrow binary IPC format, supporting ingestion of Arrow data from sources such as [DuckDB](https://duckdb.org/) and Arrow use in JavaScript data analysis tools like [Arquero](https://github.com/uwdata/arquero), [Mosaic](https://github.com/uwdata/mosaic), [Observable Plot](https://observablehq.com/plot/), and [Vega-Lite](https://vega.github.io/vega-lite/).\n\nFor documentation, see the [**API Reference**](https://idl.uw.edu/flechette/api). For code, see the [**Flechette GitHub repo**](https://github.com/uwdata/flechette).\n\n## Why Flechette?\n\nIn the process of developing multiple data analysis packages that consume Arrow data (including Arquero, Mosaic, and Vega), we've had to develop workarounds for the performance and correctness of the Arrow JavaScript reference implementation. Instead of workarounds, Flechette addresses these issues head-on.\n\n* _Speed_. Flechette provides better performance. Performance tests show 1.3-1.6x faster value iteration, 2-7x faster array extraction, 7-11x faster row object extraction, and 1.5-3.5x faster building of Arrow columns.\n\n* _Size_. Flechette is smaller: ~43k minified (~14k gzip'd) versus 163k minified (~43k gzip'd) for Arrow JS. Flechette's encoders and decoders also tree-shake cleanly, so you only pay for what you need in custom bundles.\n\n* _Coverage_. Flechette supports data types unsupported by the reference implementation, including decimal-to-number conversion, month/day/nanosecond time intervals (as used by DuckDB), run-end encoded data, binary views, and list views.\n\n* _Flexibility_. Flechette includes options to control data value conversion, such as numerical timestamps vs. Date objects for temporal data, and numbers vs. bigint values for 64-bit integer data.\n\n* _Simplicity_. Our goal is to provide a smaller, simpler code base in the hope that it will make it easier for ourselves and others to improve the library. If you'd like to see support for additional Arrow features, please [file an issue](https://github.com/uwdata/flechette/issues) or [open a pull request](https://github.com/uwdata/flechette/pulls).\n\nThat said, no tool is without limitations or trade-offs. Flechette assumes simpler inputs (byte buffers, no promises or streams), has less strict TypeScript typings, and may have a slightly slower initial parse (as it decodes dictionary data upfront for faster downstream access).\n\n## What's with the name?\n\nThe project name stems from the French word [fléchette](https://en.wikipedia.org/wiki/Flechette), which means \"little arrow\" or \"dart\". 🎯\n\n## Examples\n\n### Load and Access Arrow Data\n\n```js\nimport { tableFromIPC } from '@uwdata/flechette';\n\nconst url = 'https://cdn.jsdelivr.net/npm/vega-datasets@2/data/flights-200k.arrow';\nconst ipc = await fetch(url).then(r =\u003e r.arrayBuffer());\nconst table = tableFromIPC(ipc);\n\n// print table size: (231083 x 3)\nconsole.log(`${table.numRows} x ${table.numCols}`);\n\n// inspect schema for column names, data types, etc.\n// [\n//   { name: \"delay\", type: { typeId: 2, bitWidth: 16, signed: true }, ...},\n//   { name: \"distance\", type: { typeId: 2, bitWidth: 16, signed: true }, ...},\n//   { name: \"time\", type: { typeId: 3, precision: 1 }, ...}\n// ]\n// typeId: 2 === Type.Int, typeId: 3 === Type.Float\nconsole.log(JSON.stringify(table.schema.fields, 0, 2));\n\n// convert a single Arrow column to a value array\n// when possible, zero-copy access to binary data is used\nconst delay = table.getChild('delay').toArray();\n\n// data columns are iterable\nconst time = [...table.getChild('time')];\n\n// data columns provide random access\nconst time0 = table.getChild('time').at(0);\n\n// extract all columns into a { name: array, ... } object\n// { delay: Int16Array, distance: Int16Array, time: Float32Array }\nconst columns = table.toColumns();\n\n// convert Arrow data to an array of standard JS objects\n// [ { delay: 14, distance: 405, time: 0.01666666753590107 }, ... ]\nconst objects = table.toArray();\n\n// create a new table with a selected subset of columns\n// use this first to limit toColumns or toArray to fewer columns\nconst subtable = table.select(['delay', 'time']);\n```\n\n### Build and Encode Arrow Data\n\n```js\nimport {\n  bool, dictionary, float32, int32, tableFromArrays, tableToIPC, utf8\n} from '@uwdata/flechette';\n\n// data defined using standard JS types\n// both arrays and typed arrays work well\nconst arrays = {\n  ints: [1, 2, null, 4, 5],\n  floats: [1.1, 2.2, 3.3, 4.4, 5.5],\n  bools: [true, true, null, false, true],\n  strings: ['a', 'b', 'c', 'b', 'a']\n};\n\n// create table with automatically inferred types\nconst tableInfer = tableFromArrays(arrays);\n\n// encode table to bytes in Arrow IPC stream format\nconst ipcInfer = tableToIPC(tableInfer);\n\n// create table using explicit types\nconst tableTyped = tableFromArrays(arrays, {\n  types: {\n    ints: int32(),\n    floats: float32(),\n    bools: bool(),\n    strings: dictionary(utf8())\n  }\n});\n\n// encode table to bytes in Arrow IPC file format\nconst ipcTyped = tableToIPC(tableTyped, { format: 'file' });\n```\n\n### Customize Data Extraction\n\nData extraction can be customized using options provided to table generation methods. By default, temporal data is returned as numeric timestamps, 64-bit integers are coerced to numbers, map-typed data is returned as an array of [key, value] pairs, and struct/row objects are returned as vanilla JS objects with extracted property values. These defaults can be changed via conversion options that push (or remove) transformations to the underlying data batches.\n\n```js\nconst table = tableFromIPC(ipc, {\n  useDate: true,          // map dates and timestamps to Date objects\n  useDecimalInt: true,    // use BigInt for decimals, do not coerce to number\n  useBigInt: true,        // use BigInt for 64-bit ints, do not coerce to number\n  useMap: true,           // create Map objects for [key, value] pair lists\n  useProxy: true          // use zero-copy proxies for struct and table row objects\n});\n```\n\nThe same extraction options can be passed to `tableFromArrays`. For more, see the [**API Reference**](https://idl.uw.edu/flechette/api).\n\n## Build Instructions\n\nTo build and develop Flechette locally:\n\n- Clone https://github.com/uwdata/flechette.\n- Run `npm i` to install dependencies.\n- Run `npm test` to run test cases, `npm run perf` to run performance benchmarks, and `npm run build` to build output files.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuwdata%2Fflechette","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuwdata%2Fflechette","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuwdata%2Fflechette/lists"}