{"id":16940062,"url":"https://github.com/spencermountain/efrt","last_synced_at":"2025-04-05T15:10:20.354Z","repository":{"id":56061192,"uuid":"83158681","full_name":"spencermountain/efrt","owner":"spencermountain","description":"neato compression for key-value data","archived":false,"fork":false,"pushed_at":"2024-09-27T14:41:17.000Z","size":498,"stargazers_count":103,"open_issues_count":3,"forks_count":3,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-27T08:08:21.492Z","etag":null,"topics":["compression","trie"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spencermountain.png","metadata":{"files":{"readme":"README.md","changelog":"changelog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-02-25T19:59:59.000Z","updated_at":"2025-03-13T04:47:17.000Z","dependencies_parsed_at":"2024-10-27T00:58:09.084Z","dependency_job_id":null,"html_url":"https://github.com/spencermountain/efrt","commit_stats":{"total_commits":152,"total_committers":2,"mean_commits":76.0,"dds":0.09868421052631582,"last_synced_commit":"4d0f485946813ca3003e30880177b169abc448ec"},"previous_names":["nlp-compromise/efrt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spencermountain%2Fefrt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spencermountain%2Fefrt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spencermountain%2Fefrt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spencermountain%2Fefrt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spencermountain","download_url":"https://codeload.github.com/spencermountain/efrt/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247353749,"owners_count":20925329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compression","trie"],"created_at":"2024-10-13T21:06:11.684Z","updated_at":"2025-04-05T15:10:20.324Z","avatar_url":"https://github.com/spencermountain.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://cloud.githubusercontent.com/assets/399657/23590290/ede73772-01aa-11e7-8915-181ef21027bc.png\" /\u003e\n  \u003cdiv\u003ecompression of key-value data\u003c/div\u003e\n  \u003ca href=\"https://npmjs.org/package/efrt\"\u003e\n    \u003cimg src=\"https://img.shields.io/npm/v/efrt.svg?style=flat-square\" /\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://unpkg.com/efrt/builds/efrt.min.js\"\u003e\n     \u003cimg src=\"https://badge-size.herokuapp.com/spencermountain/efrt/master/builds/efrt.min.js\" /\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://nodejs.org/api/documentation.html#documentation_stability_index\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/stability-stable-green.svg?style=flat-square\" /\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003ccode\u003enpm install efrt\u003c/code\u003e\n\u003c/div\u003e\n\nif your data looks like this:\n\n```js\nvar data = {\n  bedfordshire: 'England',\n  aberdeenshire: 'Scotland',\n  buckinghamshire: 'England',\n  argyllshire: 'Scotland',\n  bambridgeshire: 'England',\n  cheshire: 'England',\n  ayrshire: 'Scotland',\n  banffshire: 'Scotland'\n}\n```\n\nyou can compress it like this:\n\n```js\nimport { pack } from 'efrt'\nvar str = pack(data)\n//'England:b0che1;ambridge0edford0uckingham0;shire|Scotland:a0banff1;berdeen0rgyll0yr0;shire'\n```\n\nthen \\_very!\\_ quickly flip it back into:\n\n```js\nimport { unpack } from 'efrt'\nvar obj = unpack(str)\nobj['bedfordshire'] //'England'\n```\n\n\u003ch1 align=\"center\"\u003eYep,\u003c/h1\u003e\n\n**efrt** packs category-type data into a _[very compressed prefix trie](https://en.wikipedia.org/wiki/Trie)_ format, so that redundancies in the data are shared, and nothing is repeated.\n\nBy doing this clever-stuff ahead-of-time, **efrt** lets you ship _much more_ data to the client-side, without hassle or overhead.\n\nThe whole library is **8kb**, the unpack half is barely **2kb**.\n\nit is based on:\n\n- 😍 [tamper](https://nytimes.github.io/tamper/) by the [NYTimes](https://github.com/NYTimes/)\n- 💝 [lookups](https://github.com/mckoss/lookups) by [Mike Koss](https://github.com/mckoss),\n- 💓 [bits.js](http://stevehanov.ca/blog/index.php?id=120) by [Steve Hanov](https://twitter.com/smhanov)\n\n\u003ca href=\"https://monolithpl.github.io/trie-compiler/\"\u003eBenchmarks!\u003c/a\u003e\n\n\u003ch3 align=\"center\"\u003e\n  \u003ca href=\"https://rawgit.com/nlp-compromise/efrt/master/demo/index.html\"\u003eDemo!\u003c/a\u003e\n\u003c/h3\u003e\n\n\u003ch5 align=\"left\"\u003e\nBasically,\n\u003c/h5\u003e\n\n- get a js object into very compact form\n- reduce filesize/bandwidth a bunch\n- ensure the unpacking time is negligible\n- keep word-lookups on critical-path\n\n```js\nimport { pack, unpack } from 'efrt' // const {pack, unpack} = require('efrt')\n\nvar foods = {\n  strawberry: 'fruit',\n  blueberry: 'fruit',\n  blackberry: 'fruit',\n  tomato: ['fruit', 'vegetable'],\n  cucumber: 'vegetable',\n  pepper: 'vegetable'\n}\nvar str = pack(foods)\n//'{\"fruit\":\"bl0straw1tomato;ack0ue0;berry\",\"vegetable\":\"cucumb0pepp0tomato;er\"}'\n\nvar obj = unpack(str)\nconsole.log(obj.tomato)\n//['fruit', 'vegetable']\n```\n\n---\n\n\u003ch5 align=\"left\"\u003e\nor, an Array:\n\u003c/h5\u003e\n\nif you pass it an array of strings, it just creates an object with `true` values:\n\n```js\nconst data = [\n  'january',\n  'february',\n  'april',\n  'june',\n  'july',\n  'august',\n  'september',\n  'october',\n  'november',\n  'december'\n]\nconst packd = pack(data)\n// true¦a6dec4febr3j1ma0nov4octo5sept4;rch,y;an1u0;ly,ne;uary;em0;ber;pril,ugust\nconst sameArray = Object.keys(unpack(packd))\n// same thing !\n```\n\n## Reserved characters\n\nthe keys of the object are normalized. Spaces/unicode are good, but numbers, case-sensitivity, and _some punctuation_ (semicolon, comma, exclamation-mark) are not (yet) supported.\n\n```js\nspecialChars = new RegExp('[0-9A-Z,;!:|¦]')\n```\n\n_efrt_ is built-for, and used heavily in [compromise](https://github.com/nlp-compromise/compromise), to expand the amount of data it can ship onto the client-side.\nIf you find another use for efrt, please [drop us a line](mailto:spencermountain@gmail.com)🎈\n\n## Performance\n\n_efrt_ is tuned to be very quick to unzip. It is O(1) to lookup. Packing-up the data is the slowest part, which is usually fine:\n\n```js\nvar compressed = pack(skateboarders) //1k words (on a macbook)\nvar trie = unpack(compressed)\n// unpacking-step: 5.1ms\n\ntrie.hasOwnProperty('tony hawk')\n// cached-lookup: 0.02ms\n```\n\n## Size\n\n`efrt` will pack filesize down as much as possible, depending upon the redundancy of the prefixes/suffixes in the words, and the size of the list.\n\n- list of countries - `1.5k -\u003e 0.8k` _(46% compressed)_\n- all adverbs in wordnet - `58k -\u003e 24k` _(58% compressed)_\n- all adjectives in wordnet - `265k -\u003e 99k` _(62% compressed)_\n- all nouns in wordnet - `1,775k -\u003e 692k` _(61% compressed)_\n\nbut there are some things to consider:\n\n- bigger files compress further (see [🎈 birthday problem](https://en.wikipedia.org/wiki/Birthday_problem))\n- using efrt will reduce gains from gzip compression, which most webservers quietly use\n- english is more suffix-redundant than prefix-redundant, so non-english words may benefit from other styles\n\nAssuming your data has a low _category-to-data ratio_, you will hit-breakeven with at about 250 keys. If your data is in the thousands, you can very be confident about saving your users some considerable bandwidth.\n\n## Use\n\n**IE9+**\n\n```html\n\u003cscript src=\"https://unpkg.com/efrt@latest/builds/efrt.min.cjs\"\u003e\u003c/script\u003e\n\u003cscript\u003e\n  var smaller = efrt.pack(['larry', 'curly', 'moe'])\n  var trie = efrt.unpack(smaller)\n  console.log(trie['moe'])\n\u003c/script\u003e\n```\n\nif you're doing the second step in the client, you can load just the CJS unpack-half of the library(~3k):\n\n```js\nconst unpack = require('efrt/unpack') // node/cjs\n```\n\n```html\n\u003cscript src=\"https://unpkg.com/efrt@latest/builds/efrt-unpack.min.cjs\"\u003e\u003c/script\u003e\n\u003cscript\u003e\n  var trie = unpack(compressedStuff)\n  trie.hasOwnProperty('miles davis')\n\u003c/script\u003e\n```\n\nThanks to [John Resig](https://johnresig.com/) for his fun [trie-compression post](https://johnresig.com/blog/javascript-trie-performance-analysis/) on his blog, and [Wiktor Jakubczyc](https://github.com/monolithpl) for his performance analysis work\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspencermountain%2Fefrt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspencermountain%2Fefrt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspencermountain%2Fefrt/lists"}