{"id":19732068,"url":"https://github.com/ffalt/xlsx-extract","last_synced_at":"2025-09-11T23:08:02.520Z","repository":{"id":12193412,"uuid":"14797860","full_name":"ffalt/xlsx-extract","owner":"ffalt","description":"nodejs lib for extracting data from XLSX files","archived":false,"fork":false,"pushed_at":"2023-03-17T12:44:15.000Z","size":2527,"stargazers_count":44,"open_issues_count":2,"forks_count":17,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-11-08T07:34:58.208Z","etag":null,"topics":["extracting-data","node-module","xlsx"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ffalt.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2013-11-29T09:55:43.000Z","updated_at":"2024-10-18T06:21:18.000Z","dependencies_parsed_at":"2024-06-18T19:53:20.380Z","dependency_job_id":"4b261aec-8451-47f5-a284-07cdbb4df212","html_url":"https://github.com/ffalt/xlsx-extract","commit_stats":{"total_commits":389,"total_committers":10,"mean_commits":38.9,"dds":0.3419023136246787,"last_synced_commit":"c4affa2c3af82f89190ff8e15bc7228a3f9a7737"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffalt%2Fxlsx-extract","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffalt%2Fxlsx-extract/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffalt%2Fxlsx-extract/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffalt%2Fxlsx-extract/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ffalt","download_url":"https://codeload.github.com/ffalt/xlsx-extract/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224194630,"owners_count":17271495,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extracting-data","node-module","xlsx"],"created_at":"2024-11-12T00:24:37.034Z","updated_at":"2024-11-12T00:24:37.637Z","avatar_url":"https://github.com/ffalt.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# xlsx-extract\n\nextracts data from XLSX files with low memory footprint\n\n\nxlsx-files can be pretty big, so nodejs \u0026 full featured xlsx-modules can reach memory limits or just use more than is needed for that task. (--max-old-space-size \u0026 --stack_size can't help you all the time either)\n\nhence these magnificent features:\n\n- files are parsed with sax parser `sax` or `node-expat`\n- get rows/cells each by callback or write them to a .tsv or .json file\n\n\n[![NPM](https://nodei.co/npm/xlsx-extract.png?downloads=true\u0026downloadRank=true\u0026stars=true)](https://nodei.co/npm/xlsx-extract/)\n\n![test](https://github.com/ffalt/xlsx-extract/workflows/test/badge.svg)\n[![license](https://img.shields.io/npm/l/xlsx-extract.svg)](http://opensource.org/licenses/MIT) \n[![known vulnerabilities](https://snyk.io/test/github/ffalt/xlsx-extract/badge.svg)](https://snyk.io/test/github/ffalt/xlsx-extract) \n[![certification](https://api.codacy.com/project/badge/Grade/7bd868b2fb1c4f38ad9ef2ffb698c314)](https://www.codacy.com/app/ffalt/xlsx-extract) \n[![total downloads](https://badgen.net/npm/dt/xlsx-extract)](https://badgen.net/npm/dt/xlsx-extract)\n\n\n## Install\n\n```\nnpm install xlsx-extract\n```\n\nThe XML files of the format are parsed with [sax-js](https://github.com/isaacs/sax-js) by default. \n\nIf you want to use the faster [node-expat](https://github.com/astro/node-expat) parser please install it manually and use the {parser:\"expat\"} option. (Needs native compiling on the destination system)\n```\nnpm install node-expat\n```\n\n\n## Options\n\n```\n\ninterface IXLSXExtractOptions {\n\t// sheet selection (provide one of the following)\n\tsheet_name?: string; // select by sheet name\n\tsheet_nr?: string; // default \"1\" - select by number of the sheet starting on 1\n\tsheet_id?: string; // select by sheet id, e.g. \"1\"\n\tsheet_rid?: string; // select by internal sheet rid, e.g. \"rId1'\n\tsheet_all?: boolean; // default false - select all sheets\n\t// sax parser selection\n\tparser?: string; // default \"sax\" - 'sax'|'expat'\n\t// row selection\n\tignore_header?: number; // default 0 - the number of header lines to ignore\n\tinclude_empty_rows?: boolean; // default false - include empty rows in the middle/at start\n\t// how to output sheet, rows and cells\n\tformat?: string; // default array - convert to 'array'||'json'||'tsv'||'obj'\n\t// tsv output options\n\ttsv_float_comma?: boolean; // default false - use \",\" als decimal point for floats\n\ttsv_delimiter?: string; // default '\\t' - use specified character to field delimiter\n\ttsv_endofline?: string; // default depending on your operating system (node os.EOL) e.g. '\\n'\n\t// cell value formats\n\traw_values?: boolean;  // default false - do not apply cell formats (get values as string as in xlsx)\n\tround_floats?: boolean; // default true - round float values as the cell format defines (values will be reported as parsed floats otherwise)\n\tdate1904?: boolean;   // default false - use date 1904 conversion\n\tignore_timezone?: boolean; // default false - ignore timezone in date parsing\n\tconvert_values?: { // apply cell number formats or not (values will be reported as strings otherwise)\n\t\tints?: boolean;  // rounds to int if number format is for int\n\t\tfloats?: boolean;  // rounds floats according to float number format\n\t\tdates?: boolean;  // converts xlsx date to js date\n\t\tbools?: boolean; // converts xlsx bool to js boolean\n\t};\n\t// xlsx structure options\n\tworkfolder?: string; // default 'xl' - the workbook subfolder in zip structure\n}\n\n\n\n```\n\n## Convenience API\n\n```javascript\n\n\tvar XLSX = require('xlsx-extract').XLSX;\n\n\t//dump arrays\n\tnew XLSX().extract('path/to/file.xlsx', {sheet_id:1}) // or sheet_name or sheet_nr\n\t\t.on('sheet', function (sheet) {\n\t\t\tconsole.log('sheet',sheet);  //sheet is array [sheetname, sheetid, sheetnr]\n\t\t})\n\t\t.on('row', function (row) {\n\t\t\tconsole.log('row', row);  //row is a array of values or []\n\t\t})\n\t\t.on('cell', function (cell) {\n\t\t\tconsole.log('cell', cell); //cell is a value or null\n\t\t})\n\t\t.on('error', function (err) {\n\t\t\tconsole.error('error', err);\n\t\t})\n\t\t.on('end', function (err) {\n\t\t\tconsole.log('eof');\n\t\t});\n\n\t//dump by row in tsv-format\n\tnew XLSX().extract('path/to/file.xlsx', {sheet_id:1, format:'tsv'}) // or sheet_name or sheet_nr\n\t\t.on('sheet', function (sheet) {\n\t\t\tconsole.log('sheet', sheet);  //sheet is tsv sheetname sheetnr\n\t\t})\n\t\t.on('row', function (row) {\n\t\t\tconsole.log(row); //row is a tsv line\n\t\t})\n\t\t.on('cell', function (cell) {\n\t\t\tconsole.log(cell); //cell is a tsv value\n\t\t})\n\t\t.on('error', function (err) {\n\t\t\tconsole.error(err);\n\t\t})\n\t\t.on('end', function (err) {\n\t\t\tconsole.log('eof');\n\t\t});\n\n\t//convert to tsv-file (sheet info is not written to file)\n\tnew XLSX().convert('path/to/file.xlsx', 'path/to/destfile.tsv')\n\t\t.on('error', function (err) {\n\t\t\tconsole.error(err);\n\t\t})\n\t\t.on('end', function () {\n\t\t\tconsole.log('written');\n\t\t})\n\n\t//convert to json-file (sheet info is not written to file)\n\tnew XLSX().convert('path/to/file.xlsx', 'path/to/destfile.json')\n\t\t.on('error', function (err) {\n\t\t\tconsole.error(err);\n\t\t})\n\t\t.on('end', function () {\n\t\t\tconsole.log('written');\n\t\t})\n\n\n\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fffalt%2Fxlsx-extract","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fffalt%2Fxlsx-extract","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fffalt%2Fxlsx-extract/lists"}