{"id":25806002,"url":"https://github.com/alipsa/spreadsheets","last_synced_at":"2026-05-15T09:09:51.483Z","repository":{"id":39609397,"uuid":"265791587","full_name":"Alipsa/spreadsheets","owner":"Alipsa","description":"Handling (read/write) spreadsheets in Renjin R","archived":false,"fork":false,"pushed_at":"2022-04-12T21:08:40.000Z","size":366,"stargazers_count":4,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-05-01T11:27:03.643Z","etag":null,"topics":["jvm","ods","openoffice-calc","r-package","renjin","xlsx"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Alipsa.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"github":["Alipsa"]}},"created_at":"2020-05-21T08:09:28.000Z","updated_at":"2022-06-24T16:07:22.000Z","dependencies_parsed_at":"2022-09-20T06:13:12.165Z","dependency_job_id":null,"html_url":"https://github.com/Alipsa/spreadsheets","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alipsa%2Fspreadsheets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alipsa%2Fspreadsheets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alipsa%2Fspreadsheets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alipsa%2Fspreadsheets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Alipsa","download_url":"https://codeload.github.com/Alipsa/spreadsheets/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241052529,"owners_count":19901043,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["jvm","ods","openoffice-calc","r-package","renjin","xlsx"],"created_at":"2025-02-27T19:52:36.212Z","updated_at":"2026-05-15T09:09:48.746Z","avatar_url":"https://github.com/Alipsa.png","language":"Java","funding_links":["https://github.com/sponsors/Alipsa"],"categories":[],"sub_categories":[],"readme":"# Spreadsheets - Handling spreadsheets in Renjin R\n\nThis package will give you the ability to work with (read, write) spreadsheets.\n\nIt supports reading of Excel and Open Office/LibreOffice spreadsheets files.\n\nTo use it add the following dependency to your pom.xml:\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003ese.alipsa\u003c/groupId\u003e\n  \u003cartifactId\u003espreadsheets\u003c/artifactId\u003e\n  \u003cversion\u003e1.3.4\u003c/version\u003e\n\u003c/dependency\u003e\n```\n(Note that version 1.3.4 and later requires java 11). The module name is se.alipsa.spreadsheets.\n\n...and use it your Renjin R code after loading it with:\n```r\nlibrary(\"se.alipsa:spreadsheets\")\n```\n\n## Usage\n* All indexes start with 1 (as is common practice in R), e.g. sheetNumber 1 refers to the \nfirst sheet in the spreadsheet and column number 1 is the first (A) column etc.\n\nThe file extension is used to determine whether it is an Excel (xls/xlsx) or Calc (ods) file. \n\n### findRowNumber: Find a row in a column\nTo find the first row where the cell value matches the cellContent parameter:  \n\n```r\nrowNum \u003c- findRowNumber(filePath = \"df.xlsx\", sheet = 1, column = 1, \"Iris\")\n```\n\nYou can also reference the sheet by name:\n\n```r\nrowNum \u003c- findRowNumber(filePath = \"df.ods\", sheet = \"theSheetName\", column = 1, \"Iris\")\n```\n\nor only use names\n\n```r\nrowNum \u003c- findRowNumber(filePath = \"df.xlsx\", sheet = \"theSheetName\", column = \"A\", \"Iris\")\n```\n\n### findColumnNumber: Find a column in a row\nTo find the first column where the cell value matches the cellContent parameter:  \n\n```r\ncolNum \u003c- findColumnNumber(filePath = \"df.xlsx\", sheet = 1, row = 2, \"carb\")`\n```\n\nYou can also reference the sheet by name:\n\n```r\ncolNum \u003c- findColumnNumber(\"df.xlsx\", \"project-dashboard\", 2, \"carb\")\n```\n\nThe return value of findColumnNumber is an Integer with the matching row index\nor -1 if no such cell was found.\n\n### columnIndex and columnName: Get the index number for the corresponding column name and vice versa\nSometimes it is more convenient to refer to the column by the name e.g. A for the first column, B for the second.\nTo convert an index to a name you can do:\n```r\nprint(as.columnName(14))\n[1] \"N\"\n```\n\nBut sometimes you want the other way around:\n\n```r\nprint(as.columnIndex(\"AF\"))\n[1] 32\n```\n\n### importSpreadsheet: import an Excel or Open Office spreadsheet\nReads the content of the spreadsheet and return a data.frame\n```r\nexcelDf \u003c- importSpreadsheet(\n    filePath = \"df.xlsx\",\n    sheet = 1,\n    startRow = 2,\n    endRow = 34,\n    startColumn = 1,\n    endColumn = 11,\n    firstRowAsColumnNames = TRUE\n  )\n```\nThe parameters are as follows:\n* filePath: The filePath to the excel file to import. It must be a path to file that is physically accessible. A remote url will not work.\n* sheet: The sheet index (index starting with 1) for the sheet to import. Can alternatively be the name of the sheet. Default: 1 \n* startRow: The row to start reading from. Default: 1\n* endRow: The last row to read from\n* startColumn: The column index (or name e.g. \"A\") to start reading from. default: 1\n* endColumn: The last column index (or name) to read from.\n* firstRowAsColumnNames: If true then use the values of the first column as column names for the data.frame\n\n_Return value_ A data.frame of Character vectors (strings).\n\nSince the resulting dataframe will return all values as character strings (except missing values which will be NA), \nso you will likely need to massage the data after the import to get what you want. e.g.\n```r\nexcelDf$mpg \u003c- as.numeric(sub(\",\", \".\", excelDf$mpg))\n```\nIn the example above, the regional setting of the excel sheet used comma as the decimal separator so we replace them with \ndots to we can then convert them to numerics.\n\nDates are converted to strings in the format yyyy-MM-dd HH:mm:ss.SSS which is the default format for POSIXct and POSIXlt so you can do:\n```r\nlibrary(\"se.alipsa:spreadsheets\")\ntimeMeasuresDf \u003c- importSpreadsheet(\n  filePath = \"E:\\\\some\\\\path\\\\data\\\\timeMeasures.ods\",\n  sheet = 1,\n  startRow = 1,\n  endRow = 7,\n  startColumn = \"A\",\n  endColumn = \"F\",\n  firstRowAsColumnNames = TRUE\n)\n# change the startDate column to Dates: \ntimeMeasuresDf$startDate \u003c- as.Date(as.POSIXlt(timeMeasuresDf$startDate))\n```\n\n### importSpreadsheets: import several Excel or Open Office spreadsheets at once\nReads the content of the spreadsheets and return a named list of data.frame's\n```r\nsheets \u003c- importSpreadsheets(\n  filePath=paste0(getwd(), \"/mySpreadseet.ods\"),\n  sheets = c('mtcars', 'iris', 'PlantGrowth'),\n  importAreas = list(\n    'mtcars' = c(1, 33, 1, 11),\n    'iris' = c(2, 152, 1, 5),\n    'PlantGrowth' = c(3, 32, 2, 3)\n  ),\n  firstRowAsColumnNames = list(\n    'mtcars' = TRUE,\n    'iris' = TRUE,\n    'PlantGrowth' = FALSE\n  )\n)\n  \nirisDf \u003c- sheets$iris \n```\nThe parameters are as follows:\n* _filePath_ the full path or relative path to the Excel file\n* _sheetNames_ a vector of sheet names e.g. `c('sheet1', 'sheet2')`\n* _importAreas_ a named list of numeric vectors containing start row, end row, start column, end column e.g.\n  `list('sheet1' = c(1, 33, 1, 11), 'sheet2' = c(2, 152, 1, 5))`\n* _firstRowAsColumnNames_ a named vector of logical values for whether the first row should be used as \n  column names for the dataframe in the sheet or not.\n  E.g. `list('sheet1' = TRUE, 'sheet2' = FALSE)`\n\n_Return value_ a named vector of data.frame's (ListVectors) corresponding to the imported sheets\n\nSee import importSpreadsheet for notes about values conversion.\n\n### exportSpreadsheet: export an excel or Open Office spreadsheet\n\nTo export to a new spreadsheet use\n```r\nexportSpreadsheet(filePath, df)\n```\nWhere filePath the path to the new sheet and df is the data-frame to export. If the file already exist, no action\nwill be taken.\n\n\nThe \"upsert\" (create new if not exists, update if exist) version is:\n\n```r\nexportSpreadsheet(filePath, df, sheet)\n```\nWhere df is the data-frame to export and filePath the path to the new or existing spreadsheet, \nand sheet is the sheet name to create or update. \n\nThe function returns TRUE if successful or FALSE if not. \n\n### exportSpreadsheets: export multiple data.frames to an excel or Open Office spreadsheet\nJust like above, when you have several dataframes that you want to export in one go you can\ndo it like this:\n```r\nexportSpreadsheets(\n  filePath = paste0(getwd(), \"/dfExport.ods\"), \n  dfList = list(mtcars, iris, PlantGrowth), \n  sheetNames = c(\"cars\", \"flowers\", \"plants\")\n)\n```\nThe number of sheet names must match the number of data frames in the list.\n\n\nThere are more functions in the api than what is described above, see [SpreadsheetTests.R](https://github.com/Alipsa/spreadsheets/blob/master/src/test/R/SpreadsheetTests.R) for more examples.\n\n## Background / motivation\nWhy not just use one of the existing packages such as xlsx, XLConnect, or gdata? \nSometimes I had problems with loading these packages, or some functions did not work (none of them fully passes \nthe tests on renjin cran).\nAlso, I missed some search functionality to make imports more dynamic in my R code as well as the ability to handle \nthe OpenOffice format (readOds is not available in Renjin yet).\nAs the gcc-bridge (which compiles C code to jvm byte code) gets better, the first kind of problem will disappear,\nbut I needed something \"now\". This is a \"Renjin native\" package which attempts to address some of those issues.\n\n## Dependencies / 3:rd party libraries used\n\n1. Renjin (https://www.renjin.org/, https://github.com/bedatadriven/renjin).\nThis is a Renjin package (extension) so obviously it requires Renjin to use. \nI have tested with version 3.5-beta76 but there is no particular Renjin version required, \nanything from version 0.9 and later should work.\n\n2. POI (https://poi.apache.org/)\nUsed to read and write Excel files. Built and tested with poi version 5.\n\n3. SODS (https://github.com/miachm/SODS)\nUsed to read and write Open Document Spreadsheets (Open Office / Libre Office Calc files).\nBuilt and tested with SODS version 1.4.\n\n\n# Version history\n\n### 1.3.5\n- Add Automatic-Module-Name\n\n### 1.3.4, Apr 12, 2022\n- Add support for import of multiple sheets at once\n- Upgrade to java 11\n- Upgrade apache poi dependencies\n\n### 1.3.3, Feb 6, 2022\n- make ods import behave similar to excel when importing percentages (i.e import it as a decimal e.g. 0.54 instead of 54%)\n- improve test: check that column headers are imported correctly\n- upgrade poi and slf4j\n\n### 1.3.2, Jan 30, 2022\n- Change data.frame creation of row.names to be future-proof by replacing the RowNamesVector with a ConvertingStringVector\n- update poi, logging and some plugin versions\n\n### 1.3.1, Aug 22, 2021\n- upgrade dependencies (notably SODS which in version 1.4 has a greatly reduced footprint)\n\n### 1.3, Feb 21, 2021\n- close workbook properly when calling getSheetNames()\n- upgrade SODS and poi versions\n\n### 1.2, Aug 02, 2020\n- Changed from primitives to Object wrappers (int -\u003e Integer etc.) so that we can correctly return\nNULL for missing values (which will be NA in the data.frame).\n- Allow export to update existing file.\n\n### 1.1, May 31, 2020\n- Api change: modified the api so that we always start with filePath to make it more consistent.\n              renamed columnIndex function to as.columnIndex and similar for columnName.\n- Add support for exporting multiple data.frames   \n- Enhanced documentation\n\n### 1.0 Initial release, May 27, 2020           ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falipsa%2Fspreadsheets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falipsa%2Fspreadsheets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falipsa%2Fspreadsheets/lists"}