Read, Write, Format Excel 2007 and Excel 97/2000/XP/2003 Files

Provide R functions to read/write/format Excel 2007 and Excel 97/2000/XP/2003 file formats.


An R package to read, write, format Excel 2007 and Excel 97/2000/XP/2003 files

The package provides R functions to read, write, and format Excel files. Depends on Java, but this makes it available on most OS'es.

Stable version from CRAN

install.packages('xlsx')

Or development version from GitHub

devtools::install_github('dragua/xlsx')

Quick start

To read the first sheet from spreadsheet into a data.frame

read.xlsx2('C:/temp/file.xlsx', 1)

To write a data.frame to a spreadsheet

write.xlsx2(iris, file='C:/temp/iris.xlsx')

Issues/Mailing list

To report a bug, use the Issues page at issues

Questions should be asked on the dedicated mailing list

News

NEWS for package xlsx

Changes in version 0.6.1 (released 2018-06-10)

o Export S3 methods for CellBlock, CellStyle

o Do not depend (just import) rJava and xlsxjars

o Minor fixes to pass CRAN again.

o Add Cole Arendt as an author

Changes in version 0.6.0 (released 2015-11-29)

o Added support to read and write password protected xlsx files. Note that this is for xlsx format ONLY (no binary xls format). Thanks to Heather Turner for code contribution. Note that on Linux LibreOffice does not open password protected workbooks as of now. See issue 49.

o Fixed a bug in read.xlsx when an empty rows was triggering an error. Reported by maxnax, see issue 57.

Changes in version 0.5.7 (released 2014-08-01)

o Fixed issue 41. There was a bug in the '+' method for CellStyle that was triggered when Font + Fill was not working properly.

o Added a new function called setRowHeight (wish of Sven Neulinger). See issue 43.

o Function write.xlsx did not check properly for POSIXct column classes and this resulted in spreadsheets not properly formatted. See issue 45.

o Added a new function addAutoFilter to set a filter on a sheet.

Changes in version 0.5.6 (released 2014-04-01)

o Functions read.xlsx, read.xlsx2 crashed when trying to read empty sheets. Now they return NULL if the sheet exists but it's empty. See Issue 28.

o Fixed a corner case in addDataFrame when the data.frame has zero columns.

o Allow colors to be specified as hex strings (e.g. "#FF0000" for red). Previously, only R colors were supported. This change applies only to the xlsx format. Old style binary xls format still only supports a limited number of colors (POI limitation).

o Change the .onLoad functions from both xlsxjars and xlsx package to allow for other users to initialize the Java VM themselves (useful for other packages that depend on xlsxjars). This is not a user visible change. See Issue 36.

Changes in version 0.5.5 (released 2013-12-01)

o Allow CB.setMatrixData to accept character matrices too. It only worked with numeric matrices in earlier versions. See Issue 19.

o Fixed an issue with read.xlsx2 when you wanted columns of type character, but the cell was of numeric type and the value was an exact integer (say 12). In R the result was imported as "12.0" instead of "12". See Issue 25.

o DateTime and Date formatting when using write.xlsx and addDataFrame was hardcoded to the Java formats "m/d/yyyy h:mm:ss" and "m/d/yyyy" respectively. These have been added as package options to allow for other formats when using these functions. See Issue 26.

o For write.xlsx(2) when append==TRUE, check if file exists and load it only if it exists, otherwise create the workbook. As suggested by Richard Cotton. See Issue 27.

Changes in version 0.5.3 (released 2013-07-11)

o Fix a bug introduced in 0.5.1 version in read.xlsx. The rowIndex argument was not respected anymore.

o Cleaned up the code in readColumns responsible for guessing the colClasses.

Changes in version 0.5.1 (released 2013-03-31)

o Fix an annoyance with read.xlsx2, so it doesn't return column names of the form c..............

o Add a startRow, endRow argument to read.xlsx. These are convenience arguments for better control. Also, it brings the function signature closer to that of read.xlsx2.

o A NullPointerException was triggered for read.xlsx when the table due to incorrect indexing of the header columns in certain circumstances. See Issue 9.

o Fix an error with readColumns that was affecting read.xlsx2. By default readColumns was using the max number of rows in the sheet to determine how many rows to read. If the max number of rows in sheet was greater than the existing number of rows in the column you were reading, you were getting a NullPointerException. This may happen for malformed spreadsheets (created by some other proprietary software for example).

Changes in version 0.5.0 (released 2012-09-23)

o New functionality: Introduce the concept of CellBlock to allow faster writes/updates of cell blocks. Writing a spreadsheet is now +50% faster for medium sized spreadsheets compared to version 0.4.2. This functionality allows you to style a block of cells much easier than before. Thanks to Alexey Stukalov for his significant contribution on the Java side and for the original impetus.

o Absolute paths are no longer needed for saveWorkbook and friends. Reported by Mike Cheetham.

o Functions write.xlsx, write.xlsx2 now allow you to create xls documents too. Previous versions only allowed xlsx documents.

o Bug fix: read.xlsx2 doesn't read an extra empty column anymore. Also colClasses were not mapped correctly to the corresponding colIndex.

o Deprecated: getMatrixValues. Use the readColumns function instead.

Changes in version 0.4.2 (released 2012-02-08)

o New function readRows() similar to readColumns() for reading accross the columns.

o Removed a stray browser() in function .guess_cell_type

Changes in version 0.4.1 (released 2012-01-22)

o Added a test for file existence in loadWorkbook. If file does not exist, the error message from read.xlsx, read.xlsx2 and loadWorkbook is now more informative. Also added path expansion in file names. (Suggested by Dirk Eddelbuettel.)

o Fixed bug in addDataFrame to allow it to add a df to existing sheets, not only to new sheets.

Changes in version 0.4.0 (released 2012-01-15)

o BACWARDS INCOMPATIBLE CHANGES! A complete rewrite of functionality to deal with cell styles. Although I want to minimize API breaking changes, I belive that these changes are for the better as the old function createCellStyle had a whopping 11 arguments and was still incomplete. The new functionality defines an S3 object CellStyle on which you can "add" DataFormat, Font, Fill, Border, Alignment, Protection. On the note of backward compatibility, I don't want to promise anything as the package is still young. As more people starts using the package, the API will freeze and if breaking changes are contemplated, a clear deprecation path will be provided.

o A Google project has been created http://code.google.com/p/rexcel/ to expose the development branch of the package and manage the social interaction. Please report all issues through this venue. Also a Google groups http://groups.google.com/group/R-package-xlsx has been created for announcements, etc. Do register if you are interested in the development of the package. It may give me more impetus to resolve an issue if I know many people are using this package.

o New function addHyperlink to add hyperlinks (urls, emails) to a cell.

o New function addDataFrame to add a data.frame to an existing sheet. It alows the user to style the header, rownames, or individual columns. This is now used internally in the write.xlsx2 function.

o New function getColumns to read a rectangular shape of cells into an R data.frame. This is now used internally in the read.xlsx2 function.

o Rename the default in read.xlsx, createSheet and removeSheet to sheetName="Sheet1" from "Sheet 1". This makes it consistent with Excel 2007 names of an empty workbook.

o Added ... arguments to read.xlsx2 function to mirror read.xlsx.

o Thanks to Neal Richardson and James Ward for submitting some code and suggestions.

Changes in version 0.3.0 (released 2011-03-03)

o Effort has been made to make all functions of this package to be agnostic between Excel versions. You can now read, write and format files in Excel versions 97/2000/XP/2003 (not 95!) with file extension xls, in addition to Excel 2007 with file extension xlsx. Please report issues you encounter. Note that Colors are limited for xls workbooks (see ?CellStyle).

o Read strings in a different encoding. Thanks to Wincent Huang for code contribution. Function ?getCellValue now has an encoding argument.

o Add support for Excel ranges. Thanks to Wolfgang Abele for contributing preliminary code. See ?Range.

o New function read.xlsx2 for reading spreadsheets. By moving the looping into java one gets a speed bump of one order of magnitude or better over read.xlsx.

o Documentation fixes.

o Move to version 3.7 for POI jars. See http://poi.apache.org/.

Changes in version 0.2.4 (released 2010-10-20)

o New function write.xlsx2 in which the writing is done on the java side. Speed improvements of one order of magnitude over write.xlsx on moderately large data.frames (100,000 elements).

Changes in version 0.2.3 (released 2010-08-26)

o Fix the hAlign and vAlign arguments in createCellStyle. The internal method call was lacking a cast to jshort. Reported by Douglas Rivers.

o Fix getCellValue when you have formulas with String values (it assumed Numeric values and errored out for other types). Support now Strings and Booleans. Error out for other cell types. Don't know a sound solution. Reported by ravi(?) [email protected].

Changes in version 0.2.2 (released 2010-07-14)

o Added a colIndex argument to read.xlsx to facilitate reading only specific columns.

o setCellValue now tests for NA, and if value is NA, it will fill the cell with #N/A.

o Fixed bug in getCellValue. It now returns NA for all the error codes in the cell. It used to return a numeric code which was confusing to the R user. Reported by Ralf Tautenhahn.

Changes in version 0.2.1 (released 2010-05-15)

o Fixed bug with write.xlsx. It does not write colnames even if col.names=TRUE. Reported by Ralf Tautenhahn.

o Added an ... arg to read.xlsx that is passed to the data.frame constructor, for example to control the stringsAsFactors option.

Changes in version 0.2.0 (released 2010-05-01)

o Switched to POI 3.6. This resulted in significant memory improvements but will still run into memory issues when reading/writing large xlsx files.

o Added addPicture function for embedding pictures into xlsx files.

o Added removeRow function for conveniently removing existing rows from the spreadsheet.

o Added/Fixed comments for cells. See ?Comment

o Fixed bug in read.xlsx for the case when the file contains only one column (issue reported by Hans Petersen), a corner case when drop=TRUE wrecked havoc.

o Fixed bug in createRow. If rowIndex did not start at 1, it created spurious NULL entries.

Changes in version 0.1.3 (released 2010-03-15)

o Added indCol argument to getCells in case you want to get only a subset of columns.

o Added function getMatrixValue to extract blocks of data from the sheet.

o Improved and expanded the unit tests.

o On Mac, you cannot set colors directly using createCellStyle. You can still do it manually, please see the javadocs.

Changes in version 0.1.2 (released 2010-01-02)

o Fixed getRows, getCells so it does not error out for empty rows/cells. Modified read.xlsx too.

o Added append argument to write.xlsx to be able to export to multiple worksheets of a file. (Suggestion by [email protected].)

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.