Tools for Splitting, Applying and Combining Data

A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.


plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together. For example, you might want to:

  • fit the same model each patient subsets of a data frame
  • quickly calculate summary statistics for each group
  • perform group-wise transformations like scaling or standardising

It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with:

  • totally consistent names, arguments and outputs
  • convenient parallelisation through the foreach package
  • input from and output to data.frames, matrices and lists
  • progress bars to keep track of long running operations
  • built-in error recovery, and informative error messages
  • labels that are maintained across all transformations

Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in equivalents.

A detailed introduction to plyr has been published in JSS: "The Split-Apply-Combine Strategy for Data Analysis", http://www.jstatsoft.org/v40/i01/. You can find out more at http://had.co.nz/plyr/, or track development at http://github.com/hadley/plyr. You can ask questions about plyr (and data manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr.

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("plyr")

1.8.4 by Hadley Wickham, a year ago


http://had.co.nz/plyr, https://github.com/hadley/plyr


Report a bug at https://github.com/hadley/plyr/issues


Browse source code at https://github.com/cran/plyr


Authors: Hadley Wickham [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports Rcpp

Suggests abind, testthat, tcltk, foreach, doParallel, itertools, iterators, covr

Linking to Rcpp


Imported by ACDm, AFM, ALA4R, APSIM, ARTool, AntWeb, AppliedPredictiveModeling, BEACH, BEQI2, BNPMIXcluster, BTSPAS, BacArena, BatchExperiments, BayesFM, Biograph, CDM, CONDOP, CTM, ChemoSpec, CopulaDTA, CrossScreening, Deducer, DescribeDisplay, Epi, FAOSTAT, FRK, FSA, FuzzyR, GDELTtools, GERGM, GFD, GGally, GSIF, GUIgems, GrpString, HLMdiag, HTSSIP, HighDimOut, HiveR, HydeNet, IGM.MEA, IMFData, IRATER, ITGM, Information, IntClust, InterfaceqPCR, InterpretMSSpectrum, IsingSampler, JWileymisc, Kernelheaping, LakeMetabolizer, LendingClub, MAGNAMWAR, MANOVA.RM, MCPAN, MFPCA, MODIStsp, MRMR, MVN, MatrixLDA, MetaComp, MetaboQC, MiRAnorm, Momocs, MplusAutomation, NFP, OpasnetUtils, OpenRepGrid, OutbreakTools, PAFit, PKNCA, PTXQC, PhenotypeSimulator, PhylogeneticEM, Plasmidprofiler, PopGenReport, PredPsych, RAM, RCriteo, REndo, RForcecom, RGA, RNeXML, RNewsflow, RSA, RSPS, RSentiment, RSiteCatalyst, RSocrata, RbioRXN, RchivalTag, RcmdrPlugin.KMggplot2, RefManageR, RevEcoR, RmarineHeatWaves, SEERaBomb, SNSequate, SensMixed, SensusR, SeqFeatR, SimDesign, Smisc, SocialMediaLab, Stack, StatRank, TAM, TDPanalysis, TR8, TSMining, TripleR, UpSetR, WCE, WRS2, Wats, WikiSocio, XML2R, aLFQ, abbyyR, acc, adapr, anchoredDistr, anomalyDetection, antaresRead, aop, aqp, asVPC, aslib, aws.alexa, bayesPop, bayesboot, bdvis, bea.R, benchmark, betalink, bib2df, bigml, blkbox, bold, bpa, brainGraph, breakfast, broom, bulletr, burnr, caret, caretEnsemble, classifly, classify, clhs, clickstream, climwin, clusterfly, clusternomics, clustrd, coefplot, comf, confidence, contoureR, countyfloods, cowplot, crtests, ctsem, d3Network, d3Tree, dataone, dcmr, ddpcr, deBInfer, demi, discreteRV, dotwhisker, dplR, drLumi, drake, dropR, dtwSat, dtwclust, dynr, dynsurv, ecoengine, edpclient, eiCompare, elhmc, emdbook, enigma, episensr, erp.easy, esaddle, europepmc, evoper, exifr, expandFunctions, exprso, extracat, eyelinker, ez, ezsim, fSRM, fbRads, fdq, fecR, finch, flippant, forestinventory, freqweights, funcy, gProfileR, gcbd, gems, gemtc, geospt, gfcanalysis, ggQC, ggedit, ggenealogy, ggiraph, ggiraphExtra, gglogo, ggloop, ggmap, ggparallel, ggplot2, ggpmisc, ggraph, ggspatial, ggstance, ggtern, gmediation, granovaGG, gridsampler, groupdata2, gsDesign, harvestr, heemod, hyfo, inctools, inegiR, intsvy, jocre, kamila, kimisc, kobe, kutils, lazysql, learningr, lfda, lfl, lfstat, lifelogr, likert, linear.tools, llama, lllcrc, lmerTest, lmeresampler, loopr, lpbrim, lsbclust, lsmeans, machQA, mafs, marcher, matchMulti, medicalrisk, meifly, meltt, metScanR, metafolio, metagen, metaviz, meteo, mixOmics, mizer, mlVAR, mplot, mpoly, msltrend, mtconnectR, mudata, multilevelPSA, mvdalab, nasadata, nat, nat.nblast, neotoma, net.security, networkTomography, networkreporting, nhanesA, nima, npIntFactRep, npsm, nullabor, oai, okmesonet, openair, opentraj, optiRum, optiSel, pROC, paco, paleobioDB, paleofire, parboost, patPRO, pathological, pcrsim, pdp, pems.utils, peptider, phenopix, photobiology, pirate, pitchRx, planar, platetools, plotKML, plotROC, plotluck, plusser, pogit, pointRes, poliscidata, powerbydesign, pqantimalarials, predictmeans, prettymapr, primerTree, processcontrol, productplots, profr, proteomics, prozor, psytabs, ptycho, pxweb, qgraph, rLakeAnalyzer, rLiDAR, rSPACE, rWBclimate, rYoutheria, radmixture, rapportools, raptr, rbhl, rbison, rchess, rcompanion, rcrossref, rcure, readbulk, repijson, repmis, reshape, reshape2, rfigshare, ridigbio, rinat, rnpn, rnrfa, roadoi, robustvarComp, rodham, rollply, rosm, rplos, rprime, rscopus, rsdmx, rslp, rsnps, rsunlight, rtematres, rtide, rusda, rwty, s2dverification, satellite, scales, scanstatistics, segmag, semPlot, sequoia, sharpshootR, simPop, simTool, simr, sirt, skm, snht, soilDB, solarius, solr, solrium, spant, spduration, spef, spiders, splithalf, spongecake, ss3sim, statcheck, stationaRy, stormwindmodel, strvalidator, superheat, surveydata, synthpop, tRophicPosition, taRifx, taxize, toaster, translateSPSS2R, treeclim, treecm, treeman, tuber, tweet2r, unmarked, uptimeRobot, useful, userfriendlyscience, ustyc, uwIntroStats, vardpoor, vdmR, vetools, virustotal, vortexR, wTO, weatherData, worldmet, wppExplorer, ykmeans, zebu, zoocat, zoon.

Depended on by BSGS, CPMCGLM, ClustGeo, DataLoader, DoTC, EBMAforecast, EIAdata, EurosarcBayes, Fgmutils, HRM, JAGUAR, MScombine, NAPPA, ONETr, PMA, PdPDB, RDSTK, RGBM, RSAGA, RStorm, Rmisc, SSrat, StagedChoiceSplineMix, StratifiedBalancing, abctools, acs, bcpa, bmk, boottol, coprimary, cshapes, dexter, evolqg, fishmove, gpmap, iNOTE, imager, ivlewbel, oec, plotSEMM, plotprotein, pxR, rcbalance, rcbsubset, rcqp, remix, reshapeGUI, rtip, sclero, selfea, sinaplot, smatr, surveybootstrap, timeordered, unitedR, weightTAPSPACK, wordmatch, worms, zTree.

Suggested by ARPobservation, HistData, Kmisc, LSAmitR, MGLM, NNTbiomarker, NPC, NlsyLinks, ParamHelpers, ProjectTemplate, QFASA, TH.data, TimeProjection, TropFishR, abd, afex, bayesGDS, cda, cpca, darch, data.table, dendextend, dostats, dtree, eechidna, eeptools, flacco, gcookbook, ggswissmaps, ggthemes, heuristica, hillmakeR, homeR, hydrostats, ifaTools, installr, jsonlite, knitrBootstrap, latex2exp, lulcc, mcglm, milr, mvnfast, mvtboost, nLTT, patternator, pedometrics, plumbr, pomp, psd, ptstem, rangemodelR, rattle, rbefdata, rsvd, scdhlm, shadow, soil.spec, sparseMVN, textreg, traits, trapezoid, vcdExtra, vkR, wingui.


See at CRAN