The R Analytic Tool To Learn Easily (Rattle) provides a collection of utilities functions for the data scientist. A Gnome (RGtk2) based graphical interface is included with the aim to provide a simple and intuitive introduction to R for data science, allowing a user to quickly load data from a CSV file (or via ODBC), transform and explore the data, build and evaluate models, and export models as PMML (predictive modelling markup language) or as scores. A key aspect of the GUI is that all R commands are logged and commented through the log tab. This can be saved as a standalone R script file and as an aid for the user to learn R or to copy-and-paste directly into R itself.
rattle 5.2.0 2018-08-12 15:17:12 [email protected]
Remove dependency on RGtk2 and check it dynamically. Rattle has more functionality than just the GUI yet we force installation of RGtk2 which is problematic on some platforms.
Return the datasets to rattle package. Has caused too much confusion as a separate package.
rattle 5.1.6 2018-08-12 15:17:12 [email protected]
Bug fix for new rpart.plot with roundint= handled automatically.
Reduce width of bars in ggVarImp() plot.
rattle 5.1.5 2018-07-01 17:31:22 [email protected]
Remove deprecated connect-r logo. Reported by Bob Muenchen.
Correct and update Help menus. Reported by Bob Muenchen.
Remove Report button until updated to newer functionality. Reported by Bob Muenchen.
rattle 5.1.4 2018-05-22 07:05:18 [email protected]
rattle 5.1.3 2017-10-29 21:25:08 [email protected]
rattle 5.1.1 2017-09-08 16:08:03 [email protected]
rattle 5.1.0 2017-09-04 08:20:34 [email protected]
rattle 5.0.19 2017-07-10 15:14:34 [email protected]
rattle 5.0.18 2017-06-27 06:54:48 [email protected]
rattle 5.0.17 2017-06-24 10:58:15 [email protected]
Move the dataset from the rattle package to a separate rattle.data package in line with CRAN guidelines to have a separate package for slower changing datasets of considerable size. This will also allow the option to provide further datasets for rattle as part of that package.
Use weather.csv as the sample for both R and Microsoft R as weatherAUS.csv is too large to include in a CRAN package.
Ensure strings are treated as categoricals on loading the data with Microsoft R so as to conform to read.csv() and to be consistent with the non-Microsoft R version of Rattle.
rattle 5.0.16 2017-06-17 08:18:26 [email protected]
rattle 5.0.15 2017-06-17 07:48:30 [email protected]
rattle 5.0.14 2017-06-12 13:12:00 [email protected]
rattle 5.0.13 2017-06-05 16:51:18 [email protected]
rattle 5.0.12 2017-05-30 13:52:51 [email protected]
Bug fix call to errorMatrix() where counts= is not count=.
Bug fix to evaluate where respcmd for random forest has disappeared when incorporating MRS updates.
rattle 5.0.11 2017-05-26 17:43:01 [email protected]
rattle 5.0.10 2017-04-30 13:57:16 [email protected]
rattle 5.0.9 2017-04-14 13:17:33 [email protected]
rattle 5.0.8 [email protected]
rattle 5.0.7 2017-03-05 18:13:22 [email protected]
rattle 5.0.6 2017-02-25 09:51:55 Graham Williams
rattle 5.0.5 [email protected] 2017-02-15 07:22:43 Graham Williams
Update weatherAUS dataset.
Bug fix for sample XDF dataset - if smaller the crv$xdf_preview then load the whole dataset into memory.
ggVarImp now has n= option for the top n variables. Also supports xgb.Booster models from xgboost.
rattle 5.0.4 [email protected] 2017-02-04 15:15:34 Graham Williams
Bug fix ggVarImp to work for randomForest() when importance=FALSE.
Add log= option to ggVarImp() for a log scale.
Add pc (percentages) and digits to errorMatrix().
rattle 5.0.3 [email protected] 2017-02-02 15:18:28 Graham Williams
rattle 5.0.2 [email protected] 2016-10-02 15:06:52
Implement generic ggVarImp() to plot variable importance for different models.
Implement errorMatrix() as a replacement for generating code to do this pcme() during a rattle run.
Update the weather AUS dataset from the Australian Bureau of Meteorology.
Add a subtitle to riskchart().
rattle 5.0.1 [email protected]
Begin exposing :: prefix in the log tab. It's educational and self documenting.
Support Explore -> Distribution -> Group By to include the numeric target variable (usually only categorics listed) if it has 10 or fewer levels. Suggested by Eugene Dubassarsky.
Additional XDF support: rxDForest.
rattle 5.0.0 [email protected]ogaware.com
rattle 4.2.0 [email protected] 2016-07-22 06:19:15
* Include dplyr as an Import. * Add support for Eugene Dubassarsky's ggraptr. * Cleanup and perfect executeModelRF and Log code.
rattle 4.1.8 [email protected] 2016-06-24 20:36:51
* Add transparency to ggpairs plot. Reported by Eugene Dubossarsky.
rattle 4.1.7 [email protected] 2016-06-21 21:02:13
* Bug fix for Benfords when the target is numeric. An empty Group By will use the target variable to stratify. Reported by Eugene Dubossarsky. * Spelling fixes provided by George Wilson.
rattle 4.1.6 [email protected] 2016-06-21 20:34:49
* Bug fix for new version of GGally - to get target colours. Reported by Eugene Dubossarsky.
rattle 4.1.3 [email protected] 2016-05-12 10:22:01
* Update copyright to 2016. * Add stringr dependency. * Fix missing comment character in log tab.
rattle 4.1.3 [email protected] 2016-03-13 15:07:07
* Add type= to fancyRpartPlot(). Requested by Michelle Gosse.
rattle 4.1.2 [email protected] 2016-03-13 06:24:21
* Bug fix for missing GUI code for export_filechooserdialog. Reported by Bill Burns.
rattle 4.1.1 [email protected] 2016-01-26 19:50:07
* Bug fix for a single input variable in the dataset when scoring. Reported by Szabo Szilard.
rattle 4.1.0 [email protected] 2016-01-26 11:12:01
* Bug fix calculation of confusion matricies when either actual or predictive values have missing values. Reported by Roger Bohm. * Make the rescale.by.group transform more robust by ensuring the by argument is a factor, converting as needed. Reported by Tony Nolan. * Bug fix for plots when there is no target in the dataset. Reported by Albert Lee. * Bug fix in calculation of the overall error rate in the confusion matrix. Show overall error as percentage not proportion. reported by Eugene Dubossarsky. * Remove grid from ggpairs plot and fine tune for presentation.
rattle 4.0.0 [email protected] 2015-09-21 06:00:49
* Migrate hosting of the package to Bitbucket: https://bitbucket.org/kayontoga/rattle. * Use Connect-R logo as the icon for the button.
rattle 3.5.11 [email protected] 2015-09-16 19:22:02
* Add button to toolbar to open a Connect-R page for feature requests. * Bug fix confusion matrix Error calculation and average error calculation. Reported by Eugene Dubossarsky. * Only default to TIME* variable as target if Survival model is chosen.
rattle 3.5.10 [email protected] 2015-09-16 19:22:02
* Explore tab's Distribution option now allows the user to choose how to group the data for plotting, with the Target as the default but a choice of any Categoric vairable available, or none. * Bug fix when scoring a clustering with no identifier nor target. Reported by Abhishek Sharma.
rattle 3.5.9 [email protected] 2015-09-16 05:53:00
* Incorporate pairs plots into Distributions option of the Explore tab. Contributed by Jose A Magaña.
rattle 3.5.8 [email protected] 2015-08-28 10:21:59
* Migrate histogram plots to using pipes and generally clean up the code. * Introduce appendLibLog to handle namespaces in the Log tab. Namespace prefix is removed and replaced by a library() call as a user would normally do. * Migrate Box Plots to using pipes and place multiple box plots or histograms onto a single grid.
rattle 3.5.7 [email protected] 2015-08-21 19:17:56
* Move to using clusplot from cluster rather than plotcluster from fpc to obtain ellipses to show the clusters.
rattle 3.5.6 [email protected] 2015-08-20 21:30:29
* Gracefully handle no network connection in rattleInfo().
rattle 3.5.5 [email protected] 2015-08-17 19:29:41
* Bug fix for traditional graphics and ROCR suite of plots under evaluate tab - need to use namespace to get correct version of plot().
rattle 3.5.4 [email protected] 2015-07-26 12:07:02
* Add palettes= to allow limited changing of colours in fancyRpartPlot(). * Bug fix for fancyRpartPlot() where rule conditions were being replaced with coloured blocks.
rattle 3.5.3 [email protected]
* Add a test to riskchart() to if there are more than two classes.
rattle 3.5.2 [email protected]
* Extend Error Matrix calculations in Evaluate to support multinomial targets as well as binomial targets.
rattle 3.5.1 [email protected]
* Bug fix in calculation of overall and average class errors. Thanks to Eugene Dubossarsky.
rattle 3.5.0 [email protected]
* Replace xlsx::read.xlsx() with readxl::read_excel() to remove reliance on Java which has always been problematic in terms of Windows users having trouble installing Java. Thanks to Ed Stoker for testing. (3.4.3) * When iterating over kmeans clusters now plot from 1 cluster rather than 3. Thanks to Eugene Dubossarsky. (3.4.4) * Updates to normVarNames() due to Hadley's changes to stringr. Also capture other characters to map. * Add title.size argument to riskchart(). Also support horizontal legend. Fix the text glob for the Lift label. * Revert to using only exported functions from pkgDepTools. (3.4.1) * Fix some tooltip and textview typos suggested by Kees Schippers. (3.4.2) * move from weightedKmeans to wskm. * Numerous updates to support new CRAN checks, particularly related to use of name spaces and requiring to make rattle depend on RGtk2. * weatherAUS dataset is updated. * Update rattleInfo() to be more efficient by doing dependency graph myself.
rattle 3.4.0 [email protected] 2014-12-29 19:11:59 +11:00
* Revert traditional ROC eval plot to overlay all models on the one plot. Eugene Dubossarsky * Bug fix to fancyRpartPlot() from John Vorwald when model$frame$yval all negative. * Replace comma in normVarNames(). * Remove latticist - no longer avaliable on CRAN.
rattle 3.3.0 [email protected] 2014-09-09 18:25:21 +1100
* Migrate to using namespace for external functions.
rattle 3.2.0 [email protected] 2014-09-04 06:14:03 +1100
* Execute button when clicked from the Log Tab will execute all of the code in the Log tab. Suggested by Scott MacLean, 24 July 2014) * Add the average error rate to the evaluations, as proposed on http://www.connect-r.com/. * Numerous ggplot2 updates and bug fixes. * MS-Windows support for xlsx files bug fixed. Allow sub= option in fancyRpartPlot.
* Numerous updates of plots to use ggplot2 rather than base graphics: ROC curves, riskchart, box plots, histogram plots, pairs plot, Benfords. Advanced Graphics is now the default, reverting to tradition graphics where needed. The migration to ggplot2 is ongoing. * Added new Benfords functionality. * Added a rescale option to kmeans. * New psfchart() for evaluation. * New function normVarNames() to normalise variable names to a standard preferred style * Evaluate -> Error Matrix has been updated to report averaged class error and to report class errors. * Evaluate -< PrvOb plot bug fix for non-missing data. * INSTALL: Remove old INSTALL file - visit rattle.togaware.com for installation instructions. * plotNetwork() has been removed - not used by Rattle and generally of limited use. See onepager.togaware.com for the code. * No longer report repository revision number in version or about. * Miscellaneous bug fixes and stability improvements. * weatherAUS dataset is up-to-date.
-- Graham Williams [email protected] 2014-07-18 14:32:07 +1100
rattle (2.6.26) unstable; urgency=low
Replace .path.package with path.package as requested by Ripley. The hidden version will disappear soon and the new version has been available since 2.13.0.
Update boost help to note that it is available only for binary classification.
Default stemming for textmining of a corpus is no active if the Snowball package is available.
For Advanced Graphics introduce a dendrogram plot using ggplot2.
Various text mining improvements. Bug fix in checking if data needs reloading. Support checking if corpus needs reloading. Add extra cursor and status bar messages. For corpus, set default folder to be getwd(). Check for mismatch between number of docs in corpus and the number of targets in .targets.csv. For the Corpus file dialog, do not offer folder creation.
Remove macosx special rattle.ui. The ubuntu specific text no longer appears in the saved ui file.
Internally: Move rattleGUI to crv from crs. The crs is saved as the state, and this was confusing the GUI on a project restore. Had to ensure we restored rattleGUI with the current rattleGUI - this fixes loadProject bug. Also, in Load project, filter on .RData not .Rdata.
Add newdata= to call to predict, in line with the standard approach by party (reference Torston). Remove the OOB= for predict for cforest. With a new dataset OOB makes no sense. It was in there because newdata= was not being used and positionally having issues.
Update fancy rpart plot to reduce colour intensity for printing and a nicer tree structure. Add all class probs to fancy tree.
Define paste0 if it is not defined. It was introduced in 2.15.0 but is too early to assume the world is with us.
Replace siatclust with weightedKmeans.
Bug fix in OOB plot when impute is off - need to omit missing values. Update message regarding random forest and na.omit() removing all rows, noting the option to use na.roughfix().
Fix bug identified by Brian Feeny 121209 - score a RF test dataset without a target variable tries to add one in all NA but fails if it is the last variable.
Experimentally add Deducer's data.viewer to View data. Ensure we ask user if when using Plot Builder it is okay to create a dataset in their work space. Hopefully keeps us in line, if not strictly in copmliance, with CRAN policy.
Remove SVG support - RSvgDevice is no longer available.
-- Graham Williams [email protected] Sat, 16 Mar 2013 13:27:05 +1100
rattle (2.6.25) unstable; urgency=low
Review all of the code and remove two instances of using copyrighted code without attribution. One was a copy or print.rpart, where rattle added a translation wrapper to the text message. Another was code copied from the Internet from David Hand - use the Hmeasure package now. Note in drawTreeNode() reference to the original author and lack of copyright. Note author in [email protected] ggcorplot is now available from Deducer. Remove it from Rattle. Replace Hand measure with HMeasure from hmeasure. Add Mark Vere Culp as aux author. Remove commented out code. Remove lss and cranSearch - not really part of Rattle.
Update to new style [email protected]
-- Graham Williams [email protected] Sat, 23 Jan 2013 13:12:43 +1100
rattle (2.6.24) unstable; urgency=low
Bug fix for box plot using ggplot2.
Finish the implementation of riskchart using ggplot2 to mimic the old version of risk charts.
Remove copied code from print.rpart, known as rattle.print.rpart, and originally used without proper credit to Brian Ripley, but no longer required. Use his original versoin from rpart itself, though lose the translations.
Migrate to a cleaner structure for managing the source package locally at togaware.
Bug fix fancy rpart plot to handle regression as suggested by Yana Kane-Esrig.
For arules, add option to specify minimum length.
Update to new version of RGtk2Extras' dfedit, without a pretty_print option. Also able to assign result into crs$dataset now.
Remove two instances of global variable assignments. Temporarily remove PlotBuilder and scoring of manually entered datasets.
-- Graham Williams [email protected] Tue, 11 Dec 2012 06:45:50 +1100
rattle (2.6.21) unstable; urgency=low
Retain depend on R > 2.12.1.
Ensure rattle.togaware.com repo is maintained.
Better detect arules error message for duplicate items in a basket.
Update ggplot2 calls to conform to 0.92. Also turn advanced graphics on by default. Implement risk charts using ggplot2.
Start introducing suppressPackageStartupMessages to avoid excessive messages in the console.
Do AUC only for binomial targets.
-- Graham Williams [email protected] Mon, 10 Sep 2012 19:27:42 +1000
rattle (2.6.20) unstable; urgency=low
Because of use of globalVariables Rattle now depends on R >= 2.15.1. However, check this conditionally to retain backward compatibility for now. Reported by Uwe Ligges.
For show arules, eval in global environment else it does not show the rules. Reported by Tania Churchill.
-- Graham Williams [email protected] Mon, 23 Jul 2012 02:27:18 +1000
rattle (2.6.19) unstable; urgency=low
Depend on weightedKmeans rather than siatclust.
Bug fix: correlation plots stopped working.
Bug fix: ggcorplot use of size_scale started failing. Perhaps because of new version ofggplot2.
Bug fix: notice when a restored project does not have a filename set.
Fix some logic errors in rf.
Add 0,0 point to evaluateRisk.
Make risk, recall, precision as default names in risk chart.
Add new riskchart funciton using ggplot2.
Allow additional arguments to fancyRpartPlot passed through to prp.
Update copyrigt to 2012.
Allow y for yes in installing initial RGtk2.
List global variables to avoid check messages.
-- Graham Williams [email protected] Wed, 04 Jul 2012 22:15:27 +1000
rattle (2.6.18) unstable; urgency=low
Ensure require uses quietly rather than quiet.
Clean up randomForest textview output.
Update pmml to 4.0. Fix various format issues and other updates from Tridi of Zementis.
Update setupDataset but also note that it is moving into a separate package, container.
Get odfweave stuff working again.
Update fancyRPartPlot - being used in SIAT software. Can now handle any number of classes.
Updates to the pmml rsf code.
Bug fix for evaluation of conditional trees and random forests.
Further pmml export of randomForest updates.
Add PlotBuilder as interative explore option.
Export pmml for glm models.
Enhance ggplot2 plotting of boxplot.
-- Graham Williams [email protected] Sun, 22 Apr 2012 21:47:00 +1000
rattle (2.6.17) unstable; urgency=low
Add a log10 transform to the GUI, R10 prefix, add tooltip, handle it in pmml, create new rattle_macosx.ui. Suggested by Christophe Klopp.
Bug fix usage of believeNRows - it was being ignored from the GUI, but is now acted upon. Reported by Andrew Elliott.
Add ggplot2 box plots to Advanced Graphics option.
Remove the timestamp messages.
Update pmml to handle randomForest and rattle to export to pmml.
Bug fix in naming the dataset when it is editted.
Bug fix for ggcorplot when less than 6 vars - need to map var names into a c() call.
-- Graham Williams [email protected] Sun, 19 Feb 2012 21:49:45 +1100
rattle (2.6.16) unstable; urgency=low
rattleInfo() now also notes if rattle itself needs upgrading.
Bug fix in show association rules. It now works again.
Forgot to include rescale.by.group() in NAMESAPCE.
CITATION to the book rather than the article. That is a more definitive resource, though not freely available.
-- Graham Williams [email protected] Sat, 24 Dec 2011 15:35:21 +1100
rattle (2.6.15) unstable; urgency=low
-- Graham Williams [email protected] Sat, 03 Dec 2011 22:49:18 +1100
rattle (2.6.14) unstable; urgency=low
Add OOB ROC button to Forest option of Model tab as suggested by Akbar Waljee.
Bug fix for loading R Dataset data frame named dataset. Bug reported by George Dontas.
Use roc.plot() from evaluation. Suggested by Akbar Waljee.
Ensure oob roc plot handles numeric targets.
-- Graham Williams [email protected] Wed, 16 Nov 2011 06:01:17 +1100
rattle (2.6.13) unstable; urgency=low
-- Graham Williams [email protected] Tue, 25 Oct 2011 21:34:13 +1100
rattle (2.6.12) unstable; urgency=low
Ensure the data partitions that are specified are appropriate. Also allow some flexiblity in specifying: 70 or 70/30 or 70/15/15. For the first two the training is 70% and testing is 30%. For the third, validation is 15% and testing is 15%.
Update text mining support for lates version of tm.
rattleInfo() was incorrectly counting the unmber of packages listed.
-- Graham Williams [email protected] Sun, 23 Oct 2011 06:00:16 +1100
rattle (2.6.11) unstable; urgency=low
Use listAdaVarsUsed in Rattle.
Use fancyRpartPlot in Rattle.
Note rattle.ui requires gtk > 2.16, not > 2.20. Otherwise fails to start on Mac OS/X.
-- Graham Williams [email protected] Wed, 05 Oct 2011 19:12:28 +1100
rattle (2.6.10) unstable; urgency=low
Add listAdaUsedVars support function.
Workaround CairoDevice issue on Windows by defaulting to not using it, as in the Settings menu.
Add common name and crv constant for ewkm.
fancyRpartPlot has optional main title as empty string.
biclust now reports a biclust built rather than reporting a kmeans built.
Add weights plots for ewkm from siatclust.
-- Graham Williams [email protected] Sun, 11 Sep 2011 17:08:18 +1000
rattle (2.6.9) unstable; urgency=low
AdaBoost now also reports which variables are used in the collection of trees built, and the number of trees in which a variable appears.
Add setupDataset and whichNumeric to support encapsulation of data mining objects.
Add a fancyRpartPlot so my fancy rpart tree is available outside of the rattle GUI.
Correct the textview information relating to confusion matrices.
Add doRiskChart to simplify using the risk charts.
-- Graham Williams [email protected] Sun, 04 Sep 2011 21:03:32 +1000
rattle (2.6.8) unstable; urgency=low
Ensure ggplot2 loaded before plot ctree.
Handle probability predictions for ctree and cforest in evaluation.
-- Graham Williams Graham.Will[email protected] Tue, 26 Jul 2011 22:03:47 +1000
rattle (2.6.7) unstable; urgency=low
Add support for the entropy weighted k-means subspace clustering algorithm from the ewkm package.
Ensure rattle can load with only the base package installed (so install.packages is prefixed with utils:::).
Migrate from using installed.pacakges() since it can be very slow on MS/Windows.
Add an experimental dataset option to the command line call to rattle.
Allow a bygroup to be used for any numeric transform.
Add a plot for association rules.
Display a ggplot2 scatterplot if advanced plots is enabled.
rattle:::executeExplorePlot made more friendly for calling from outside of Rattle.
Tidy up the rattleInfo manual page.
Master Makefile should respond with help if no target specified.
-- Graham Williams [email protected] Mon, 18 Jul 2011 06:53:47 +1000
rattle (2.6.6) unstable; urgency=low
Settings/Tooltips should be shown as TRUE.
Add Settings/GGPlot2 to enable enhanced graphics (generally using ggplot2) where they have been implemented.
Implement a ggplot2 pairs plot (scatterplot) as the plot to use when ggplot2 is enabled and under Explore/Distriubtions no variables are chosen to be displayed. Uses ggcorplot from Deducer.
Implement use of rpart.plot's prp() when ggplot2 is enabled.
-- Graham Williams [email protected] Sat, 09 Apr 2011 22:16:29 +1000
rattle (2.6.5) unstable; urgency=low
Add rattleReport() - report on current state of rattle modelling.
Restore the ByGroup option for now until it can be coded for the about transforms.
Deal with UTF-8 encoding of Japanese filenames in data and evaluate, using iconv.
Be sure to include http:// in web links, though on MS/Windows still not working: Could Not Show Link... No application is registered as handling this file
On loading a dataset, convert any character variables to be factors. Rattle does not handle character variables, so the translation seems appropriate.
Association rules status bar was refering to decision trees. Fixed. (Pointed out by Xiaobo Gu)
Fix an introduced bug in handling of categorics in numeric transforms.
Fix a bug where imputation for a categoric with class "ordered" and "factor" was treating it as a numeric (because "ordered" is not "factor").
Some Help menu items under Test were not loading the required package and thus were not displaying the help.
Only do crosstabs when we have categoric variables.
-- Graham Williams [email protected] Sun, 13 Mar 2011 16:46:20 +1100
rattle (2.6.4) unstable; urgency=low
Confusion matrices transposed to conform to what most people exect: Actual is on left and Predicted is on top. Retain the name as Error Matrix in Rattle for now.
Use different pch for a dotchart.
Include the install.packages(rattleInfo()) trick in the output of rattleInfo().
-- Graham Williams [email protected] Sat, 19 Feb 2011 06:26:09 +1100
rattle (2.6.3) unstable; urgency=low
weather.arff Date field should have 'date' data type.
The rug plot of histograms is no longer coloured. For large datasets, there is much overplotting and so it can in fact be quite misleading.
Box plots now use varwidth=TRUE to indicate the distribution of the target variable.
Bug fix: exportHClustTab should not have a file argument.
-- Graham Williams [email protected] Sun, 13 Feb 2011 21:42:11 +1100
rattle (2.6.2) unstable; urgency=low
Rename rattle.info() to rattleInfo(), modelled on sessionInfo() naming. Include available CRAN version of rattle in the output.
Ensure connection is closed on pmmltoc export from Rattle.
questionDialog needs to not use RGtk2 if RGtk2 is not installed!
Emphasise that Rattle is free in loading the rattle package.
exportKmeansTab does not require the file argument.
-- Graham Williams [email protected] Wed, 02 Feb 2011 05:46:28 +1100
rattle (2.6.1) unstable; urgency=low
When exporting a regression model, be sure to use proper slash (i.e., not the Windows slosh) for log tab record of the command.
Add rattle.ui to the google code repository.
Remove as many literals as possible from the Log tab - so that crs$dataset[crs$sample, c(2:10,14,16:20)] becomes crs$dataset[crs$sample, c(crs$input, crs$target)], for example. Similarly for the set.seed and other data storing variables.
Other Log tab cleanup.
Fix bug that caused failure on reading an .xls data file.
rattle.info() now returns the list of packages that need updating.
In exporting a model as C code, if we are Japanese on Windows then note that the encoding is shift-jis rather than utf-8 for some reason.
Improve infrastructure for the generation of C code from PMML.
-- Graham Williams [email protected] Thu, 13 Jan 2011 21:50:53 +1100
rattle (2.6.0) unstable; urgency=low
Keep track of project names and use as default name to save a project to. Suggested by David Cochrane.
Add strip.white to the default for reading CSV files. Suggested by Robert Muenchen.
Bug fix on resetEvaluateTab - Data row was being reset to sensitive because model was being toggled.
Disconnect Rattle versions from google code revision numbers since the revision numbers change each change to the Wiki.
Indicator Variables will Ignore the first of the new indicator variables. Suggested by Robert Muenchen.
Include the Target name in listing of a decision tree as a rule set.
On adding to the log when saving a plot make sure carioDevice is loaded and the file name path separators are appropriate. Reported by Shane Butler 11 Dec 2010.
Ensure filename string is UTF-8 when exporting a file, to handle Japanese filenames.
For nnet, choose a seed so weather generates a non-trivial model.
Refer to remapping as recoding in line with commonly used terminology.
Default back to showing text on icon for buttons. Seems okay in the new version of Gtk.
-- Graham Williams [email protected] Sat, 11 Dec 2010 13:39:55 +1100
rattle (2.5.47) unstable; urgency=low
Add a useGtkBuilder argument to rattle(). If NULL, then heuristically determine, otherwise go with the specified choice, if possible.
Remove RGtk2, colorspace, and pmml as dependencies. Now dynamically check and offer to install. This also helps reduce chance of the XML/RGtk2 zlib1.dll bug, and also ensure RGtk2 loads before XML to avoid that bug.
-- Graham Williams [email protected] Mon, 15 Nov 2010 21:50:15 +1100
rattle (2.5.46) unstable; urgency=low
Bug fix for fixTranslations.
Save weights information in PMML.
Cleanup SVM command generator.
-- Graham Williams [email protected] Thu, 11 Nov 2010 19:08:36 +1100
rattle (2.5.45) unstable; urgency=low
Check for GtkBuilder handling of the 'requires' tag, and if not handled the don't use GtkBuilder.
Bump pmml version through 1.2.25 to 1.2.26.
Change default nolan groups for a singularity to 50 rather than 99.
PMML bug fix when glm and using weights.
Move all variable initialisation from .onLoad to .onAttach. This will ensure .RData saved (and therefore old) versions of the variables will not overwrite the proper versions in a newer release of Rattle.
-- Graham Williams [email protected] Sat, 09 Oct 2010 08:16:15 +1100
rattle (2.5.44) unstable; urgency=low
Add an include.libpath to rattle.info() to provide information about where the packages are installed.
Check for failed startup of rattle GUI using GtkBuider (because the Gtk library installed does not recognise 'requires' and suggest a workaround).
Condiionally turn toolbar Text (in addition to just Icons) on.
For loading spreadsheets, make sure RODBC is available and loaded.
Ensure 'ordered categoric' are treated as categoric for Explore, Distribution.
-- Graham Williams [email protected] Tue, 05 Oct 2010 18:08:20 +1100
rattle (2.5.43) unstable; urgency=low
Ensure gtkBuilder is setting the correct translation domain for the interface.
Add global option for not showing timestamps: crv$show.timestamp.
Add optional arg to newProject to not ask about overwriting a project. Default is as previously - to ask.
-- Graham Williams [email protected] Wed, 22 Sep 2010 05:37:53 +1000
rattle (2.5.42) unstable; urgency=low
Update rattle.info() to recursively identify all dependencies, report their version number and any updates available from CRAN and generate command to update packages that have updates available. See ?rattle.info for the options.
Fix bug causing R Dataset option of the Evaluate window to always revert to the first named dataset.
Fix bug in transforms where weights were not being handled in refreshing of the Data tab.
Fix a bug in box plots when trying to label outliers when there aren't any.
-- Graham Williams [email protected] Sun, 19 Sep 2010 05:01:51 +1000
rattle (2.5.41) unstable; urgency=low
Use GtkBuilder for Export dialog.
Test use of glade vs GtkBuilder on multiple platforms.
Rename rattle.info to rattle.version.
Add weight column to data tab.
Support weights for nnet, multinom, survival.
Add weights information to PMML as a PMML Extension.
Ensure GtkFrame is available as a data type whilst waiting for updated RGtk2.
Bug fix to packageIsAvailable not reruning any result.
Replace destroy with withdraw for plot window as the former has started crashing R.
Improve Log formatting for various model build commands.
Be sure to include the car package for Anova for multinom models.
Release pmml 1.2.24: Bug fix glm binomial regression - note as classification model.
-- Graham Williams [email protected] Wed, 15 Sep 2010 14:56:09 +1000
rattle (2.5.40) unstable; urgency=low
-- Graham Williams [email protected] Sun, 22 Aug 2010 12:02:00 +1000
rattle (2.5.39) unstable; urgency=low
-- Graham Williams [email protected] Sat, 21 Aug 2010 07:47:43 +1000
rattle (2.5.38) unstable; urgency=low
Ensure pmml.ksvm will at least run - though resulting PMML not validated.
Bump pmml version to 1.2.23
-- Graham Williams [email protected] Fri, 06 Aug 2010 05:56:11 +1000
rattle (2.5.37) unstable; urgency=low
The Predictive tab has gone back to being Model. Not sure which is best.
cranSearch defaults to r-project rather than unimelb.
Migrate from RGtk2DfEdit to its replacement, RGtk2Extras.
Revert cairoDevice to being a Suggests rater than Depends.
Remove redundant CITATION from root of package, as the real one is in inst.
-- Graham Williams [email protected] Sat, 31 Jul 2010 14:34:50 +1000
rattle (2.5.36) unstable; urgency=low
Add Bill Venables' searchCRAN example code.
Improve error message when we find duplicate variable names in a loaded file, which might result when there is no header line.
Add help item for Projects.
On Evaluate with supplied file, use the hdr specified on the Data tab.
-- Graham Williams [email protected] Mon, 12 Jul 2010 06:43:06 +1000
rattle (2.5.35) unstable; urgency=low
Add utility lss function to list object sizes.
Add options text entry for SVM to easily allow other options.
Better formatting of the Log tab.
Use a set.seed for SVM to ensure same model each time.
Add option to random forest to impute missing values rather than simply ignoring the observations.
On Evalaute with supplied file, use the sep specified on the Data tab, thus allowing TXT files.
On loading a new dataset for evaluation be sure to add in any missing columns, and unify the levels.
Improve binning documentation.
Make RGtk2, cairoDevice, colorspace all dependencies so we can get rattle started and then rattle will prompt to install other packages that are mssing when it needs them.
-- Graham Williams [email protected] Thu, 01 Jul 2010 15:34:50 +1000
rattle (2.5.34) unstable; urgency=low
When a package is missing, there is now the option to install it right then, and it continues as normal after it gets installed.
Change Suggests to Depends so all used pacakges get loaded on loading rattle, in an attempt to make it easier to install Rattle. Then the r-cran-rattle package on Debian/Ubuntu will have all required dependencies and a normal install.packages will get all dependencies also, rather than having to use dependencies=c('Depends', 'Suggests'). Penalty is it takes 20 seconds to do 'library(rattle)' on a server and 90 seconds on a netbook - so revert back to not doing this.
Ensure the new train/validate/test scneario is saved across projects.
-- Graham Williams [email protected] Wed, 09 Jun 2010 07:04:08 +1000
rattle (2.5.33) unstable; urgency=low
Bug fix rf.cmd.
Improve scoring functionality: The dataset can have NA's for target, and these can now get scored by rf on Evaluate tab. Loading a CSV file to be scored no longer needs to have the target column included (previously it needed to be there and have non-NA values). Thanks to Chris Snijders.
-- Graham Williams [email protected] Mon, 31 May 2010 06:22:54 +1000
rattle (2.5.32) unstable; urgency=low
Remove dependency on car - not actually being used at the moment.
For random forest, allow sample size text entry as a single integer or a list, as per randomForest.
Use na.omit with cforest, as is done with randomForest.
For randomForest turn subsampling with replacement off since it is more likely to produce biased importance measures, as explained in by the cforest papers.
Fix bug with multiple "contact support" lines in error popups.
When showing the randomForest importance values, sort on the accuracy measure rather than the Gini measure, since the Gini is biased in favour of categoric variables with many categories.
ada boost seed should be 42, like all other seeds.
Tidy up some ada output.
Bug fix - save project for rf failing (looking for rf_sampsize_entry).
Remove text from toolbar by default.
Change order of Forest/Boost buttons on Model tab.
Add tooltips for all toolbar buttons.
-- Graham Williams [email protected] Fri, 28 May 2010 15:47:15 +1000
rattle (2.5.31) stable; urgency=low
Add rattle.info() to list information for debugging purposes.
Bump pmml to 1.2.22
Fixes from [email protected]: Extension in Header should be first element. Coefficients in regression models should not be NA (as will be for singularities), but replace with, and so no impact of change.
Ensure Survival defaults are reset appropriately.
-- Graham Williams [email protected] Wed, 19 May 2010 09:50:39 +1000
rattle (2.5.30) stable; urgency=low
On MS/Windows with Japanese, read.csv needs encoding option set with file rather than with read.csv (for UTF-8) but seems okay under other scenarios.
On MS/Windows with Japanese (UTF-8) the encoding of the variables selected for transforming needs to be UTF-8 for much of the process, but "unknown" when using Rtxt and sprintf (when substituting the variable names) to ensure resulting message is correctly matched for encodings.
-- Graham Williams [email protected] Wed, 19 May 2010 09:47:12 +1000
rattle (2.5.29) stable; urgency=low
Add the translation file.
Fix an Encoding/sprintf issue for Japanese on MS/Windows.
Allow crv$NOTEBOOK.MODEL.NAME to be overridden by other packages (RStat).
When dispatch fails be sure to include the Tab label on which it fails.
Ensure HClust Options are re-enabled on loading a project.
-- Graham Williams [email protected] Sat, 24 Apr 2010 07:32:02 +1000
rattle (2.5.28) stable; urgency=low
Minor format changes for glm and rf model output.
Capture additional survival model error and suggest a solution.
Remove spurious additional plot for Survival Residual plot.
Update log tab labels to be more generic.
Update tooltips to be generic and add survival tooltips.
-- Graham Williams [email protected] Thu, 22 Apr 2010 06:21:58 +1000
rattle (2.5.27) unstable; urgency=low
Further translation fixes. In particular, use Encoding(...getText()) <- "UTF-8" to ensure strings from the GUI ate UTF-8, and not unknown.
Ensure training dataset rather than sample dataset nomenclature is now used.
Ensure execute button can only be clicked once while it is processing.
Survival plot buttons need to be made sensitive as appropriate.
For Japanese on MS/Windows do not use monospace font since this ends up vertically cenbtering periods and commas (and all other characters). Need a fixed width font that does not do this, but for now we put up with variable width font.
Revert to using only English for all hidden tab labels.
Improved identification of current plot number.
Bug fix multiple vars selected for asnumeric and ascategoric transforms.
-- Graham Williams [email protected] Thu, 22 Apr 2010 06:17:20 +1000
rattle (2.5.26) unstable; urgency=low
Add Cross Tab option to Explore tab to generate cross tabulations of each categoric variable by the target variable. (Luke Lake)
Bug fix - improve how we obtain the plot number from the title, particularly in the context of translations.
Further translation markup.
Clean up the use of dfedit.
Minor improvement to spacing in Log tab.
-- Graham Williams [email protected] Tue, 30 Mar 2010 21:37:29 +1100
rattle (2.5.25) unstable; urgency=low
Start using the RGtk2DfEdit for the View and Edit buttons of the Data tab, and the Enter/Score option of the Evaluate tab. RGtk2DfEdit provides a spreadhseet like interface to the data. Various data editing options are available. Also press = to run an arbitrary R command on selected data (e.g. select two columns of data and issue the plot command).
Add further markup of text for translations.
Support specification of the character used for decimal points (to suit some European usage).
Fix bug in exporting XML - replace & with &
Survival plots - split survival chart plot from residuals plots, and plot all residuals.
Fix logic behind what is greyed out in the Test tab.
-- Graham Williams [email protected] Mon, 29 Mar 2010 19:37:25 +1100
rattle (2.5.24) stable; urgency=low
Revamp the help text, and put into the Rtxt translation framework.
Fix the height of the data name widget (the library option was growing the width for some reason).
For Evaluate, add Full and Enter as dataset options. Enter will pop up an editor with the final row from the dataset, allowing you to add rows or modify the supplied row. We supply the row so that we have an example to work from. Full uses the whole original dataset.
-- Graham Williams [email protected] Sat, 06 Mar 2010 14:17:12 +1100
rattle (2.5.23) stable; urgency=low
Catch "arules" error in converting data to transactions when baskets contain repeated items.
When data tab is executed, and so crs$rpart is reset to NULL, always remove the Draw/Rules button from the Tree option of the Predictive tab.
Add code to fix translations that are not being loaded when using RGtk2 on MS/Windows. All is okay on GNU/Linux, but RGtk2 seems not to get the right locale for loaded Glade file. The fix is to traverse the GUI and change all labels, on starting up Rattle. RGtk2 authors tried to fix but it remains an issue.
Ensure rpart is reset on resetting rattle.
Rework handling of tab pages because a Japanese translation on MS/Windows is having issues with the following call (nd=notebook) nb$getTabLabelText(nb$getNthPage(nb$getCurrentPage())) returning what looks like Shift-JIS encoding of the string rather than UTF-8, and hence not string matching the expected tab label.
Fix spelling errors on help menu and ensure help for all topics is covered.
For nnet, use MaxNWts=10000 (default is 1000) to allow larger nets by default, and capture the error message when this is exceeded and better explain what to do.
Ensure we don't export an empty dataset when choosing export on the data tab.
Capture arules error message when there are repeated items in one basket, and explain this more clearly.
For rpart use information as the default split rather than Gini - makes little if any difference.
Allow showHelpPlus to have an extra/alternative question that is displayed.
All random seeds should be 42.
Reset kmeans tab on loading a project.
Add dozen more weather stations to the weatherAUS dataset.
Improve the logic for the display of the Report radio buttons on the Evaluate tab.
Spelling correction to a number of tooltips.
-- Graham Williams [email protected] Wed, 03 Mar 2010 06:50:58 +1100
rattle (2.5.22) stable; urgency=low
Default window height is 650, but not forced so that the window nicely fills a netbook screen if maximised.
Bump R dependency to 2.8.0 in line with update of the CITATION file.
-- Graham Williams [email protected] Sat, 13 Feb 2010 09:48:00 +1100
rattle (2.5.21) stable; urgency=low
Re-enable gettext on MS/Windows, even though RGtk2 2.12.18 has not fixed the bindtextdomain problem with glade files and package supplied translations.
Change the tree plot to us "< =>" and ">= <" to clearly identify which branch the "=" results go. Could not figure out how to get expression to us a "ge" symbol.
Improve formatting of the PvO plots.
Use the pairs.panels function from the psych package for the default scatterplot on the Explore tab.
Add INSTALL file.
-- Graham Williams [email protected] Sun, 07 Feb 2010 15:03:22 +1100
rattle (2.5.20) stable; urgency=low
Restore missing weather.csv file.
Add to Google code: weather.R ChangeLog NEWS ToDo upload_uwe.sh upload_cran.sh.
-- Graham Williams [email protected] Sun, 31 Jan 2010 11:07:55 +1100
rattle (2.5.19) unstable; urgency=low
Ensure the right labels (Time/Risk rather than Class/Prob) displayed in filechooser when exporting a survival model.
Model tab renamed as Predictive.
Ensure boxplots have same "by ..." in the main title.
Update the weather dataset and include many more weather stations in the weatherAUS dataset.
Rtxt does no translations when running on MS/Windows (for now).
-- Graham Williams [email protected] Sat, 30 Jan 2010 09:28:18 +1100