Create, visualize, and test fast-and-frugal decision trees (FFTs). FFTs are very simple decision trees for binary classification problems. FFTs can be preferable to more complex algorithms because they are easy to communicate, require very little information, and are robust against overfitting.
FFTrees is an R package to create and visualize fast-and-frugal decision trees (FFTs) like the one below that predicts heart disease.
Additional information about FFTs, and the FFTrees package can be found at Phillips, Neth, Woike & Gaissmaier, 2017. For seminal papers on FFTs, consult Martignon, Katsikopoulos & Woike, 2008 and Martignon, Vitouch, Takezawa & Forster, 2003
# Install FFTrees from CRANinstall.packages("FFTrees") # Load packagelibrary(FFTrees) # Create an FFTrees object from the heartdisease dataheart.fft <- FFTrees(formula = diagnosis ~., data = heart.train, data.test = heart.test) # Plot the best tree applied to the test dataplot(heart.fft, data = "test", main = "Heart Disease", decision.labels = c("Healthy", "Disease"))
1.4.0
cost.cues
and cost.outcomes
are now specified as named lists to avoid confusion.goal
and goal.chase
.1.3.6
1.3.5
1.3.4
Added class probability predictions with predict.FFTrees(type = "prob")
Updated print.FFTrees()
to display FFT #1 'in words' (from the inwords(x)
function)
1.3.3
Added show.X
arguments to plot.FFTrees()
that allow you to selectively turn on or turn off elements when plotting an FFTrees
object.
Added label.tree
, label.performance
arguments to plot.FFTrees()
that allow you to specify plot (sub) labels.
Bug fixes
1.3.0
Many additional vignettes (e.g.; Accuracy Statistics and Heart Disease Tutorial) and updates to existing vignettes.
Added cost.outcomes
and cost.cues
to allow the user to specify specify the cost of outcomes and cues. Also added a new cost
statistic throughout outputs.
Added inwords()
, a function that converts an FFTrees object to words.
Added my.tree
argument to FFTrees()
that allows the user to specify an FFT verbally. E.g., my.tree = 'If age > 30, predict True. If sex = {m}, predict False. Otherwise, predict True'
.
Added positive predictive value ppv
, negative predictive value npv
and balanced predictive value bpv
as primary accuracy statistics throughout.
Added support for two FFT construction algorithms from Martignon et al. (2008): "zigzag"
and "max"
. The algorithms are contained in the file heuristic_algorithm.R
and can be implemented in FFTrees()
as arguments to algorithm
.
1.2.3
Added sens.w
argument to allow differential weighting of sensitivities and specificities when selecting and applying trees.
Fixed bug in calculating importance weightings from FFForest()
outputs.
1.2.0
Changed wording of statistics throughout package. hr
(hit rate) and far
(false alarm rate) are now sens
for sensitivity, and spec
for specificity (1 - false alarm rate)
The rank.method
argument is now depricated. Use algorithm
instead.
Added stats
argument to plot.FFTrees()
. When stats = FALSE
, only the tree will be plotted without reference to any statistical output.
Grouped all competitive algorithm results (regression, cart, random forests, support vector machines) to the new x.fft$comp
slot rather than a separate first level list for each algorithm. Also replaced separate algorithm wrappers with one general comp.pred()
wrapper function.
Added FFForest()
, a function for creating forests of ffts, and plot.FFForest()
, for visualizing forests of ffts. This function is very much still in development.
Added random forests and support vector machines for comparison in FFTrees()
using the randomForest
and e1071
packages.
Changed logistic regression algorithm from the default glm()
version to glmnet()
for a regularized version.
predict.FFTrees()
now returns a vector of predictions for a specific tree rather than creating an entirely new FFTrees object.
You can now plot cue accuracies within the plot.FFTrees()
function by including the plot.FFTrees(what = 'cues')
argument. This replaces the former showcues()
function.
Many cosmetic changes to plot.FFTrees()
(e.g.; gray levels, more distinct classification balls). You can also control whether the results from competing algorithms are displayed or not with the comp
argument.
Bug-fixes
1.1.7
Trees can now use the same cue multiple times within a tree. To do this, set rank.method = "c"
and repeat.cues = TRUE
.
Bug-fixes
FFTrees()
now supports a single predictor (e.g.; formula = diagnosis ~ age
) which previously did not work.1.1.6
Streamlined code to improve cohesion between functions. This may cause issues with FFTrees objects created with earlier versions of the package. They will need to be re-created.
Updated, clearer print.FFTrees()
method to see important info about an FFTrees object in matrix format.
Training and testing statistics are now always in seperate objects (e.g.; data$train
, data$test
) to avoid confusion.
Bug-fixes
predict.FFTrees()
now works much better by passing a new dataset (data.test
) as a test dataset for an existing FFTrees object.1.1.5
mar
and layout
are now reset after running plot.FFTrees()
1.1.4
Bug-fixes
which.tree
argument in plot.FFTrees()
to tree
to conform to blog posts.predict.FFTrees()
now works better with tibble
inputs.Changed the fft
label to FFTrees
throughout the package to avoid confusion with fast fourier transform. Thus, the main tree building function is now FFTrees()
and the new tree object class is FFTrees