1) data cleaning including variable scaling, missing values and unbalanced variables identification and removing, and strategies for variable balance improving;
2) modeling based on random forest and gradient boosted model including feature selection, model training, cross-validation and external testing.
For more information, please see H2O.ai (Oct. 2016). R Interface for H2O, R package version 220.127.116.11. < https://github.com/h2oai/h2o-3>; Zhang W (2016).