DataRobot Predictive Modeling API

For working with the DataRobot predictive modeling platform's < https://app.datarobot.com> API.


News

New features:

  • The premium feature DataRobot Prime has been added. You can now approximate a model on the leaderboard and download executable code for it. Talk to your account representative if the feature is not available on your account. The new related functions are GetPrimeEligibility, RequestApproximation, ListPrimeModels, GetPrimeModel, GetRulesets, RequestPrimeModel, GetPrimeModelFromJobId, CreatePrimeCode, GetPrimeFileFromJobid, ListPrimeFiles, GetPrimeFile, DownloadPrimeCode
  • A utility function, WaitForJobToComplete, has been added. It will block until the specified job finishes, or raise an error if it does not finish within a specified timeout.
  • Functions SetupProjectFromMySQL, SetupProjectFromOracle, SetupProjectFromPostgreSQL and SetupProjectFromHDFS have been added. They allow user to create DataRobot projects from MySQL, Oracle, PostgreSQL and HDFS data sources.
  • Functions RequestTransferrrableModel, DownloadTransferrableModel, UploadTransferrableModel, GetTransferrrableModel, ListTransferrrableModels, UpdateTransferrrableModel, DeleteTransferrrableModel have been added. They allow user to download models from modeling server and transfer them to special dedicated prediction server (those functions are only useful to users with on-premise environment)

Enhancements:

  • An optional maxWait parameter has been added to GetModelFromJobId and GetFeatureImpactForJobId, to allow users to specify an amount of time to wait for the job to complete other than the default 60 seconds.
  • Projects can now be run in quickrun mode (which skips some autopilot stages and longer-running models) by passing "quick" as the mode parameter, in the same way "auto" and "manual" modes can be specified.
  • The client will now check the API version offered by the server specified in configuration, and a warning if the client version is newer than the server version. The DataRobot server is always backwards compatible with old clients, but new clients may have functionality that is not implemented on older server versions. This issue mainly affects users with on-premise deployments of DataRobot.
  • SetupProject and UploadPredictionDataset accept url as dataSource parameter now

Bugfixes:

  • If a model job errors, GetModelFromJobId will now immediately raise an exception, rather than waiting for the timeout.
  • The maxWait parameter on UploadPredictionDataset will now be correctly applied.

API Changes:

Deprecated and Defunct:

  • The quickrun parameter on SetTarget is deprecated (and will be removed in 3.0). Pass "quick" as the mode parameter instead.

Documentation Changes:

Enhancements:

  • When project creation using SetupProject times out, the error message now includes a URL to use with the new ProjectFromAsyncUrl function to resume waiting for the project creation.
  • GetFeatureInfo now supports retrieving features by feature name. (For backwards compatibility, feature IDs are still supported until 3.0.)
  • The package no longer a particular version of the methods package. (This dependecy was too strict and required some users to unnecessarily upgrade R.)
  • The projectName argument of SetupProject no longer defaults to the string 'None'. (The new default is not to send a name, which results in the name 'Untitled Project'.)
  • The maxWait argument for SetupProject now controls the timeout for the initial POST request and has a larger default value. The reason for this is that for large project creation file uploads, the server may take a longer-than-normal amount of time to respond, and waiting longer than the default timeout may be necessary.

Deprecated and Defunct:

  • The ability to use GetFeatureInfo with feature IDs is deprecated (and will be removed in 3.0). Use feature names instead.
  • GetRecommendedBlueprints is replaced by ListBlueprints and deprecated (and will be removed in 3.0).
  • RequestPredictions is deprecated and replaced by RequestPredictionsForDataset. RequestPredictionsForDataset will be renamed to RequestPredictions in 3.0.
  • DeletePendingJobs is removed; use DeleteModelJob instead
  • GetFeatures is removed; use ListModelFeatures instead
  • GetPendingJobs is removed; use GetModelJobs instead
  • StartAutopilot is removed; use SetTarget instead
  • parameter url is removed from ConnectToDataRobot
  • parameter jobStatus is removed from GetModelJobs
  • parameters saveFile and csvExtension are removed from RequestPredictions
  • parameters saveFile and csvExtension are removed from SetupProject
  • "semi" mode option (functions SetTarget, StartNewAutoPilot) is deprecated (and will be removed in 3.0).

New features:

  • The API now supports the new Feature Impact feature. Use RequestFeatureImpact to start a job to compute FeatureImpact, and GetFeatureImpactForModel or GetFeatureImpactForJobId to retrieve the completed Feature Impact results.
  • The new functions CreateDerivedFeatureAsCategorical, CreateDerivedFeatureAsText, CreateDerivedFeatureAsNumeric can be used to create derived features as type transforms of existing features.
  • The API now supports uploading (UploadPredictionDataset), listing (ListPredictionDatasets), and deleting (DeletePredictionDataset) datasets for prediction as well as requesting predictions (RequestPredictionsForDataset) against such datasets.

Bugfixes

  • as.data.frame fixed for empty listOfBlueprints, listOfFeaturelists, listOfModels
  • The documentation for SetTarget incorrectly referred to the 'semiauto' (rather than 'semi') autopilot setting. This is fixed.
  • GetPredictions previously used a maxWait of 60, regardless of what maxWait the user specified. This is fixed.

Bugfixes

  • GetModelJobFromId was broken by v2.2.32 and is now fixed.
  • CreateFeaturelist was broken by v2.2.32 and is now fixed.

API Changes

  • Package renamed to datarobot.

New features:

  • ListJobs and DeleteJob functions added. ListJobs lists the jobs in the project queue (of any type). DeleteJob can be used to cancel one of these jobs.
  • ListFeatureInfo (for all features) and GetFeatureInfo (for one feature) have been added for retrieving feature details.

Enhancements:

  • In line with new functionality in version 2.2 of the DataRobot API, CreateUserPartition now allows holdoutLevel to be NULL (which results in not sending the holdout level, in line with backend API changes to allow user partitions to be created without a holdout level).
  • Slices using [ from objects of type listOfBlueprints, listOfFeaturelists, and listOfModels will now retain the appropriate type.
  • Several functions (e.g. ConnectToDataRobot, DeleteModel, PauseQueue, etc.) used to return TRUE as their only possible return value. Now they return nothing instead.
  • GetValidMetrics no longer has special-casing for the situation when the project is not yet ready to give you the valid metrics for a potential metric. In this case, an error will now be returned from the server.
  • Error messages from the server now include additional detail.
  • To improve error messages, in several places error messages no longer reference the top-level function the user called.
  • The SetTarget function will now properly block execution until the server indicates the project has finished initializing and is ready to build models

Deprecated and Defunct:

  • GetFeatures has been deprecated and renamed to ListModelFeatures (for more more clarity/consistency in naming and to avoid confusion with the now GetFeatureInfo and ListFeatureInfo)
  • Support for authenticating via username/password has been removed. Use an API Token instead
  • Removed broken UpdateDefaultPartition. To use one of the default partition methods with updated settings, please use CreateRandomPartition or CreateStratifiedPartition.

Enhancements

  • Use of the WaitForAutopilot function will no longer trigger deprecation warnings

Bugfixes

  • Due to a dependency on the methods package (which is loaded by default interactively but not running Rscript), RequestPredictions did not work when invoked with Rscript. This is fixed. The methods package is now in 'depends' instead of 'imports' to prevent this problem from ever occurring again.

Deprecated & Defunct

  • Removed broken UpdateDefaultPartition. Please use the other partition-creating functions.

Bugfixes

  • Due to a dependency on the methods package (which is loaded by default interactively but not running Rscript), some functions did not work when invoked with Rscript. This is fixed.
  • SetupProject and GetPredictions now check for and displays errors in project creation (previously they would keep waiting and time out if there are errors)
  • Previously errors would sometimes appear missing a space between two words. This is fixed.

Bugfixes

  • Fixed a problem that caused an error when getting predictions if the installed version of the httr package was 1.0 and older.

Enhancements:

  • HTTP requests now include User-Agent headers for logging purposes, e.g. "DataRobotRClient/2.0.25 (Darwin 14.5.0 x86_64)".
  • We now provide a more informative error message after receiving HTML from the server when we expected JSON.
  • We avoid httr encoding warning messages by specifying UTF-8.
  • It is now possible to not specify the desired jobStatus in GetPendingJobs (by passing NULL for the jobStatus argument, which is now the default).
  • GetPredictions now checks whether a prediction job has errored or been canceled and will error right away in that case (instead of waiting until the timeout)
  • When specifying the data source as a dataframe (in RequestPredictions or SetupProject), the class may now be a subclass of dataframe (it need not be equal to dataframe).
  • Previously GetModelJobs returned a dataframe when there are jobs but an empty list when there are none. Now it consistently returns a dataframe (with zero rows if there are no jobs) either way.

New features:

  • ConnectToDataRobot can now read from a YAML config file.
  • On package startup, we look for a config file in the default location, so the user does not need to call ConnectToDataRobot explicitly
  • WaitForAutopilot function added. This function periodically checks whether Autopilot is finished and returns only after it is.
  • SetupProject and RequestPredictions now default to using a tempfile instead of placing the file to be uploaded into the current working directory.
  • New function StartNewAutopilot can be used to restart autopilot on a specific featurelist if it was previously running on a different one.
  • New function SetTarget provides the functionality that StartAutopilot used to be responsible for. StartAutopilot is now deprecated, and SetTarget should be used instead. This function can now take a featurelistId argument, specifying which featurelist to use.

Bugfixes:

  • GetPendingJobs (now deprecated in favor of GetModelJobs) was broken and is now fixed.
  • GetValidMetrics was broken and is now fixed.
  • GetProjectList no longer errors when there are no projects. It now returns an object whose structure matches the returned object when there are projects.

Deprecated and Defunct:

  • The arguments controlling where the tempfile goes (in SetupProject and RequestPredictions) are now deprecated
  • DeletePendingJob is deprecated (use DeleteModelJob instead)
  • GetPendingJob is deprecated (use GetModelJob instead)
  • jobStatus argument to GetModelJob/GetPendingJob is deprecated (use status instead)
  • StartAutopilot is deprecated (use SetTarget instead).

API Changes Summary:

  • Support for the experimental date partitioning has been removed in DataRobot API, so it is being removed from the client immediately - the CreateDatePartition function has been removed.

Enhancements:

  • Codebase cleaned of many lint violations.

New Features:

  • DeletePredictJob, GetPredictJobs, GetPredictions, RequestPredictions all added to control the prediction functionality created in v2.0 featureset of the API.
  • "quickrun" parameter added to StartAutopilot. This boolean enables use of the quickrun autopilot feature of DataRobot.

Bugfixes: None

Deprecated and Defunct: None

API Changes: None

  • fixes the maxWait parameter that was unsuccessfully introduced in 0.2.23
  • maxWait parameter added to SetupProject to allow for datasets that take very long to initialize on the DataRobot server
  • Documentation structure changed to use Roxygen2

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("datarobot")

2.4.0 by David Chudzicki, 4 months ago


Browse source code at https://github.com/cran/datarobot


Authors: Ron Pearson [aut], Zachary Deane-Mayer [aut], David Chudzicki [aut], Dallin Akagi [aut]


Documentation:   PDF Manual  


Task views: Web Technologies and Services


MIT + file LICENSE license


Imports httr, jsonlite, yaml

Depends on methods

Suggests knitr, MASS, testthat, beanplot, mlbench, car, rmarkdown, insuranceData, doBy, lintr, stubthat


See at CRAN