Calculation routines based on the FOCUS Kinetics Report (2006, 2014). Includes a function for conveniently defining differential equation models, model solution based on eigenvalues if possible or using numerical solvers and a choice of the optimisation methods made available by the 'FME' package. If a C compiler (on windows: 'Rtools') is installed, differential equation models are solved using compiled C functions. Please note that no warranty is implied for correctness of results or fitness for a particular purpose.
The R package mkin provides calculation routines for the analysis of chemical degradation data, including multicompartment kinetics as needed for modelling the formation and decline of transformation products, or if several compartments are involved.
You can install the latest released version from CRAN from within R:
In the regulatory evaluation of chemical substances like plant protection products (pesticides), biocides and other chemicals, degradation data play an important role. For the evaluation of pesticide degradation experiments, detailed guidance and helpful tools have been developed as detailed in 'Credits and historical remarks' below.
mkinmod, including equilibrium reactions and using the single first-order reversible binding (SFORB) model, which will automatically create two latent state variables for the observed variable.
mkinpredictis performed either using the analytical solution for the case of parent only degradation, an eigenvalue based solution if only simple first-order (SFO) or SFORB kinetics are used in the model, or using a numeric solver from the
deSolvepackage (default is
compiled_models. The autogeneration of C code was inspired by the
ccSolvepackage. Thanks to Karline Soetaert for her work on that.
transform_odeparmsso their estimators can more reasonably be expected to follow a normal distribution. This has the side effect that no constraints are needed in the optimisation. Thanks to René Lehmann for the nice cooperation on this, especially the isometric logration transformation that is now used for the formation fractions.
mkinfitobject is in fact a full report that should give enough information to be able to approximately reproduce the fit with other tools.
reweight.method = "obs"to your call to
mkinfitand a separate variance componenent for each of the observed variables will be optimised in a second stage after the primary optimisation algorithm has converged.
reweight.method = "tc".
There is a graphical user interface that I consider useful for real work. Please refer to its documentation page for installation instructions and a manual.
It could not have been written without me being introduced to regulatory fate
modelling of pesticides by Adrian Gurney during my time at Harlan Laboratories
Ltd (formerly RCC Ltd).
mkin greatly profits from and largely follows
the work done by the
FOCUS Degradation Kinetics Workgroup,
as detailed in their guidance document from 2006, slightly updated in 2011 and
Also, it was inspired by the first version of KinGUI developed by BayerCropScience, which is based on the MatLab runtime environment.
In 2011, Bayer Crop Science started to distribute an R based successor to KinGUI named
KinGUII whose R code is based on
mkin, but which added, amongst other
refinements, a closed source graphical user interface (GUI), iteratively
reweighted least squares (IRLS) optimisation of the variance for each of the
observed variables, and Markov Chain Monte Carlo (MCMC) simulation
functionality, similar to what is available e.g. in the
Somewhat in parallel, Syngenta has sponsored the development of an
KinGUII based GUI application called CAKE, which also adds IRLS and MCMC, is
more limited in the model formulation, but puts more weight on usability.
CAKE is available for download from the CAKE
website, where you can also
find a zip archive of the R scripts derived from
mkin, published under the GPL
Finally, there is KineticEval, which contains a further development of the scripts used for KinGUII, so the different tools will hopefully be able to learn from each other in the future as well.
Contributions are welcome! Your mkin fork is just a mouse click away... The master branch on github should always be in good shape, I implement new features in separate branches now. If you prefer subversion, project members for the r-forge project are welcome as well. Generally, the source code of the latest CRAN version should be available there. You can also browse the source code at cgit.jrwb.de/mkin.
Add the function 'logLik.mkinfit' which makes it possible to calculate an AIC for mkinfit objects
Add the function 'AIC.mmkin' to makeqit easy to compare columns of mmkin objects
'add_err': Respect the argument giving the number of replicates in the synthetic dataset
'max_twa_parent': Support maximum time weighted average concentration calculations for the hockey stick (HS) model
'mkinpredict': Make the function generic and create a method for mkinfit objects
'mkinfit': Improve the correctness of the fitted two component error model by fitting the mean absolute deviance at each observation against the observed values, weighting with the current two-component error model
'tests/testthat/test_irls.R': Test if the components of the error model used to generate the data can be reproduced with moderate accuracy
Add the function 'CAKE_export' to facilitate cross-checking of results
Implement the logistic model (only tested for parent fits)
'nafta': Add evaluations according to the NAFTA guidance
Make the two-component error model stop in cases where it is inadequate to avoid nls crashes on windows
Move two vignettes to a location where they will not be built on CRAN (to avoid more NOTES from long execution times)
Exclude more example code from testing on CRAN to avoid NOTES from long execution times
'mkinfit': Improve fitting the error model for reweight.method = 'tc'. Add 'manual' to possible arguments for 'weight'
Test that FOCUS_2006_C can be evaluated with DFOP and reweight.method = 'tc'
'sigma_twocomp': Rename 'sigma_rl' to 'sigma_twocomp' as the Rocke and Lorenzato model assumes lognormal distribution for large y. Correct references to the Rocke and Lorenzato model accordingly.
'mkinfit': Use 1.1 as starting value for N parameter of IORE models to obtain convergence in more difficult cases. Show parameter names when 'trace_parms' is 'TRUE'.
Skip some tests on CRAN and winbuilder to avoid timeouts
'test_data_from_UBA_2014': Added this list of datasets containing experimental data used in the expertise from 2014
'mkinfit': Added the iterative reweighting method 'tc' using the two-component error model from Rocke and Lorenzato. NA values in the data are not returned any more.
'mkinfit': Work around a bug in the current FME version that prevented the convergence message to be returned in the case of non-convergence.
'summary.mkinfit': Improved output regarding weighting method. No predictions are returned for NA values in the model (see above).
'summary.mkinfit': Show versions of mkin and R used for fitting (not the ones used for the summary) if the fit was generated with mkin >= 0.9.47.1
vignettes/mkin.Rmd: URLs were updated
synthetic_data_for_UBA: Add the code used to generate the data in the interest of reproducibility
Converted the vignette FOCUS_Z from tex/pdf to markdown/html
DESCRIPTION: Add ORCID
plot.mkinfit: Fix scaling of residual plots for the case of separate plots for each observed variable
plot.mkinfit: Use all data points of the fitted curve for y axis scaling for the case of separate plots for each observed variable
test_FOMC_ill-defined.Ras it is too platform dependent
max_twa_parent to avoid conflict with
twa from my
Update URLs in documentation
Limit test code to one core to pass on windows
rbenchmark as the former is not supported on all platforms
twafunction, calculating maximum time weighted average concentrations for the parent (SFO, FOMC and DFOP).
plot.mmkin: If the plotting device is
tikz, LaTeX markup is being used for the chi2 error in the graphs.
pkgdown, the successor of
staticdocs for generating static HTML documentation. Include example output and graphs also for
plot.mkinfit: Plotting does not fail any more when the compiled model is not available, e.g. because it was removed from the temporary directory. In this case, the uncompiled model is now used for plotting
test_FOMC_ill-definedfailed on several architectures, so the test is now skipped
The title was changed to
Kinetic evaluations of chemical degradation data
plot.mkinfit: Add the possibility to show fits (and residual plots if requested) separately for the observed variables
plot.mkinfit: Add the possibility to show the chi2 error levels in the plot, similar to the way they are shown in
plot_sep: Add this function as a convenience wrapper for plotting observed variables of mkinfit objects separately, with chi2 error values and residual plots.
Vignettes: The main vignette
mkin was converted to R markdown and updated. The other vignettes were also updated to show current improved functionality.
add_err was added to the package, making it easy to generate simulated data using an error model based on the normal distribution
Remove an outdated reference to the inline package in the
mkinfit: Do not error out in cases where the fit converges, but the Jacobian for the untransformed model cost can not be estimated. Give a warning instead and return NA for the t-test results.
summary.mkinfit: Give a warning message when the covariance matrix can not be obtained.
A test has been added to containing a corresponding edge case to check that the warnings are correctly issued and the fit does not terminate.
plot.mmkin: Round the chi2 error value to three significant digits, instead of two decimal digits.
mkinfit: Return the
err values used on weighted fits as a column named
err. Also include these inverse weights when the column
value in the observed data is used, which is returned as
observed in the data component of the mkinfit object.
endpoints: When the name of a substance degrading to a metabolite (e.g. a parent compound) used in the model formulation ended in the letter
f, some rate parameters could be listed as formation fractions with mixed up names. These would also appear in the summary.
mkinfit: Check for all observed variables when checking if the user tried to fix formation fractions when fitting them using ilr transformation.
plot.mmkin: Set the plot margins correctly, also in the case of a single fit to be plotted, so the main title is placed in a reasonable way.
plot.mkinfit: Correct default values for
lty_obs for the case that
obs_vars is specified.
mkinfit, for fitting only the decline from the maximum observed value for models with a single observed variable
Add plots to
Give an explanatory error message when mkinmod fails due to a missing definition of a target variable
print.mkinmod(): Improve formatting when printing mkinmod model definitions
Add an R6 class
mkinds representing datasets with a printing method
Add a printing method for mkinmod objects
Make it possible to specify arbitrary strings as names for the compounds in
mkinmod, and show them in the plot
Use an index.r file to group help topics in static documentation
print.summary.mkinfit(): Avoid an error that occurred when printing summaries generated with mkin versions before 0.9-36
endpoints(): For DFOP and SFORB models, where
optimize()is used, make use of the fact that the DT50 must be between DT50_k1 and DT50_k2 (DFOP) or DT50_b1 and DT50_b2 (SFORB), as
optimize()sometimes did not find the minimum. Likewise for finding DT90 values. Also fit on the log scale to make the function more efficient.
R/*.R: Import (from) stats, graphics and methods packages, and qualify some function calls for non-base packages installed with R to avoid NOTES made by R CMD check --as-cran with upcoming R versions.
mmkin(): This function takes a character vector of model shorthand names, or alternatively a list of mkinmod models, as well as a list of dataset as main arguments. It returns a matrix of mkinfit objects, with a row for each model and a column for each dataset. A subsetting method with single brackets is available. Fitting the models in parallel using the
parallel package is supported.
plot.mmkin(): Plots single-row or single-column
mmkin objects including residual plots.
mkinparplot(): Fix the x axis scaling for rate constants and formation fractions that got confused by the introduction of the t-values of transformed parameters.
vignettes/compiled_models.html: Show the performance improvement factor actually obtained when building the vignette, as well as mkin version, some system info and the CPU model used for building the vignette.
vignettes/*: Clean up vignette generation and include table of contents in HTML vignettes.
mkinmod(): When generating the C code for the derivatives, only declare the time variable when it is needed and remove the '-W-no-unused-variable' compiler flag as the C compiler used in the CRAN checks on Solaris does not know it.
summary.mkinfit(): A one-sided t-test for significant difference of untransformed parameters from zero is now always shown, based on the assumption of normal distribution for estimators of all untransformed parameters. Use with caution, as this assumption is unrealistic e.g. for rate constants in these nonlinear kinetic models.
If a compiler (gcc) is installed, use a version of the differential equation model compiled from C code, which is a huge performance boost for models where only the deSolve method works.
mkinmod(): Create a list component $cf (of class CFuncList) in the list returned by mkinmod, if a version can be compiled from autogenerated C code (see above).
mkinfit(): Set the default
deSolve when a compiled version of the model is present, except when an analytical solution is possible.
mkinparplot(): Avoid warnings that occurred when not all confidence intervals were available in the summary of the fit
print.summary.mkinfit(): Fix printing the summary for the case that the number of iterations is not available
NAMESPACE: export S3 methods plot.mkinfit, summary.mkinfit and print.summary.mkinfit to satisfy R CMD check on R-devel
mkinparplot(): Avoid warning in R CMD check about undeclared global variable
mkinfit(): Report successful termination when quiet = FALSE. This is helpful for more difficult problems fitted with reweight.method = obs, as no progress is often indicated during the reweighting.
A first test using results established in the expertise written for the German Federal Environmental Agency (UBA) was added.
Add synthetic datasets generated for expertise written for the German Federal Environmental Agency UBA
Add tests based on these datasets
Add the convenience function
mkinsub() for creating the lists used in
Add the possibility to fit indeterminate order rate equation (IORE) models using an analytical solution (parent only) or a numeric solution. Paths from IORE compounds to metabolites are supported when using formation fractions (use_of_ff = 'max'). Note that the numerical solution (method.ode = 'deSolve') of the IORE differential equations sometimes fails due to numerical problems.
Switch to using the Port algorithm (using a model/trust region approach) per default. While needing more iterations than the Levenberg-Marquardt algorithm previously used per default, it is less sensitive to starting parameters.
The formatting of differential equations in the summary was further improved
Always include 0 on y axis when plotting during the fit
The initial value (state.ini) for the observed variable with the highest observed residue is set to 100 in case it has no time zero observation and
state.ini = "auto"
A basic unit test for
mkinerrmin() was written
mkinfit(): The internally fitted parameter for
g was named
g_ilr even when
mkinfit(): The initial value (state.ini) for the parent compound was not set when the parent was not the (only) variable with the highest value in the observed data.
mkinerrmin(): When checking for degrees of freedom for metabolites, check if their time zero value is fixed instead of checking if the observed value is zero. This ensures correct calculation of degrees of freedom also in cases where the metabolite residue at time zero is greater zero.
plot.mkinfit(): Avoid a warning message about only using the first component of ylim that occurred when ylim was specified explicitly
The formatting of differential equations in the summary was improved by wrapping overly long lines
The FOCUS_Z vignette was rebuilt with the above improvement and using a width of 70 to avoid output outside of the grey area
print.summary.mkinfit(): Avoid a warning that occurred when gmkin showed summaries ofinitial fits without iterations
mkinfit(): Avoid a warning that occurred when summarising a fit that was performed with maxitmodFit = 0 as done in gmkin for configuring new fits.
The number of degrees of freedom is difficult to define in the case of ilr transformation of formation fractions. Now for each source compartment the number of ilr parameters (=number of optimised parameters) is divided by the number of pathways to metabolites (=number of affected data series) which leads to fractional degrees of freedom in some cases.
The default for the initial value for the first state value is now taken from the mean of the observations at time zero, if available.
The kinetic model can alternatively be specified with a shorthand name for parent only degradation models, e.g.
Optimisation method, number of model evaluations and time elapsed during optimisation are given in the summary of mkinfit objects.
The maximum number of iterations in the optimisation algorithm can be specified using the argument
maxit.modFit to the mkinfit function.
mkinfit gives a warning when the fit does not converge (does not apply to SANN method). This warning is included in the summary.
Avoid plotting an artifical 0 residual at time zero in
In the determination of the degrees of freedom in
mkinerrmin, formation fractions were accounted for multiple times in the case of parallel formation of metabolites. See the new feature described above for the solution.
mkinfit now also works for FOMC and HS models.
Initial values for formation fractions were not set in all cases.
No warning was given when the fit did not converge when a method other than the default Levenberg-Marquardt method
Marq was used.
Vignettes were rebuilt to reflect the changes in the summary method.
Pseudo was excluded because it needs user-defined parameter limits which are not supported.
Newton was excluded because of its different way to specify the maximum number of iterations and because it does not appear to provide additional benefits.
mkinerrmin()used by the summary function.
The original and the transformed parameters now have different names (e.g.
log_k_parent. They also differ in how many they are when we have formation fractions but no pathway to sink.
The order of some of the information blocks in
print.summary.mkinfit.R() has been ordered in a more logical way.
The vignette FOCUS_Z has been simplified to use formation fractions with turning off the sink, and slightly amended to use the new versions of DT50 values calculated since mkin 0.9-29.
All vignettes have been rebuilt so they reflect all changes.
The ChangeLog was renamed to NEWS.md and the entries were converted to markdown syntax compatible with the
tools::news() function built into R.
The test suite was overhauled. Tests of the DFOP and SFORB models with dataset FOCUS_2006_A were removed, as they were too much dependent on the optimisation algorithm and/or starting parameters, because the dataset is SFO (compare kinfit vignette).
Also, the Schaefer complex case can now be fitted using formation fractions, and with the 'Port' optimisation method we also fit A2 in the same way as published in the Piacenza paper.
Some more checks were introduced to
mkinfit(), leading to warnings or stopping execution if unsupported combinations of methods and parameters are requested.
R/mkinresplot.R: Make it possible to specify
R/geometric_mean.R, man/geometric_mean.Rd: Add geometric mean function
R/endpoints.R, man/endpoints.Rd: Calculate additional (pseudo)-DT50 values for FOMC, DFOP, HS and SFORB. Avoid calculation of formation fractions from rate constants when they are directly fitted
Do not backtransform confidence intervals for formation fractions if more than one compound is formed, as such parameters only define the pathways as a set
Add historical remarks and some background to the main package vignette
Correct 'isotropic' into 'isometric' for the ilr transformation
Fork the GUI into a separate package gmkin
DESCRIPTION, NAMESPACE, TODO: Adapt and add copyright information
Remove files belonging to the GUI
Possibility to fit without parameter transformations, using bounds as implemented in FME
Add McCall 2,4,5-T dataset
Enable selection of observed variables in plotting
Add possibility to show residual plot in
R/mkinparplot.R, man/mkinparplot.Rd: plot parameters with confidence intervals
Change vignette format from Sweave to knitr
Split examples vignette to FOCUS_L and FOCUS_Z
Remove warning about constant formation fractions in mkinmod as it was based on a misconception
Restrict the unit test with the Schaefer data to parent and primary metabolites as formation fraction and DT50 for A2 are higly correlated and passing the test is platform dependent. For example, the test fails in 1 out of 14 platforms on CRAN as of today.
Add Eurofins Regulatory AG copyright notices
Import FME and deSolve instead of depending on them to have clean startup
Add a starter function for the GUI:
Change the format of the workspace files of gmkin so they can be distributed and documented in the package
Add gmkin workspace datasets FOCUS_2006_gmkin and FOCUS_2006_Z_gmkin
Bugfix re-enabling the fixing of any combination of initial values for state variables
Default values for kinetic rate constants are not all 0.1 any more but are "salted" with a small increment to avoid numeric artefacts with the eigenvalue based solutions
Backtransform fixed ODE parameters for the summary
Get rid of the optimisation step in
mkinerrmin - this was unnecessary. Thanks to KinGUII for the inspiration - actually this is equation 6-2 in FOCUS kinetics p. 91 that I had overlooked originally
plot.mkinfit as it passed graphical arguments like main to the solver
Do not use
The first successful fits in the not so simple GUI
Fix iteratively reweighted least squares for the case of many metabolites
Unify naming of initial values of state variables
Unify naming in dataframes of optimised and fixed parameters in the summary
Show the weighting method for residuals in the summary
Correct the output of the data in the case of manual weighting
Implement IRLS assuming different variances for observed variables
Do not use 0 values at time zero for chi2 error level calculations. This is the way it is done in KinGUII and it makes sense. It does impact the chi2 error levels in the output. Generally they seem to be lower for metabolites now, presumably because the mean of the observed values is higher
For a detailed list of changes to the mkin source please consult the commit history on http://github.com/jranke/mkin