Found 62 packages in 0.01 seconds
Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl
Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, MojePanstwo, Eurostat, WHO and other sources.
Concept Drift and Concept Shift Detection for Predictive Models
Concept drift refers to the change in the data distribution or
in the relationships between variables over time.
'drifter' calculates distances between variable distributions or
variable relations and identifies both types of drift.
Key functions are:
calculate_covariate_drift() checks distance between corresponding variables in two datasets,
calculate_residuals_drift() checks distance between residual distributions for two models,
calculate_model_drift() checks distance between partial dependency profiles for two models,
check_drift() executes all checks against drift.
'drifter' is a part of the 'DrWhy.AI' universe (Biecek 2018)
Flexible Tool for Bias Detection, Visualization, and Mitigation
Measure fairness metrics in one place for many models. Check how big is model's bias towards different races, sex, nationalities etc. Use measures such as Statistical Parity, Equal odds to detect the discrimination against unprivileged groups. Visualize the bias using heatmap, radar plot, biplot, bar chart (and more!). There are various pre-processing and post-processing bias mitigation algorithms implemented. Package also supports calculating fairness metrics for regression models. Find more details in (Wiśniewski, Biecek (2021))
Datasets and Functions Used in the Book 'Przewodnik po Pakiecie R'
Data sets and functions used in the polish book "Przewodnik po pakiecie R" (The Hitchhiker's Guide to the R). See more at < http://biecek.pl/R>. Among others you will find here data about housing prices, cancer patients, running times and many others.
Ceteris Paribus Profiles
Ceteris Paribus Profiles (What-If Plots) are designed to present model responses around selected points in a feature space. For example around a single prediction for an interesting observation. Plots are designed to work in a model-agnostic fashion, they are working for any predictive Machine Learning model and allow for model comparisons. Ceteris Paribus Plots supplement the Break Down Plots from 'breakDown' package.
Tools for Storing, Restoring and Searching for R Objects
Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.
The Proton Game
'The Proton Game' is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. You have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. The knowledge of dplyr is not required but may be very helpful. This game is linked with the ,,Pietraszko's Cave'' story available at http://biecek.pl/BetaBit/Warsaw. It's a part of Beta and Bit series. You will find more about the Beta and Bit series at http://biecek.pl/BetaBit.
Mini Games from Adventures of Beta and Bit
Three games: proton, frequon and regression. Each one is a console-based data-crunching game for younger and older data scientists. Act as a data-hacker and find Slawomir Pietraszko's credentials to the Proton server. In proton you have to solve four data-based puzzles to find the login and password. There are many ways to solve these puzzles. You may use loops, data filtering, ordering, aggregation or other tools. Only basics knowledge of R is required to play the game, yet the more functions you know, the more approaches you can try. In frequon you will help to perform statistical cryptanalytic attack on a corpus of ciphered messages. This time seven sub-tasks are pushing the bar much higher. Do you accept the challenge? In regression you will test your modeling skills in a series of eight sub-tasks. Try only if ANOVA is your close friend. It's a part of Beta and Bit project. You will find more about the Beta and Bit project at < https://github.com/BetaAndBit/Charts>.
LIME-Based Explanations with Interpretable Inputs Based on Ceteris Paribus Profiles
Local explanations of machine learning models describe, how features contributed to a single prediction.
This package implements an explanation method based on LIME
(Local Interpretable Model-agnostic Explanations,
see Tulio Ribeiro, Singh, Guestrin (2016)
A Set of Datasets Used in My Classes or in the Book 'Modele Liniowe i Mieszane w R, Wraz z Przykladami w Analizie Danych'
A set of datasets and functions used in the book 'Modele liniowe i mieszane w R, wraz z przykladami w analizie danych'. Datasets either come from real studies or are created to be as similar as possible to real studies.