caret - Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Last updated
19.66 score 1.7k stars 334 dependents 91k scripts 154k downloads
recipes - Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Last updated
18.21 score 618 stars 425 dependents 9.5k scripts 167k downloads
tidymodels - Easily Install and Load the 'Tidymodels' Packages
The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
Last updated
17.01 score 815 stars 15 dependents 78k scripts 114k downloads
parsnip - A Common API to Modeling and Analysis Functions
A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).
Last updated
16.56 score 654 stars 85 dependents 3.7k scripts 44k downloads
tune - Tidy Tuning Tools
The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, preprocessing methods, and post-processing steps.
Last updated
14.76 score 338 stars 51 dependents 1.4k scripts 34k downloadsCubist - Rule- And Instance-Based Regression Modeling
Regression modeling using rules with added instance-based corrections.
Last updated
13.04 score 45 stars 18 dependents 3.0k scripts 20k downloads
probably - Tools for Post-Processing Predicted Values
Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.
Last updated
12.35 score 121 stars 1 dependents 19k scripts 7.4k downloads
butcher - Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Last updated
12.30 score 138 stars 17 dependents 194 scripts 12k downloads
stacks - Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Last updated
12.13 score 302 stars 2 dependents 1.0k scripts 5.3k downloadsC50 - C5.0 Decision Trees and Rule-Based Models
C5.0 decision trees and rule-based models for pattern recognition that extend the work of Quinlan (1993, ISBN:1-55860-238-0).
Last updated
11.89 score 53 stars 11 dependents 1.6k scripts 9.3k downloadsmodeldata - Data Sets Useful for Modeling Examples
Data sets used for demonstrating or testing model-related packages are contained in this package.
Last updated
11.22 score 24 stars 17 dependents 3.0k scripts 38k downloads
finetune - Additional Functions for Model Tuning
The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <doi:10.48550/arXiv.1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
Last updated
9.69 score 65 stars 2 dependents 1.1k scripts 4.8k downloadstailor - Iterative Steps for Postprocessing Model Predictions
Postprocessors refine predictions outputted from machine learning models to improve predictive performance or better satisfy distributional limitations. This package introduces 'tailor' objects, which compose iterative adjustments to model predictions. A number of pre-written adjustments are provided with the package, such as calibration. See Lichtenstein, Fischhoff, and Phillips (1977) <doi:10.1007/978-94-010-1276-8_19>. Other methods and utilities to compose new adjustments are also included. Tailors are tightly integrated with the 'tidymodels' framework.
Last updated
9.30 score 16 stars 51 dependents 37 scripts 29k downloadsAmesHousing - The Ames Iowa Housing Data
Raw and processed versions of the data from De Cock (2011) <http://ww2.amstat.org/publications/jse> are included in the package.
Last updated
7.87 score 15 stars 2 dependents 667 scripts 6.2k downloads
modeldb - Fits Models Inside the Database
Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.
Last updated
databasedbplyrdplyrggplot2modelingrlangsqltidyevalvisualization
7.60 score 79 stars 63 scripts 636 downloads
baguette - Efficient Model Functions for Bagging
Tree- and rule-based models can be bagged (<doi:10.1007/BF00058655>) using this package and their predictions equations are stored in an efficient format to reduce the model objects size and speed.
Last updated
7.21 score 28 stars 972 scripts 1.3k downloadsAppliedPredictiveModeling - Functions and Data Sets for 'Applied Predictive Modeling'
A few functions and several data set for the Springer book 'Applied Predictive Modeling'.
Last updated
7.13 score 35 stars 1.5k scripts 5.2k downloadsdesirability2 - Desirability Functions for Multiparameter Optimization
In-line functions for multivariate optimization via desirability functions (Derringer and Suich, 1980, <doi:10.1080/00224065.1980.11980968>) with easy use within 'dplyr' pipelines.
Last updated
6.81 score 14 stars 2 dependents 62 scripts 600 downloadsusemodels - Boilerplate Code for 'Tidymodels' Analyses
Code snippets to fit models using the tidymodels framework can be easily created for a given data set.
Last updated
6.81 score 87 stars 186 scripts 392 downloadssfd - Space-Filling Design Library
A collection of pre-optimized space-filling designs, for up to ten parameters, is contained here. Functions are provided to access designs described by Husslage et al (2011) <doi:10.1007/s11081-010-9129-8> and Wang and Fang (2005) <doi:10.1142/9789812701190_0040>. The design types included are Audze-Eglais, MaxiMin, and uniform.
Last updated
6.48 score 70 dependents 6 scripts 29k downloadsplsmod - Model Wrappers for Projection Methods
Bindings for additional regression models for use with the 'parsnip' package, including ordinary and spare partial least squares models for regression and classification (Rohart et al (2017) <doi:10.1371/journal.pcbi.1005752>).
Last updated
mixomics
5.99 score 14 stars 99 scripts 492 downloadsshinymodels - Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Last updated
shiny
5.95 score 50 stars 51 scripts 282 downloadssparseLDA - Sparse Discriminant Analysis
Performs sparse linear discriminant analysis for Gaussians and mixture of Gaussian models.
Last updated
5.92 score 7 stars 2 dependents 40 scripts 5.0k downloadstabpfn - Prior-Data Fitted Network Foundational Model for Tabular Data
Provides a consistent API for classification and regression models based on the 'TabPFN' model of Hollmann et al. (2025), "Accurate predictions on small data with a tabular foundation model," Nature, 637(8045) <doi:10.1038/s41586-024-08328-6>. The calculations are served via 'Python' to train and predict the model.
Last updated
5.88 score 33 stars 19 scripts 721 downloadsdesirability - Function Optimization and Ranking via Desirability Functions
S3 classes for multivariate optimization using the desirability function by Derringer and Suich (1980).
Last updated
5.34 score 3 stars 1 dependents 49 scripts 764 downloadssparsediscrim - Sparse and Regularized Discriminant Analysis
A collection of sparse and regularized discriminant analysis methods intended for small-sample, high-dimensional data sets. The package features the High-Dimensional Regularized Discriminant Analysis classifier from Ramey et al. (2017) <arXiv:1602.01182>. Other classifiers include those from Dudoit et al. (2002) <doi:10.1198/016214502753479248>, Pang et al. (2009) <doi:10.1111/j.1541-0420.2009.01200.x>, and Tong et al. (2012) <doi:10.1093/bioinformatics/btr690>.
Last updated
5.32 score 3 stars 1 dependents 115 scripts 4.0k downloadsimportant - Supervised Feature Selection
Interfaces for choosing important predictors in supervised regression, classification, and censored regression models. Permuted importance scores (Biecek and Burzykowski (2021) <doi:10.1201/9780429027192>) can be computed for 'tidymodels' model fits.
Last updated
5.09 score 19 stars 26 scripts 199 downloadsbeans - Data on Dried Beans
These data contain morphological image measurements for dried beans from Koklu and Ozkan (2020) <doi:10.1016/j.compag.2020.105507>.
Last updated
3.65 score 1 stars 90 scripts 321 downloadsQSARdata - Quantitative Structure Activity Relationship (QSAR) Data Sets
Molecular descriptors and outcomes for several public domain data sets
Last updated
3.48 score 75 scripts 4.0k downloads

