caret - Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Last updated
19.04 score 1.7k stars 333 dependents 79k scripts 173k downloads
recipes - Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Last updated
18.30 score 614 stars 425 dependents 9.5k scripts 178k downloads
tidymodels - Easily Install and Load the 'Tidymodels' Packages
The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
Last updated
16.98 score 815 stars 15 dependents 74k scripts 133k downloads
parsnip - A Common API to Modeling and Analysis Functions
A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).
Last updated
16.57 score 652 stars 83 dependents 3.6k scripts 48k downloads
tune - Tidy Tuning Tools
The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, preprocessing methods, and post-processing steps.
Last updated
14.73 score 334 stars 50 dependents 1.2k scripts 38k downloadsCubist - Rule- And Instance-Based Regression Modeling
Regression modeling using rules with added instance-based corrections.
Last updated
13.15 score 45 stars 19 dependents 3.0k scripts 25k downloads
stacks - Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Last updated
12.40 score 302 stars 2 dependents 1.1k scripts 5.8k downloads
probably - Tools for Post-Processing Predicted Values
Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.
Last updated
12.35 score 121 stars 1 dependents 19k scripts 7.4k downloads
butcher - Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Last updated
12.30 score 138 stars 17 dependents 216 scripts 11k downloadsC50 - C5.0 Decision Trees and Rule-Based Models
C5.0 decision trees and rule-based models for pattern recognition that extend the work of Quinlan (1993, ISBN:1-55860-238-0).
Last updated
12.13 score 53 stars 11 dependents 1.7k scripts 15k downloadsmodeldata - Data Sets Useful for Modeling Examples
Data sets used for demonstrating or testing model-related packages are contained in this package.
Last updated
11.06 score 24 stars 17 dependents 2.1k scripts 38k downloads
finetune - Additional Functions for Model Tuning
The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <doi:10.48550/arXiv.1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
Last updated
9.82 score 64 stars 2 dependents 1.0k scripts 5.6k downloadstailor - Iterative Steps for Postprocessing Model Predictions
Postprocessors refine predictions outputted from machine learning models to improve predictive performance or better satisfy distributional limitations. This package introduces 'tailor' objects, which compose iterative adjustments to model predictions. A number of pre-written adjustments are provided with the package, such as calibration. See Lichtenstein, Fischhoff, and Phillips (1977) <doi:10.1007/978-94-010-1276-8_19>. Other methods and utilities to compose new adjustments are also included. Tailors are tightly integrated with the 'tidymodels' framework.
Last updated
9.36 score 16 stars 51 dependents 33 scripts 31k downloadsAmesHousing - The Ames Iowa Housing Data
Raw and processed versions of the data from De Cock (2011) <http://ww2.amstat.org/publications/jse> are included in the package.
Last updated
7.82 score 15 stars 2 dependents 644 scripts 5.7k downloads
modeldb - Fits Models Inside the Database
Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.
Last updated
databasedbplyrdplyrggplot2modelingrlangsqltidyevalvisualization
7.59 score 78 stars 63 scripts 582 downloadsAppliedPredictiveModeling - Functions and Data Sets for 'Applied Predictive Modeling'
A few functions and several data set for the Springer book 'Applied Predictive Modeling'.
Last updated
7.15 score 35 stars 1.3k scripts 6.1k downloads
baguette - Efficient Model Functions for Bagging
Tree- and rule-based models can be bagged (<doi:10.1007/BF00058655>) using this package and their predictions equations are stored in an efficient format to reduce the model objects size and speed.
Last updated
7.00 score 28 stars 802 scripts 848 downloadsusemodels - Boilerplate Code for 'Tidymodels' Analyses
Code snippets to fit models using the tidymodels framework can be easily created for a given data set.
Last updated
6.78 score 87 stars 174 scripts 302 downloadsdesirability2 - Desirability Functions for Multiparameter Optimization
In-line functions for multivariate optimization via desirability functions (Derringer and Suich, 1980, <doi:10.1080/00224065.1980.11980968>) with easy use within 'dplyr' pipelines.
Last updated
6.73 score 14 stars 2 dependents 51 scripts 614 downloadssfd - Space-Filling Design Library
A collection of pre-optimized space-filling designs, for up to ten parameters, is contained here. Functions are provided to access designs described by Husslage et al (2011) <doi:10.1007/s11081-010-9129-8> and Wang and Fang (2005) <doi:10.1142/9789812701190_0040>. The design types included are Audze-Eglais, MaxiMin, and uniform.
Last updated
6.48 score 70 dependents 6 scripts 29k downloadsplsmod - Model Wrappers for Projection Methods
Bindings for additional regression models for use with the 'parsnip' package, including ordinary and spare partial least squares models for regression and classification (Rohart et al (2017) <doi:10.1371/journal.pcbi.1005752>).
Last updated
mixomics
5.95 score 14 stars 90 scripts 446 downloadsshinymodels - Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Last updated
shiny
5.92 score 50 stars 48 scripts 284 downloadstabpfn - Prior-Data Fitted Network Foundational Model for Tabular Data
Provides a consistent API for classification and regression models based on the 'TabPFN' model of Hollmann et al. (2025), "Accurate predictions on small data with a tabular foundation model," Nature, 637(8045) <doi:10.1038/s41586-024-08328-6>. The calculations are served via 'Python' to train and predict the model.
Last updated
5.88 score 33 stars 19 scripts 721 downloadssparseLDA - Sparse Discriminant Analysis
Performs sparse linear discriminant analysis for Gaussians and mixture of Gaussian models.
Last updated
5.82 score 7 stars 2 dependents 40 scripts 4.0k downloadsdesirability - Function Optimization and Ranking via Desirability Functions
S3 classes for multivariate optimization using the desirability function by Derringer and Suich (1980).
Last updated
5.26 score 3 stars 1 dependents 40 scripts 590 downloadssparsediscrim - Sparse and Regularized Discriminant Analysis
A collection of sparse and regularized discriminant analysis methods intended for small-sample, high-dimensional data sets. The package features the High-Dimensional Regularized Discriminant Analysis classifier from Ramey et al. (2017) <arXiv:1602.01182>. Other classifiers include those from Dudoit et al. (2002) <doi:10.1198/016214502753479248>, Pang et al. (2009) <doi:10.1111/j.1541-0420.2009.01200.x>, and Tong et al. (2012) <doi:10.1093/bioinformatics/btr690>.
Last updated
5.18 score 3 stars 1 dependents 101 scripts 3.4k downloadsimportant - Supervised Feature Selection
Interfaces for choosing important predictors in supervised regression, classification, and censored regression models. Permuted importance scores (Biecek and Burzykowski (2021) <doi:10.1201/9780429027192>) can be computed for 'tidymodels' model fits.
Last updated
5.09 score 19 stars 26 scripts 199 downloadsbeans - Data on Dried Beans
These data contain morphological image measurements for dried beans from Koklu and Ozkan (2020) <doi:10.1016/j.compag.2020.105507>.
Last updated
3.59 score 1 stars 78 scripts 319 downloadsQSARdata - Quantitative Structure Activity Relationship (QSAR) Data Sets
Molecular descriptors and outcomes for several public domain data sets
Last updated
3.34 score 70 scripts 3.2k downloads

