ART combines machine learning, Bayesian inference, and Monte Carlo sampling to learn from a training data set, produce a probabilistic predictive model, and recommend inputs that optimize the desired response.
ART uses a linear ensemble model where the coefficients for each model are probability distributions. Models that perform well (in a 10× cross-validation) display coefficients close to one and have an important weight in the final prediction, whereas poorly performing models have a coefficient close to zero and have little influence on the final prediction. The probability distribution associated to each weight (calculated through Bayesian inference and Monte Carlo sampling) is narrow if all models provide the same prediction for an input, and wide if the models differ with their predictions. The sum of the weight probability distributions (multiplied by their corresponding predictive models) produce a natural way to establish uncertainty quantification for the predictions: if a single model produces very good crossvalidated predictions, the probability distribution for the predictions will be narrow (indicating low uncertainty); if several models produce equally good cross-validated predictions but these predictions are very different, the probability distribution will be wide (indicating high uncertainty). The search for recommendations is performed through the optimization of a surrogate function through parallel tempering or differential evolution (genetic algorithms).
The ensemble approach used in ART allows for easy addition of new models to the ensemble. If any single external model performs better than the ensemble, adding this model to the ensemble ensures that the results will be the same or better. The initial set of models in the ensemble involves all models from scikit-learn and their combinations through the TPOT metalearner, but we are often adding new models to the ensemble.