In order to get an idea of the predictive ability of a model a number of statistical measures are used. These can be applied to the
calibration set (the group of samples that are used to build the model parameters), the cross-validation set (samples temporarily
excluded from model development but still ultimately involved in the development of the model), and the independent set (samples
that have no input into the development of the model).
Statistics solely based on the calibration set can give an inaccurate representation of the predictive ability of the model
for unknown samples since it is possible to "overfit" the model to the calibration set, particularly if a large number of PLS factors are used.
Cross-validation statistics provide a better idea of the robustness of a model but, ideally, independent validation (a test set) should be used. When presenting
our regression statistics Celignis will use the values for the test set, unless otherwise stated.
Some of the most important statistics, those that are used on this website, are described below: