Calibration In Machine Learning
Steps in calibration
- we first build a model on D_train and now we do cross validation on D_cv and get yi_hat.
- Now we make a table of sorted yi_hats which contains actual yi’s as well from the D_cv.
- We calculate the average yi_hat and average yi. this pair of yi_hat and yi is the D_calib.
- We plot the D_calib with yi_hat on x axis and yi on y axis. We want our plot to be a 45 degree line to the x axis which is the ideal situation and is generally not possible. We see that the x- axis covers the values of the yi_hat i.e. the predicted real value out of D_cv and the y axis has the actual yi’s average which is nothing but the probability value . And that’s what we want to predict for our test data as well. We want to give the probability value of whatever prediction we make through our model but the probability that we get directly from our base models is not the actual probability value so we make use of calibration . As using the D_cv we have created a model which smooths the look up table and gives a value on y axis( the probability value of yi=1) for every yi_hat so our calibration model knows that for which value of the predicted yi_hat what true value is i.e. the probability of yi so when we give xq to the base model it predicts yq_hat for us and now we put this yq value in our calibration model to get the probability value of yq=1. And this is more precise and better value than the probability value that we got through our base model .