In part one, I described one problem with overfitting the data is that estimates of the target variable in regions without any training data can be unstable, whether those regions require the model to interpolate or extrapolate. Accuracy is a problem, but more precisely, the problems in interpolation and extrapolation are not revealed using any accuracy metrics and only arise when new data points are encountered after the model is deployed. This month, a second problem with overfitting is the model interpretation. Predictive modeling algorithms find variables that associate or correlate with the target variable. When models are
This content is restricted to site members. If you are an existing user, please log in on the right (desktop) or below (mobile). If not, register today and gain free access to original content and industry news. See the details here.