(Part 7 (of 11) of the Top 10 Data Mining Mistakes, drawn from the Handbook of Statistical Analysis and Data Mining Applications) Outliers and leverage points can greatly affect summary results and cloud general trends. Yet, one must not routinely dismiss them; they could be the result. The statistician John Aitchison recalled how a spike in radiation levels over the Antarctic was thrown out for years, as an assumed error in measurement, when in fact it revealed a hole in the Ozone layer that proved to be an impressive finding. To the degree possible, visualize your data to
This content is restricted to site members. If you are an existing user, please log in on the right (desktop) or below (mobile). If not, register today and gain free access to original content and industry news. See the details here.