Bayes Error/Bayes Risk/Bayes Rate/Irreducible Error
- Python Automation and Machine Learning for ICs - - An Online Book - |
||||||||
Python Automation and Machine Learning for ICs http://www.globalsino.com/ICs/ | ||||||||
Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix | ||||||||
================================================================================= Bayes error, also known as Bayes risk, Bayes rate, or irreducible error, is a measure of the lowest possible error that any classifier can achieve on a particular task. That is, Bayes error refers to the lowest possible error rate for any classifier on a given task. It represents the inherent or irreducible error because it is the best performance that can be achieved, even with a perfect model, given the distribution of the data. It is named after the Reverend Thomas Bayes, who contributed to the development of Bayesian statistics. In classification problems, Bayes error represents the error rate that would occur if we had perfect knowledge of the underlying probability distribution of the data. In other words, it's the error rate inherent in the data itself, and no classifier can achieve an error rate lower than the Bayes error. Mathematically, Bayes error is given by the following formula: ---------------------- [3767a] where, is the predicted class label. is the true class label. is the input data. is the conditional probability of predicting given the input . Figure 3767a shows the Bayes error which is the area where points from different classes overlap. The two synthetic classes are created with multivariate normal distributions. The decision boundary is a straight line (for illustration purposes only), and the gray shaded region represents the area of Bayes error where points from different classes overlap. Note that in a real-world scenario, the true underlying distribution and Bayes error would not be known, and you would use this visualization as a conceptual illustration. Figure 3767a. Bayes error. (Code). The general trend of the test error decreases as the square root of the training set size () as shown in Figure 3767b. The idea that the test error decreases as "Bayes error" or "irreducible error," is a common observation. However, it's important to note that learning algorithms may not always drive the test error to zero, even with an infinite amount of data due to several reasons: until reaching some irreducible error, often referred to as the
The irreducible error represents the inherent noise or uncertainty in the data that no model can overcome. It sets a lower bound on the achievable test error. As you increase the size of your training set ( ), the model has more information to learn from, and the variance typically decreases, leading to better generalization.============================================
|
||||||||
================================================================================= | ||||||||
|
||||||||