Naive Bayes Algorithm

Naive Bayes is a classification algorithm for binary (two-class) and multi-class classification problems. The technique is easiest to understand when described using binary or categorical input values.

It is called naive Bayes or idiot Bayes because the calculation of the probabilities for each hypothesis are simplified to make their calculation tractable. Rather than attempting to calculate the values of each attribute value P(d1, d2, d3|h), they are assumed to be conditionally independent given the target value and calculated as P(d1|h) * P(d2|H) and so on.

Representation Used By Naive Bayes Models
The representation for naive Bayes is probabilities. A list of probabilities are stored to file for a learned naive Bayes model. This includes:

  1. Class Probabilities: The probabilities of each class in the training dataset.
  2. Conditional Probabilities: The conditional probabilities of each input value given each class value.

Learning a naive Bayes model from your training data is fast. Training is fast because only the probability of each class and the probability of each class given different input (x) values need to be calculated. No coefficients need to be fitted by optimization procedures.


I would like to nominate “Bayes’ rule because it brought about a revolution in reasoning (inference!), expert systems, and is fundamental to many machine learning algorithms. And more than 250 years old!

Johan Loeckx (Belgium)