I currently use Python with pandas and numpy lib to consist a basic naive bayesian classifier .
With:
-
Selective Naive Bayesian method to selecte the best attributes which can result the best accuracy.
-
5 fold cross validation in init, you can change it with other numbers like 10 or 2.
-
Laplace's estimate is using with every possibally attributes value.
-
Dirichlet prior is using and testing after finish the SNB.
Warning:
1.Can't process the missing value.
2.Can only process "non-string attribute", including all of the attributes.
for more related imformation, please check wiki pedia below
https://en.wikipedia.org/wiki/Naive_Bayes_classifier
Below is the example output with Pima dataset from UCI Machine Learning DB
- The discretization result
(plz use the see_attrs_code_pair() function to see the original attribute name.)
- The SNB process with attribute and accuracy
- The Dirichlet prior test, from 2 all the way to 60 (1 is in the case of SNB already)
- The accuracy with different Dirichlet dist prior
- The accuracy when different attributes adding in SNB process