Class ComplementNaiveBayes

java.lang.Object
weka.classifiers.Classifier
weka.classifiers.bayes.ComplementNaiveBayes
All Implemented Interfaces:
Serializable, Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

public class ComplementNaiveBayes extends Classifier implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
Class for building and using a Complement class Naive Bayes classifier.

For more information see,

Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: ICML, 616-623, 2003.

P.S.: TF, IDF and length normalization transforms, as described in the paper, can be performed through weka.filters.unsupervised.StringToWordVector.

BibTeX:

 @inproceedings{Rennie2003,
    author = {Jason D. Rennie and Lawrence Shih and Jaime Teevan and David R. Karger},
    booktitle = {ICML},
    pages = {616-623},
    publisher = {AAAI Press},
    title = {Tackling the Poor Assumptions of Naive Bayes Text Classifiers},
    year = {2003}
 }
 

Valid options are:

 -N
  Normalize the word weights for each class
 
 -S
  Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
 
Version:
$Revision: 5516 $
Author:
Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • ComplementNaiveBayes

      public ComplementNaiveBayes()
  • Method Details

    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class Classifier
      Returns:
      an enumeration of all the available options.
    • getOptions

      public String[] getOptions()
      Gets the current settings of the classifier.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class Classifier
      Returns:
      an array of strings suitable for passing to setOptions
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -N
        Normalize the word weights for each class
       
       -S
        Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class Classifier
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getNormalizeWordWeights

      public boolean getNormalizeWordWeights()
      Returns true if the word weights for each class are to be normalized
      Returns:
      true if the word weights are normalized
    • setNormalizeWordWeights

      public void setNormalizeWordWeights(boolean doNormalize)
      Sets whether if the word weights for each class should be normalized
      Parameters:
      doNormalize - whether the word weights are to be normalized
    • normalizeWordWeightsTipText

      public String normalizeWordWeightsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSmoothingParameter

      public double getSmoothingParameter()
      Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.
      Returns:
      the smoothing value
    • setSmoothingParameter

      public void setSmoothingParameter(double val)
      Sets the smoothing value used to avoid zero WordGivenClass probabilities
      Parameters:
      val - the new smooting value
    • smoothingParameterTipText

      public String smoothingParameterTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • globalInfo

      public String globalInfo()
      Returns a string describing this classifier
      Returns:
      a description of the classifier suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class Classifier
      Returns:
      the capabilities of this classifier
      See Also:
    • buildClassifier

      public void buildClassifier(Instances instances) throws Exception
      Generates the classifier.
      Specified by:
      buildClassifier in class Classifier
      Parameters:
      instances - set of instances serving as training data
      Throws:
      Exception - if the classifier has not been built successfully
    • classifyInstance

      public double classifyInstance(Instance instance) throws Exception
      Classifies a given instance.

      The classification rule is:
      MinC(forAllWords(ti*Wci))
      where
      ti is the frequency of word i in the given instance
      Wci is the weight of word i in Class c.

      For more information see section 4.4 of the paper mentioned above in the classifiers description.

      Overrides:
      classifyInstance in class Classifier
      Parameters:
      instance - the instance to classify
      Returns:
      the index of the class the instance is most likely to belong.
      Throws:
      Exception - if the classifier has not been built yet.
    • toString

      public String toString()
      Prints out the internal model built by the classifier. In this case it prints out the word weights calculated when building the classifier.
      Overrides:
      toString in class Object
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class Classifier
      Returns:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - the options