Package weka.core

Class Utils

java.lang.Object
weka.core.Utils
All Implemented Interfaces:
RevisionHandler

public final class Utils extends Object implements RevisionHandler
Class implementing some simple utility methods.
Version:
$Revision: 10570 $
Author:
Eibe Frank, Yong Wang, Len Trigg, Julien Prados
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static double
    The natural logarithm of 2.
    static double
    The small deviation allowed in double comparisons.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static String
    Returns the given Array in a string representation.
    static String
    Converts carriage returns and new lines in a string into \r and \n.
    static void
    Checks if the given array contains any non-empty options.
    static String
    Converts carriage returns and new lines in a string into \r and \n.
    static File
    Converts a File's absolute path to a path relative to the user (ie start) directory.
    static final double
    correlation(double[] y1, double[] y2, int n)
    Returns the correlation coefficient of two double vectors.
    static String
    doubleToString(double value, int afterDecimalPoint)
    Rounds a double and converts it into String.
    static String
    doubleToString(double value, int width, int afterDecimalPoint)
    Rounds a double and converts it into a formatted decimal-justified String.
    static boolean
    eq(double a, double b)
    Tests if a is equal to b.
    static Object
    forName(Class classType, String className, String[] options)
    Creates a new instance of an object given it's class name and (optional) arguments to pass to it's setOptions method.
    static Class
    Returns the basic class of an array class (handles multi-dimensional arrays).
    static int
    Returns the dimensions of the given array.
    static int
    Returns the dimensions of the given array.
    static boolean
    getFlag(char flag, String[] options)
    Checks if the given array contains the flag "-Char".
    static boolean
    getFlag(String flag, String[] options)
    Checks if the given array contains the flag "-String".
    static String
    getGlobalInfo(Object object, boolean addCapabilities)
    Utility method for grabbing the global info help (if it exists) from an arbitrary object.
    static String
    getOption(char flag, String[] options)
    Gets an option indicated by a flag "-Char" from the given array of strings.
    static String
    getOption(String flag, String[] options)
    Gets an option indicated by a flag "-String" from the given array of strings.
    static int
    getOptionPos(char flag, String[] options)
    Gets the index of an option or flag indicated by a flag "-Char" from the given array of strings.
    static int
    getOptionPos(String flag, String[] options)
    Gets the index of an option or flag indicated by a flag "-String" from the given array of strings.
    Returns the revision string.
    static boolean
    gr(double a, double b)
    Tests if a is greater than b.
    static boolean
    grOrEq(double a, double b)
    Tests if a is greater or equal to b.
    static double
    info(int[] counts)
    Computes entropy for an array of integers.
    static String
    joinOptions(String[] optionArray)
    Joins all the options in an option array into a single string, as might be used on the command line.
    static double
    kthSmallestValue(double[] array, int k)
    Returns the kth-smallest value in the array
    static int
    kthSmallestValue(int[] array, int k)
    Returns the kth-smallest value in the array.
    static String
    lineWrap(String input, int maxLineWidth)
    Implements simple line breaking.
    static double
    log2(double a)
    Returns the logarithm of a for base 2.
    static double[]
    logs2probs(double[] a)
    Converts an array containing the natural logarithms of probabilities stored in a vector back into probabilities.
    static void
    main(String[] ops)
    Main method for testing this class.
    static int
    maxIndex(double[] doubles)
    Returns index of maximum element in a given array of doubles.
    static int
    maxIndex(int[] ints)
    Returns index of maximum element in a given array of integers.
    static double
    mean(double[] vector)
    Computes the mean for an array of doubles.
    static int
    minIndex(double[] doubles)
    Returns index of minimum element in a given array of doubles.
    static int
    minIndex(int[] ints)
    Returns index of minimum element in a given array of integers.
    static void
    normalize(double[] doubles)
    Normalizes the doubles in the array by their sum.
    static void
    normalize(double[] doubles, double sum)
    Normalizes the doubles in the array using the given value.
    static String
    padLeft(String inString, int length)
    Pads a string to a specified length, inserting spaces on the left as required.
    static String
    padRight(String inString, int length)
    Pads a string to a specified length, inserting spaces on the right as required.
    static String[]
    Returns the secondary set of options (if any) contained in the supplied options array.
    static int
    probRound(double value, Random rand)
    Rounds a double to the next nearest integer value in a probabilistic fashion (e.g.
    static double
    probToLogOdds(double prob)
    Returns the log-odds for a given probabilitiy.
    static String
    quote(String string)
    Quotes a string if it contains special characters.
    static Properties
    readProperties(String resourceName)
    Reads properties that inherit from three locations.
    static String
    removeSubstring(String inString, String substring)
    Removes all occurrences of a string from another string.
    static void
    Replaces all "missing values" in the given array of double values with MAX_VALUE.
    static String
    replaceSubstring(String inString, String subString, String replaceString)
    Replaces with a new string, all occurrences of a string from another string.
    static String
    Reverts \r and \n in a string into carriage returns and new lines.
    static int
    round(double value)
    Rounds a double to the next nearest integer value.
    static double
    roundDouble(double value, int afterDecimalPoint)
    Rounds a double to the given number of decimal places.
    static boolean
    sm(double a, double b)
    Tests if a is smaller than b.
    static boolean
    smOrEq(double a, double b)
    Tests if a is smaller or equal to b.
    static int[]
    sort(double[] array)
    Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.
    static int[]
    sort(int[] array)
    Sorts a given array of integers in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.
    static int[]
    sortWithNoMissingValues(double[] array)
    Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.
    static String[]
    splitOptions(String quotedOptionString)
    Split up a string containing options into an array of strings, one for each option.
    static int[]
    stableSort(double[] array)
    Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array.
    static double
    sum(double[] doubles)
    Computes the sum of the elements of an array of doubles.
    static int
    sum(int[] ints)
    Computes the sum of the elements of an array of integers.
    static String
    The inverse operation of backQuoteChars().
    static String
    unquote(String string)
    unquotes are previously quoted string (but only if necessary), i.e., it removes the single quotes around it.
    static double
    variance(double[] vector)
    Computes the variance for an array of doubles.
    static double
    xlogx(int c)
    Returns c*log2(c) for a given integer value c.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • log2

      public static double log2
      The natural logarithm of 2.
    • SMALL

      public static double SMALL
      The small deviation allowed in double comparisons.
  • Constructor Details

    • Utils

      public Utils()
  • Method Details

    • readProperties

      public static Properties readProperties(String resourceName) throws Exception
      Reads properties that inherit from three locations. Properties are first defined in the system resource location (i.e. in the CLASSPATH). These default properties must exist. Properties defined in the users home directory (optional) override default settings. Properties defined in the current directory (optional) override all these settings.
      Parameters:
      resourceName - the location of the resource that should be loaded. e.g.: "weka/core/Utils.props". (The use of hardcoded forward slashes here is OK - see jdk1.1/docs/guide/misc/resources.html) This routine will also look for the file (in this case) "Utils.props" in the users home directory and the current directory.
      Returns:
      the Properties
      Throws:
      Exception - if no default properties are defined, or if an error occurs reading the properties files.
    • correlation

      public static final double correlation(double[] y1, double[] y2, int n)
      Returns the correlation coefficient of two double vectors.
      Parameters:
      y1 - double vector 1
      y2 - double vector 2
      n - the length of two double vectors
      Returns:
      the correlation coefficient
    • removeSubstring

      public static String removeSubstring(String inString, String substring)
      Removes all occurrences of a string from another string.
      Parameters:
      inString - the string to remove substrings from.
      substring - the substring to remove.
      Returns:
      the input string with occurrences of substring removed.
    • replaceSubstring

      public static String replaceSubstring(String inString, String subString, String replaceString)
      Replaces with a new string, all occurrences of a string from another string.
      Parameters:
      inString - the string to replace substrings in.
      subString - the substring to replace.
      replaceString - the replacement substring
      Returns:
      the input string with occurrences of substring replaced.
    • padLeft

      public static String padLeft(String inString, int length)
      Pads a string to a specified length, inserting spaces on the left as required. If the string is too long, characters are removed (from the right).
      Parameters:
      inString - the input string
      length - the desired length of the output string
      Returns:
      the output string
    • padRight

      public static String padRight(String inString, int length)
      Pads a string to a specified length, inserting spaces on the right as required. If the string is too long, characters are removed (from the right).
      Parameters:
      inString - the input string
      length - the desired length of the output string
      Returns:
      the output string
    • doubleToString

      public static String doubleToString(double value, int afterDecimalPoint)
      Rounds a double and converts it into String.
      Parameters:
      value - the double value
      afterDecimalPoint - the (maximum) number of digits permitted after the decimal point
      Returns:
      the double as a formatted string
    • doubleToString

      public static String doubleToString(double value, int width, int afterDecimalPoint)
      Rounds a double and converts it into a formatted decimal-justified String. Trailing 0's are replaced with spaces.
      Parameters:
      value - the double value
      width - the width of the string
      afterDecimalPoint - the number of digits after the decimal point
      Returns:
      the double as a formatted string
    • getArrayClass

      public static Class getArrayClass(Class c)
      Returns the basic class of an array class (handles multi-dimensional arrays).
      Parameters:
      c - the array to inspect
      Returns:
      the class of the innermost elements
    • getArrayDimensions

      public static int getArrayDimensions(Class array)
      Returns the dimensions of the given array. Even though the parameter is of type "Object" one can hand over primitve arrays, e.g. int[3] or double[2][4].
      Parameters:
      array - the array to determine the dimensions for
      Returns:
      the dimensions of the array
    • getArrayDimensions

      public static int getArrayDimensions(Object array)
      Returns the dimensions of the given array. Even though the parameter is of type "Object" one can hand over primitve arrays, e.g. int[3] or double[2][4].
      Parameters:
      array - the array to determine the dimensions for
      Returns:
      the dimensions of the array
    • arrayToString

      public static String arrayToString(Object array)
      Returns the given Array in a string representation. Even though the parameter is of type "Object" one can hand over primitve arrays, e.g. int[3] or double[2][4].
      Parameters:
      array - the array to return in a string representation
      Returns:
      the array as string
    • eq

      public static boolean eq(double a, double b)
      Tests if a is equal to b.
      Parameters:
      a - a double
      b - a double
    • checkForRemainingOptions

      public static void checkForRemainingOptions(String[] options) throws Exception
      Checks if the given array contains any non-empty options.
      Parameters:
      options - an array of strings
      Throws:
      Exception - if there are any non-empty options
    • getFlag

      public static boolean getFlag(char flag, String[] options) throws Exception
      Checks if the given array contains the flag "-Char". Stops searching at the first marker "--". If the flag is found, it is replaced with the empty string.
      Parameters:
      flag - the character indicating the flag.
      options - the array of strings containing all the options.
      Returns:
      true if the flag was found
      Throws:
      Exception - if an illegal option was found
    • getFlag

      public static boolean getFlag(String flag, String[] options) throws Exception
      Checks if the given array contains the flag "-String". Stops searching at the first marker "--". If the flag is found, it is replaced with the empty string.
      Parameters:
      flag - the String indicating the flag.
      options - the array of strings containing all the options.
      Returns:
      true if the flag was found
      Throws:
      Exception - if an illegal option was found
    • getOption

      public static String getOption(char flag, String[] options) throws Exception
      Gets an option indicated by a flag "-Char" from the given array of strings. Stops searching at the first marker "--". Replaces flag and option with empty strings.
      Parameters:
      flag - the character indicating the option.
      options - the array of strings containing all the options.
      Returns:
      the indicated option or an empty string
      Throws:
      Exception - if the option indicated by the flag can't be found
    • getOption

      public static String getOption(String flag, String[] options) throws Exception
      Gets an option indicated by a flag "-String" from the given array of strings. Stops searching at the first marker "--". Replaces flag and option with empty strings.
      Parameters:
      flag - the String indicating the option.
      options - the array of strings containing all the options.
      Returns:
      the indicated option or an empty string
      Throws:
      Exception - if the option indicated by the flag can't be found
    • getOptionPos

      public static int getOptionPos(char flag, String[] options)
      Gets the index of an option or flag indicated by a flag "-Char" from the given array of strings. Stops searching at the first marker "--".
      Parameters:
      flag - the character indicating the option.
      options - the array of strings containing all the options.
      Returns:
      the position if found, or -1 otherwise
    • getOptionPos

      public static int getOptionPos(String flag, String[] options)
      Gets the index of an option or flag indicated by a flag "-String" from the given array of strings. Stops searching at the first marker "--".
      Parameters:
      flag - the String indicating the option.
      options - the array of strings containing all the options.
      Returns:
      the position if found, or -1 otherwise
    • quote

      public static String quote(String string)
      Quotes a string if it contains special characters. The following rules are applied: A character is backquoted version of it is one of " ' % \ \n \r \t . A string is enclosed within single quotes if a character has been backquoted using the previous rule above or contains { } or is exactly equal to the strings , ? space or "" (empty string). A quoted question mark distinguishes it from the missing value which is represented as an unquoted question mark in arff files.
      Parameters:
      string - the string to be quoted
      Returns:
      the string (possibly quoted)
      See Also:
    • unquote

      public static String unquote(String string)
      unquotes are previously quoted string (but only if necessary), i.e., it removes the single quotes around it. Inverse to quote(String).
      Parameters:
      string - the string to process
      Returns:
      the unquoted string
      See Also:
    • backQuoteChars

      public static String backQuoteChars(String string)
      Converts carriage returns and new lines in a string into \r and \n. Backquotes the following characters: ` " \ \t and %
      Parameters:
      string - the string
      Returns:
      the converted string
      See Also:
    • convertNewLines

      public static String convertNewLines(String string)
      Converts carriage returns and new lines in a string into \r and \n.
      Parameters:
      string - the string
      Returns:
      the converted string
    • revertNewLines

      public static String revertNewLines(String string)
      Reverts \r and \n in a string into carriage returns and new lines.
      Parameters:
      string - the string
      Returns:
      the converted string
    • partitionOptions

      public static String[] partitionOptions(String[] options)
      Returns the secondary set of options (if any) contained in the supplied options array. The secondary set is defined to be any options after the first "--". These options are removed from the original options array.
      Parameters:
      options - the input array of options
      Returns:
      the array of secondary options
    • unbackQuoteChars

      public static String unbackQuoteChars(String string)
      The inverse operation of backQuoteChars(). Converts back-quoted carriage returns and new lines in a string to the corresponding character ('\r' and '\n'). Also "un"-back-quotes the following characters: ` " \ \t and %
      Parameters:
      string - the string
      Returns:
      the converted string
      See Also:
    • splitOptions

      public static String[] splitOptions(String quotedOptionString) throws Exception
      Split up a string containing options into an array of strings, one for each option.
      Parameters:
      quotedOptionString - the string containing the options
      Returns:
      the array of options
      Throws:
      Exception - in case of an unterminated string, unknown character or a parse error
    • joinOptions

      public static String joinOptions(String[] optionArray)
      Joins all the options in an option array into a single string, as might be used on the command line.
      Parameters:
      optionArray - the array of options
      Returns:
      the string containing all options.
    • forName

      public static Object forName(Class classType, String className, String[] options) throws Exception
      Creates a new instance of an object given it's class name and (optional) arguments to pass to it's setOptions method. If the object implements OptionHandler and the options parameter is non-null, the object will have it's options set. Example use:

       String classifierName = Utils.getOption('W', options);
       Classifier c = (Classifier)Utils.forName(Classifier.class,
                                                classifierName,
                                                options);
       setClassifier(c);
       
      Parameters:
      classType - the class that the instantiated object should be assignable to -- an exception is thrown if this is not the case
      className - the fully qualified class name of the object
      options - an array of options suitable for passing to setOptions. May be null. Any options accepted by the object will be removed from the array.
      Returns:
      the newly created object, ready for use.
      Throws:
      Exception - if the class name is invalid, or if the class is not assignable to the desired class type, or the options supplied are not acceptable to the object
    • info

      public static double info(int[] counts)
      Computes entropy for an array of integers.
      Parameters:
      counts - array of counts
      Returns:
      - a log2 a - b log2 b - c log2 c + (a+b+c) log2 (a+b+c) when given array [a b c]
    • smOrEq

      public static boolean smOrEq(double a, double b)
      Tests if a is smaller or equal to b.
      Parameters:
      a - a double
      b - a double
    • grOrEq

      public static boolean grOrEq(double a, double b)
      Tests if a is greater or equal to b.
      Parameters:
      a - a double
      b - a double
    • sm

      public static boolean sm(double a, double b)
      Tests if a is smaller than b.
      Parameters:
      a - a double
      b - a double
    • gr

      public static boolean gr(double a, double b)
      Tests if a is greater than b.
      Parameters:
      a - a double
      b - a double
    • kthSmallestValue

      public static int kthSmallestValue(int[] array, int k)
      Returns the kth-smallest value in the array.
      Parameters:
      array - the array of integers
      k - the value of k
      Returns:
      the kth-smallest value
    • kthSmallestValue

      public static double kthSmallestValue(double[] array, int k)
      Returns the kth-smallest value in the array
      Parameters:
      array - the array of double
      k - the value of k
      Returns:
      the kth-smallest value
    • log2

      public static double log2(double a)
      Returns the logarithm of a for base 2.
      Parameters:
      a - a double
      Returns:
      the logarithm for base 2
    • maxIndex

      public static int maxIndex(double[] doubles)
      Returns index of maximum element in a given array of doubles. First maximum is returned.
      Parameters:
      doubles - the array of doubles
      Returns:
      the index of the maximum element
    • maxIndex

      public static int maxIndex(int[] ints)
      Returns index of maximum element in a given array of integers. First maximum is returned.
      Parameters:
      ints - the array of integers
      Returns:
      the index of the maximum element
    • mean

      public static double mean(double[] vector)
      Computes the mean for an array of doubles.
      Parameters:
      vector - the array
      Returns:
      the mean
    • minIndex

      public static int minIndex(int[] ints)
      Returns index of minimum element in a given array of integers. First minimum is returned.
      Parameters:
      ints - the array of integers
      Returns:
      the index of the minimum element
    • minIndex

      public static int minIndex(double[] doubles)
      Returns index of minimum element in a given array of doubles. First minimum is returned.
      Parameters:
      doubles - the array of doubles
      Returns:
      the index of the minimum element
    • normalize

      public static void normalize(double[] doubles)
      Normalizes the doubles in the array by their sum.
      Parameters:
      doubles - the array of double
      Throws:
      IllegalArgumentException - if sum is Zero or NaN
    • normalize

      public static void normalize(double[] doubles, double sum)
      Normalizes the doubles in the array using the given value.
      Parameters:
      doubles - the array of double
      sum - the value by which the doubles are to be normalized
      Throws:
      IllegalArgumentException - if sum is zero or NaN
    • logs2probs

      public static double[] logs2probs(double[] a)
      Converts an array containing the natural logarithms of probabilities stored in a vector back into probabilities. The probabilities are assumed to sum to one.
      Parameters:
      a - an array holding the natural logarithms of the probabilities
      Returns:
      the converted array
    • probToLogOdds

      public static double probToLogOdds(double prob)
      Returns the log-odds for a given probabilitiy.
      Parameters:
      prob - the probabilitiy
      Returns:
      the log-odds after the probability has been mapped to [Utils.SMALL, 1-Utils.SMALL]
    • round

      public static int round(double value)
      Rounds a double to the next nearest integer value. The JDK version of it doesn't work properly.
      Parameters:
      value - the double value
      Returns:
      the resulting integer value
    • probRound

      public static int probRound(double value, Random rand)
      Rounds a double to the next nearest integer value in a probabilistic fashion (e.g. 0.8 has a 20% chance of being rounded down to 0 and a 80% chance of being rounded up to 1). In the limit, the average of the rounded numbers generated by this procedure should converge to the original double.
      Parameters:
      value - the double value
      rand - the random number generator
      Returns:
      the resulting integer value
    • replaceMissingWithMAX_VALUE

      public static void replaceMissingWithMAX_VALUE(double[] array)
      Replaces all "missing values" in the given array of double values with MAX_VALUE.
      Parameters:
      array - the array to be modified.
    • roundDouble

      public static double roundDouble(double value, int afterDecimalPoint)
      Rounds a double to the given number of decimal places.
      Parameters:
      value - the double value
      afterDecimalPoint - the number of digits after the decimal point
      Returns:
      the double rounded to the given precision
    • sort

      public static int[] sort(int[] array)
      Sorts a given array of integers in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. The sort is stable. (Equal elements remain in their original order.)
      Parameters:
      array - this array is not changed by the method!
      Returns:
      an array of integers with the positions in the sorted array.
    • sort

      public static int[] sort(double[] array)
      Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. NOTE THESE CHANGES: the sort is no longer stable and it doesn't use safe floating-point comparisons anymore. Occurrences of Double.NaN are treated as Double.MAX_VALUE.
      Parameters:
      array - this array is not changed by the method!
      Returns:
      an array of integers with the positions in the sorted array.
    • sortWithNoMissingValues

      public static int[] sortWithNoMissingValues(double[] array)
      Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. Missing values in the given array are replaced by Double.MAX_VALUE, so the array is modified in that case!
      Parameters:
      array - the array to be sorted, which is modified if it has missing values
      Returns:
      an array of integers with the positions in the sorted array.
    • stableSort

      public static int[] stableSort(double[] array)
      Sorts a given array of doubles in ascending order and returns an array of integers with the positions of the elements of the original array in the sorted array. The sort is stable (Equal elements remain in their original order.) Occurrences of Double.NaN are treated as Double.MAX_VALUE
      Parameters:
      array - this array is not changed by the method!
      Returns:
      an array of integers with the positions in the sorted array.
    • variance

      public static double variance(double[] vector)
      Computes the variance for an array of doubles.
      Parameters:
      vector - the array
      Returns:
      the variance
    • sum

      public static double sum(double[] doubles)
      Computes the sum of the elements of an array of doubles.
      Parameters:
      doubles - the array of double
      Returns:
      the sum of the elements
    • sum

      public static int sum(int[] ints)
      Computes the sum of the elements of an array of integers.
      Parameters:
      ints - the array of integers
      Returns:
      the sum of the elements
    • xlogx

      public static double xlogx(int c)
      Returns c*log2(c) for a given integer value c.
      Parameters:
      c - an integer value
      Returns:
      c*log2(c) (but is careful to return 0 if c is 0)
    • convertToRelativePath

      public static File convertToRelativePath(File absolute) throws Exception
      Converts a File's absolute path to a path relative to the user (ie start) directory. Includes an additional workaround for Cygwin, which doesn't like upper case drive letters.
      Parameters:
      absolute - the File to convert to relative path
      Returns:
      a File with a path that is relative to the user's directory
      Throws:
      Exception - if the path cannot be constructed
    • getGlobalInfo

      public static String getGlobalInfo(Object object, boolean addCapabilities)
      Utility method for grabbing the global info help (if it exists) from an arbitrary object. Can also append capabilities information if the object is a CapabilitiesHandler.
      Parameters:
      object - the object to grab global info from
      addCapabilities - true if capabilities information is to be added to the result
      Returns:
      the global help info or null if global info does not exist
    • lineWrap

      public static String lineWrap(String input, int maxLineWidth)
      Implements simple line breaking. Reformats the given string by introducing line breaks so that, ideally, no line exceeds the given number of characters. Line breaks are assumed to be indicated by newline characters. Existing line breaks are left in the input text.
      Parameters:
      input - the string to line wrap
      maxLineWidth - the maximum permitted number of characters in a line
      Returns:
      the processed string
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision
    • main

      public static void main(String[] ops)
      Main method for testing this class.
      Parameters:
      ops - some dummy options