Class ScriptDecoder

java.lang.Object
org.htmlparser.scanners.ScriptDecoder

public class ScriptDecoder extends Object
Decode script. Script obfuscated by the Windows Script Encoder provided by Microsoft, is converted to plaintext. This code is based loosely on example code provided by MrBrownstone with changes by Joe Steele, see scrdec14.c.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static int
    The state to enter when decrypting is complete.
    protected static int[]
    The base 64 decoding table.
    protected static byte[]
    Table of lookup choice.
    protected static char[]
    The escaped characters corresponding to the each escape sequence.
    protected static char[]
    Escape sequence characters.
    protected static char[]
    The leader.
    protected static char[][]
    Two dimensional lookup table.
    protected static char[]
    The prefix.
    protected static char[]
    The trailer.
    protected static final int
    State when reading the checksum.
    protected static final int
    State while decoding.
    static final int
    Termination state.
    protected static final int
    State when reading an escape sequence.
    protected static final int
    State while exiting.
    static final int
    State on entry.
    protected static final int
    State while reading the encoded length.
    protected static final int
    State when reading up to decoded text.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static String
    Decode(Page page, Cursor cursor)
    Decode script encoded by the Microsoft obfuscator.
    protected static long
    decodeBase64(char[] p)
    Extract the base 64 encoded number.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • STATE_DONE

      public static final int STATE_DONE
      Termination state.
      See Also:
    • STATE_INITIAL

      public static final int STATE_INITIAL
      State on entry.
      See Also:
    • STATE_LENGTH

      protected static final int STATE_LENGTH
      State while reading the encoded length.
      See Also:
    • STATE_PREFIX

      protected static final int STATE_PREFIX
      State when reading up to decoded text.
      See Also:
    • STATE_DECODE

      protected static final int STATE_DECODE
      State while decoding.
      See Also:
    • STATE_ESCAPE

      protected static final int STATE_ESCAPE
      State when reading an escape sequence.
      See Also:
    • STATE_CHECKSUM

      protected static final int STATE_CHECKSUM
      State when reading the checksum.
      See Also:
    • STATE_FINAL

      protected static final int STATE_FINAL
      State while exiting.
      See Also:
    • LAST_STATE

      public static int LAST_STATE
      The state to enter when decrypting is complete. If this is STATE_DONE, the decryption will return with any characters following the encoded text still unconsumed. Otherwise, if this is STATE_INITIAL, the input will be exhausted and all following characters will be contained in the return value of the Decode() method.
    • mEncodingIndex

      protected static byte[] mEncodingIndex
      Table of lookup choice. The decoding cycles between three flavours determined by this sequence of 64 choices, corresponding to the first dimension of the lookup table.
    • mLookupTable

      protected static char[][] mLookupTable
      Two dimensional lookup table. The decoding uses this table to determine the plaintext for characters that aren't mEscaped.
    • mDigits

      protected static int[] mDigits
      The base 64 decoding table. This array determines the value of decoded base 64 elements.
    • mLeader

      protected static char[] mLeader
      The leader. The prefix to the encoded script is #@~^nnnnnn== where the n are the length digits in base64.
    • mPrefix

      protected static char[] mPrefix
      The prefix. The prfix separates the encoded text from the length.
    • mTrailer

      protected static char[] mTrailer
      The trailer. The suffix to the encoded script is nnnnnn==^#~@ where the n are the checksum digits in base64. These characters are the part after the checksum.
    • mEscapes

      protected static char[] mEscapes
      Escape sequence characters.
    • mEscaped

      protected static char[] mEscaped
      The escaped characters corresponding to the each escape sequence.
  • Constructor Details

    • ScriptDecoder

      public ScriptDecoder()
  • Method Details

    • decodeBase64

      protected static long decodeBase64(char[] p)
      Extract the base 64 encoded number. This is a very limited subset of base 64 encoded characters. Six characters are expected. These are translated into a single long value. For a more complete base 64 codec see for example the base64 package of iHarder.net
      Parameters:
      p - Six base 64 encoded digits.
      Returns:
      The value of the decoded number.
    • Decode

      public static String Decode(Page page, Cursor cursor) throws ParserException
      Decode script encoded by the Microsoft obfuscator.
      Parameters:
      page - The source for encoded text.
      cursor - The position at which to start decoding. This is advanced to the end of the encoded text.
      Returns:
      The plaintext.
      Throws:
      ParserException - If an error is discovered while decoding.