org.WeaselReader.PalmIO
Class ZtxtDB

java.lang.Object
  extended by java.io.RandomAccessFile
      extended by org.WeaselReader.PalmIO.PalmDB
          extended by org.WeaselReader.PalmIO.ZtxtDB
All Implemented Interfaces:
java.io.Closeable, java.io.DataInput, java.io.DataOutput

public class ZtxtDB
extends PalmDB

This class supports reading a Weasel Reader zTXT database file. The underlying Palm OS PDB structure is handled by PalmDB while this class handles the data specific to the zTXT format. The ZtxtDB class represents the zTXT format version 1.44 which is the final Weasel Reader for Palm OS zTXT format. Any future zTXT format will not be contained within a Palm OS database.

Limitations:


Format:

A zTXT database is an e-book format that contains a 32 byte header in record 0, a series of zLib compressed data records which hold the document text, an optional record with a bookmark list, and an optional record with an annotation index which is followed by one record for each annotation.

A more detailed description and explanation of the zTXT format can be found on Weasel Reader's zTXT format reference page.

The zTXT header (version 1.44) is defined by the following C structure:
typedef struct zTXT_record0Type {
UInt16 version;
UInt16 numRecords;
UInt32 size;
UInt16 recordSize;
UInt16 numBookmarks;
UInt16 bookmarkRecord;
UInt16 numAnnotations;
UInt16 annotationRecord;
UInt8 flags;
UInt8 reserved;
UInt32 crc32;
UInt8 padding[0x20 - 24];
} zTXT_record0;

Definition of bits in the flags byte of the zTXT header:
typedef enum {
ZTXT_RANDOMACCESS = 0x01,
ZTXT_NONUNIFORM = 0x02
} zTXT_flag;

Version:
$Id$
Author:
John Gruenenfelder

Nested Class Summary
 class ZtxtDB.Annotations
          A collection of annotations.
 
Field Summary
private  int annotationRecordIndex
          The index of the record containing the annotation index.
private  ZtxtDB.Annotations annotations
          The annotations for this zTXT document.
private  int bookmarkRecordIndex
          The index of the record containing the bookmark list/index.
private  Bookmarks bookmarks
          The bookmarks for this zTXT document.
private  long crc32
          A CRC32 value computed over all of the uncompressed text data.
private  long dataSize
          The total size of the text data when uncompressed.
private  java.util.zip.Inflater decompressor
          The decompression information for this zTXT database.
static int MAX_ANNOTATION_LENGTH
          The maximum length of a zTXT annotation.
static int MAX_TITLE_LENGTH
          The maximum length of a zTXT bookmark or annotation title.
private  int numAnnotations
          The number of annotations present in this zTXT document.
private  int numBookmarks
          The number of bookmarks present in this zTXT document.
private  int numDataRecords
          The number of text data records in this zTXT file.
private  int recordSize
          The size of an uncompressed data record.
static java.lang.String WEASEL_CREATOR_ID
          Weasel Reader database creator ID.
static java.lang.String WEASEL_TYPE_ID
          Weasel Reader database type ID.
static short ZTXT_NONUNIFORM
          If this flag is set then not all data records in this zTXT will necessarily be of size recordSize but may be slightly different.
static short ZTXT_RANDOMACCESS
          If this flag is set, this zTXT supports random access of the compressed text records.
static int ZTXT_VERSION
          The highest zTXT format version recognized by this class.
private  short zTXTFlags
          Flags to indicate features of this zTXT document.
private  int zTXTVersion
          The zTXT format version for this document.
 
Fields inherited from class org.WeaselReader.PalmIO.PalmDB
DB_FLAG_BACKUP, DB_FLAG_DIRTY_APPINFO, DB_FLAG_NEWER_OKAY, DB_FLAG_NO_COPY, DB_FLAG_READ_ONLY, DB_FLAG_RESET_ON_INSTALL, DB_HEADER_LENGTH, DB_NAME_LENGTH, PALM_CTIME_OFFSET, REC_FLAG_BUSY, REC_FLAG_DELETE, REC_FLAG_DIRTY, REC_FLAG_SECRET
 
Constructor Summary
ZtxtDB(java.io.File inputFile)
          Create a new ZtxtDB and load the specified zTXT document.
 
Method Summary
 long computeCRC32()
          Compute a CRC32 value over the text data records in the database.
 void endDecompression()
          End the decompression of data and clean up any data used by the Inflater object.
protected  void finalize()
          Close any open resources before garbage collection.
 int getAnnotationRecordIndex()
           
 ZtxtDB.Annotations getAnnotations()
           
 int getBookmarkRecordIndex()
           
 Bookmarks getBookmarks()
           
 long getCRC32()
           
 long getDataSize()
           
 int getNumAnnotations()
           
 int getNumBookmarks()
           
 int getNumDataRecords()
           
 int getRecordSize()
           
 short getzTXTFlags()
           
 int getzTXTVersion()
           
 java.lang.String getzTXTVersionString()
          Convert the two version bytes into a readable string.
 void initializeDecompression()
          Initialize the decompression stream.
static void main(java.lang.String[] args)
          Prints values from the zTXT header and validates the CRC32 within the header.
private  void parseOffsetsAndTitles(byte[] data, int numEntries, int[] offsetArray, java.lang.String[] titleArray)
          Parse the offset/title data in a byte array that is common to both bookmark and annotation indices.
private  void readAnnotations()
          Load the annotation index and annotation text blocks into memory.
private  void readBookmarks()
          Load the bookmark array into memory for quicker access.
 java.lang.String readTextRecord(int index)
          Read the specified text data record and decompress it.
private  void readzTXTHeader()
          Read the zTXT header data from record zero.
 java.lang.String toString()
          Show something semi-useful for this zTXT when the object is printed.
 boolean validateCRC32()
          Validate the CRC32 stored in the zTXT header against the CRC32 computed over the text data records in the database.
 
Methods inherited from class org.WeaselReader.PalmIO.PalmDB
getApplicationInfoIDPtr, getCreationTime, getDbCreatorID, getDbName, getDbTypeID, getFlags, getLastBackupTime, getModificationNumber, getModificationTime, getNextRecordListIDPtr, getNumRecords, getRecordFlags, getRecordIDs, getRecordOffsets, getSortInfoIDPtr, getUniqueIDSeed, getVersion, readRecord, readUInt32, readUniqueID
 
Methods inherited from class java.io.RandomAccessFile
close, getChannel, getFD, getFilePointer, length, read, read, read, readBoolean, readByte, readChar, readDouble, readFloat, readFully, readFully, readInt, readLine, readLong, readShort, readUnsignedByte, readUnsignedShort, readUTF, seek, setLength, skipBytes, write, write, write, writeBoolean, writeByte, writeBytes, writeChar, writeChars, writeDouble, writeFloat, writeInt, writeLong, writeShort, writeUTF
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

WEASEL_CREATOR_ID

public static final java.lang.String WEASEL_CREATOR_ID
Weasel Reader database creator ID. This is a four byte character literal. Use PalmDB.stringToID to get the numerical value.

See Also:
Constant Field Values

WEASEL_TYPE_ID

public static final java.lang.String WEASEL_TYPE_ID
Weasel Reader database type ID. This is a four byte character literal. Use PalmDB.stringToID to get the numerical value.

See Also:
Constant Field Values

MAX_TITLE_LENGTH

public static final int MAX_TITLE_LENGTH
The maximum length of a zTXT bookmark or annotation title. A title of the maximum length will not be NUL terminated in the database.

See Also:
Constant Field Values

MAX_ANNOTATION_LENGTH

public static final int MAX_ANNOTATION_LENGTH
The maximum length of a zTXT annotation. An annotation of the maximum length will not be NUL terminated in the database.

See Also:
Constant Field Values

ZTXT_VERSION

public static final int ZTXT_VERSION
The highest zTXT format version recognized by this class.

See Also:
Constant Field Values

ZTXT_RANDOMACCESS

public static final short ZTXT_RANDOMACCESS
If this flag is set, this zTXT supports random access of the compressed text records.

See Also:
Constant Field Values

ZTXT_NONUNIFORM

public static final short ZTXT_NONUNIFORM
If this flag is set then not all data records in this zTXT will necessarily be of size recordSize but may be slightly different.

See Also:
Constant Field Values

zTXTVersion

private int zTXTVersion
The zTXT format version for this document. Consists of two unsigned bytes. A value of 0x012C would be interpreted as version 1.44.


numDataRecords

private int numDataRecords
The number of text data records in this zTXT file. Does not include record 0 or any bookmark or annotation records.


dataSize

private long dataSize
The total size of the text data when uncompressed.


recordSize

private int recordSize
The size of an uncompressed data record. All data records will decompress to this size except for the last which may be smaller.


numBookmarks

private int numBookmarks
The number of bookmarks present in this zTXT document.


bookmarkRecordIndex

private int bookmarkRecordIndex
The index of the record containing the bookmark list/index. If there are no bookmarks in this zTXT, this record will not exist and this value must be zero.


numAnnotations

private int numAnnotations
The number of annotations present in this zTXT document.


annotationRecordIndex

private int annotationRecordIndex
The index of the record containing the annotation index. If there are no annotations in this zTXT, this record will not exist and this value must be zero.


zTXTFlags

private short zTXTFlags
Flags to indicate features of this zTXT document. There are currently only two defined: ZTXT_RANDOM_ACCESS and ZTXT_NONUNIFORM. Older versions of Weasel on Palm OS supported non-random access zTXTs, but this is no longer the case and all supported zTXTs will need ZTXT_RANDOM_ACCESS to be set. There are also no known non-uniform zTXTs so this feature is not currently supported either by this class or by Weasel Reader.


crc32

private long crc32
A CRC32 value computed over all of the uncompressed text data. The CRC32 function used to generate this value is that provided by zLib.


bookmarks

private Bookmarks bookmarks
The bookmarks for this zTXT document.


annotations

private ZtxtDB.Annotations annotations
The annotations for this zTXT document.


decompressor

private java.util.zip.Inflater decompressor
The decompression information for this zTXT database. This data must be primed by called initializeDecompression and may be ended by calling endDecompression.

Constructor Detail

ZtxtDB

public ZtxtDB(java.io.File inputFile)
       throws java.io.IOException,
              java.util.zip.DataFormatException
Create a new ZtxtDB and load the specified zTXT document.

Parameters:
inputFile - a zTXT document database to read from disk.
Throws:
java.io.IOException - if an I/O error occurs while reading the PDB header or the zTXT header.
java.util.zip.DataFormatException - if the input file is not a zTXT database (its typeID is not 'zTXT')
Method Detail

getRecordSize

public int getRecordSize()
Returns:
the size of the data records in this zTXT document.

getzTXTFlags

public short getzTXTFlags()
Returns:
the zTXT format option flags.

getzTXTVersion

public int getzTXTVersion()
Returns:
the zTXT format version for this document.

getzTXTVersionString

public java.lang.String getzTXTVersionString()
Convert the two version bytes into a readable string.

Returns:
the zTXT version as a readable String.

getNumDataRecords

public int getNumDataRecords()
Returns:
the number of text data records.

getDataSize

public long getDataSize()
Returns:
the total size of the uncompressed text data.

getNumBookmarks

public int getNumBookmarks()
Returns:
the number of bookmarks present.

getBookmarkRecordIndex

public int getBookmarkRecordIndex()
Returns:
the index of the bookmark list record, if present.

getNumAnnotations

public int getNumAnnotations()
Returns:
the number of annotations present.

getAnnotationRecordIndex

public int getAnnotationRecordIndex()
Returns:
the index of the annotation index record, if present.

getCRC32

public long getCRC32()
Returns:
the computed CRC32 value for this zTXT's data.

getBookmarks

public Bookmarks getBookmarks()
Returns:
the bookmark collection for this zTXT. If this zTXT has no bookmarks, returns null.

getAnnotations

public ZtxtDB.Annotations getAnnotations()
Returns:
the annotation collection for this zTXT. If this zTXT has no annotations, returns null.

validateCRC32

public boolean validateCRC32()
Validate the CRC32 stored in the zTXT header against the CRC32 computed over the text data records in the database. zTXT databases use the CRC32 algorithm from zLib which is available in java.util.zip.CRC32.

Returns:
true if the stored CRC32 matches that computed from the zTXT's text data records or false if the CRC32 does not match or if the stored CRC32 is zero.

computeCRC32

public long computeCRC32()
                  throws java.io.IOException
Compute a CRC32 value over the text data records in the database. zTXT databases use the CRC32 algorithm from zLib which is available in java.util.zip.CRC32.

Returns:
the computed CRC32 value for this zTXT database.
Throws:
java.io.IOException - if an I/O error occurred while computing the CRC32 value.

initializeDecompression

public void initializeDecompression()
                             throws java.lang.ArrayIndexOutOfBoundsException,
                                    java.io.IOException,
                                    java.util.zip.DataFormatException
Initialize the decompression stream. For random access to work, the first text record must be decompressed first. After that, any other text record may be decompressed in any order. This is necessary because the first compressed record contains data important to the zLib format. The decompression stream will remain open until this object is discarded or until the endDecompression method is called.

Throws:
java.io.IOException - if an I/O error occurs while reading the first record.
java.lang.ArrayIndexOutOfBoundsException - if the first text record does not exist.
java.util.zip.DataFormatException - if the zLib formatted data in the input text record is invalid.

endDecompression

public void endDecompression()
End the decompression of data and clean up any data used by the Inflater object.


readTextRecord

public java.lang.String readTextRecord(int index)
                                throws java.lang.ArrayIndexOutOfBoundsException,
                                       java.io.IOException,
                                       java.util.zip.DataFormatException
Read the specified text data record and decompress it.

Parameters:
index - the index of the text data record to be read, counting from zero.
Returns:
a String containing the decompressed text.
Throws:
java.io.IOException - if an I/O error occurs while reading the requested record.
java.lang.ArrayIndexOutOfBoundsException - if the requested record index does not exist.
java.util.zip.DataFormatException - if the Inflater is not initialized or if the zLib formatted data in the input text record is invalid.

toString

public java.lang.String toString()
Show something semi-useful for this zTXT when the object is printed.

Overrides:
toString in class PalmDB
Returns:
a String representation of this zTXT.

finalize

protected void finalize()
                 throws java.lang.Throwable
Close any open resources before garbage collection. Specifically, clean up the decompression data.

Overrides:
finalize in class java.lang.Object
Throws:
java.lang.Throwable

readzTXTHeader

private void readzTXTHeader()
                     throws java.io.IOException
Read the zTXT header data from record zero.

Throws:
java.io.IOException - if an I/O error occurs while reading from the database.

readBookmarks

private void readBookmarks()
                    throws java.lang.ArrayIndexOutOfBoundsException,
                           java.io.IOException
Load the bookmark array into memory for quicker access.

Throws:
java.io.IOException - if an I/O error occurs reading the bookmark record.
java.lang.ArrayIndexOutOfBoundsException - if the bookmark record is listed as existing but cannot be found.

readAnnotations

private void readAnnotations()
                      throws java.lang.ArrayIndexOutOfBoundsException,
                             java.io.IOException
Load the annotation index and annotation text blocks into memory.

Throws:
java.io.IOException - if an I/O error occurs reading the annotation index record.
java.lang.ArrayIndexOutOfBoundsException - if the annotation index record is listed as existing but cannot be found.

parseOffsetsAndTitles

private void parseOffsetsAndTitles(byte[] data,
                                   int numEntries,
                                   int[] offsetArray,
                                   java.lang.String[] titleArray)
                            throws java.io.IOException
Parse the offset/title data in a byte array that is common to both bookmark and annotation indices. The offsetArray and titleArray arrays must be allocated before this method is called and they should have a length equal to the number of bookmarks/annotations present in the data array to be parsed.

Parameters:
data - the array of data bytes to be parsed.
numEntries - the number of offsets/titles to be parsed from the data array.
offsetArray - an empty allocated array where byte offset values will be stored.
titleArray - an empty allocated array where titles will be stored.
Throws:
java.io.IOException - if the input data array is too small which likely means some sort of I/O error when the array was read.

main

public static void main(java.lang.String[] args)
Prints values from the zTXT header and validates the CRC32 within the header. Can also print the bookmark and annotation indices as well as the annotation text.

Parameters:
args - the first argument is the input filename, the second argument is an optional boolean (0/1) toggling whether to print the bookmark list, the third argument is an optional boolean (0/1) toggling whether to print the annotation list, the fourth argument is an option annotation number to print the full text of, and the fifth argument is the number of a text record to decompress and print.