org.WeaselReader.PalmIO
Class PalmDocDB

java.lang.Object
  extended by java.io.RandomAccessFile
      extended by org.WeaselReader.PalmIO.PalmDB
          extended by org.WeaselReader.PalmIO.PalmDocDB
All Implemented Interfaces:
java.io.Closeable, java.io.DataInput, java.io.DataOutput

public class PalmDocDB
extends PalmDB

This class supports reading PalmDoc (also known as AportisDoc) formatted Palm databases. A PalmDoc database file is similar to a zTXT file, but it uses a simple proprietary compression algorithm that is very quick but does not compress well. It also supports bookmarks, each one stored in a separate record following the text records. PalmDoc databases do not support annotations.

There are two versions of PalmDoc files, one with uncompressed text records (version 1) and one with compressed text records (version 2). Unfortunately, due to a number of buggy PalmDoc conversion programs, there are a large number of broken PalmDoc files available in the wild. The most common breakage is decompressed records which do not decompress to the proper record size (normally 4096 bytes) which requires that a table of true record sizes be built so that such a PalmDoc may be navigated accurately. The other form of breakage is an incorrect record count in the PalmDoc header.

The PalmDoc header in record 0 is described by the following C language structure:
typedef struct palmdoc_record0Type
{
UInt16 wVersion;
UInt16 spare;
UInt32 munged_dwStoryLen;
UInt16 munged_wNumRecs;
UInt16 wRecSize;
UInt32 dwSpare2;
} palmdoc_record0;

The information and C code for the PalmDoc format comes from Weasel Reader which in turn took the information from another Palm OS book reader, CSpotRun. CSpotRun is copyright Bill Clagett (bill@32768.com) and is also released under the GNU GPL. The source for CSpotRun was at one time available at Bill Clagett's page.

Version:
$Id$
Author:
John Gruenenfelder

Field Summary
private  int bookmarkRecordIndex
          The first bookmark record, if any.
private  Bookmarks bookmarks
          The bookmarks for this zTXT document.
private  long dataSize
          Length of uncompressed text data in a PalmDoc database.
static int MAX_TITLE_LENGTH
          Maximum length for a PalmDoc bookmark title.
private  int numBookmarks
          The number of bookmarks in this PalmDoc file.
private  int numDataRecords
          The number of text records in a PalmDoc database.
static int PALMDOC_COMPRESSED
          PalmDoc version flag for compressed data.
static java.lang.String PALMDOC_CREATOR_ID
          PalmDoc database creator ID.
static java.lang.String PALMDOC_TYPE_ID
          PalmDoc database type ID.
static int PALMDOC_UNCOMPRESSED
          PalmDoc version flag for uncompressed data.
private  int palmDocVersion
          PalmDoc format version.
private  int[] recordLengths
          Lengths of each text record in the database.
private  int recordSize
          The record size for an uncompressed PalmDoc text record.
 
Fields inherited from class org.WeaselReader.PalmIO.PalmDB
DB_FLAG_BACKUP, DB_FLAG_DIRTY_APPINFO, DB_FLAG_NEWER_OKAY, DB_FLAG_NO_COPY, DB_FLAG_READ_ONLY, DB_FLAG_RESET_ON_INSTALL, DB_HEADER_LENGTH, DB_NAME_LENGTH, PALM_CTIME_OFFSET, REC_FLAG_BUSY, REC_FLAG_DELETE, REC_FLAG_DIRTY, REC_FLAG_SECRET
 
Constructor Summary
PalmDocDB(java.io.File pdbFile)
          Create a new PalmDocDB and load the specified PalmDoc file.
 
Method Summary
private  int calculateBufferLength(byte[] data)
          Calculate the decompressed length of the given buffer.
private  void calculateRecordLengths()
          Calculate the lengths of each text record.
private  byte[] decompressBuffer(byte[] data, int outputSize)
          Decompress the given buffer using the LZ77-based PalmDoc compression algorithm.
 int getBookmarkRecordIndex()
           
 Bookmarks getBookmarks()
           
 long getDataSize()
           
 int getNumBookmarks()
           
 int getNumDataRecords()
           
 int getPalmDocVersion()
           
 int[] getRecordLengths()
           
 int getRecordSize()
           
static void main(java.lang.String[] args)
          Prints values from the PalmDoc header as well as any bookmark list and the text from a specified record number.
private  void readBookmarks()
          Load the bookmark index into memory for quicker access.
private  void readPalmDocHeader()
          Read the PalmDoc header data from record zero.
 java.lang.String readTextRecord(int index)
          Read the specified text record and decompress if necessary.
 
Methods inherited from class org.WeaselReader.PalmIO.PalmDB
getApplicationInfoIDPtr, getCreationTime, getDbCreatorID, getDbName, getDbTypeID, getFlags, getLastBackupTime, getModificationNumber, getModificationTime, getNextRecordListIDPtr, getNumRecords, getRecordFlags, getRecordIDs, getRecordOffsets, getSortInfoIDPtr, getUniqueIDSeed, getVersion, readRecord, readUInt32, readUniqueID, toString
 
Methods inherited from class java.io.RandomAccessFile
close, getChannel, getFD, getFilePointer, length, read, read, read, readBoolean, readByte, readChar, readDouble, readFloat, readFully, readFully, readInt, readLine, readLong, readShort, readUnsignedByte, readUnsignedShort, readUTF, seek, setLength, skipBytes, write, write, write, writeBoolean, writeByte, writeBytes, writeChar, writeChars, writeDouble, writeFloat, writeInt, writeLong, writeShort, writeUTF
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PALMDOC_CREATOR_ID

public static final java.lang.String PALMDOC_CREATOR_ID
PalmDoc database creator ID. This is a four byte character literal. Use PalmDB.stringToID to get the numerical value.

See Also:
Constant Field Values

PALMDOC_TYPE_ID

public static final java.lang.String PALMDOC_TYPE_ID
PalmDoc database type ID. This is a four byte character literal. Use PalmDB.stringToID to get the numerical value.

See Also:
Constant Field Values

PALMDOC_UNCOMPRESSED

public static final int PALMDOC_UNCOMPRESSED
PalmDoc version flag for uncompressed data.

See Also:
Constant Field Values

PALMDOC_COMPRESSED

public static final int PALMDOC_COMPRESSED
PalmDoc version flag for compressed data.

See Also:
Constant Field Values

MAX_TITLE_LENGTH

public static final int MAX_TITLE_LENGTH
Maximum length for a PalmDoc bookmark title. A title of the maximum length will not be NUL terminated in the database.

See Also:
Constant Field Values

palmDocVersion

private int palmDocVersion
PalmDoc format version. Version 1 contains uncompressed text records and version 2 contains compressed text records.


dataSize

private long dataSize
Length of uncompressed text data in a PalmDoc database. Due to bugs in PalmDoc converters, this number may not always be correct.


numDataRecords

private int numDataRecords
The number of text records in a PalmDoc database. Due to broken converters, this value also seems to have a habit of being wrong.


recordSize

private int recordSize
The record size for an uncompressed PalmDoc text record. All records are supposed to decompress to this length, but there are many PalmDoc files in the wild where this is not true. This can occur when a PalmDoc file is edited on a device. The avoid recompressing all data which follows it, only this record will be recompressed resulting in a record with an odd size.


numBookmarks

private int numBookmarks
The number of bookmarks in this PalmDoc file. This is not stored in the header and must be computed. The number of bookmarks is given by:
totalRecords - numDataRecords - 1


bookmarkRecordIndex

private int bookmarkRecordIndex
The first bookmark record, if any. PalmDoc database do not have a bookmark index in one place. Instead, this record and each record after it, contains a single bookmark up to numBookmarks in total. If there are no bookmarks in this PalmDoc, this value will be zero.


bookmarks

private Bookmarks bookmarks
The bookmarks for this zTXT document.


recordLengths

private int[] recordLengths
Lengths of each text record in the database. Because of various flaws and situations, PalmDoc databases can come to have non-uniform record lengths. In order to properly position the reader within the document, it is necessary to calculate the record lengths ahead of time.

Constructor Detail

PalmDocDB

public PalmDocDB(java.io.File pdbFile)
          throws java.io.IOException,
                 java.util.zip.DataFormatException
Create a new PalmDocDB and load the specified PalmDoc file.

Parameters:
pdbFile - a PalmDoc document database to be read from disk.
Throws:
java.io.IOException - if an I/O error occurs while reading the PDB header or the PalmDoc header.
java.util.zip.DataFormatException - if the input file is not a PalmDoc database (its typeID is not 'TEXt').
Method Detail

getPalmDocVersion

public int getPalmDocVersion()
Returns:
the PalmDoc format version. Possible values are either PALMDOC_UNCOMPRESSED or PALMDOC_COMPRESSED.

getDataSize

public long getDataSize()
Returns:
the total size of the uncompressed text data records.

getNumDataRecords

public int getNumDataRecords()
Returns:
the number of text data records.

getRecordSize

public int getRecordSize()
Returns:
the size of an uncompressed text data record.

getNumBookmarks

public int getNumBookmarks()
Returns:
the number of bookmarks in this PalmDoc database.

getBookmarkRecordIndex

public int getBookmarkRecordIndex()
Returns:
the index of the first record containing a bookmark, if any.

getBookmarks

public Bookmarks getBookmarks()
Returns:
the bookmark collection for this PalmDoc. If this PalmDoc has no bookmarks, returns null.

getRecordLengths

public int[] getRecordLengths()
Returns:
the record lengths array containing the uncompressed lengths of each text record.

readTextRecord

public java.lang.String readTextRecord(int index)
                                throws java.lang.ArrayIndexOutOfBoundsException,
                                       java.io.IOException
Read the specified text record and decompress if necessary.

Parameters:
index - the index of the text data record to be read, counting from zero.
Returns:
a String containing the (decompressed) text.
Throws:
java.io.IOException - if an I/O error occurs while reading input record.
java.lang.ArrayIndexOutOfBoundsException - if the requested record does not actually exist.

readPalmDocHeader

private void readPalmDocHeader()
                        throws java.io.IOException
Read the PalmDoc header data from record zero.

Throws:
java.io.IOException - if an I/O error occurs while reading from the database.

readBookmarks

private void readBookmarks()
                    throws java.lang.ArrayIndexOutOfBoundsException,
                           java.io.IOException
Load the bookmark index into memory for quicker access.

Throws:
java.io.IOException - if an I/O error occurs reading a bookmark record.
java.lang.ArrayIndexOutOfBoundsException - if a bookmark record should exist but cannot be found.

calculateRecordLengths

private void calculateRecordLengths()
                             throws java.lang.ArrayIndexOutOfBoundsException,
                                    java.io.IOException
Calculate the lengths of each text record. Some PalmDoc files have text record which do not all decompress to a uniform size. In order to accurately mark a position within the text, it is necessary to know the size of each record up to that point. This array allows that calculation to be performed quickly.

Throws:
java.io.IOException - if an I/O error occurs while reading input records.
java.lang.ArrayIndexOutOfBoundsException - if a record which should exist is requested but does not actually exist.

decompressBuffer

private byte[] decompressBuffer(byte[] data,
                                int outputSize)
Decompress the given buffer using the LZ77-based PalmDoc compression algorithm.

Parameters:
data - a block of PalmDoc data to decompress.
outputSize - the length of the data array when decompressed.
Returns:
a byte array containing the uncompressed data.

calculateBufferLength

private int calculateBufferLength(byte[] data)
Calculate the decompressed length of the given buffer. This is the same as decompressing the buffer, but without saving the data anywhere. If the text of this PalmDoc is not compressed, there is no computation involved and the record length is the same as the buffer length.

Parameters:
data - a block of PalmDoc data to calculate the length of.
Returns:
the uncompressed length of the given data buffer.

main

public static void main(java.lang.String[] args)
Prints values from the PalmDoc header as well as any bookmark list and the text from a specified record number.

Parameters:
args - the first argument is the input filename.