NOTE: This page is also available in Spanish.

1. What is makeztxt?

makeztxt is a simple commandline program that takes a plain ASCII text file and compresses it into a zTXT database. makeztxt will remove newline characters at the end of lines that contain text so that the paragraphs flow better on the Palm screen. makeztxt supports the use of regular expressions to automatically generate a list of bookmarks for you. Lastly, makeztxt can also break an existing zTXT file into it's components (text, bookmarks, annotations) and store them into separate files for you.

Please note that as a commandline program, makeztxt is intended for more advanced users. There are several very good conversion programs available that have easy to use GUI interfaces. If you are not experienced using the UNIX/DOS commandline environment, you may wish to use one of those instead. You can find links to all the conversion programs on the sidebar.

2. Features

3. Using makeztxt: Automatically Generating Bookmarks

Running 'makeztxt --help' will print out the list of command line options and what their functions are.

The best feature of makeztxt is its ability to use regular expressions to search the input text for bookmark spots. This is done with the command line options -l and -r.

-l will list all the bookmarks that are generated.

-r takes a regex as an argument to generate one or more bookmarks. You can have as many -r options as you want.

A full listing of all of the options to makeztxt can be found in Section 5.

You can also put a list of regular expressions, one per line, in a file called ".makeztxtrc". This file goes in your home directory, or in the current directory (if you have no home directory). A sample .makeztxtrc is included with the distribution. You can also explicitly specify which file to read regex from by using the -R option.

makeztxt can add a list of pre-generated bookmarks given in a file with the -m option. Care should be taken to make sure that the bookmark offsets you specify are valid in the converted text since makeztxt will, by default, reformat the input text to better flow on a Palm screen (removing many line feeds).

For annotations, makeztxt can also add pre-generated annotations given in a file with the -A option. See Section 5 for information on how this file must be formatted.

In addition, you can use a 2 part regular expression, like (regexp1)(regexp2), and it will match on the entire line, but the bookmark display will only be the regexp2 part. For example:

makeztxt -l -r (Subject:)(.*) file.txt

Where file.txt contains a number of emails, or news articles will generate bookmarks with the subject of the article, but without the word Subject:.

The following examples show the name of the work, the command line used, and the first eight bookmarks generated by the command line:

Shakespeare's "King Henry V"

>makeztxt -l -t "King Henry V" -r "DRAMATIS PERSONAE" \
      -r "ACT [A-Z]+" -r "SCENE [A-Z]+" 2ws2310.txt
Generated bookmarks
Offset          Title
-----------     --------------------
12097           DRAMATIS PERSONAE
14841           ACT FIRST
14853           SCENE I
19241           SCENE II
33233           ACT II
35118           SCENE I
40805           SCENE II
49553           SCENE III

RL Stevenson's "Treasure Island"

>makeztxt -l -t "Treasure Island" -r "PART [A-Z]+" \
      -r "          [0-9]+" treas10.txt
Generated bookmarks
Offset          Title
-----------     --------------------
12005           PART ONE
12422           PART TWO
12836           PART THREE
13087           PART FOUR
13685           PART FIVE
14102           PART SIX
14656           PART ONE
14723           1

Charles Darwin's "On the Origin of Species"

>makeztxt -l -t "On the Origin of Species" \
      -r "Introduction\." -r "Chapter [IVX]+" otoos10.txt
Generated bookmarks
Offset          Title
-----------     --------------------
19482           Introduction.
29724           Chapter I
99693           Chapter II
129257          Chapter III
165118          Chapter IV
259640          Chapter V
332498          Chapter VI
399182          Chapter VII

4. Using makeztxt (deconstructing)

Running 'makeztxt -d --help' will print out commandline usage for disecting zTXT files. This mode is much simpler than that of creating zTXT database, so it should be much easier to use. Simply give makeztxt a zTXT PDB file (filename.pdb) and it will output the uncompressed text data into another file (filename.txt). The exact output filename can be specified with the -o option.

makeztxt can also extract the bookmark list and the annotations from the zTXT file and output them. To output a bookmark list, give an output filename with the -m option. Similarly, to output a file with the zTXT's annotations, give an output filename with the -A option.

That's all there is to it.

5. List of command line options

makeztxt has a host of options to allow finer control over the text layout and compression modes.

5.1. Options used when creating a zTXT database

Option Description
-A/--annofile Give makeztxt a file containing annotations that will be added into the generated zTXT database. This file must follow a particular format to be understood by makeztxt. Each annotation is of the format:
  1. An annotation begins with a title line:
    Title: My Annotation
    where the text after the colon is the annotation's title with a maximum of 20 characters.
  2. The next line is the location in the text of the annotation anchor:
    Offset: 12345
    where the offset value is an absolute character position in the *reformatted* text file.
  3. The actual annotation text:
    Annotation: This is the text of my annotation.
    The annotation text will continue after a *single* "Annotation:" line until one of the following conditions is met: a) the file ends, b) another annotation is started with a "Title:" line, or c) the annotation reaches the maximum size of 4096 characters.
-a/--adjust Control the method of text formatting. Valid types are 0, 1, or 2. Method 0 will compute the average line length through the entire file and strip newline characters from any line longer than the average. Method 1 will strip the newline from any line with text in it. Method 2 will leave the text unchanged. The default is 0.
-b/--length If adjust method 0 is used, the value given with this option is the length a line must be to have its newline stripped. Using this option will override the value calculated by makeztxt.
-h/--help Display command line options and usage information.
-l/--list Display a list of all bookmarks generated by makeztxt or specified by the user. This is useful if you want to make sure your regular expressions are generating correct bookmarks.
-L/--launchable Sets the "launchable" attribute in the generated zTXT database. The Launcher apps on a Palm device can use this attribute and will display all zTXT documents in the main program listing allowing you to launch Weasel and open a specific document by tapping on the document directly. Default is OFF.
-m/--markfile Give makeztxt a file containing a pre-generated list of bookmarks to add to the generated zTXT database. The bookmark file has a very simple format. Each line begins with an integer offset for the bookmark anchor. Following that are one or more spaces/tabs. Finally is the bookmark title which occupies the remainder of the line up to a maximum of 20 characters. A line might look like:
23955 Chapter VII
-n/--nobackup Instructs makeztxt to not set the backup attribute in the generated zTXT database. This attribute, if set, will cause the database to be backed up during the next HotSync operation. Default is to set this attribute.
-o/--output Explicitly give the output filename which makeztxt should use. If this filename is not given, makeztxt will generate an output filename by removing the extension of the input file and replacing it with "pdb". If makeztxt is reading input from standard input this option is mandatory.
-R/--regexfile makeztxt will attempt to read a default set of regular expressions from the file .makeztxtrc in the user's home directory or from /etc/makeztxt.conf if that fails. This option can be used to tell makeztxt which file to read the list of regex from. Useful for user's on systems with no home directories.
-r/--regex Supply makeztxt with a regular expression for bookmark generation. string is a valid regex. This option can be given multiple times on the command line, each one adding a new regex.
-t/--title Specify the title of the generated zTXT database. The database title is stored within the database and is the name which will appear under Palm OS. The title is limited to 32 characters. If makeztxt is reading input from standard input this options is mandatory.
-V/--version Cause makeztxt to print out version information and exit.
-z/--compression Set the method of compression to be used. makeztxt supports to methods of compression. Method 1 allows for random access with a zTXT document and is the standard method. Method 2 gives 10-15% higher compression but requires that the entire document be decompressed before it can be read by the user. Default is method 1.

5.2. Options used when deconstructing a zTXT database

Option Description
-d/--deconstruct This option tells makeztxt that you wish to deconstruct a zTXT database. It is required for this mode of operation.
-A/--annofile Specify the filename into which makeztxt will store any annotations extracted from the input zTXT file. If this option is not given, annotations will not be extracted.
-h/--help Display command line options and usage information.
-m/--markfile Specify the filename into which makeztxt will store any bookmarks extracted from the input zTXT file. If this option is not given, bookmarks will not be extracted.
-o/--output Specify the output file makeztxt will store the extracted text data. If this option is not given, makeztxt will generate a default filename by removing the extension from the input file name and replacing it with "txt". If makeztxt is reading input from standard input this option is mandatory.
-V/--version Cause makeztxt to print out version information and exit.

6. Compiling makeztxt (for great profit!)

makeztxt uses zLib v1.1.3 (http://www.info-zip.org/pub/infozip/zlib). You will need to have zLib compiled for your HOST machine. All Linux distributions as well as most other Unices come with zLib, though it is possible you may be lacking the zLib header files.

You should look in the Makefile to make sure the program names and paths are okay.

If you are running on Sun hardware, uncomment the PACK line in the Makefile. makeztxt will not work without this. If you are getting mysterious crashes, you might want to try this switch as well, however, if you are on an x86 system, you should not enable that flag.

If your system does not have GNU regex (Solaris, Cygwin, others) then uncomment the USEPOSIX line to cause makeztxt to use POSIX regex.

If you are compiling on a Windows system, or any system which makes a distinction between text and binary files, you'll need to uncomment out the HAVEBINARYFLAG line in order to get valid output from makeztxt.

Lastly, you can uncomment STATICLIBS to statically link against zlib. This can be beneficial on Cygwin systems to cut down on the number of DLLs that need to be distributed.

Now run:

"make"

You should now have makeztxt.

If you're messing with the source, then maybe you want to help. If you have any problems, feel free to email me at johng@as.arizona.edu . Please use, if possible, the latest code from the SVN repository. It can be found at:

http://sf.net/projects/gutenpalm

If you would like to submit a bug report or a feature request, please make use of the facilities on Weasel's SourceForge project page. This allows for much easier management of bug and feature request tracking. It also ensures that your report is not forgotten about. The project page is at:

http://sf.net/projects/gutenpalm