 *******************************************************************
                             FAX TOOLKIT

         All Material Copyright (C) Andrew Margolis 1994-1995

         All Intellectual and other property rights reserved.  

     No part of this document or any accompanying software may be
 reproduced in any form or distributed by any means or stored in any
      database without the prior written consent of the author.

 *******************************************************************

This document fairly loosely describes the contents of the Fax
Toolkit in the CODE directory on this disk.  

As well as the source code, the executables QENCODE.EXE, ENCODE.EXE,
FAX.EXE, and DECODE.EXE are provided for readers who might want to
use the code but have no compiler.

PART 1 : Contents of the \CODE directory

The directory contains the source code for four stand-alone programs,
each in its separate subdirectory (together with the appropriate
precompiled executable DOS program).  The TIFF subroutines are common
to all.

IMPORTANT : Readers seeking to understand the source code should take
particular notice of the make files in each subdirectory which
document the code dependencies.

1.  Quick 1-D Encoding Program in \CODE\QENCODE

This program is a quick fax encoder which only generates 1-D normal
resolution files.  Any size file is supported.  Since it uses a half-
width font and doubles up on run lengths, it is at least 15% faster
than the previous version, and can be up to 900% faster on text with
lots of white space.  This is because it only encodes those portions
of lines containing text and doesn't try to generate a bit map of
each  line.  However, this technique cannot generate 2-D or T.6
files.

 qencode.c is the code for the exerciser program
 qencode.h is the header file common to all modules
 qaxify.c actually does the digitizing and encoding
 qaxisubs.c contains a number of coding subroutines
 qshiftin.c handles the writing of bits rather than bytes to disk
 font8x16.dat is the half size font data
 quff.dat is the font data contains the MH encoding tables
 qencode.mak is the make file for the software

2.  Encoding Program in \CODE\ENCODE

This program turns an ordinary text file into a fax file stored in
TIFF format.  Any size file is supported.  The default is for normal
resolution 1-D T.4 format, but 2-D T.4 and T.6 formats are also
supported, as is fine resolution (though we have no font for this and
it simply generates a squashed page.  It consists of the following
modules :

 encode.c is the code for the exerciser program
 encode.h is the header file common to all modules
 faxify.c actually does the digitizing and passes lines to the encoder
 faxisubs.c contains a number of subroutines that together comprise the encoder
 shiftin.c handles the writing of bits rather than bytes to disk
 font.dat is the font data
 huff.dat contains the MH and MR encoding tables
 encode.mak is the make file for the software

3.  Decoding Program in \CODE\DECODE

This program takes the tiff image from a fax file and outputs it to
screen or printer.  All formats are supported (T.4 1-D, T.4 2-D and
T6) in both normal and fine resolutions.  Screens supported include
CGA, EGA and VGA IBM screens, and printers include Hewlett-Packard
Lasers and Inkjets at 100 x 100 or 300 x 300 resolution, Canon CaPSL
printers such as laserbeams, both 9-pin and 24-pin Epson compatible
dot-matrix and inkjet printers, and also 48-nozzle Canon Bubblejets.
An option to dump a fax to uncompressed TIFF graphics format is
included.

 decode.c is the code for the exerciser program
 decode.h is the header file common to all modules
 unfaxify.c is the decoder, which redigitizes run-lengths from the image
 doline.c puts the run-lengths together in a line and works out what to do with them
 output.c outputs lines to whatever device is selected
 scales.c handles the scaling of the fax to the resolution of the device
 output.dat contains printer control codes
 unhuff.dat contains MH and MR decoding trees
 decode.mak is the make file for the software

4.  Faxing Program in \CODE\FAX

This is the fax software itself.  It will handle fax sending and
receiving on any type of fax modem.   It accepts any sort of image
transmission (1-D 2-D or T.6) and sends whatever sort of image it
finds in a TIFF file.  A diagnostic display of the fax modem dialogue
with progress reports is optionally displayed on the screen.

 fax.c is the code for the exerciser program
 fax.h is the header file common to all modules
 faxmodem.c checks the modem and puts it in fax mode
 faxin.c is the fax receiver dispatcher.
 faxin1.c receives faxes on class 1 modems
 faxin2.c receives faxes on class 2 modems
 faxin20.c receives faxes on class 2.0 modems
 faxout.c is the fax transmit dispatcher.
 faxout1.c transmits faxes on class 1 modems
 faxout2.c transmits faxes on class 2 modems
 faxout20.c transmits faxes on class 2.0 modems
 faxclas1.c contains class 1 routines to send and receive frames and pages
 faxecmrx.c contains class 1 routine for receiving error corrected faxes
 faxecmtx.c contains class 1 routines for transmitting error corrected faxes
 faxutils.c contains various utility routines for timing and string handling
 async.asm contains the low level asynchronous port ISR
 backward.dat contains a lookup table for reversing bits in a byte
 frames.h contains class 1 frame definitions
 fax.mak is the make file for the software

5.  TIFF Routines in \CODE\TIFF

These three files, included for TIFF support  are common to all the
earlier modules, so they are listed in a separate section here rather
than being duplicated three times.  Note that while our version of
TIFF could be regarded as a little eccentric, it is quite legal.

 tiff.h is the tiff header common to all the programs
 tifwrite.c is the tiff file creator and IFD writer
 tifread.c is the tiff file reader (not used in the encoder)

PART 2 : Languages and Software Tools

1. Operating environment

The exercisers for the modules were all targeted at the IBM PC
architecture and the DOS environment, for the simple reason that it
is accessible, clean, and relatively uncomplicated by the real-time
multitasking issues that more complex and capable operating systems
bring with them.  This enables us to concentrate on the fax aspects
of the code.  The code should port fairly well to other operating
environments.

2. ANSI C

The code is written (mostly) in standard ANSI C for maximum
portability. There wasn't much agonising over the choice of
language.  C is the only high- level language which is both widely
available and which also contains the necessary facilities for
manipulating bits within bytes.  Since fax imaging and communications
are essentially bit-oriented rather than byte- oriented processes, it
is simply not possibly to fulfil the requirements efficiently without
the ability to access and shift bits within bytes.  C++ was
considered, but ruled out for a number of reasons :

 i.  It is not as standardised as ANSI C and is not as universally
 available.  One of the main reasons why people may need to gain
 access to source code for fax operations is to implement them on
 non-standard non- PC platforms.  It is these very platforms that are
 the ones most likely to have no reliable C++ compiler available.
 Consequently, writing in C++ would be likely to make the code
 inaccessible to many people who might want to use it.

 ii.  One aim of a good C++ program is to hide the nature of any
 underlying processes.  A lot of the effort involved in constructing
 appropriate class hierarchies is dedicated to this end.  However,
 the aim of the code is to do the exact opposite and lay all
 underlying fax operations out as openly as possible.  It is perhaps
 easier to do this in a language which is not continually encouraging
 the programmer to conceal what is going on.

 iii.  A related but possibly stronger reason is that it doesn't seem
 to me to be possible to build an appropriate class hierarchy unless
 one already understands reasonably fully all the processes that go up
 to make up the complete program.  To try to teach the basics of fax
 programming using a properly structured C++ program would therefore
 require that a reader already possessed the very skills that you are
 trying to teach.  If they lacked those skills then the structure of
 the program would most likely get in the way of acquiring them.

 iv.  The final reason is that I'm a better C programmer than a C++
 programmer.

3. Assembler

While the algorithms used in the code can be coded in C for
portability, there are some low-level PC dependent input and output
routines where this is not an issue.  All such low-level code which
is written in assembler, for efficiency and because I am a believer
in using the tools most appropriate for the job at hand.

The majority of the assembler code is in the ASYNC.ASM module, which
is actually written entirely in assembler.  It contains an interrupt
service routine (ISR) for serial port access on an IBM PC together
with low-level routines to send and receive characters, get the
status of the interrupt send and receive buffers, and to initialize,
deinitialise and change the speed on the serial port.

Fax i/o is a fast real-time operation.  Writing in assembler means
that the code can be optimized for speed and also guaranteed to be
self- sufficient : for in an interrupt handler, you mustn't call any
external routines like operating system services or language
libraries which might not be re-entrant.  The usual disadvantages of
assembler like lack of portability, needing to know a little about
the way hardware works and having to write all the utility code
yourself are actually requirements for the task.

The other module that contains some assembler is the OUTPUT.C module,
which uses one assembler routine to access the video ram on the PC in
order to display a bitmap as fast as possible, and another to put the
DOS printer device into raw mode so that binary data for graphics can
be printed.  Again, if you have to bash to metal and write machine
dependent code then you may as well get out the assembler and do it
properly.

4. Tools

The main programming tools used for compiling and debugging were by
Borland, since Microsoft apparently no longer seem interested in
producing programming tools targeted at the DOS environment.

The code was compiled without errors both with Turbo C 2.0, which
generates 8086 compatible code, and also with Borland C++ version
4.0, which doesn't.  All warnings were turned on for both version. As
a check, the fax code was also compiled using Clarion's Topspeed C
version 3.10, also with no errors. In all cases, Borland's TASM was
used for the assembler routines.

Users of other compilers should have no problems with compiling the
code, though some minor differences are always to be expected.  The
syntax of the make file will almost certainly differ, but I have
included these anyway as they indicate the relationships and
dependencies between the various modules.

5. The nature of the source code

The code is essentially a one-person project.  While it has been
developing over a couple of years, source distribution was not so
something that I originally envisaged.  Large parts of it haven't
developed according to any of the standard models laid down by the
various software engineering luminaries, and this sometimes shows.
When wrestling with impenetrably worded standards recommendations and
imprecise specifications, it's always a temptation to resort to the
hack and the quick fix to try and get things working.  Like most
temptations, this is one that I succumb to more regularly than I like
to admit. The temptation to hack applies even more to programming fax
modems.

Though you can't condemn a modem by saying that it fails to behave as
described in its documentation when it doesn't have any, it is
extremely frustrating to have no way of telling whether the failure
of a program to perform to expectation is because there is a bug in
the program or because there is a fault in the hardware.  Even if
code works on some modems and not on others, it isn't possible to
know for certain that the modems that failed were ones which had
problems.  It is always possible that differently code would have
worked under all circumstances.

Despite these problems and with all its imperfections, the source
code included here is better written than some commonly available
source code libraries, and the programs themselves perform at least
as well at practical faxing as many commercially programs.  Even so,
the code is not guaranteed to be bug-free, and in the form I provide
it, it has not undergone the stricter and more rigorous development
and testing procedure that would make it suitable for
mission-critical applications.

6.  Key Code Fragments 

The full code in the subdirectories is documented reasonably well. 
The comments are mostly grouped at the start of each function, with
descriptions of the purpose and working of the code.  This makes it
easier to keep the comments up to date when the code changed.

What we do here (as in the printed text, but in rather more detail) is
to extract the key algorithms from the code provided and list those.
Most of the time, this involves providing code fragments that don't
appear anywhere on the disk in the exact form that they do here.
Anyone who has written software knows that in most programs, only a
few lines of the code actually do the important work.  The rest of
the code is concerned with support functions, checking for errors,
handling special conditions and so on.

My excuse for putting only the interesting bits in this text is that
it aims to help people with fax modem programming rather than
software engineering.  Of course, the real work involved in writing
bug-free programs that work properly lies precisely in getting the
support functions working properly and all the modules integrated
properly.  Algorithms that don't work are probably the easiest things
to spot, and even bad programs usually function most of the time.
Their deficiency lies in their unpredictability and their
unreliability, and where there are bugs in the code, they are usually
in the corners of the program rather than the centre.

7.  Hardware requirements

The code was written and tested on IBM PC compatible systems with
standard serial ports.  However, it should port fairly easily to any
hardware with a decent ANSI C compiler.  A small number of routines
written are written in Intel assembler, and would need to be
rewritten for other processors and architectures, but there are no
surprises here.  Nobody expects low-level communications routines to
be portable between different platforms.

While commercial PC fax software typically advertises itself as
requiring something like 2 megabytes of memory with a hard disk being
mandatory, the programs presented here will run on 256K floppy-disk
based systems quite easily.

There aren't any significant demands on the hardware in terms of
memory, as we handle images in terms of scan lines rather than as
complete bitmaps.  This makes the routines reasonably suitable for
use in circumstances where memory may well be at a premium, such as :

 i.   background fax applications
 ii.  small handheld or palmtop portables
 iii. providing integral fax functions to existing software

The disk space available should be sufficient to hold a fax image.
Typically, a single A4 sheet of paper takes between 20K and 40K as a
one- dimensional fax.  A disk of 256K (such as a RAM disk) is usually
adequate for faxes of at least 6 pages.  The average fax is around
2.1 pages long, including a cover page.

The main constraint on the speed of any system is that it must allow
system serial ports to handle bit-streamed input and output at speeds
of 19200 bps.  Additionally, software controlling class 1 modems must
be guaranteed access to the processor at least once every 50 ms
during fax negotiations.  Software using class 2 or 2.0 modems can be
much more relaxed about this, and can usually get away with ignoring
data from the modem for at least 500 ms, and sometimes longer.

8. Other machine portability issues

There are also two other non-ANSI C routines in the code.  The first
is the PC specific routine kbhit() which returns true if a key has
been pressed.  This is used for the dumb terminal mode in the
FAXMODEM.C, where the code allows manual configuration and
customization of a modem, and also to enable escape from waiting for
the phone to ring in all the FAXIN*.C modules.  The second is the
print routine, which handles printer output by writing to the stdprn
device.  Though widely implemented, this is not an ANSI supported
device.  There is no portable method in ANSI C for writing to
printers.

9. Modem portability

The code has been successfully tested with a range of fax modems as
follows :

Ŀ
 Manufacturer         Chipset       EIA classes supported 
Ĵ
 Sonix Volante                      Class 1               
 USR Sportster 14400                Class 1               
 Solwise V9624A       Cirrus Logic  Class 1               
 Hayes Optima 144                   Class 1               
 Dynalink 9624        Exar          Class 2               
 Racal MXF 9632       Sierra        Class 2               
 Dynalink 144E        Rockwell      Class 1 and Class 2   
 Supra V32bis         Rockwell      Class 1 and Class 2   
 Pace Microlin fx     Rockwell      Class 1 and Class 2   
 Cardinal 14.4        AT&T          Class 1 and Class 2   
 USR Courier                        Class 1 and Class 2.0 


The fax machine used for testing was NEFAX 570 with V.17, T.6 and ECM
support.  Not all functions performed properly on all modems, but
this was generally because of identifiable limitations of the devices
themselves.  All the modems tested were able to send and receive both
normal and fine resolution 1-D faxes with no problems.

PART 3 : The quick encoder

Our first program is a quick encoder, and all the relevant files can
be found in the subdirectory \CODE\QENCODE.  This program takes an
ordinary text file and turns it into a one-dimensional normal-
resolution T.4 fax image.

This is a two step process :

 A. Turning the text file into a bitmap (Digitizing)
 B. Compressing the bitmap using the T.4 specification (Encoding)

We'll look at what's involved in each of these steps in turn

A. Digitizing

1.  The size of the font

The key to digitizing text is to have a font of the correct size. The
dimensions of the fax are the critical figures here.  We assume that
the text file is designed to fit on a standard A4 page, and that it
can be printed out using standard printer defaults of 10 characters
per inch and horizontally, and 6 lines per inch vertically.

Since the official A4 size is 8.27" x 11.69", this means our text
file can (in theory) be formatted as fax pages containing 70 lines of
82 characters each.

We use these figures as the basis of our character digitization.  The
required height of the font is quite easy to work out.  The vertical
resolution (for a normal fax) is specified as 3.85 lines/mm. Since an
A4 is 297 mm long, it consists of 1143 scan lines, and as we have 70
text lines, each line of text consists of 16 scan lines.

The width of the font is only slightly more difficult to arrive at.
The T.4 specification for a fax is that it has 1728 dots on a 215 mm
line, Since an A4 page is 210 mm wide, it consists of 1688 dots, and
as we have 82 characters per line, each character is 20 dots wide.

The size of a fax font should therefore be 20 x 16.  There is some
official support for this fax character size, as the T.4
recommendation states that a 20 x 16 font should be use in the
optional mixed mode (E.8/T.4).

The dimensions we have just arrived at make no allowances for the
fact that T.4 states that only the middle 196.6 mm are guaranteed
reproducible.  This leaves us with the middle 1580 dots, with two
margins of 74 dots each.  As each character is 20 dots wide then we
can have only 79 characters on a line rather than 82.  And since the
guaranteed reproducible length of an A4 fax is only 281.46 mm, we can
only rely on 67 lines per page.  In practise this is less of a
restriction, as while all fax machines have a fixed width, most can
accept images of an indefinite length.

2.  Generating a font

The skill needed to designing a font is something I lack; I suspect
this is true of most people.  Good fonts are typically copyrighted,
and property rights are jealously guarded. However, an easily
accessible font with no restrictions on its use is the standard VGA
font found on virtually all PCs, which is 8 x 16 dots, with no
intercharacter gaps (in other words, a couple of characters like MM
side by side will run into each other).  On PCs a ninth blank bit is
usually added for the gap, but it is much more convenient for our
purposes to add an extra dot on each side of a character.  This makes
the font 10x16, and doubling this up to a 20 x 16 font on the fly is
a trivial problem.  (It would be nice to add a modification at some
point to make the line and block characters from 176 to 223 join up,
but as they aren't standard ASCII it's something of a luxury.)

The following fragment of assembler code shows how to obtain the
location of the VGA font on a standard PC.  It can either be used
directly, or as in the fragment, dumped to a 4K memory array for
easier accessibility.

    push    ds
    push    es
    push    si
    mov     ax,1130h        ; function 11h subfunction 30h
    mov     bh,6            ; bh=6 for 8x16 font pointer
    int     10h             ; returned in es:bp
    push    es
    push    ds
    pop     es
    pop     ds
    mov     si,bp
    mov     di,offset font  ; store the array at the address font
    mov     cx,256*16/2     ; size (in words)
    rep     movsw           ; copy the font array
    pop     si
    pop     es
    pop     ds

Alternatively, an 8 x 16 font data is found in font8x16.dat. This
font has been specially modified for fax.  The shapes of many of the
letters have been altered for better transmission and reproduction.
Even so, the font isn't marvellous; it is comparable to the "Ugly"
font used by the Ghostscript Postscript interpreter.

3.  Digitizing the text

The great advantage of an 8 x 16 font for digitizing is its
convenient size.  Finding the dot pattern for any character is simply
a question of using the value of a character as one index into a
two-dimensional font array, with the font row being the other index.

The following code fragment reads a line of text from a file to
text_line, and stores a digitized version of the text at bit_image.
It consists of two loops.  The outer loop processes the entire line
once for each row of the font, with the row being the control
variable.

Meanwhile, the inner loop step through each character in the line,
with the character position being the control variable. The whole
purpose of the loop is simply to store the value in the
two-dimensional 256 x 16 font table indexed by the character and the
row in bit_image. Non- printable characters are ignored in this
fragment (though the code in the subdirectory does expands tabs).

 fgets (text_line,79,infile) ;
 for (row=0 ; row<16 ; row++)
 {
    for (i=0 ; i<strlen(text_line) ; i++) if (text_line[i]>31)
    bit_image[image_offset++]=fonts[text_line[i]][row] ;
    bit_image[image_offset++]=EOLFLAG ;
 }

Each of the 16 scan lines we generate ends with a unique EOLFLAG
sentinel, which can be any configuration of bits not occurring in the
fax data (the code in the subdirectory uses 0x01).  The use of a
unique end of line code to end scan lines, coupled with the use of a
byte-wide font, is what gives this encoder its speed.

This is the core of the code that you'll find in QAXIFY.C, though the
version there has a few extra tweaks to ignore pad out tabs (in
columns of 8) and to ignore extraneous control codes.

B. Compressing

1.  Encoding the bitmap (i) Identifying run lengths

There are three steps in encoding the image.  The first step is to go
through each of the scan lines byte by byte (or more accurately, octet
by octet) identifying the run lengths of black and white bits (which
are always going to be 1 and 0 respectively if a screen-based font
was used for the digitization step).  While a lookup table which
returned run lengths for each byte could be designed, the need to
cope with varying numbers of runs octets as different as 00000000 and
10101010 would make the enterprise quite complicated, and it isn't
something that I've looked into in detail.

The most understandable way of identifying the run lengths is to pick
each bit out of the image individually.  No matter what language is
being used, getting at the individual bits is less that
straightforward on any CPU designed to handle information in bytes,
and generally requires some familiarity with Boolean algebra.
Knowledge of the logical (bitwise) consequences of arithmetic
operations is also useful.

Undoubtedly, the quickest method of counting runs is to write a
routine in assembler to rotate each octet left 8 times; the carry
status will reflect the colour of each bit.  Regrettably, no high
level language in common use gives access to the carry flag, and so
alternative methods have to be sought.

Pseudo-carry flags can be used; a straightforward method would be to
cast each octet into a 16-bit integer and then double it.  If the
result was less than 256 then the leftmost bit was white, otherwise
it was black.

Alternatively, casting an octet as a signed character enables the
leftmost bit to be checked by looking at the sign (using
twos-complement arithmetic).  A negative value indicates a black bit
and a positive value indicates a white bit.

Another strategy possible with a language like C is to place the byte
in a union of a char and 8 bit fields, which enables us to look at
each bit in turn. (Note that we can't do this in a loop as bit fields
can't be indexed, and that the ordering of bit field is not something
that is necessarily fixed).

However, the traditional method, which is feasible using almost any
language, is to get at the bits in a byte by means of a logical AND.

While shifting the octet and ANDing with a constant is one option,
the most elegant method would be to define char c=0x80, and use it to
examine each bit from an octet in turn by ANDing it with c and then
shifting c right after each iteration.  This means that the value we
AND our octet with can double up as the control variable in a loop
via for (c=0x80 ; c>0 ; c>>=1). Coding this iteratively would of
course execute slightly faster, but we use a loop for convenience.

Like most repetitive tasks, it is fairly easy to code if the
underlying data structures have been simplified.  In the case of
digitizing text, the use of a font which was 8 bits wide made life
simple.  In the case of identifying runs in a bilevel image, the
simplification comes the way the colour variable is defined.  In our
code, we use the value 0x00 for white and 0xff for black.

For each iteration of the loop, we perform a logical AND of our
control variable with the octet we are looking at and compare the
result with a logical AND of the control variable and the colour.  If
the two are the same, then the run length is increased by one; if the
two are different, then we are at the start of a new run.

At the start of each new run, we check the current colour; if it was
white then we are about to start a black run but if it was black then
we are about to start a white run.  Our two subroutines nextwhite and
nextblack could just as easily have been called lastblack and
lastwhite.

Before dissecting each octet, we check that it isn't an EOLFLAG; if
it is, we end the scan line via our endaline routine.  We also begin
and end each octet with an extra white dot.  This serves to turn our
original 8 x 16 font into a 10 x 16 font.  Note that the fragment
below makes use of the C ?: ternary operator, mostly for cosmetic
reasons.  I find that replacing ?: with the equivalent if...else
makes the code look more complicated.

 {
     octet=bit
     for (i=0 ; i<image_offset ; I++) image[i] ;
     if (octet==EOLFLAG) endaline ;
     else
     {
         color ? nextwhite : run_length++ ;
         for (c=0x80 ; c ; c>>=1)
         {
             if ((color&c)==(octet&c)) run_length++ ;
             else color ? nextwhite : nextblack ;
         }
         color ? nextwhite : run_length++ ;
     }
 }

2.  Encoding the bitmap (ii) Coding run lengths

The second step in encoding the bit map is to turn each run length
into its T.4 Huffman code equivalent.  Before we do this, each run
length is doubled.  This completes the final step in the
transformation of our 8 x 16 screen font to a 20 x 16 fax font.
Adding an extra white dot on either side effectively made the font 10
x 16, and doubling a run-length based on a 10 x 16 font gives the
same result as coding with a 20 x 16 font.

We also keep track of the remaining number of dots on each line by
subtracting each run length from a counter which we initialize to 1728
at the start of each line.  As we'll see, this is needed in order to
handle the EOLFLAG case in the last fragment correctly.

The code in the fragment above, and also the code in the subdirectory,
has two separate routines for coding black and white runs.  It is
possible to have one changecolor routine which starts off with a
check to see what the current colour is.  Alternatively, the
changecolor routine could be defined as a function pointer to either
nextwhite or nextblack.

The two routines which code run lengths directly are complementary,
so we'll look in detail only at one of them.  The core of the
nextblack routine which ends each white run is the following code;
all it does is use the run length as an index to a table of
run-length codes, and pass the address of the correct entry to the
subroutine shiftin.  As its name implies, this shifts the
variable-length bit codes from the modified Huffman table into a
coherent sequence.

Note that we check for runs greater than 64 and call shiftin with the
makeup code if needed.  If we wanted to handle pages with scan lines
greater than the standard 1728 dots, we'd need make sure that our
table of makeup codes includes the T.4 additions which extend to 2560
dots.

 run_length *= 2 ;
 dots_left -= run_length ;
 if (run_length>63)
 {
     shiftin (&whitemakeup[run_length/64)-1]) ;
     run_length%=64 ;
 }
 shiftin (&whiterun[run_length]) ;
 color=BLACK ;
 run_length=1 ;

The nextwhite routine is fundamentally the same routine; it just looks
up the code in the black run length tables and flips color to WHITE.
Both routines return with run_length set to 1, as they are called
once a dot of a different colour has been located in bit_image.

3.  Ending a scan line

The endaline routine is called when we hit the EOLFLAG sentinel.  If
color is BLACK it first calls nextwhite to end the black run; then it
calls nextblack with run_length set to dots_left divided by 2 to fill
up the trailing white space on the line.  Finally, it calls shiftin
with the address of the T.4 EOL code word, sets color to white, and
dots_left to 1728.

The run_length isn't set to 0 at the start of each line; instead, it
is set reflect the fact that there is a left-hand margin on the
page.  A full line of 79 characters uses up only 1580 of the 1728
dots on a scan line, leaving space for (say) two margins of 74 dots
on each side.  If that is the way you'd like your margins to be
setup, then the endaline routine should set run_length to 37 (the run
will be doubled up to 74 by nextblack when it is encoded).

4.  Constructing an array for the Huffman codes Our technique of
using a run length as an index to an array of codes is basically
quite sound.  We know that no code is longer than 12 bits (the length
of an EOL) and that no code is shorter than 2 bits, so there is no
problem with representing the codes as 16-bit short integers.  But it
is not possible to construct an array whose elements consist simply
of integer values based on the bits in the table.  If you look again
at the black run lengths from 0 to 10, it is clear that
variable-length bit codes can't just be padded out with zeros and
treated as numbers.  The codes for a black run of 2 (11), a run of 4
(011), a run of 5 (0011) and a runs of 7 (00011) are quite different,
but all would be represented as integers with the value 3 if this
were to be done.

Ŀ
run  bit code   integer value 
Ĵ
0    0000110111 55            
1    010        02            
2    11         03            
3    10         02            
4    011        03            
5    0011       03            
6    0010       02            
7    00011      03            
8    000101     05            
9    000100     04            
10   0000100    04            
 .                            
 .                            
 .                            


Since the codes are of variable length some method of indicating the
number of bits in each code is essential. There are a number of different
way of doing this.  One method I've seen used is to insert a single 1 bit
at the start of each code, and build an array which starts as follows :

 short int blackrun [] =
 {
     0x437,  /* 10000110111 */
     0x00a,  /* 1010 */
     0x007,  /* 111 */
     0x006,  /* 110 */
     0x00b,  /* 1011 */
     0x013,  /* 10011 */
     0x012,  /* 10010 */
     0x023,  /* 100011 */
     0x045,  /* 1000101 */
     0x044,  /* 1000100 */
     0x084,  /* 10000100 */
         .
         .
         .

The shiftin routine then needs to isolate each bit in turn and
discard all bits up to and including the first 1; the remaining bits
in each code are the ones that count.  The main problem with this
approach is that it requires 16 bit shifts for all codes irrespective
of the number of bits that they might contain, and the shortest codes
will actually take the longest to identify.  Since the whole basis of
the Huffman coding technique is that the shortest codes are the ones
that occur most frequently, this method is extremely inefficient in
terms of speed.

The best solution is to include the length of each code in the array
as a separate item from its integer representation.  This means that
each element our array of T.4 codes will consists of the integer
representation of the bit code itself together with the number of
bits that the code contains.  This is handled quite simply using a
structure such as the following :

 struct code
 {
  const char count ;              /* number of bits in the code   */
  const unsigned short int bits ; /* bit code as a 16-bit integer */
 } ;

In order to avoid shifting the integer to discard the padding bits, it
also makes sense to left-justify the bits in each code before
representing it as an integer.  The array of T.4 codes for the black runs
constructed according to these conventions begins as follows

 struct code blackrun [] =
 {
     {10,0x0dc0},    /* 0000110111000000 */
     {3,0x4000},     /* 0100000000000000 */
     {2,0xc000},     /* 1100000000000000 */
     {2,0x8000},     /* 1000000000000000 */
     {3,0x6000},     /* 0110000000000000 */
     {4,0x3000},     /* 0011000000000000 */
     {4,0x2000},     /* 0010000000000000 */
     {5,0x1800},     /* 0001100000000000 */
     {6,0x1400},     /* 0001010000000000 */
     {6,0x1000},     /* 0001000000000000 */
     {7,0x0800},     /* 0000100000000000 */
                     /* carry on with the rest of the codes .... */

5.  Shifting in the Huffman codes

The last of the key techniques in the quick encoder is the method for
combining the various codes from the T.4 one-dimensional coding
table.

This is the shiftin routine used above, to which we pass the offset
of the code structure we are interested in.  The subroutine makes a
local copy of both the count of bits and of the padded code word to
be shifted.

We want to store the coded run-lengths consecutively in a character
array at fax.

The key to the workings of this routine is to keep track of the number
of unused bits in each octet being composed (at fax[offset]).  We do
this in the sample using a static variable called spare which is
initialized to 8 at the start of each new octet.  Every time spare is
decremented to 0, we store the octet we have just finished,
re-initialize to 8, and go on to the next octet.

We test the successive in the code word by ANDing with 0x8000, and
shifting the code word left between iterations.  If the result was 0,
we reset the least significant bit of our target to 0 by ANDing with
0xfe (11111110); otherwise we set the same bit to 1 by ORing with
0x01 (00000001).  Again, we shift the target octet left between
iterations; which means that the least significant bit in each octet
is promoted to the most significant bit once spare is decremented to
0.

 count = code->count ;
 codebits = code->bits ;
 static int spare ;
 for (;count;count--)
 {
     if (codebits&0x8000) fax[offset]|=1 ;
     else fax[offset]&=0xfe ;
     codebits<<=1 ;
     if (!(--spare))
     {
         ++offset ;
         spare=8 ;
     }
 else fax[offset]<<=1 ;
 }

We can check to see if the address passed to shiftin is the same as
the address of the EOL code word structure.  The code is slightly
different if we are shifting in an EOL code, as we can take the
opportunity of ending each EOL on a byte boundary. As well as being
preferred by some TIFF readers and class 2 fax modems, this also
enables each scan line to be written to disk as it is generated
(which would be after each call to endaline).  This means that the
memory overhead of the quick encoder is kept small.

We pad out any spare bits in the current octet at fax[offset] simply
by shifting left; then we reinitialize spare to 8, and place 0x0
followed by 0x01 at fax[offset].

 {
     if (spare!=8)
     {
         if (spare!=1) fax[offset]<<=(spare-1) ;
         offset++ ;
         spare=8 ;
     }
     fax[offset++]=0x0 ;
     fax[offset++]=0x1 ;
 }

PART 3 : The full encoder

Our next program is a full encoder, and all the relevant files can be
found in the directory \CODE\ENCODE.  This program takes an ordinary
text file and turns it into either a one-dimensionally coded T.4 fax
image, or a two-dimensionally coded T.4 fax image, or a T.6 fax
image; all these can be either in normal or in fine resolution.  This
means that any input text file can be converted into one of six
possible images.

The full encoder is a larger and slower program than the quick
encoder.

The size is to be expected; this program does a lot more than the
last one.  The speed difference does need some explaining though.
When the same one-dimensional normal resolution T.4 image is
generated with each program in turn, the full encoder is always going
to come in last.

In fact, the quick encoder is always at least 15% faster than the
full encoder, and can be up to 900% faster on text with lots of blank
lines or trailing white space at the ends of lines.  This is because
it only encodes those portions of lines containing text and doesn't
try to generate a bit map of each line.  If you refer back to the
commented fragments on the quick encoder, you'll see that as soon as
the digitizer hits the end of a line, it stuffs an end-of-line
sentinel in the bit map; and as soon as the encoder comes across an
end-of-line sentinel, it works out the remaining number of dots in
the line and simply goes off and encodes that as a white run before
adding an EOL code word.

Unfortunately, this technique cannot be used to generate two-
dimensionally coded images using either the T.4 or the T.6 methods.
This is because two-dimensional coding requires each scan line to be
compared with the previous one.  Therefore every line must be coded
in its entirety, and stopping at the last printable character is not
a viable option.  All the white space at the end of a line (as well
as all the margins) must be completely filled in the digitized
bitmap.

1.  The size of the font

The need for a two-dimensional coder to have access to a fully
digitized version of the previous scan line also means that the trick
used by the quick encoder, of using an 8 x 16 font, adding a extra
element on each side to get 10 x 16 and then doubling all the run
lengths to get 20 x 16, will not work either.  That technique makes
generating a full scan line impossible.

The font.dat file in the \CODE\ENCODE directory is a double width
version of the same font as we used in the quick encoder.  It is
possible to use the 8 x 16 font and stretch it on the fly, but it
isn't a trivial operation, and having a 16 x 16 font ready makes more
sense.  Stretching the font out to 20 x 16 is best done not by adding
bits to each side of every character as it is digitized, but by
stretching out every alternate character to occupy three octets
rather than two.

Our implementation of fine resolution uses exactly the same font as
does normal resolution, with the result that the text appears as
squashed up half-height.  Ideally, the availability of a proper fine
resolution font 32 rows high would enable digitization of a
fine-resolution image using a different value for the height of a
font.  In practise, there isn't a lot of point in doubling up the
height of the existing font, as the fax could take twice as long to
send with no extra benefit in terms of legibility.

2.  Digitizing the text

The following fragment shows how this is done.  Note that we start off
by initialising the entire bitmap for one text line to white space 0
bits; the magic numbers here are 1728 for the bits in each scan line,
16 for the font height (giving 16 scan lines per text line) and 8 for
the bits per byte.  Instead of using an end-of-line sentinel, we
digitize each scan line using a copy of the offset of start, so that
when we reach the end of the text line we can simply encode the next
scan line start from 1728/8 bytes further on.  This way, we ensure
each scan line is padded out with white space at the end.  Similarly,
we start off the line by positioning at the end of the left-hand
margin.

 for (i=0 ; i<(1728*16/8) ; i++) bit_image[i]=0 ;
 image_offset=0 ;
 fgets (text_line,79,infile) ;
 for (row=0 ; row<16 ; row++)
 {
     temp_offset=image_offset+(MARGIN/8) ;
     for (i=0 ; i<strlen(text_line) ; i++) if (text_line[i]>31)
     {
         if (i&1)
         {
             bit_image[temp_offset++]=fonts[text_line[i]][row*2] ;
             bit_image[temp_offset++]=fonts[text_line[i]][(row*2)+1] ;
         }
         else
         {
             c=fonts[text_line[i]][row*2] ;
             bit_image[temp_offset++]|=(c>>4) ;
             bit_image[temp_offset]|=(c<<4) ;
             c=fonts[text_line[i]][(row*2)+1] ;
             bit_image[temp_offset++]|=(c>>4) ;
             bit_image[temp_offset++]|=(c<<4) ;
         }
     }
     image_offset+=(1728/8)
 }

If you don't think well in Boolean terms, the most difficult part of
the above fragment is the code for expanding the font data for the
alternate characters into three octets.  Bear in mind that the data
space is initialized to white space, and that the explanation below
should be read in conjunction with the code.

First we make a copy of the first half of the font data.  Apart from
the speed saving in not having to index the font array four times, I
found that the code was illegible if this wasn't done.  Then we shift
the top four bits into the bottom four bits of the first of our three
data bytes.

Next, we shift the bottom four bits of the first half of the font
data into the second of our three data bytes, and then pick up the
next half of the font data.  The top four bits of this go into the
bottom four bits of the second data byte, with the bottom four bits
going into the top half of the third data byte.

The bottom half of the third data byte, like the top half of the
first, is left as white space.  Since both the previous character and
the next one are digitized as two bytes without any additional
padding, we are coding two character widths into five bytes; which
gives us our 20 bits for one character.

3.  Picking the right encoder

The next step is to pick the right encoder for each line.  If we are
doing T.6 coding, we always use the modified read encoder mrcoder.  If
we aren't doing two-dimensional T.4 coding, we always use the one-
dimensional modified Huffman encoder mhcoder.

If we are doing two-dimensional T.4 coding, we start by initialising
k the default value specified in the CCITT/ITU-T recommendation, and
decrement it after each line has been coded. When it reaches 0 we set
it to the default value again.  Whenever k is at the default, we code
the line one-dimensionally with mhcoder; for all other values of k we
code the line two-dimensionally using mrcoder.

 for (i=0 ; i<(1728*16/8) ; )
 {
     endaline ;
     thisline=&(bit_image[i]) ;
     if ((T6)||((T4_2D)&&(k!=default_k))) mrcoder ;
     else mhcoder ;
     i+=(1728/8) ;
     ifd->ImageLength.value++ ;
 }

For each iteration of the loop, we add 1728/8 to the offset in
bit_image and save the address as thisline; it may be needed by the
two-dimensional coding routine.  If we are doing either T.6 or T.4
two-dimensional coding, the endaline routine with which we begin
includes the following line :

 memcpy (lastline,thisline,MAXDOTS/8) ;

which copies the current scan line for use as the reference line.
Unless we are doing T.6 coding, endaline also places the EOL code
word in the fax.

It's also worth noting that while the quick encoder handled blank
scan lines at the top and bottom by inserting extra EOL sentinels in
the digitized bitmap, the full encoder needs a special routine for
this.

When using two-dimensional coding, the way a blank line is handled
depends on the contents of the previous line.

4.  One-dimensional encoding

One-dimensional coding is in fact simpler than the version included
with the quick encoder, as we don't need to worry about EOLFLAG
sentinels.

Once again, we use the value 0x00 for white and 0xff for black.  The
nextwhite and nextblack routines, with their associated tables of T.4
Huffman code words, are almost identical to the quick encoder
versions.

The main difference is that the run-lengths aren't doubled, and that
there is no need to keep track of the total number of bits in each
line, since we know that all 1728 will be present.

 for (i=0 ; i<1728/8 ; i++)
 {
     octet=thisline[i] ;
     for (c=0x80 ; c ; c>>=1)
     {
         if ((color&c)==(octet&c)) run_length++ ;
         else color ? nextwhite (T4) : nextblack (T4) ;
     }
 }
 if (run_length) color ? nextwhite (T4) : nextblack (T4) ;

The last line takes care of the final run length of each scan line.

5.  Two-dimensional encoding

Two-dimensional coding requires extra entries in the data tables, for
shifting in tag bits and the various two-dimensional code words.  The
HUFF.DAT file includes these additions after the same one-dimensional
code words as the quick encoder used.    The same method of including
the number of bits in the code word followed by the code itself as a
left- justified short integer is applied.

While the two-dimensional coding method certainly looks more
complicated than on-dimensional coding, it is actually a fairly
mechanical process, and follows the method outlines in the
CCITT/ITU-T flowchart 7/T.4.  The key to this is the accurate
identification of the key markers a0, a1, a2, b1 and b2.

The two-dimensional coder has access to both the current and the
previous scan line in the two arrays pointed at by thisline and
lastline, which we have already seen being set up.  These are of
course arrays of bits, while the markers a0, a1, a2, b1 and b2 are
all stored as integers.  The following fragment, which identifies the
colour of a particular bit in the current line identified by the
integer a0, is typical of the way that the individual bits in a line
can be accessed.

 i=(a0/8) ;
 this_bit=(0x80>>(a0%8)) ;
 if (thisline[i]&this_bit) color=BLACK ; else color=WHITE ;

The integer a0 is divided by 8. The result is the index for the byte
which contains the bit we want, and any remainder is used to shift
the mask 10000000 stored in thisbit to the right.  The Boolean
expression thisline[i]&this_bit is true if the bit is 1 (black) or
false if the bit is 0 (white).

Before the two-dimensional coder starts, the marker a0 is set to the
imaginary position just before the start of the line.  Both the colour
of this imaginary dot stored in color and the colour of the same bit
on the reference line stored in ref_color are set to be WHITE.  The
normal coding definitions of  WHITE as 0 and BLACK as 0xff apply.

The coder is basically a loop which relies for its functioning on the
correct setting for the marker a0; once this reaches 1728 the loop
terminates.  (For clarity, a number of checks to stop the various
markers running over the end of line have been omitted from this
fragment).

The first task of the coder is the identification of both the colour
of the a0  bit on the current line, and the colour of the
corresponding bit on the reference line.  Next, the value of the
first element of the opposite colour on the same line is saved as a1,
and the first changing element of the opposite colour on the
reference line is saved as b1.

You'll notice that this last step can require two attempts, as the
first changing element on the reference line isn't always of the
opposite colour to a0.  Finally, the next changing element after b1
on the reference line is saved as b2.

(You should read up on the T.4 specification for two-dimensional
coding if any of this is unclear.) Once the key values have been
identified, two-dimensional coding is simple.  If b2 is less than a1,
we shift in the code for pass mode, set a0 to the value of b2 and
start the loop again.

If the difference between b1 and a1 is no greater than 3, we shift in
the correct code for vertical mode.  This is taken from an array of
the possible codes, indexed by the expression b1-a1+3]. We then set
a0 to the value of a1 and start the loop again.

If neither pass mode nor vertical mode applies, we shift in the code
for horizontal mode.  We then identify the next changing element
after a1 on the current line as a2, and (using the colour of a0)
shift in the normal T4 code word for the run a0a1, followed by the
code word for the run a1a2 of the opposite colour.  After setting a0
to the value of a2 we start the loop again.

 a0=-1 ;
 color=WHITE ;
 ref_color=WHITE ;
 while (a0<1728)
 {
 if (a0!=-1)
     {
     i=(a0/8) ;
     this_bit=(0x80>>(a0%8)) ;
     if (thisline[i]&this_bit) color=BLACK ;
     else color=WHITE ;
     if (lastline[i]&this_bit) ref_color=BLACK ;
     else ref_color=WHITE ;
     }
 for (a1=a0+1;a1<1728;a1++)
     {
     i=(a1/8) ;
     this_bit=(0x80>>(a1%8)) ;
     if ((thisline[i]&this_bit)!=(color&this_bit)) break ;
     }
 for (b1=a0+1;b1<1728;b1++)
     {
     i=(b1/8) ;
     this_bit=(0x80>>(b1%8)) ;
     if ((lastline[i]&this_bit)!=(ref_color&this_bit)) break ;
     }
 if (ref_color!=color) for (b1++;b1<1728;b1++)
     {
     i=(b1/8) ;
     this_bit=(0x80>>(b1%8)) ;
     if ((lastline[i]&this_bit)!=(color&this_bit)) break ;
     }
 for (b2=b1+1;b2<1728;b2++)
     {
     i=(b2/8) ;
     this_bit=(0x80>>(b2%8)) ;
     if ((lastline[i]&this_bit)==(color&this_bit)) break ;
     }
 if (b2<a1)                              /* check for pass mode */
     {
     shiftin (&passmode,T4) ;
     a0=b2 ;
     continue ;
     }
 if (((b1-a1)<=3)&&((a1-b1)<=3)) /* check for vertical mode */
     {
     shiftin (&vertical_mode[b1-a1+3],T4) ;
     a0=a1 ;
     continue ;
     }
 for (a2=a1+1;a2<1728;a2++)          /* we must be in horizontal mode */
     {
     i=(a2/8) ;
     this_bit=(0x80>>(a2%8)) ;
     if ((thisline[i]&this_bit)==(color&this_bit)) break ;
     }
 shiftin (&horizontal_mode,T4) ;
 run_length=(a1-a0) ;
 if (a0==(-1)) run_length-- ;
 color ? nextwhite (T4) : nextblack (T4) ;
 run_length=(a2-a1) ;
 color ? nextwhite (T4) : nextblack (T4) ;
 a0=a2 ;
 }

6.  Shifting in the Huffman codes

The routine for shifting in the variable length bitcodes is basically
the same as that used by the quick encoder.  There are two
significant differences.

The first is that the routine to shiftin in and byte-align EOL codes
also takes care of decrementing the k parameter needed for T.4
two-dimensional encoding; the tag bit indicating the coding used for
the next line is also inserted at this point.

The second difference from the shiftin routine used by the quick
encoder is that the full encoder writes each octet to the output file
as it is completed instead of placing it in a coded scan line which
is written to disk once the line is complete.  The reason why the
full encoder has to do this is because it also performs T.6
two-dimensional coding, which doesn't use any end of line markers.
This means that it is not possible to byte-align T.6 scan lines, and
therefore it isn't practical to write out lines at a time.

The standard macro putc is used to write octets to the output file;
since this does its own disk buffering, the result of not writing
lines at a time doesn't have any serious speed impact on the code. It
does make a difference to the modularity of the software, as the
shiftin module needs access to the output file handle in order to be
able to write octets.

The version of shiftin used by the full encoder also needs to be able
to cater for the special case where the output file ends in the
middle of an octet.  The code in the subdirectory manages this by
calling shiftin with a null pointer, whereupon the octet currently
being composed is padded out with 0 bits and written to disk.  This
use of a null pointer to handle a special case is possibly a
deviation from what is generally recognised to be best programming
practise.

PART 4 : The TIFF modules

There are two TIFF-specific modules needed by the suite of fax
programs in all the other subdirectories.  You should review the Aldus
TIFF specification if you aren't sure what the TIFF file format is
and what IFD entries are, as the descriptions given here of the TIFF
portions of our fax suite assume a familiarity with concepts such as
IFDs.

The aim of these two modules is to provide all the necessary general-
purpose TIFF-related services that might be needed by other
functions.

We therefore attempt to isolate as much as possible of the
TIFF-specific code from the fax code proper.  The header and IFDs
that are included with images in a TIFF file are regarded as wrappers
whose sole purpose is to describe and isolate a fax image.  Some of
the paraphernalia that goes along with this doesn't concern us, but a
number of other elements of a TIFF most definitely are of interest.
The programming interface between these modules and the other parts
of our fax suite needs to be designed to allow us the luxury of
ignoring the tedious details involved in TIFF programming, together
with the freedom to alter those portions of the image description
when it suits us.

Designing an interface

There are two ways of approaching this task.  We could either design
a custom interface as a means of passing information between the TIFF
modules and the fax functions, or else we could allow the fax
functions access to the pre-existing TIFF image descriptors.

There is little doubt that according to conventional programming
wisdom, the former option is the better one.  Designing a special
interface, in theory at least, permits easy adaptation of the program
for other image formats, while prohibiting direct access to TIFF
field entries ought to result in fewer bugs.   On the other hand, it
must be remembered that our fax modules are based on the storage and
retrieval of raw fax images, and any theoretical advantage in
permitting use of other file format apart from TIFF is completely
negated by the fact that there aren't any other public file formats
that allow raw fax images to be included.

On the same basis, it could also be argued that assignment of the
actual ownership of an IFD to any one module is an arbitrary decision
in the first place.  The fact is that a TIFF IFD is a complex
structure which can't easily be separated into parts.  While
following and creating the linked list of IFDs in the file are the
responsibilities of reader and writer modules respectively,
validating the content of the entries in each IFD is a task that
could just as easily be assigned to the fax modules themselves.

Our two TIFF modules show how each of these different approaches work
out in practise.  While all the functions take a file pointer, the
TIFF reader is also passed a structure pointer to a custom IMAGE
descriptor of our own devising, which it has to fill in. In contrast,
the TIFF writer is called only with a file pointer, and returns a
pointer to an IFD structure instead.

1.  The TIFF reader

Because the TIFF reader creates its own image descriptor, it is
probably the more complex of the two modules in terms of code; but
since it doesn't require any knowledge of TIFF, it may well be
simpler conceptually.  The thinking behind the design is that a
single function should be provided which, when called with a file
pointer, will return all the information we need in order to fax an
image from the file. The information which we need in order to
accomplish this falls into four broad categories :

a)  Those items which describe the location of the image data.  There
are two of these ; the offset of the start of the image in the file,
and its length in bytes.

b)  Items which convey a logical description of the image and enable
us to set various options in the fax negotiation phase.  These
include variables such as the length of image, its resolution, and
the type of coding used in its construction.  If non-standard widths
are to be supported, then the width of the fax must also be included.

c)  We need to know whether a particular image is to be followed by
another one or not; this information is necessary for sending the
correct post-message command.

d)  Finally, we have those TIFF-specific items such as whether a file
contains duplicate images in different formats, and the ordering of
the bits within a byte.  These items may have no equivalents when
using less flexible graphics file format.

All this information is bundled up into an IMAGE structure, the
address of which is passed to the tiffread_ifd function together with
the file pointer.  The prototype for the function is

int tiffread_ifd (FILE *,struct IMAGE *) ;

and it returns 0 if the file could not be read.  For successful calls
all information relating to the image read are returned in the IMAGE
structure.  Note that detection of a non-faxable image is considered
to be a successful call; we just discard any such image and go on to
the next one.  This is perhaps not really in the spirit of TIFF; any
deviation from the restricted set of options we permit in a TIFF file
means that the file is considered non-faxable.

The structure definition is as follows :

 struct IMAGE {
    unsigned long int offset ;
    unsigned long int bytecount ;
    unsigned short int width ;
    unsigned short int length ;
    unsigned char options ;
    unsigned long int ifd ;
    unsigned short int page ;
    unsigned char bitorder ; } ;

The location of the image data is returned in the first two members.

Reading an image is simply a matter of doing an fseek to the correct
place in the file as passed in IMAGE.offset and reading the number of
characters passed in IMAGE.bytecount.

The logical description of the image is contained in the next three
members.   If we were happy with default values none of these would
be needed for sending a fax, as a normal resolution one-dimensional
A4 page is guaranteed to be acceptable to all fax machines.  The
width and length of the fax are really provided more for completeness
rather than out of necessity.  While few faxes correspond exactly to
A4 length (since they start from an A4 original and add a header)
this doesn't usually matter, since virtually all faxes accept pages
of unlimited length.  We also don't need to know the width unless we
want to make use of non-standard fax dimensions; 1728 dots can
otherwise be assumed.

In contrast, IMAGE.options is much more useful.  It contains the
CCITT/ITU options used in encoding the image, and is bit-mapped as
follows :

 8 7 6 5 4 3 2 1
 |           | |
 |           | --- set for 2-D coding
 |           ----- set for T.6 coding
 ----------------- set for fine resolution

The reason for this member of the structure being bit-mapped is that
the resolution and the coding are independent parameters, which
translate directly to T.30 bit fields.  (The unused bits allow for
expansion to include any unsupported options, such as superfine and
inch based resolutions, or uncompressed mode when coding.)

The presence of IMAGE.ifd as a structure member is used to hold the
offset of the next IFD in the TIFF file.  As the specification
states, the last image in the file has a null IFD offset; examining
IMAGE.ifd therefore enables us to work out the type of post-page
message to sent if an image is being read for transmission.

In the TIFF reader in the subdirectory, we've opted to make the use of
this structure member rather more complex.  As well as being read by
the caller to detect whether an image is the last in a file, it can
also be set by a caller to tell tiffread_ifd which image in a file
should be read next.

When a previously unknown file has just been opened, the caller must
set the IMAGE.ifdt to 0.  This is recognised by tiffread_ifd as a
special case.  When it is called with an IFD offset of 0, the
function verifies that there is a proper TIFF header in the file and
then goes on to read the first IFD.  The members of the IMAGE
structure are filled in, including IMAGE.ifd, which holds the offset
to the next IFD.

Subsequent calls to tiffread_ifd, for as long as IMAGE.ifd is
non-zero, will begin with an fseek to this offset to obtain the
location and description of the next image.  On reading the last IFD
in the file, the offset returned will be zero again; any subsequent
read will begin at the first image again.

The advantage of returning the proper offset to an IFD (rather than a
result code) is that where a TIFF file does contain multiple images,
building a table of IFDs enables quick access to any particular one
that may be required without having to search through the links in
the file first.  Though this isn't a feature that is exploited in the
code in the subdirectory, it isn't difficult to envisage applications
for which it is useful apart from disentangling multiple-format TIFF
images.  Fax- on-demand and selective polling operations spring to
mind immediately.

The disadvantage is that the IFD offset might be corrupted, or set to
a stupid value by mistake; but this applies equally to the normal
file and structure pointers used when calling the tiffread_ifd
function.

In any event, the caller is relieved of the responsibility for
checking that an image is suitable for faxing as the TIFF image
descriptor is verified within tiffread_ifd.  Apart from the
possibility of an invalid offset (for which the function returns with
a 0), if an invalid or unfaxable IFD is read, IMAGE.offset returns
set to 0.

This gives a caller to tiffread_ifd the following cases to handle :

i.  If ((image->offset==0) && (image->ifd==0)) then the current image
    is unsupported and there are no more in the file.

ii. If ((image->offset==0) && (image->ifd!=0)) then although the current
    image is unsupported, the file contains more images which may be
    readable, and the caller can continue to process the file.

iii.If ((image->offset!=0) && (image->ifd==0)) then the current image is
    supported but the file contains no more image after this.

iv. If ((image->offset!=0) && (image->ifd!=0)) then the current image
    is supported but the file contains no more images.

The remaining TIFF-specific members of the structure have no real
parallel in any other image format, but are quite straightforward.

IMAGE.page enables us to detect TIFF files which include identical
pages in different formats.  The fact that TIFF allows this is one
solution to the problems caused by the fact that making the optimum
fax connection is impossible without real-time encoding of a fax
image.  The code in the subdirectory for sending faxes doesn't attempt
to implement this feature though.

Finally, IMAGE.bitorder is included in the structure in order that we
can read TIFF class F files created by other applications. It simply
returns the ordering of bits within the bytes as expressed in the
TiffOrder IFD entry.

While the actual code in the TIFF reader is quite straightforward,
and therefore quite boring, there are a few points of interest which
are worth pointing out here.

The fact that Apple Macintosh systems and IBM PC clones can be
networked together means that it is quite possible for a fax server
to use a different byte ordering convention to a workstation.  The
first four bytes in the file need to be carefully examined to check
for such mismatches.

The reader maintains two variables; file_order is set for Intel byte
ordering if the first two bytes in the file are II, but is set for
Motorola byte ordering if the byte are MM.  If the next two bytes
contain the integer 42, then the cpu_order matches the byte order and
the file can be read with no problem.

If file_order and cpu_order don't match, then the ordering of the
bytes in all the values in the IFD needs to be reversed.

The following fragment both checks the first four bytes to ensure
that we have a valid TIFF file, and also enables a mismatch in the
byte ordering to be detected :

 fread (&tiff_id,1,2,infile) ;
 if (tiff_id==0x4949)
 {
     file_order='I' ;
     cpu_order='M' ;
 }
 else
 {
     if (tiff_id==0x4D4D)
     {
     file_order='M' ;
     cpu_order='I' ;
     }
     else return (0) ;
 }
 fread (&tiff_id,1,2,infile) ;
 if ((short int)tiff_id==0x2A)
 {
     cpu_order=file_order ;
 }
 else
 {
     reverse(&tiff_id,2) ;
     if (tiff_id!=0x2A) return (0) ;
 }

Once this fragment has executed, we have to reverse the order of the
bytes in any value read from the file if (file_order!=cpu_order).  We
can easily do this using the following function; it reverses the
order of any sequence of bytes in memory.  We call it with the
address of the value to be reversed, and the number of bytes used to
hold the value.  While in our code this number is either 2 for a
short or 4 for a long, the function is a general-purpose one capable
of reversing the order of any number of bytes.

 void reverse (void *pointer,int countdown)
 {
     int countup=0 ;
     unsigned char store ;
     unsigned char *address = (unsigned char *)pointer ;
     if (pointer==NULL) return ;
     if (countdown%2) return ;
     while (countup<countdown)
     {
         countdown-- ; store=address[countdown] ;
         address[countdown]=address[countup] ;
         address[countup]=store ;
         countup++ ;
     }
 }

The main body of the reader simply loop through the IFD checking its
validity and its suitability for faxing.  The image is rejected as
incorrectly formatted if the tags are in the wrong order, or if
essential tags are missing.  It is also rejected as unfaxable if the
image isn't a bilevel one (BitsPerSample and SamplesPerPixel must
both be 1), if the bit image doesn't interpret bits set to 1 as being
black (PhotometricInterpretation must be 0), if there are multiple
strips in the image, or if the width is not 1728.

The reader is, however, quite lax about the horizontal and vertical
resolution of the fax, and doesn't insist on any particular values.
If the aspect ratio is nearer to 1:1 than 1:2 then the fax resolution
is set to normal; otherwise it is set to fine.

2.  The TIFF writer.

This module is to be found in TIFWRITE.C, and contains two functions.

Both functions are called with a file pointer, and returns an integer
value, which is zero on failure and non-zero on success.  The caller
needs to know how to modify TIFF IFDs safely to use the writer
functions correctly.

The first function is tiffwrite_head, which is called with a pointer
to an open file before any images are written to it; in fact, the
file need not be empty, but if it isn't, any contents are liable to
be overwritten.

As its name implies it writes a TIFF header at the start of the file,
with a dummy long pointer of zero at the location reserved for the
first IFD, in bytes 4-7.

As well as returning non-zero on success, this function also returns
a structure pointer to a pre-initialized IFD, which is included with
each image that is written to the file. The structure definition is
found in TIFF.H, and exactly follows the standards laid down in the
Aldus TIFF specification.  The structure ends with four long integers
containing the values required by the Xresolution and Yresolution
TIFF fields, which are defined as rationals and therefore can't be
included in the IFD itself.

The IFD is initialized as follows :

 #define BYTE     1
 #define SHORT    3
 #define LONG     4
 #define RATIONAL 5
 static struct IFD ifd =
 {
     16,                 /* number of entries in the IFD */
     {254,LONG,1,2},     /* NewSubfileType is 010 binary */
     {256,LONG,1,1728},  /* ImageWidth 1728 */
     {257,SHORT,1,0},    /* ImageLength filled in later */
     {258,SHORT,1,1},    /* BitsPerSample is a default */
     {259,SHORT,1,3},    /* Compression is 3 */
     {262,SHORT,1,0},    /* PhotometricInterpretation is 0 */
     {266,SHORT,1,1},    /* FillOrder_tag is a default */
     {273,LONG,1,0},     /* StripOffsets (address filled in later) */
     {277,SHORT,1,1},    /* SamplesPerPixel is a default */
     {278,SHORT,1,0},    /* RowsPerStrip = ImageLength */
     {279,LONG,1,0},     /* StripByteCounts (size filled in later) */
     {282,RATIONAL,1,0}, /* Xresolution (pointer filled in later) */
     {283,RATIONAL,1,0}, /* Yresolution (pointer filled in later) */
     {292,LONG,1,0},     /* T4Options is a default */
     {296,SHORT,1,2},    /* ResolutionUnit is dpi */
     {297,SHORT,2,0},    /* PageNumber */
     0,                  /* nextifd (filled in later if needed) */
     {80300L,394L},      /* Xres fraction */
     {38500L,394L}       /* Yres fraction */
 } ;

The defaults are broadly those for TIFF Class F, and assume a single-
strip image 1728 dots wide, with one-dimensional coding and a
horizontal resolution of 8.03 dots per millimetre with a vertical
resolution of 3.85 dots per millimetre.  However, the resolution
units are given in inches, using a conversion factor of 3.94 mm/inch.
The StripOffsets value is filled in with an 8, which is the offset of
the next byte in the file after the header is written; it is at this
offset that any further data will be written to the file. The final
task of tiffwrite_head is to initialize a static pointer lastifd,
holding the address of the last IFD pointer, to 4.

On a non-zero return from tiffwrite_head, the calling program can
begin to write the first fax image sequentially from StripOffsets.

Once the image is complete, the second TIFF writer function
tiffwrite_ifd is called.  As described earlier, it is called like
tiffwrite_head with a file pointer, and returns zero on failure and
non-zero on a successful write.  While it does not take or return an
IFD structure pointer, it has access to the same IFD structure as the
first writer function passed to the calling program.  Its purpose is
to write this IFD to the file after the image.

The first step is to find out how many bytes are in the image.  This
is done by using the ftell function to obtain the current file
position, and subtracting the start of the image as stored in
StripOffsets.  The length of the image in bytes is stored in
StripByteCounts in the IFD structure.

The second step begins by writing a dummy byte if the current file
position (or StripByteCounts for that matter) is an odd number as all
TIFF IFDs must be written on even byte boundaries.  The resulting
file position is then written to the offset held in the static
pointer lastifd.

The third step is to update the rest of the IFD structure.  While the
caller is able to modify any of the members, typically only those
values relating to T.4 and T.30 options would be altered, such as the
compression scheme used and the image resolution.  The only structure
member that the caller must update is the number of scan lines in the
image, held in ImageLength.  The function can hardly be expected to
re- read and decode the whole image looking for scan lines to count.
Since the writer handles single strip images only, RowsPerStrip is
set the same value as ImageLength. The file position is also used as
the basis for updating those portions of the IFD structure that
require file offsets; notably the pointers to XResolution and
YResolution.

The fourth step is to write that IFD to the file.

fwrite (&ifd,1,sizeof(ifd),outfile) ;

The fifth and last step is to re-initialize StripOffsets to the
offset immediately after the IFD, in case another image is to be
written; and also to ensure that the offset of the IFD structure
member holding the offset of the next IFD in the file is stored at
lastifd for later updating if required.  Other IFD housekeeping such
as incrementing the Page can also be handled at this point if the
caller doesn't do this.

PART 5 : The decoder

The next program in our suite is fax image decoder.  There are twice
as many lines of code in the \CODE\DECODE directory as there are in
the ENCODE directory; the reason why the decoder is so much more
complex than an encoder is because it has to cater for the
complexities of different types of output devices, which are
controlled in different ways, fed with different types of data, and
have different resolutions.

It is easier to look at a decoder as alternating between two distinct
processes.  The first process is concerned with the decoding proper.
It takes fax data as input and produces scan lines composed of
run-lengths totalling 1728 dots as output.  The second process is
concerned with reproducing the image; it takes the scan lines and run
lengths as input and outputs a stream of instructions and graphics
data which enable a specific output device to print or display an
appropriate image.  In many respects, getting the run lengths out of
the coded data is the easier process to write.  The UNFAXIFY.C module
in the subdirectory has all the necessary routines, and UNHUFF.DAT
has the necessary data structures.

1.  The TIFF wrapper

We start by considering the different types of fax data a decoder
needs to deal with.  It has to be able to cater for images in either
normal or fine resolution.  An image might be composed of entirely of
one- dimensionally coded lines separated by EOL codes. Alternatively,
in the case of a T.6 image, it might be composed entirely of
two-dimensionally coded lines with no EOL codes at all.  The final
case is the T.4 two- dimensional image, which consists of a mixture
of one-dimensionally coded lines and two-dimensionally coded lines,
which are separated by EOL codes. Each of these lines starts with a
tag bit, indicating the type of coding to be used on each line.

It isn't possible to decode any image unless we know before we see
the data which of these options we used in the encoding.  This
information can't be deduced from the data itself, but must be
provided by some other methods.  This is where the TIFF file format
comes into its own.  When we standardize all images using the TIFF
format, an image doesn't come to us naked, but is always enclosed in
a wrapper.  Examining this wrapper tells us the significant facts we
need to know about the file.  The tiffread routine (which is
discussed in a separate section) looks at the wrapper for us.  All
our fax file decodings begin with a call to tiffread; this tells us
the encoding options used to create the image, together with its size
in bytes.

Since a TIFF file can contain multiple fax images, we also end each
image decoding with a call to tiffread; this tells us whether we have
to repeat the whole process for the next image in the file.

2.  The structure of an image decoder

The basic structure of an image decoder is a loop through every bit in
a file in order to isolate the code words used to put it together.
Since we are always dealing with computer architectures that are
oriented towards dealing with bytes rather than bits, this has to be
handled using two loops.  The outer loop uses the image size in bytes
as the control variable and reads octets from the file; while the
inner loop used the number of bits in a byte as the control variable
and isolates bits from each octet.

 for (; image_size ; image_size--)
 {
     octet= getc(faxfile) ;
     if (FillOrder==2) octet=backward[octet] ;
     for (i= 0 ; i<8 ; i++, octet<<= 1)
     {
                             .
                             .
 (here we decode, using the most significant bit of octet)
                             .
                             .
     }
 }

Our code uses TIFF FillOrder=1, or left-to-right bit ordering.  I
believe that when the bit ordering in each octet is the same as the
printed representation, it aids understanding and leads to fewer
bugs.

If the image was stored directly from a modem that produces bytes in
what the TR-29.2 committee refer to as direct bit ordering, it ought
to have been stored using TIFF FillOrder=2.  In this case, the bits
in the octet will be filled in from right to left rather than from
left to right.  The statement octet=backward[octet] corrects this
situation via a lookup table which is simply an array of the
characters from 0 to 255 with the bit patterns reversed.  You'll find
this table in the file \CODE\DECODE\UNHUFF.DAT.

3.  Setting up a decoding tree.

There is more than one way of decoding the bits that go to make up an
image.

Suppose we know that the code word we are looking for is a black run-
length.  The relevant table of codes for these run-lengths begins as
follows (for convenience, we ignore the code for a run-length of 0
and begin with 1).

 Ŀ
 1 010    
 2 11     
 3 10     
 4 011    
 5 0011   
 6 0010   
 7 00011  
 8 000101 
 9 000100 
 100000100
 110000101
 120000111
 

One method of identifying these codes in a bitstream, which is found
in some texts, is to start by constructing an array containing all
the possible codes stored as integers, together with the number of
bits they contain, and the run length that they correspond to.  As an
example, we show the start of such an array for the black run lengths
in the above table :

 struct codewords
 {
     short int count_bits ;
     short int code_word ;
     short int run_length ;
 } ;

 struct codewords blackruns [] =
 {
     {2,0x03,2},
     {2,0x02,3},
     {3,0x02,1},
     {3,0x03,4},
     {4,0x03,5},
     {4,0x02,6},
     {5,0x03,7},
     {6,0x05,8},
     {6,0x04,9},
     {7,0x04,10},
     {7,0x05,11},
     {7,0x07,12},
         .
         .
         .

The search for a code word begins by initialising both a test integer
and the number of bits in the code word to zero.

As each bit is extracted from the image, it is shifted into the test
integer from the right, and the count of bits in the code is
incremented.

The array is then searched for a match of both the number of bits in
the code and the test integer, and if no match is found then the
procedure is repeated.  Though intuitively obvious, this method is
actually very inefficient. Every successful search for a run-length
must always be preceded by unsuccessful searches for all shorter
run-lengths; and especially in the case of the longer code words, the
time taken to find a particular run- length is arbitrarily determined
by the way the codes happen to be ordered.

A much better way of decoding an image can be used if the code words
are arranged in a binary tree.  This method eliminates the needs for
multiple unsuccessful searches through a list and requires just one
test for each bit in the code.  It also guarantees that the time
taken to find the meaning of a particular code word depends only on
its length, and not on any arbitrary ordering of the code words in a
list.

The fact that a tree is one of the best ways of decoding T.4 one-
dimensional images is hardly surprising.  The principle on which the
T.4 Huffman codes are constructed is that the possible runs are first
sorted according to frequency, and the frequencies are used to build
a tree.  All this decoding method does is to reverse the process.

Our tree consists of nodes, each of which has two branches.  A branch
can lead either to another node (which has its own branches) , or
else to a leaf, which holds a run-length.  The first node in the tree
is called the root.

A representation of a binary tree for the black codes for runs of 1
to 12 looks as follows :

                                        0 <----- ROOT -----> 1
                                             /          \
                                          NODE          NODE
                                         /    \         /  \
                                     NODE      NODE    3    2
                                     /  \      /  \
                                NODE    NODE  1    4
                                /  \    /  \
                              /      \ 6    5
                            /          \
                        NODE            NODE
                        /   \           /  \
                      /       \       NODE  7
                    /           \     /  \
                  /               \  9    8
                /                   \
            NODE                    NODE
            /  \                   /    \
        NODE   xxx             NODE      NODE
        /  \                   /  \      /  \
    NODE   xxx               10    11   0    12
    /  \
 EOL  xxx

The method for traversing of the tree is quite straightforward.
Beginning at the root, we take successive bits from the bit-stream.
If the bit is a 0 we take the left-hand branch of the tree, and if
the bit is a 1, we take the right-hand branch.  If another node is
reached, we replicate the procedure with the next bit, continuing
until a leaf with a valid run- length is reached instead of another
branching node.  The sample tree illustrated has some apparently
rotten branches that don't lead either to nodes or to run-lengths,
but this is only because it is a partial tree that decodes only the
black runs from 1 to 12.  You can take any of the codes in the table
and trace them through the tree to see how the run length is
extracted.

The neat thing about this sort of data structure is that it doesn't
contain any of the code words for the run lengths as data items;
instead, the code words are implicit in the structure of the tree
itself.  The Huffman codes for the run-lengths revert to their
original nature as descriptions of the location of a run in the tree.

Turning the tree into a data structure isn't terribly difficult.  We
can quite easily construct a two-dimensional array of which reflects
the tree structure.  The following example shows how this can be
done; the array below corresponds exactly to the tree shown above.
Even the rotten branches (caused by the fact that the tree only
contains the black runs from 1 to 12) are present.

 int btree [][2] =
 {
         {2, 1},         /* 0 */
         {-3, -2},       /* 1 */
         {4, 3},         /* 2 */
         {-1, -4},       /* 3 */
         {6, 5},         /* 4 */
         {-6, -5},       /* 5 */
         {9, 7},         /* 6 */
         {8, -7},        /* 7 */
         {-9, -8},       /* 8 */
         {xxx, 10},      /* 9 */
         {12, 11},       /* 10 */
         {xxx, -12},     /* 11 */
         {-10, -11},     /* 12 */
                 .
                 .
                 .

The rows of the array consist of pairs of numbers, which correspond to
the two branches from each node of the tree. We begin decoding by taking
row 0 of the array, which is effectively the root of the tree. If the
first bit we look at is 0 then we take the left-hand branch and examine
the number at [0][0], while if the bit is a 1 we take the right-hand
branch and examine the number at [0][1].  A positive number means that we
have found the row corresponding to the next node in the tree, while a
negative number indicates that we have arrived at a run-length, which is
simply the negation of the number we have found.  If we reach another
node, we take the next bit and go through the same procedure, while if we
reach a run-length we can output it, and start again at the root with the
next bit.

4.  Using a decoding tree

The file UNHUFF.DAT in the subdirectory includes three trees.  One
tree contains white run lengths (wtree), a second tree contains black
run lengths (btree)and a third contains all the two-dimensional codes
for pass mode, vertical mode and horizontal mode (twotree).  A common
pointer tree is set to whichever tree happens to be one that needs to
be used.

The simplest decoding case to consider is a series of run-lengths
coded one-dimensionally (we assume that there are no errors).  We
start off (as with all fax lines) by setting colour to WHITE, the
tree pointer to address of the white codes, and setting the initial
code to 0.  Using the same basic double loop we specified earlier
(and assuming TIFF FillOrder=1 this time), the decoding proceeds as
follows :

 color= WHITE ;
 tree= &wtree ;
 code=0 ;
 for (; image_size ; image_size--)
 {
     octet= getc(faxfile) ;
     for (i= 0 ; i<8 ; i++, octet<<= 1)
     {
         code= (*tree)[code][(octet&0x80)>>7]  ; if (code<1)
         {
             code= (-code) ; output_run_length(code) ;
             if (code < 64)
             {
                 color= (~color) ;
                 tree = color ? &btree : &wtree ;
             }
             code= 0 ;
         }
     }
 }

The recursive nature of the redefinition of code using itself as one
of the indices into the array is typical of the way that binary trees
are traversed.  The rather awkward expression (octet&0x80)>>7
effectively casts the leftmost bit of an octet as an integer, and
enables it to act as the other index.

The line code= (*tree)[code][(octet&0x80)>>7] is repeated for each
bit until code returns with a negative number. We know that we have
hit a run length in this case, and negate the value to get the proper
run length.  The run is then output.

If the run was less than 64, then the code was a terminating one and
the both the colour and the tree being used has to be changed.  We
use the same definitions of WHITE as 0x00 and BLACK as 0xff as we
used when we were encoding; and changing the colour is therefore
handled using ~color just as in ENCODE and QENCODE.

While the ternary operator ?: is used to swap trees because of its
elegance, the alternative version using if (color) tree=(&btree) ;
else tree=(&btree) may be used instead.

If the run was 64 or greater, then the code was a makeup one, and the
colour remains the same.

In both cases, setting code=0 ensures that the decoding process begins
again at the root of whatever tree happens to be used next.

5.  Messy but important details

The decoding routine in UNFAXIFY.C has the tree-based decoding method
shown above at its core.  However, as it stands, the code fragment
can only deal with infinitely long error-free one-dimensional T.4
data.

In order to make the code function effectively as a purely one-
dimensional decoder, additional code must be added to handle the EOL
code and any padding that precedes it.  The start of the decode must
always be a search for an initial EOL.  Along with end-of-line
handling, routines must also be added to handle the most common error
conditions.  These include lines that are too long, and data that
doesn't decode properly.

In both cases, a resync flag can be set and used to ignore all
subsequent data until the next EOL code is detected.

The action to be taken when a decoding error is discovered is
something that is up to a programmer to decide.  Fax machines often
replace suspect lines with a duplicate of the previous good line;
alternatively, blank lines can be inserted, or else the line can
simply be ignored.  There is no way telling (short of a visual
inspection) whether a partial line contains good data or flawed data.
Generally speaking, the human eye is liable to make a better job of
working out what the missing bits in any image ought to be, and
leaving bad lines blank is often the best course; there is a lot of
redundancy even in normal resolution faxes.

Enhancing a one-dimensional decoder to handle two-dimensional data
isn't precisely trivial, but is quite straightforward in programming
terms.

The key to the task is to maintain ands inspect the various flags
which are needed in order to let that decoder knows what coding
scheme is being used and what state it has reached.

These flags can get pretty complex.  For example, one flag will be
used to hold the encoding scheme used to create the image.  The EOL
handling routine will need to be sensitive to this flag; in the case
of two- dimensional T.4 data the bit after an EOL is the tag bit
which identifies the type of coding to be used on the following
line.  The tag bit itself will be used to set a second flag, which
must be inspected if the first flag is set for two-dimensional data,
and which holds the coding scheme for any particular line.

(We ought to note in passing that the tag bit is the sole method of
determining the coding method; the k parameter used when encoding
only gives the maximum number of two-dimensional lines that can occur
in succession.  A two-dimensional fax that consists entirely of on-
dimensional lines with all EOL codes followed by a tag bit of 1 is
perfectly legal.) Returning to our flags,  a third flag is needed for
two-dimensionally coded lines which indicates the mode that is being
used, and in the case of horizontal mode, which of the two possible
runs is being assembled.

The point of all these flags is that while the application of our
basic T.4 decoding routine to one-dimensional images was quite
straightforward, extending it to handle the needs of two-dimensional
data means that the fragment has to be able to handle four distinct
cases rather than just one.  Apart from normal one-dimensional lines,
there are also one- dimensionally coded lines in two-dimensional
data; and within two- dimensionally coded lines, horizontal mode
involves two distinct runs of T.4 data.  All these cases utilise the
same T.4 decoding fragment, but the program flow in all these cases
is different.

In addition, the extensions to handle T.6 data, while not difficult
to add to an existing program, make reading one rather more of a
chore than it otherwise would be.  The real problem with presenting
code of this type is that the series of conditional statements which
this entails tends to make the programs pretty tedious to read.  The
UNFAXIFY.C program in the subdirectory is unfortunately of this type;
but careful reading of the code together with the header files should
show that while the code is complex, it is quite mechanical.

6.  Two-dimensional decoding

A two-dimensional decoding tree can easily be constructed from the
codes for pass mode, vertical mode and horizontal mode.  Instead of
requiring a special routine, all that is required to decode
two-dimensional data is to ensure that the root of the this tree is
selected using the statement tree=&twotree at the start of all
two-dimensional lines.

When the two-dimensional code for horizontal mode is found, the
decoder simply drops into the one-dimensional decoding routine for
two run- lengths, but pointing tree at the correct root for the
colour of the next run; it is just as easy to keep track of the
colour when decoding two- dimensionally coding as it is when decoding
one-dimensionally.

Whenever a two-dimensional image is being decoded, a copy of the last
scan line always needs to be kept, which is always initialized to an
all- white line at the start of each page.  If two arrays are set
aside for this purpose, each of which is 1728 bits long, a pair of
pointers to the current line and the reference line must be kept, and
swapped over each time an EOL is reached.  The reference line needs
to be used for working out what to do when the codes for pass mode
and vertical mode are encountered, while the array for the current
line needs to have each run- length mapped into it as is decoded.

The following fragment of code shows one way of handling this task. A
1728 bit (216 character) array at current_line has its bytes indexed
by byte_inx, and each byte has its bits indexed by bit_inx.  Both
these indices have to be declared as static variables, since they
need to retain their value between calls. At the start of each row,
current_line is initialized to be all-white, while the byte index
byte_inx is initialized to 0 and the bit index bit_inx is initialized
to 8.

The fact that the scan line defaults to all white means that for a
white run, all we need to do is adjust the indices for the value
run_length which is passed to the routine.  The byte index is
increased by the value of the run length divided by 8, while the
remainder is subtracted from the bit index (which is wrapped around
to 8 again if it becomes negative).  Try this out with a few numbers
to see how it works.

The black runs are slightly more complex.  We may have to OR in some
individual bits, but for long black runs we can simply place bytes
with the value 11111111 (0xff) in the array.

 if (color==WHITE)
 {
     bit_inx-=(run_length%8) ;
     byte_inx+=(run_length/8) ;
     if (bit_inx<0)
     {
         bit_inx+=8 ;
         byte_inx++ ;
     }
 }
 else for (;;)
 {
     while (bit_inx>0)
     {
         current_line[byte_inx]|=(1<<(bit_inx-1)) ;
         bit_inx-- ;
         if (!(--run_length)) break ;
     }
     if (run_length==0) break ;
     byte_inx++ ;
     for (; run_length>8 ; run_length-=8)
     current_line[byte_inx++]=0xff ;
     if (run_length==0) break ;
     bit_inx=8 ;
 }

Code similar to this is found in DOLINE.C.

It is generally more convenient to handle the mapping of each scan
line as part of the output_run_length function that is called as each
run length is assembled.

The process of working out the runs embedded in pass mode and
vertical mode uses the same code for identifying the positions of b1
and b2 as a two-dimensional encoder uses.  The two key preliminaries
are to have access the current position on the scan line being
decoded (which has already been done with the byte_inx variable) and
use that as the value of a0, and to keep a copy of the last line that
was decoded as the reference line ref_line, so that b1 and b2 can be
identified.  The following code works out the position of b1 starting
from a0 under these circumstances.

 for (b1= a0+1 ; b1<1728 ; b1++)
 {
     bit_inx= (0x80>>(b1%8)) ;
     if ((ref_line[b1/8]&bit_inx)!= (color&bit_inx)) break ;
 }

The code for computing the position of b2 starting from b1 is
similar.

Two-dimensional pass mode does not change the colour of the next run.
It simply codes a run length computed using b2-a0, and carries on
with the same colour.  In contrast, vertical mode codes a run length
computed using b1-a03 , and changes to the opposite colour.
Horizontal mode, which codes the next two run lengths
one-dimensionally, does ends up with the decoder using the same
colour as it began with, but only because of the two intermediate
colour changes.

As with encoding, decoding two-dimensional images is significantly
slower than decoding one-dimensional images.  Apart from the extra
chore of constructing a reference line for each scan line, the
decoding of two- dimensional images doesn't directly result in series
of run-lengths.  The runs-lengths always need to be worked out at one
level of indirection; the two-dimensional codes simply tell us how to
find out the next run, not what it is.  Though two-dimensional data
offers significant savings on storage space and transmission times,
it does require more computation and is therefore always going to
take a little longer than simpler coding methods.

7.  Outputting scan lines

The most complicated part of the DECODE program is the second of the
two processes into which we divided the program. This has the task of
reproducing the image on an output device. The program in the
subdirectory accomplishes this with three distinct modules.  DOLINE.C
takes care of the composition of the data to be output.  SCALES.C
ensures that the vertical and horizontal resolutions specified by T.4
are adjusted to take account of the different capabilities of various
output devices.  OUTPUT.C handles the physical output of the data.

Accurate reproduction of a fax image on an output device isn't simply
a question of sending out a scan line.  In the first place, the
normal resolution of an A4 fax page, at 1728 x 1143 dots or 200 x 100
dpi, isn't matched by any output device in common use.  Some devices
are capable of much greater resolution, while others cannot hope to
match the clarity of fax.  In the second place, while some devices
can output a single scan line at a time, others either use or are
descended from dot-matrix technology and will need to output a series
of scan lines as image slices, which are composed of multiple lines.

The following table shows the devices catered for in the code in the
subdirectory.  While the same basic routines are used for all the
output devices, the number of scan line n that are output at a time,
together with the X and Y resolution (in terms of dots per page) are
quite different.  The output routine is the one part of the code
which would have been better written in C++ rather than in C.

 Ŀ
 Output Device                         n      X      Y     
 Ĵ
 Uncompressed image                    1      1728   1143  
 CGA graphics screen                   1      640    200   
 EGA graphics screen                   1      640    350   
 VGA graphics screen                   1      640    480   
 Hewlett-Packard (PCL) 100 x 100       1      802    1069  
 Hewlett-Packard (PCL) 300 x 300       1      2406   3207  
 Epson 9-pin dot matrix 120 x 72       8      960    792   
 Epson 24-pin dot matrix 360 x 180     24     2880   1980  
 Canon Bubblejet 360 x 360             48     2880   3960  
 Canon CaPSL printer 300 x 300         1      2340   3390  
 

Once a selection of a specific output device is made, we simply use
the correct routine for each of these possible devices.  The OUTPUT.C
module contains code for initialising and de-initialising, for
outputting lines and for handling page breaks. Each of these four
routines consists of a switch statement. (Additionally, there are two
IBM-specific printer support routines).

The initialisation routine has to take care of any steps necessary to
place a device in graphics mode, and also includes the setting of
three program variables for lines, width and height as shown in the
above table. The width and height variables are used for scaling the
image, with width/1728 being the horizontal scaling ratio and
height/1143 being the vertical scaling ratio for normal resolution
images, while the number of line n is used to tell how many scan
lines are output at one time.

Note that the reason why the Canon and HP printers appear to have
slightly different resolutions even though both are capable of 300
dpi is because we have assumed a slightly different printable area on
an A4 page for each.  The Canon LPB4 has an effective print area of
7.8" x 11.3", while we use a generic figure for all Hewlett-Packard
printers (including Deskjets).  This is based on a default margin of
.5" at the top and bottom of the paper, with .25" on each side,
giving an effective A4 page size of 8.02" x 10.69".  More details of
the different output devices are discussed in more detail in the next
section of this document.

a)  Scaling

Unusually, we show the two scaling routines almost exactly as they
appear in the code; this is simply because they are short and compact
as well as important.

Horizontal scaling is performed as each run-length is generated. The
fundamentals of this are straightforward arithmetic, in that each run
is first multiplied by the X-resolution of the device stored in width
and is then divided by 1728, which is the width of a fax line.  If
floating- point arithmetic were used, the computation (which is
basically administering a scaling ratio) could be performed the other
way round; but doing the multiplication first, using long integer
variables, is quicker.

There are two complications, both of which are only really important
when scaling down.  The first is that in order to avoid inaccuracies
accumulating, we keep any remainder from each operation and add it
back in to the next one.  If this isn't done, the discarding of all
the remainders means that each scan line will be shortened by an
apparently random amount, and vertical alignment of features on
adjacent lines will be distorted by the runs that precede them.

The second complication is a particular fax requirement.  We assume
that as fax is primarily designed for printed text, preserving the
alternations of black runs and white runs are of more importance than
accuracy in reproducing any individual run-length.  To take one device
as an example, when scaling the 1728 dots of an image for
reproduction on an IBM PC VGA screen, which is only 640 dots wide,
any runs less than 3 dots long are not guaranteed to be reproduced.
If no remainder is available for adding in, the scaling ratio of
640/1728 as applied to a run of 2 results in a run of 0.

We remedy this situation by attempting to ensure that every run-length
in the fax will be reproduced by at least a run of one in the
reproduced image.  This is done by borrowing a dot in advance from
the next run- length if we have scaled a run down to zero, and
setting a flag to indicate what has happened; the borrowed dot is
repaid the next time we have a run greater than 1.  We never borrow
more than one dot at a time.

Both the remainder and the borrow flag can be initialized at the start
of each row by calling the scaler with a run-length of 0.

 int hscale (long int width, long int run)
 {
     static long int remainder ;
     static char borrow ;
     if (!run)
     {
         remainder=0 ;
         borrow=0 ;
         return(0) ;
     }
     run=(run*(long int)width);
     run=run+remainder ;
     remainder=run%1728 ;
     run=run/1728 ;
     if ((!run)&&(!borrow))
     {
         run++ ;
         borrow++ ;
     }
     if ((run>1)&&(borrow!=0))
     {
         run-- ;
         borrow-- ;
     }
 }
 return (int)(run) ;

While the code for vertical scaling is superficially similar in
appearance concept to that for horizontal scaling, it is quite
different in concept.  The simplicity and accuracy of calling a
function with one run-length and getting back the correct run-length
to use instead isn't possible to replicate in the vertical dimension.

Instead, we call a function each time we have completed decoding a
scan line, but before the line is actually output.  What we get back
is a value corresponding to the number of times the scan line should
be output.  If the resolution of the output device is less than that
of a fax, we have to miss some lines out to reproduce the page
accurately; if the resolution is greater than that of a fax, some
line will have to be output more than once.  Inevitably, the
inaccuracies of the vertical reproduction will be more noticeable
than that of the horizontal reproduction; if you think about the
problems involved if we had to decide whether or not to output each
individual dot in a run rather than decide how many dots to output,
you can get an idea of the reason why this is so.

The way the function works is to output the first line of each image
once for each time the height variable (indicating the number of
lines the device used to output A4) is divisible by the height of the
fax image.

For the second and subsequent lines, we add the remainder to the
height of the fax and do the same again.

For instance, assume we are outputting a fine resolution A4 fax (with
2286 scan lines) to a 300 dpi laser.  If we assuming an 11" page,
this gives is a device with a nominal height of 3300.  For the first
line, we simply divide 3300 by 2286.  The result is 1 with a
remainder of 1014, so the line is output once.  For the second line,
we add the remainder to height before we start, and so divide 4314 by
2286. The result is again 1, but with a remainder of 2028; the line
is again output once but the remainder has doubled.  For the third
line, adding this new remainder to height means we divide 5328 by
2286.  This time, the result is 2 with a remainder of 756, so the
line is output twice.  The sequence carries on with the fourth line
being output once, the fifth line twice, and so on.

Each time, the addition of the remainder makes any errors in the
vertical scaling self-correcting.

This method works whether or not we are scaling up or down.  Let's
assume we are outputting a normal resolution A4 fax (with 1143 scan
lines) to a 9 pin dot matrix printer, whose 72 dpi gives us a height
of only 792 lines on an A4 page.

For the first line, dividing 792 by 1143 gives us 0, so the first
line isn't output at all, but we have a remainder of 792.

When we reach the second line, we add in the remainder and this time
divide 1584 by 1143.  The result is 1 with a remainder of 441 , so
the second line is output.

The remainder of 441, when added to the height of 792 for the third
line, gives us 1233; so the third line is also output once, with a
remainder of 90 being carried forward.

Unfortunately, adding in the new remainder of 90 to height gives us
only 882, so the fourth line isn't output at all.

Continuing the process, we can see that the fifth and sixth lines are
both output.

The pattern is that approximately one line in three is missed out,
when the remainder drops below 351.  However, after line 13 is
missed, the remainder has reached 1062.  This is sufficient to allow
all three of the following lines to be output.  Once again, the
method shows it can make the best of the available resources.

Sample code for a vertical scaler is shown here; as well as the height
it is passed the resolution, which should be 1 for normal resolution
or 2 for fine resolution.  If res is 0, then this is taken as being a
special case and the remainder is initialized to zero.

 int vscale (long int height, char res)
 {
     static int remainder ;
     if (res==0) {remainder=0 ; return(0) ;}
     count=remainder+height ;
     remainder=count%(1143*res) ;
     count=count/(1143*res) ;
     return (int)(count) ;
 }

The method of borrowing dots to avoid losing data in short runs has
no obvious parallel when scaling vertically.  However, it is
essential that we avoid losing information when scaling down; we
don't want the detail on lines that aren't output to be completely
lost.  For instance, the lower-case letters c and e can easily be
confused if the scan line with the horizontal stroke in the e happens
to be missed out.  The same applies to the capital letters F and E.
The missing line cannot always be reconstructed from the context. For
instance, if a fax consisted of hexadecimal codes, confusing either
of these pairs would be quite a serious matter.  This brings us to
the second section of the scan line output routines, which is the
code for generating the data to be output.

b)  Generating data for output

One thing that all our output devices have in common is that they all
accept bit-mapped graphics output.

Devices that can output single scan lines at a time all accept
bit-mapped graphics in the form of a stream of bytes, which are read
from left to right as written.  The following table shows how the
first three bytes of such a bit-stream are used to encode the first
24 bits of each line.

 Ŀ
        Byte 1 Byte 2 Byte 3 
 Ĵ
 MSB    dot 1  dot  9 dot 17 
 bit 2  dot 2  dot 10 dot 18 
 bit 3  dot 3  dot 11 dot 19 
 bit 4  dot 4  dot 12 dot 20 
 bit 5  dot 5  dot 13 dot 21 
 bit 6  dot 6  dot 14 dot 22 
 bit 7  dot 7  dot 15 dot 23 
 LSB    dot 8  dot 16 dot 24 
 

The code for mapping out a run lengths for devices which work in this
way is fundamentally the same as the code for mapping the contents of
the reference line when doing two-dimensional coding; the address and
the size of the array and of the run-lengths are the only real
differences.

Most page-based laser printers, and most screens, accept this type of
single-line graphics data.  The only addition to the code we used
when mapping run lengths into a reference line is that we keep track
of the final black bit on each line; this enables us to save time
when reproducing images by not outputting trailing white space.

A major modification to this mechanism is necessary for devices that
have to output multiple lines of graphics at one time.  The original
of this type is the Epson 9-pin dot matrix printer, which outputs
data in slices of 8 lines.  (The ninth pin is unused in graphics
modes).  While a stream of bytes is again output, each byte is used
to encode the same bit for all 8 lines in the slice.  The printer
cannot print one line at a time, and has to print the whole slice;
the situation is analogous to parallel data transmission, as it isn't
possible to transmit a single bit over such a link.   As the
following table shows, the first byte maps to the first bit of all
the lines, the second byte maps to the second bit, and so on.

 Ŀ
        Byte 1         Byte 2         Byte 3       
 Ĵ
 MSB    line 1 dot 1   line 1 dot 2   line 1 dot 3 
 bit 2  line 2 dot 1   line 2 dot 2   line 2 dot 3 
 bit 3  line 3 dot 1   line 3 dot 2   line 3 dot 3 
 bit 4  line 4 dot 1   line 4 dot 2   line 4 dot 3 
 bit 5  line 5 dot 1   line 5 dot 2   line 5 dot 3 
 bit 6  line 6 dot 1   line 6 dot 2   line 6 dot 3 
 bit 7  line 7 dot 1   line 7 dot 2   line 7 dot 3 
 LSB    line 8 dot 1   line 8 dot 2   line 8 dot 3 
 

Clearly, the code use for mapping a run-length for a single-line
based output device is nor suitable for use when the destination
device outputs multiple lines in slices.  In a sense, a device such
as a laser printer, which takes single-line input, is analogous to a
serial transmission, as it effectively handles single bits strung
together in a sequence.  On the other hand, a printer that outputs in
slices is more like a parallel transmission, which manages whole
bytes at a time but can't transmit a single bit at all.

For all output, an array is dynamically created by multiplying the
number of lines in a slice by the number of dots across each page to
obtain the bit count, and dividing by 8 when allocating the memory in
bytes.

Obviously, a printer driver for something like a Canon Bubblejet,
which prints 48 lines in each slice, would require 48 times as much
memory as a printer of the same resolutions that accepts single
lines.  In fact, at 360 dpi the Canon Bubblejet has a resolution
better than many laser printers.  An 8.5" line at 360 dpi requires
3060 bits; 48 such lines require an array of 18360 bytes. If memory
is tight, and 18K isn't going to be available, then the slice would
have to be output in portions.

It is possible to develop a generic routine for outputting multiple
lines of data which works for all cases where the slice is exactly
divisible by 8.  It is therefore suitable for basic 9 pin dot matrix
printers, 24 pin near-letter quality printers and also 48 nozzle
inkjets.  The difference is that while a single byte is sufficient
when only 8 pins need to be controlled, three bytes per dot are
needed for 24 pin printers, and six bytes per dot are needed for 48
nozzle inkjets that work on the same principle.  The following table
shows how the first three bytes are used to encode a single bit from
every line of a 24-line slice.

 Ŀ
        Byte 1         Byte 2         Byte 3        
 Ĵ
 MSB    line 1 dot 1   line  9 dot 1  line 17 dot 1 
 bit 2  line 2 dot 1   line 10 dot 1  line 18 dot 1 
 bit 3  line 3 dot 1   line 11 dot 1  line 19 dot 1 
 bit 4  line 4 dot 1   line 12 dot 1  line 20 dot 1 
 bit 5  line 5 dot 1   line 13 dot 1  line 21 dot 1 
 bit 6  line 6 dot 1   line 14 dot 1  line 22 dot 1 
 bit 7  line 7 dot 1   line 15 dot 1  line 23 dot 1 
 LSB    line 8 dot 1   line 16 dot 1  line 24 dot 1 
 

A general-purpose bit-slice output driver needs to know the number of
scan lines in a slice.  Dividing this number by 8 gives the number of
bytes in a slice.  It also needs to keep a count of the number of
scan lines that are left in each slice; for the first scan line in
each slice this is always set to the total number of lines in the
slice, and it is decremented each time a scan line is completed. When
this count is decremented to 0, the slice can be output and the count
is reset to the maximum again.

Once the number of bytes in a slice is known, the technique for
skipping over white runs in a line is simply a matter of multiplying
the number of bytes in a slice by the run length, and adding that
number to the current index.  The reason why this is so is because
each slice skipped is equivalent to a bit skipped.

Mapping black runs in is more complicated.  First we finding out
which bit needs to be ORed into each slice; we do this by generating
a bit index bit_inx from the modulo division of number of the scan
lines left in the slice by 8 (which is the number of bits in a byte).
We then use bit_inx as a basis for shifting a single bit to generate
the correct byte for setting a specific bit via a logical OR.

The next step is to find out which byte in each slice needs to have
the bit set.  For this, we find out the number of bytes left in each
slice; this is simply a matter of dividing the number of lines left
in each slice by the number of bits in a byte, and rounding up to the
next boundary.  We generate a slice index slice_inx by subtracting
this from the total number of bytes in the slice.

We then loop for each bit in the run, and OR in the correct bit to
the byte in the output array indexed by byte_inx+slice_inx. Before
each repetition, we add the number of bytes in each slice slicebytes
to the byte_inx.

Note that the next fragment also keeps track of the final black bit
on each line; as described earlier, this helps us avoid printing
trailing white space on each line.

 if (color==WHITE)
 {
     byte_inx+=(slicebytes*runlength) ;
 }
 else
 {
     c_bit = 1 ;                        /* we are black */
     bit_inx = scanlines_left%8 ;
     if (bit_inx == 0) bit_inx = 8 ;
     if (bit_inx > 1) c_bit <<= (bit_inx-1) ;
     slice_inx=slicebytes-((scanlines_left+7)/8) ;
     for (;;)
     {
         if (last_black_byte<=i) last_black_byte=(i+1) ;
         faxline[byte_inx+slice_inx]|=c_bit ; byte_inx+=slicebytes ;
         if (!(--runlength)) return(1) ;
     }
 }

c)  Catering for lost lines and inserted lines

We return to the subject of entire scan lines that risk being lost
when scaling a fax image down to fit a lower resolution device.  The
way these are handled is quite straightforward.

We avoid losing any lines by holding the data in an array which is
initialized to white space at the start of each page, and
re-initialized only when the array is output.  If a call to vscale
returns 0 and the line isn't output, then the data is simply left in
the array.  If the output device is a printer which outputs slices
consisting of multiple lines, then the count of the number of lines
left in the slice is not decremented if vscale returns 0.

When the next scan line is decoded, the black runs that are output
are ORed into the array, while the white run simply adjusts the
index.  This means that if a line isn't output, the data it contains
remains to be combines with that from the next scan line.  Surplus
lines are never simply ignored; they are combined with valid lines
instead.

This technique assumes that black data is more significant than white
data.  This assumption is usually justified, and is widely made in
the fax business; it is well-known that white on a black background
reproduces less well when transmitted than black on a white
background.

When a fax is output to a higher resolution device, we have to cater
for inserted lines which result vscale returns a number greater than
1.  For single-line output devices, we just use the value returned by
vscale as the control variable for a loop which calls our output
routine.

For multi-line printers that output slices, we have to duplicate lines
by isolating individual bits and ORing them in the next bit
position.  This may be in the next byte of the slice.  As with all
bit-manipulation, this is a chore, but is fairly mechanical.  The
only real complication is that the count of the number of lines left
in the slice has to be decremented each time a line is duplicated;
and if the count is decremented to 0, the slice is printed.  While
the array holding the slice has to be re- initialized to white space,
the very last line in the slice may now have to be replicated as the
first line in the next slice.

As a sample which is typical of the sort of bit manipulation that you
can find in DOLINE.C in the subdirectory, we present the following
fragment, which shows exactly how this complex re-initialisation of a
multi-line slice can be performed.  We take the last byte of the
slice, and shift it left seven times.  This both moves the least
significant bit to the most significant position and also moves 0
bits in from the left.

We then replace it as the first byte of the same slice, with all the
remaining bytes being reset to white space.  Of course, if
number_of_line which was returned by vscale is 1, then we can
initialize the whole slice to 0, as the last line doesn't have to be
replicated.

 for (byte_inx=0 ; byte_inx<last_black_byte ; byte_inx+=slicebytes)
 {
     if (number_of_lines==1) faxline[byte_inx]=0 ;
     else faxline[byte_inx]=(faxline[byte_inx+(slicebytes-1)]<<7) ;
     for (slice_inx=1 ; slice_inx<slicebytes ; slice_inx++)
     faxline[byte_inx+slice_inx]=0 ;
 }

8.  Catering for different devices.

We conclude our presentation of decoding faxes with a few words on
the actual physical output of the data to a device.  There are three
types of device we cater for in the decode program in the
subdirectory.

a)  Dumping an uncompressed image to disk

The simplest of the devices that the code supports isn't really a
device at all; it is an option to dump the decoded scan line to disk
inside a wrapper for an uncompressed TIFF image.  It is primarily
provided for the convenient exporting of received fax images to other
programs.

Uncompressed TIFF is a widely accepted format for this purpose, while
the TIFF fax options are sometimes less useful than we might hope.

While the ability to generate an uncompressed TIFF file is useful, it
has its limits.  There are a number of graphics packages that
specialise in manipulating and converting images, and anyone who
needs to do any serious work with post-reception fax processing is
best advised to obtain a package which is designed to perform
whatever function is required.

Dumping a fax to an uncompressed TIFF image is simply a convenient
point to start from.  The routines provided for doing this aren't
particularly efficient and can be improved; for instance, they still
attempt to scale the image 1:1 before writing it to disk.  This is
probably acceptable in a program which is primarily designed to
output visible images and offers an option to generate images on disk
as an extra feature, but would be an unnecessary overhead if this
operation was being performed more intensively.

The initialisation function for writing an uncompressed TIFF asks for
the name and opens the file.  The deinitialisation function closes
the files.

The line output function counts and writes scan lines to disk, and
the end page function writes the IFD to the file after the data.

b)  Displaying an image on screen

The next simplest device is the option to display an image on an IBM
graphics screen.  Detecting the type of graphics screen is quite
straightforward. All three of 480 line VGA, 350 line EGA and 200 line
CGA are supported, with a screen width of 640 dots.  After the
appropriate scaling, the runs are simply written directly to the
screen RAM in byte- wide chunks of 8 adjacent bits at a time.

One notable feature of screen input is that all the bits in the fax
data are complemented before being written to the screen.  This is
because a totally blank screen is composed entirely of black space,
and any data written to it appears as white dots.  In contrast, a
sheet of paper, which is the more usual output medium, is composed
entirely of white space, and data printed on it appears as black
dots.  In TIFF terms, the two have opposite Photometric
Interpretations.

The initialisation function for writing data to a screen places the
video adapter in graphics mode.  The deinitialisation function
returns it to text mode.  The line output function writes lines of
data to the screen, and the end page function beeps at the user and
waits for a key press before clearing the screen and carrying on with
the next page.

c)  Outputting to printers

All printers are controlled by escape sequences.  The sequences for
the printers in our code can be found in the OUTPUT.DAT file in the
subdirectory.  The data is structured in the form of a count byte,
followed by the commands, and the printer function in OUTPUT.C is
responsible for sending the data to the stdprn device.  The reason
why a length byte is always needed (and why null-terminated strings
are no good for this purpose) is because many printers can contain
null bytes as part of a command; for example, most dot-matrix
printers include their own 16-bit length word with graphics data, and
this can easily contain a null.

The initialisation function for a printer contains whatever commands
are needed as preliminaries before setting the printer up in the
required graphics mode.  Generally speaking, the simpler the printer,
the fewer commands are needed to complete this step.  The most
complicated printer we have codes for are those using the Canon CaPSL
control language, which requires a set of seven separate commands.
Hewlett-Packard printers, which use the rather less rich PCL, require
only three commands.  The only commands needed by dot-matrix and
other multi-slice printers are ones to set the proper vertical
spacing between slices.

The initialisation sequence in the code includes an operating system
call to place the printer in raw mode (as opposed to cooked mode) in
order that characters such as ASCII EOF codes (0x1a) which occur as
part of the data may be streamed to the printer transparently.

The line output function is broadly similar for all printers.  The
exact escape sequence varies, but the basic pattern is that of an
escape sequence with an embedded count, followed by the image data.
The count may be of bytes (in single-line printers) or of slices (in
multi-line printers).  The details of whether anything needs to be
sent after the data also vary; typically, multi-line printers follow
a slice with a carriage return and line feed, while single line
printers either implicitly move one dot vertically, or need an
explicit escape sequence.

The process of actually generating the output data was where most of
the differences between single line and multi-line printers were
handled.

The ones that remain at this stage are, by comparison, trivial.

The deinitialisation and end of page functions are identical for
printers; a form feed is an invariable part of the procedure, and if
the initialisation sequence required placing the printer in a unique
graphics mode, as with Hewlett-Packard printers, then the form feed
is preceded by a return to normal ASCII operation.

PART 6 : FAX TRANSMITTING AND RECEIVING

A number of fax modules are included in the FAX subdirectory.  They
are all tied in to one sample exerciser program, appropriately called
FAX.EXE, which sends and receives faxes on any type of TIA/EIA fax
modem.  There are a large number of source files that go to make up
this exerciser; in fact, there are so many that under DOS they have
to be managed using a library, as the command line isn't long enough
to take a command to link all the object modules together.

The source files actually fall quite neatly into five groups.

i. The first group includes those files that are required for the
exerciser program.  The user interface and all the preliminary work
is handled by FAX.C, with FAXIN.C and FAXOUT.C calling the functions
appropriate to the type of modem being used.  The header file FAX.H
should also be included in this group, as it includes a highly
important session structure.

ii. The second group includes those lower-level routines which
underpin all the other functions.  The ASYNC.ASM module which
implements low-level port i/o is specific to IBM PC architecture, but
the low-level functions which it implements are needed for any fax
implementation. The FAXUTILS.C module builds on ASYNC, and includes
the basic string handling and timing functions needed to control a
fax modem, while FAXMODEM.C is a universal routine for verifying and
initialising TIA/EIA fax modems.  You should ideally have read a
little on the class 1 and 2 command sets (such as the SUPRA
documentation) in order to follow the code in this group.

iii. The third group consists of routines to control class 1 fax
modems.

FAXIN1.C is a fax receiver while FAXOUT1.C is a fax transmitter. Both
these modules depend in turn on FAXCLAS1.C, which implements the
basic class 1 routines +FRH, +FTH, +FRM and +FTM commands.  This
group also includes FAXECMRX.C and FAXECMTX.C, for receiving and
sending error corrected faxes, and FRAMES.H, which contains the
definitions and structures needed to handle T.30 negotiations.  You
should be familiar with the T.30 specification in order to follow the
code in this group.

iv. The fourth group consists of routines to control class 2 fax
modems.  It consists of two modules only;  FAXIN2.C is a fax receiver
while FAXOUT2.C is a fax transmitter.  This group requires
familiarity with the class 2 command set.

v. The fifth group consists of routines to control class 2.0 fax
modems, which are derived from those for class 2 modem.  Like the
last group, it consists of two modules only;  FAXIN2.C is a fax
receiver while FAXOUT2.C is a fax transmitter.  This group requires
some knowledge of the class 2.0 command set.  Regrettably no public
domain source of this information appears to be available.

1.  The exerciser - FAX.C and FAX.H

The code contained in first group is quite straightforward.  Its
primary purpose is to provide a common entry point for testing out
all the other modules.  It does this by setting up a session
structure, which is defined in FAX.H.  The main exerciser FAX.C
initialises this structure with the various options required for any
particular operation.  Both command-line operation and an interactive
interface with suitable prompts are provided for.  FAX.C includes a
call to a modemset function which checks and initialises the modem,
and provided this operation is satisfactory, it then calls either
faxin or faxout, depending on whether a fax is to be received or
sent.

All these three functions return an integer.  Any non-zero value
indicates an error has occurred, and is passed back the operating
system via a normal exit.  A suitable message is displayed before
this is done.

The purposes of many of the session structure members are self
explanatory. For instance, the type of fax modem is carried in fclass
and can be automatically detected by the modemset function.  It can
also be explicitly set up for a particular class, which is useful if
a modem has multiple TIA/EIA FCLASS capability. Another fairly
obvious member is port, which contains a setting for the
communications port which the modem is attached to.  While the
programs supplied assume an IBM PC architecture, this isn't really a
PC specific requirement, as any system with multiple serial ports
will need to have the modem address specified.

The speed used to initialize the port is optionally passed in speed;
the common default of 19200 bps is used if no speed is specified. Not
to be confused with this is brate, which holds the preferred speed at
which a fax should be sent or received.

There are a number of equally simple flags in the structure.  The
direction flag is set to S if a fax is being sent, and to R if a fax
is to be received.  The verbose flag contains 0 to suppress progress
messages from the software and both data and text messages from the
modem, or else contains 1 to display them. The ecm flag (which is
only valid with class 1 modems), indicates whether error correction
is to be used or not, and the twodee flag performs the same task for
two- dimensional coding.

Three more members of the structure contain text strings.  The local
ID used to compose the CSI and TSI frames is held in id, while the
name of either the file holding the fax being sent or the file which
will hold the fax being received is equally appropriately held in
filename.

Finally, phone holds the telephone number of the destination fax, or
in the case of reception, a single character indicating whether the
telephone is to be answered immediately or only when it rings.

This structure is also used as convenient means of passing common
pointers and parameters around the various functions in the suite.  A
well as the file pointer faxfile (which points at the file named by
filename), a memory pointer faxbuf and a TIFF image structure are
also members.  These pointers are initialized before any fax session
by the functions contained in FAXIN.C and FAXOUT.C.  The eventual
speed at which a connection is established is held in bps.  The final
member of the session structure is a timeout, which is mostly used to
control the various stages of the T.30 protocol when a class 1 modem
is being used, though it also enables recovery from modem errors on
any type of modem.

The session structure definition is as follows :

 struct session
 {
     char fclass ;
     char port ;
     unsigned int speed ;
     unsigned int brate ;
     char direction ;
     char verbose ;
     char ecm ;
     char twodee ;
     char id[21] ;
     char filename[65] ;
     char phone[33] ;
     FILE *faxfile ;
     char *faxbuf ;
     struct IMAGE image ;
     unsigned int bps ;
     unsigned int timeout ;
 } ;

2.  Low-level functions - ASYNC.ASM

There are a number of machine-specific functions implemented in
ASYNC.ASM for IBM PC architectures which would need to be
re-implemented for other environments.  Most of these are quite
straightforward to implement on operating systems that provide
reasonable services for handling serial ports.

The serial port is initialized using the init function and
deinitialised using deinit.  An additional function reinit is also
needed whenever the speed at the port needs to be changed.

Character input and output are handled through simple rxchar and
txchar functions; the latter needs to be able to handle output flow
control.

Additionally, serial port buffer status is returned by the rxstat and
txstat functions.

Two final functions lowerdtr and raisedtr are used in the code in the
subdirectory to hang up and reset the modem by dropping and then
raising the DTR signal at the serial port.

3.  PC specific routines

Implementation of these functions for the PC is mostly a matter of
quite straightforward UART programming.  The reason why this module
is coded in assembler rather than C is because of this specific
hardware bias.  There are a few points of interest which it is worth
going into in more detail.

i. The port initialisation function takes three parameters.  As well
as the port number and the bit rate (which would be required on any
system) init also requires an IRQ number.  This isn't always used, as
if the port passed is a number such as 1, 2, 3 or 4 it is assumed to
be a normal COM port.  In this case the port address is taken from
the table the BIOS maintains at location 0:400, and odd ports are
assumed to use IRQ 4 while even ports are allocated IRQ 3. The actual
value passed as the third parameter is ignored in this case.

However, if a port number greater than 255 is passed to the function,
then it is assumed to be a port address, and the third parameter is
assumed to be a real IRQ number.  The fact that only IRQ numbers from
2 to 7 are supported is in line with the prevailing PC industry
practise of implementing serial port and internal modems as 8 bit
cards.

ii.  Both transmit and receive character i/o is handled through
interrupt service routines.  Note that interrupt driven output is as
important for stopping transmitter underrun when sending faxes as
interrupt driven input is for stopping receiver overrun when
receiving faxes.  Techniques for writing interrupt service routines
for received characters are well documented in a number of places,
but corresponding routines for transmitted characters are less well
known and are also more complex.

All characters passed to txchar are placed in a transmit buffer, from
which they are automatically removed, and sent out whenever a
transmitter empty interrupt is generated by the UART.  If an
interrupt is generated when no characters are available to send
because the transmit buffer is empty, then nothing is sent ; the
transmitter need not be explicitly disabled, as the interrupt is
generated only once, when the transmit holding register becomes
empty.

If both the UART transmit holding register and our own transmit
buffer was empty when txchar is called, then the transmitter has to
be kick- started by forcing a character out of the port. Subsequent
character transmissions (until the transmit buffer empties again) are
automatically catered for on each interrupt.  Our routine also checks
to see if the transmitter is empty even when the transmit buffer has
data in it just in case we missed a transmitter interrupt. Some of
the UART emulators built in to multi-function cards are not as
reliable as they ought to be at handling multiple interrupts, and if
a transmitter interrupt is missed, there is no way that an interrupt
service routine can recover unless such a check is explicitly built
in.

iii.  The receive and transmit FIFO buffers are automatically enabled
and used if a 16550 UART is present.  The trigger levels for both are
set at 8 bytes.

Our own transmit and receive buffer sizes should be defined as an
integral power of 2, since this simplifies wraparound arithmetic.  We
use a 2K buffer, which is a fairly pragmatic figure suitable for most
software installation.

iv.  As the routines should be usable for both class 1 and class 2
modems, flow control is only implemented for output ; it is assumed
that the application software can process characters as fast as they
arrive.  Both hardware and software flow control are handled by the
interrupt service routine. Characters are only transmitted if CTS is
high and if the last character received was not an XOFF.  Conversely,
if CTS goes high or if an XON is received, the transmitter is
restarted if there is data available to be sent.

v.  It is worth noting that the algorithm used for programming the
UART speed is capable of setting almost any integer speed, not simply
the usual multiples of 2400 bps.  It is theoretically possible to
minimize modem flow control events by matching the serial port speed
more closely to the actual transmission speed of the modem ; for
example, 12000 bps asynchronous is a far better match for a normal
9600 bps synchronous fax transmission than the usual 19200 bps.

vi.  The C functions don't call txchar, but instead send characters
via a higher-level qtxchar() function.  This implements a check that
the character has actually been sent, with automatic retry and a
10-second timeout if it cannot be sent at all.

4.  Notes on recoding ASYNC for other architectures

The basic functions contained in ASYNC are fairly easily recoded for
other architectures and operating systems provided that flow control
events can be handled rapidly enough.  There are a number of other
points to watch out for.

i.  Multi-user and multi-tasking operating systems, together with
single- user systems running over networked ports, cannot be assumed
to guarantee receipt or delivery of octets within the timing
constraints needed to satisfy the requirements of the T.30 session
protocol.  For example, a change in the modulation scheme during a
transmission requires a delay of 75 20 milliseconds.  The optional
ECM mode requires even shorter programmed delays of 50 milliseconds
after RCP frames.  If these delays are either too short or too long
this will almost certainly lead to errors.

Even if we assume that the operating system in question is capable of
accurate timings of this order, it is often not possible to know what
the delay might be between calling txchar on and the character
actually being sent, or between the arrival of a character and its
availability for retrieval by rxchar.  For this reason, class 1
modems which directly implement the T.30 protocol might prove
unreliable in these environments.

Even where reliability is possible, it might be possible only by
hogging the processor, and this may be a price which slows down the
system to such an extent that users aren't willing to pay it.  Class
2 modems, which relieve an application of the need to adhere to the
T.30 timings, are the obvious choice of modem in these situations.

ii.  If our transmit buffer becomes full, qtxchar() waits for the
interrupt service routine to make room with a fairly primitive busy
waiting loop which times out after ten seconds.  Busy waiting loops
are in general a bad idea, and should be avoided on multi-user and
multi- tasking operating systems.  If an operating system is being
used which supports some method of suspending a process until there
is room in the transmit buffer, this should replace the busy waiting
loop at this point.

The same consideration applies to the loop which waits for characters
to come in when the receive buffer is empty, and the silence routine
(discussed in the next section), which is basically a busy waiting
loop for one clock tick.  As a general guideline, all the calls to
clock() which implement waiting loops can easily be recoded to
suspend a process for a tick or two instead of actively waiting.  The
sort of timeout loops we have in the software at the moment enable
this  to be done quite easily.  We simply amend

        t=(clock()+(timeout*CLK_TCK)) ;
        while (rxstat()==0) if (t<clock()) return (0) ;

by adding a simple else clause as follows to suspend for a number of
ticks.

        t=(clock()+(timeout*CLK_TCK)) ;
        while (rxstat()==0) if (t<clock()) return (0) ;
        else suspend (ticks)

iii. If operating services permit the transmission or reception of
data in blocks, it will always be more effective to make use of these
facilities.  Entire buffers can then be sent or filled with one
operation.  Usually, only trivial amendments to the code are required
to take advantage of these facilities.  An additional txflush
function to force the dispatch of any characters in the transmit
buffer even though it may not be full is generally needed also.

5.  Low-level functions - FAXUTILS.C

With one exception, the routines in this module are all required for
controlling a fax modem and dependent on the proper implementation of
the ANSI time function clock and the macro constant CLK_TCK which
reflects the granularity of the timer.

The function silence simply waits for one tick of silence at the
serial port as measured by CLK_TCK.  The point of this routine, which
loops until one complete clock tick passes with no characters coming
in and no characters going out, is to enable a modem to stabilise
between the issuing of AT commands, or after a change of speed, or
after a reset.

You must remember that fax modems are far from being all alike, and
they can differ quite dramatically in the way they respond to various
commands.

Its use in the program in the subdirectory is fairly pragmatic;
during testing of the software, part of my standard technique for
fixing problems when a modem failed to respond to a command was to
prefix the command with a call to silence.  When this delay, which is
between 55 and 110 milliseconds on a PC, had the desired effect, I
simply left the call in the code.  Feel free to remove any of these
calls if your modem works without them; feel free to rewrite silence
for your own hardware as required.

Note that most fax modems will not accept commands when they are
still issuing responses to the last command; and these responses
include new line characters.  Given that there is no guarantee of
uniformity between different fax modems regarding these responses, I
found it necessary to wait for silence before sending any commands
out of the serial port.  The problem here is that on an IBM PC, the
granularity of the system clock means that a delay smaller that 110
milliseconds is impossible to obtain from a straightforward call to
clock.  If a T.30 session is being controlled using a class 1 fax
modem, this is far too coarse. Where timings of 75  20 milliseconds
are required, a delay of up to 110 milliseconds is quite clearly
unacceptable.

The export function which sends commands out of the serial port needs
to wait for the smallest practical period of time.  In fact, the
version of the function in the subdirectory waits for a fairly
empirical 2 milliseconds.

The following code fragment is similar to that used in export to
manage this delay; it ought to work on almost any machine using
standard C calls no matter what the granularity of the clock might
be.  The function is called once at the start of the program to
calibrate the timer.  It first waits for the clock to tick, then
counts the number of calls to clock() in one second.  This number is
always to be measured in the thousands, and is stored as a long
integer.

All subsequent calls to export will first call clock however many
times are needed for a two character delay.  Note that the
calibration loop and the delay loop all involve one call, one
comparison and one increment; this makes the algorithm reasonably
accurate.  Like all timing delays of this magnitude, it isn't going
to be exact, but it should be well within any tolerance required.
Unless you want to run with interrupts off, exact timings are always
going to be impossible.

This technique will not work on operating systems with pre-emptive
multi- tasking, as the fax program isn't guaranteed access to the
processor at all times and neither the calibration nor subsequent
loops can be assumed to have occurred during the same time slice. For
class 2 modems this should make little difference; for class 1 modems
it does.  Once again, it must be stressed that an operating system
based on pre-emptive multitasking that does not offer facilities for
either accurate timings or changing the priorities of competing
processes cannot be considered suitable for the sort of real-time
communications programming that class 1 fax modems require.

In other words, if the method shown in the following fragment can't
be made to work reliably then your system is probably not going to be
satisfactory when used for real-time communications.

 /* the export() function begins like this */
 unsigned int x ;
 static unsigned long int calls=0L ;
 clock_t t ;
 if (calls==0L)
 {
     t=clock()+1 ;
     while (t>clock()) ;
     t=clock()+(CLK_TCK) ;
     while (t>clock()) calls++ ;
     calls/=500 ;
     calls++ ;
 }
 else for (x=0 ; x<calls ; x++) clock() ;
 /* now carry on with export() proper */

The import function is much simpler than export.  It basically is
intended to get lines.  It returns a string beginning with the first
printable character received and ends with the first non-printable
character received, or when the allocated string space is full.

The function is called with a pointer to a line buffer, the length of
the line buffer, and an integer timeout in seconds.  The timeout is
used to set a period of silence after which the call is assumed to
have failed, rather than a maximum time for receipt of the whole
string.  In other words, the timeout is reset after each character
received.

A successful call returns with the length of the null-terminated
string.

The string itself is returned in the line buffer in upper case.  An
unsuccessful call always returns 0.  This means that a timeout
occurred, but the buffer may nonetheless contain a partial line.

The matchs string matching function simply combines our own import
function with the standard ANSI C substring function strstr to wait
for any string of up to 64 characters.  The framestat function, which
is primarily for use with class 1 fax modems, is very similar.  It
returns successfully, with 1, on receipt of OK or CONNECT, but
unsuccessfully, with 0, on receipt of ERROR or NO CARRIER.  A special
return of -1 on timeout enables T.30 timeouts to be programmed when
using class 1 faxmodems; this is something that is handled by the
getframe function in the FAXCLAS1.C module.

The one function in this module that doesn't need a timer is invert,
which inverts the order of the bits in a byte. Usually, the preferred
method of implementing this function is via a lookup table contained
in BACKWARD.DAT, which simply consists of the inverted bit patters of
all 256 possible values of an 8 bit character, from 00000000 to
11111111.

The function is straightforward enough :

 unsigned char invert (unsigned char octet)
 {
     return (backward[octet]) ;
 }

Obviously, it makes sense to replace all occurrences of invert (char)
with char=backward[char] to avoid the code and timing overhead of
function call.  The reason why invert<c> is left as a function is
because alternative implementations to a lookup table may make more
sense under some circumstances.  In particular, the trade-off of
program size against speed may put a premium on space, while the fact
that octets are being transmitted at the relatively leisurely rate
(in terms of CPU speeds) of no more that 14400 bps means that time is
not necessarily going to be a problem.  The following code does the
same job as the lookup table in software :

 unsigned char invert (unsigned char octet)
 {
     unsigned char tetco, newbit, mask ;
     for (tetco=0,newbit=0x1,mask=0x80 ; mask ; newbit<<=1,mask>>=1)
     if (octet & mask) tetco |= newbit ; return (tetco) ;
 }

while the following embeddable 8086 assembler to reverse an 8-bit
character in ax is even quicker :

    mov     ah,al
    mov     cx,8
 reverse:
    shl     ah,1
    rcr     al,1
    loop    reverse

6.  Low-level routines - FAXMODEM.C

The two routines modemset and modemreset control the mode that the
modem is being used in.  Because our suite of programs is designed
for fax, the modemset function places the modem in fax mode using the
appropriate AT+FCLASS command, while the modemreset command places
the modem back in data mode.

If the attempt to place the modem on the appropriate +FCLASS is
successful, modemset returns with a 0.  Otherwise, it returns with an
error code indicating the nature of the problem.

The initialisation function has to be passed a pointer to the session
structure we outlined earlier, as it needs to know the port to use,
the speed to use, and the preferred fax modem class. If the fclass
member of the structure is 0, then our initialisation routine
attempts to autodetect the capabilities of the modem and set the
class appropriately.

Where a modem has multiple capabilities, we have to decide which one
is to be preferred.  The version of this function in the subdirectory
prefers class 1 to class 2 and class 2 to class 2.0.  This simply
reflects the fact that the programs were tested on single-user
single- tasking DOS machines, and there was therefore no reason to
give up the greater capabilities of the class 1 devices (notably
error-correction); and our reluctance to use class 2.0 reflects the
fact that (as of 1994) the command set must still be considered new
and potentially unstable. If the source code is being recompiled for
other environments or for specific modems, none of these
considerations would necessarily be relevant.

The deinitialisation function modemreset is much more
straightforward.

The command ATHE1+FCLASS=0 is sent to the modem, and once an OK
response is received, the port deinit function is called.

Dumb terminal and verbose mode The session structure contains one
member, verbose, that makes a significant different to the way that
the program runs.  The FAX.H header file contains two macro
definitions as follows :

 # define PRINTF if (!(defs->verbose)) ; else printf
 # define PUTS if (!(defs->verbose)) ; else puts

You'll find a scattering of strings output with PRINTF and PUTS
instead of the normal C functions printf and puts throughout the code
in the FAX subdirectory.  All such strings are suppressed unless the
verbose flag is non-zero. The verbose flag is a useful tool either
for people who just want to see exactly what is going on, or else for
situations where a problem exists.

Typical uses of these macros are where the export function includes a
PRINTF of all characters sent to the modem; and although the import
functions itself doesn't use either macro, both framestat and matchs
include PUTS for all lines received.  Additional progress report
messaged using PUTS are scattered throughout the code; hexadecimal
dumps of HDLC frames are also given when using a class 1 fax modem.

When the verbose flag is non-zero, the modem initialisation function
includes a dumb terminal routine.  The theory behind this is that
when the exerciser program is being used to test out a particular
modem, part of the procedure is almost always going to be working out
the correct AT initialisation string to use.  The dumb terminal
routine enables modem commands to be typed in at the keyboard; this
is done after the AT+FCLASS command to put the modem into fax mode. A
typical use of this feature is for setting various options for flow
control.  The dumb terminal routine can be ended by pressing ESC,
when the rest of the exerciser program carries on as normal.
Alternatively, pressing control-x abandons the whole program.

When the verbose flag is 0, there is no opportunity for using the
dumb terminal routine, and there are no progress messages or modem
dialogue displayed on screen.  Personally, I always prefer to see
what is going on rather than having a blank screen to stare at, but
this is largely a matter of taste and convenience.  A batched or
background fax operation has no need of any user messages.

The reason why the PRINTF and PUTS macros were defined in the way
they were rather than via a more usual #if debug type construction is
because of the real-time nature of the operations involved.  During
the testing of the software in the subdirectory, it became apparent
that having separately compiled debug and non-debug versions of the
same source code was less than satisfactory because the nature of the
timings involved meant that the two versions behaved in different
ways.  This is of course especially true of class 1 modems, as the
timings are much more critical.

It is inevitably that differences should exist, and the definition of
our macros is an attempt to minimize the differences.  Apart from the
fact that the decision to output debug messages is made at run time
rather that at compile time, the fact that even when verbose is 0 the
macros will force a printf or a puts with a null string makes the two
modes of operations as similar as possible.

For proper debugging, the output from PRINTF or PUTS should be either
piped to a disk file or else the macros should be rewritten to make
this logging explicit.  Timing information could also be included at
suitable points if the granularity of the system clock provides
useful information; this isn't usually true of IBM PC systems, where
the clock can provide useful information for logging the time and
duration of fax sessions, but is not accurate enough for debugging
class 1 T.30 session reports.

7.  Primitive class 1 functions - FAXCLAS1.C

The first of our class 1 specific modules is FAXCLAS1.C, which
contains four generic class 1 routines.  These correspond to the four
main class 1 fax commands.

i. sendframe

The function sendframe transmits a frame of HDLC data and is often
used with the AT+FTH command.  However, in many circumstances, a
class 1 modem will prompt for HDLC frame data with a simple CONNECT,
without having being specifically told to with this command. For
example, FTH=3 command is implied by all calls answered, and is also
implied where multiple frames follow in sequence.  The modem detects
this by checking the P/F bit (the fifth bit in the control octet) of
every frame sent out.  Other frames that are never preceded by +FTH
include the RCP used in error corrected mode.

This function also caters for resending frames where there has been
no response.  In this situation, the timeout is detected by the
application and the modem is sent a character (the CAN character
decimal 24 in this case) to abort the receive command.  When this
occurs, an explicit +FTH command is needed even if the original frame
did not require one.

Because of these considerations, the sendframe function actually
looks quite complex.  It is called with the usual session structure
and a pointer to a character array containing the frame data to be
sent; the first character in the array must always consist of a count
of characters to be sent.  The leading 0xff which is the standard fax
address octet is automatically added to the start of the frame by
this function and the 16-bit frame check sequence is added by the
modem at the end, so only the intermediate octets (beginning with the
control octet) need to be sent.

As no frame can take longer than 3 seconds, there is a theoretical
maximum size of 900 bits per frame at 300 bps.  In practise, all the
regular T.30 frames are much shorter, and we allow 36 octets for the
frame data. Before sending the data, it is copied to an internal
static array, which enables easy resending of any frame that fails to
elucidate a proper response. If sendframe is called with a null
pointer, then the last frame is resent instead.  We don't bother to
resend optional frames.

The conditions under which the sending of the +FTH and the wait for a
response are omitted are a)  if the frame was an RCP or b)  if the
frame is the second in an ID sequence (DCS or CSI or DIS) and it is
being sent for the first time.

This function returns a code via a call to framestat.  In other words,
a value of 1 indicates a response of either CONNECT or OK, while a
value of 0 means NO CARRIER or ERROR.  A value of -1 means that the
+FTH command timed out; given the half-duplex nature of the link,
this is a fatal error .

ii. getframe

The function getframe receives a frame of HDLC data and is used with
the AT+FRH command.  However, the +FRH is issued by the caller and
not by this function; it starts off with an attempt to detect the
primary CONNECT response to the +FRH.  Once this is received,
incoming data is placed in a character buffer (the pointer to which
is provided by the caller) until the <dle><etx> sequence is
detected.  The function then looks for the secondary OK or ERROR
response to find out whether or not the frame was a good one.

If the frame was bad, the last transmitted frame is resent.  This
also occurs if the frame was a good CRP command repeat frame.

As well as incoming data being returned to the caller, it is also
displayed in hexadecimal format if the verbose flag is set. Debugging
of T.30 HDLC data often an essential tool in finding out what is
causing problems with specific fax sessions.

Once again, this function looks quite complex, as it includes the
code for implementing the mandatory 3-second response timeout. If no
frame is received within 3 seconds, the last frame is resent (by
calling sendframe with a null pointer), and if no response is
received after three attempts, then a fatal error code is returned.

The function also included code to handle the special case of +FRH
being implied by the phone being answered each time a fax call is
made.

iii. sendpage

The function sendpage transmits phase C fax data and is used with the
AT+FTM command.  It is called with the usual session  structure, and
also with the minimum scan line time as arrived at during the T.30
negotiations.  The function assumes that the negotiations have
successfully informed the recipient of the type of data that is being
sent.

We start by issuing an AT+FTM=<mod> command, where <mod> is the
modulation corresponding to the speed negotiated, which is reported
in the bps member of the session structure.  We wait for a CONNECT
response via framestat before continuing.  Since the session
structure also contains within it the pointer to the file being
transmitted, together with the offset of the image within the file
and its length, all we need to do is to fseek to the correct place in
the file, read the required number of bytes, and send them via
txchar.

Note that if the image is one that uses TIFF FillOrder=1, then each
byte in the file needs to be inverted before sending.

The task of sending is complicated by the need to pad out each scan
line to the minimum time required.  The actual length required for
each line is easily worked out from the bit rate - in the code
supplied, the number of bytes is worked out from the bit rate using
the formula minscan=(((bps*100)/(1000/minscan))/8)+1 which looks
complicated until you realise that minscan is passed as a time in
milliseconds and bps is a bit rate in hundreds per second (so bps=96
is 9600 bps).  Once minscan is converted from a value in milliseconds
to the equivalent count in bytes, all we have to do is count the
bytes between successive EOL codes and add sufficient null bytes to
bring the value up to the computed minscan as required.

The difficult part is detecting the EOL codes.  If we could be sure
that the EOLs were byte-aligned then the task would be simplified, as
all we would need to do is search for the bit pattern xxxx0000
followed by 00000001.  Unfortunately, we can't be sure that EOLs are
byte-aligned.

This means that we have to count the number of successive 0 bits and
find the EOL codes ourselves.

The way we do this is to count the number of trailing 0s in each
octet sent out first  We do this (once each octet has been inverted
for transmission as necessary) using the algorithm

 for (trailing0s=8 ; octet ; octet>>=1,trailing0s--) ;

One limiting case is where octet==00000000, in which case the loop
immediately terminates with trailing0s=8.  The other limiting case is
where octet==10000000, in which case we have to complete 8 right
shifts before the loop terminates with trailing0s=0.

The next step is to count to number of leading 0s in the next octet to
be sent out.  Once again, we do this after any necessary inversion,
using this complementary algorithm

for (leading0s=8 ; octet ; octet<<=1,leading0s--) ;

We then add together trailing0s and leading0s; if they are over 10,
then we have found an EOL code and proceed to pad the line out as
necessary.

Once all the phase C data has been transmitted, we transmit the
necessary RTC sequence for the coding used to create the fax image
(found in the options member of the TIFF image structure included
within the session structure).  We end the phase C data with the
normal <dle><etx> sequence.

This is not, however, the end of sendpage, which valiantly carries on
and sends the appropriate phase D post-page command. Before this is
done, the AT+FTS=8 command delays for 80 milliseconds in accordance
with the T.30 requirement for a transmitter to delay after switching
modulation schemes without turning the line around.

The type of phase D command to be sent is decided on by inspecting
the offset of the IFD for the next image in the transmitted file.  If
the offset is non-zero then there is assumed to be another page, and
we send an MPS frame.  If the offset of the next IFD is zero, then
there are no more pages, and we send an EOP frame instead.  Note that
this method can't cope with TIFF files containing images in different
formats, which would require an EOM frame followed by a return to
phase B.  Implementing this feature is left as an exercise for the
student.

The sendpage function terminates at this point; all the caller has to
do is find out the post-message response using getframe.

iv. getpage

The final member of our primitive class 1 quartet is the function
getpage, which receives phase C fax data and is used with the AT+FRM
command.  It is the simplest of the four functions, and in essence
just needs to issue the AT+FRM=<mod> command as required by the bps
parameter set during the T.30 negotiations, and then swallow all
subsequent data, buffering it to disk, until a <dle><etx> is
detected.

In our implementation, getpage is padded out with some possibly
unnecessary frills.  We discard all incoming data until the first EOL
code; many fax systems start pages off with a succession of flag
sequences.  We also perform EOL detection using the same trailing0s
and leading0s algorithms described earlier. The main reason why this
is done is so that the number of lines in the image can be accurately
included in the TIFF IFD.  Arguably, it is something that doesn't
necessarily needed to be handled in real-time.  However, it imposes
no real strain on most CPUs and is therefore included in our
implementation.

Since we are detecting EOL codes, we also implement our own 6 second
timeout on the maximum length of a scan line; arguably, the class 1
modem ought to be able to detect overlong scan lines itself.

Once a <dle><etx> is detected, the function calls tiffwrite_ifd to
tidy up the TIFF file, and returns to the caller with 0 for success.
Once again, non-zero returns are used to indicate a variety of error
conditions.

8.  FRAMES.H and assumptions about compilers

This header file defines the FCFs for the frames specified by the
T.30 fax session protocol.  It isn't quite as simple as it looks.

The first complication is one that we are by now used to; the
definitions of the octets are inverted from the bit patterns given in
the T.30 protocol.  A frame such as a DIS has the bit pattern
00000001, but the corresponding line in the header file is # define
DIS 0x80 The most significant bits of each FCF are the X bits listed
in T.30, and are set according to which station received a DIS.
Because of the inversion of the bit order by the synchronous to
asynchronous conversion process, these appear as the least
significant bits of the second octet of each incoming frame.  This is
the reason why our code precedes the examination of these FCFs by
ANDing them with 0xfe.

The second complication is related to the first, and involves the way
that the T.30 bit fields are defined in FRAMES.H.  In our
implementation, the contents of the DIS/DTC/DCS frames are stored in
a bitfield structure which is in turn combined in a union with an
array of bits to allow for easy initialisation.  The problem is that
according Kernighan and Ritchie "(bit)fields are assigned
left-to-right on some machines and right to left on others,
reflecting the nature of different hardware.  This means that
although fields are quite useful for maintaining internally-defined
data structures, the question of which end comes first has to be
carefully considered when picking apart externally-defined data"
(section 6.7 of "The C Programming Language", Kernighan & Ritchie,
Prentice-Hall 1978).

In fact, there is no onus on any compiler to adhere to any specific
ordering of bit fields within bytes.  The C compilers which I have
been using all order bitfields from right to left, and I have assumed
that this hold for all C compilers anyone else might use.  In other
words, if a union is defined as follows

 union right_to_left
 {
    unsigned char byte ;
    struct
    {
        unsigned bit1: 1 ;
        unsigned bit2: 1 ;
        unsigned bit3: 1 ;
        unsigned bit4: 1 ;
        unsigned bit5: 1 ;
        unsigned bit6: 1 ;
        unsigned bit7: 1 ;
        unsigned bit8: 1 ;
    } bits ;
 } ;

then the Boolean expression byte&1 tests the value of the structure
member bits.bit1 while byte&0x80 accesses bits.bit8.

You should note one peculiar fact which may not be immediately
obvious.

When bitfields are ordered from right to left, this naturally and
conveniently inverts each byte for us.  The first bitfield defined in
the header really does correspond to the first bit in the FIF, even
though the asynchronous conversion process means that it is actually
the eighth bit received.  Though the conventional right-to-left bit
ordering actually reverses the order of the bits as read, this is
cancelled out by the fax modem, which does exactly the same thing and
reverse the bits the right way round again.

The right-to-left bitfield ordering convention is one that needs to
be verified on any specific C compiler for the class 1 code to work
properly.   In fact, it is one of a number of assumptions that the
code makes; for instance, we assume that a byte is 8 bits wide, that
a char occupies one byte, while a short int occupies two bytes and a
long int occupies four bytes.

Ideally, these assumptions ought to be verified before compiling any
of the code on a new target machine.  Most compiler users will be
aware if their machine has peculiar word lengths, or if their
compiler has unusual values for sizeof int.  However, very few
programmers could give a definite answer to the question as how their
compiler handles the ordering of any bitfields, and it is something
which doesn't often appear in reference manuals either.

The following code ought to compile well on any computer, and performs
a number of checks which, taken together, should identify which (if
any) of our assumptions are untrue.

 #include "stdio.h"
 void main ()
 {
     union of_types
     {
         unsigned short int s ;
         unsigned long int l ;
         unsigned char c[4] ;
         unsigned highbit: 1 ;
     } testdata ;
     int n ;
     char i ;

     printf ("Size of short int = %u\n",sizeof(short int)) ;
     printf ("Size of long int = %u\n",sizeof(long int)) ;
     printf ("Size of char = %u\n",sizeof(char)) ;
     printf ("Filling union of 4 bytes with 4-3-2-1\n") ;

     for (n=0, i=4 ; n<4 ; i--,n++ ) testdata.c[n]=i ;

     printf ("\tLong Integer in complete union is 0%lx hex\n",testdata.l) ;
     printf ("\tShort Integer in first two bytes is 0%x hex\n",testdata.s) ;
     printf ("\tChar in first byte is 0%x hex\n",testdata.c[0]) ;
     printf ("Therefore this machine is ") ;

     if ((testdata.l==0x01020304)&&(testdata.s==0x0304))
      printf ("little-endian Intel (II) type\n") ;
     else    if ((testdata.l==0x04030201)&&(testdata.s==0x0403))
      printf ("big-endian Motorola (MM) type\n") ;
     else
      printf ("of an unknown type\n") ;
     testdata.c[0]=0x80 ;

     if (testdata.highbit==1)
       printf ("Bitfields run left to right 1 -> 8 \n") ;
     else
       printf ("Bitfields run right to left 8 <- 1 \n") ;
 }

Typical output from this program is as follows on an IBM PC :

 Size of short int = 2
 Size of long int = 4
 Size of char = 1
 Filling union of 4 bytes with 4-3-2-1
 Long Integer in complete union is 01020304 hex
 Short Integer in first two bytes is 0304 hex
 Char in first byte is 04 hex
 Therefore this machine is little-endian Intel (II) type
 Bitfields run right to left 8 <- 1

Our code ought to run perfectly well on big-endian Motorola machines,
but if the sizes of char and int are different from the output above
then you have problems.

If the order of the bitfield runs from left to right, then the
definitions of the fields in FRAMES.H will be incorrect, and the
header file should be amended to reverse the order of the bitfields
in each octet.  Using the example union given above, the bitfields
should be reordered as follows :

 union left_to_right
 {
    unsigned char byte ;
    struct
    {
        unsigned bit8: 1 ;
        unsigned bit7: 1 ;
        unsigned bit6: 1 ;
        unsigned bit5: 1 ;
        unsigned bit4: 1 ;
        unsigned bit3: 1 ;
        unsigned bit2: 1 ;
        unsigned bit1: 1 ;
    } bits ;
 } ;

9.  Receiving a fax with a class 1 modem - FAXIN1.C

Receiving a fax with a class 1 modem is quite straightforward if you
approach it in the right way.  The code presented here is based on
that found in FAXIN1.C in the subdirectory.  Since we want the code to
work with any modem, we begin with the fairly straightforward command
AT+FRM=? which tells us its exact capabilities for receiving fax
data.  We need this to compose our DIS frame, which in turn tells the
caller the maximum speed that can be used to send us a fax.

Possibly the easiest part of the receive procedure is waiting for the
RING prompt from the modem and issuing an ATA (or just issuing an ATA
if the modem is being used to receive a fax once the telephone has
already been answered).  If the call is from another fax, you'll see
the CONNECT response, at which point the initial negotiating frames
should be sent out; usually these consist of the CSI identity frame
and the DIS capabilities frame.

The code for sending the CSI and DIS frames is typical of the way our
code sends frames, so we'll take this opportunity of showing how we
use sendframe to do this in some detail .  We pass the frame data in
a character array (appropriately enough called frame), and make use
of the definitions in FRAMES.H.  As well as the already-inverted bit
patterns for the FCFs of all the possible T.30 frames, this header
file also contains definitions of both possible control octets;
CTLNXT is used when a frame is followed by another, and CTLLST is
used when a final frame is sent.

Where our frames require FIFs to be included, these are best prepared
beforehand.   You will recall that our session structure includes a
20- character string member with our local ID.  Sending a CSI frame
is accomplished with the following code :

 frame[0]=22 ;
 frame[1]=CTLNXT ;
 frame[2]=CSI ;
 x=3 ;
 for (i=20 ; i>strlen(session->id) ; i--) frame[x++]=0x20 ;
 for (i=strlen(session->id) ; i ; ) frame[x++]=session->id[--i] ;
 if (sendframe (session,frame)!=1) return ;

We begin by placing an integer 22 in frame[0] as we are sending a
control octet, an FCF, and a FIF of 20 characters.  As the frame is
to be followed by a DIS, the control octet is set to CTLNXT. Placing
the definition from FRAMES.H for the CSI facsimile control field
follows.

Finally, we insert the ID sequence in the frame backwards.  The first
loop in the above fragment inserts spaces for padding in case the ID
string is shorter than 20 character, while the second loop inserts
the contents of the ID string.  Note that we use the prefix decrement
operator [--i] to index the character array, as we don't want to send
the final null byte of the ID string.

Once the frame data is complete, we pass the its address along with
the ubiquitous session structure to sendframe, which ought to return
a 1 for all successfully sent frames.  If the frame was a final one,
the modem will have responded OK, while for an intermediate frame the
response is another CONNECT.

Sending the DIS frame requires a little more preparation.  A fax
parameter structure faxparms is prepared which consists of the sort of
union of bytes and bitfields we have been discussing.  This structure is
carefully initialized according to our requirements in accordance with
the T.30 recommendations.  A basic 3-octet DIS is initialized and sent in
a frame as follows;

 union faxparms dis ;
 for (i=0 ; i<3 ; i++) dis.byte[i]=0
 dis.byte[1]=0x0c ;/* 9600 7200 V.29 and 4800 2400 V.27 ter */
 dis.bit.b09=0 ; /* no T4 transmit (polling) capability */
 dis.bit.b10=1 ; /* T4 receive capability */
 dis.bit.b15=1 ; /* we can receive fine resolution */
 dis.bit.b20=1 ; /* unlimited length */
 dis.bit.b21=1 ; /* minimum scan lime time 0 ms */
 dis.bit.b22=1 ; /* minimum scan lime time 0 ms */
 dis.bit.b23=1 ; /* minimum scan lime time 0 ms */
 dis.bit.b16=1 ; /* we can receive 2-D faxes */
 frame[0]=5 ;
 frame[1]=CTLLST ;
 frame[2]=DIS ;
 x=3 ;
 for (i=0 ; i<3 ; i++) frame[x++]=dis.byte[i] ;
 if (sendframe (defs,frame)!=1) return ;

The code in the subdirectory is rather more complex, as it first
interrogates the modem to find out what its speed capabilities are,
and also looks up the various T.30 options (fine resolution,
two-dimensional coding and ECM) in the session structure and used
those to initialize the DIS bitfields.

Once the CSI and DIS frames have been transmitted, the code drops
into one big continuous loop, which begins with issuing an AT+FRH
command to the modem and looking at the command in the frame that
comes in.

Remember that the getframe function takes care of resending the last
frame if there is a timeout.

The body of the faxin1 loop is a large switch statement, with each
case consisting of one of the possible frames that could come in. The
action we take for some of these frames is often up to us.  For
instance, a TSI frame with a transmitter ID could easily be ignored;
our code simply displays the ID on the screen, while a complex fax
management system would presumably write the caller ID to a log file.

The most apparently complex response to a frame is to a DCS.  The
code for this case is more than twice as long as the code for all the
other cases in the switch statement put together.  It is actually
quite straightforward though, and falls neatly into four parts as
follows :

i.  The DCS frame is examined, and the various options for
the resolution and coding scheme proposed for the incoming fax are
used to set up a possible TIFF IFD.

ii.  The proposed speed and modulation scheme is extracted from the
DCS, and we issue an AT+FRM command with the appropriate parameter
and await a CONNECT response (using framestat).

iii.  We then look for a successful training sequence ; out of the
1.5 seconds of 0 octets sent by the transmitter, we pragmatically
decide that if we receive .75 seconds worth, then the line is OK.

If we don't received .75 seconds of nulls, then we send an FTT failure
to train frame and wait for the next frame (which will presumably be
another DCS - note that if we don't receive anything the FTT will be
sent again by the getframe at the top of the main loop.

Note that if we broke out of the AT+FRM for any other reason than a
<dle><etx> arriving, we need to wait for the command to end before
issuing a suitable response frame. Those who say the class 1
specification is more standard than the class 2 one haven't  tried to
wrestle with this problem. One method of doing this is to the AT+FRS
command to wait for silence, but this is unreliable on many modems.
My preferred method is to force the +FRM to end by issuing a CAN
character.

If this proves unreliable on any particular modem a third method is
simply to wait for the <dle><etx> to come in.  A three-second timeout
in case of modem error is a good idea in such circumstances.

        if ((k!=etx)||(lastchar!=dle))
            {
            t=(clock()+(CLK_TCK*3)) ;
                {
                lastchar=k ;
                while (rxstat()==0) if (t<clock()) break ;
                k=rxchar() ;
                if ((k==etx)&&(lastchar==dle)) break ;
                }
            }

iv.  A successful train is followed by the sending of a CFR
confirmation frame, and a call to getpage to receive the first page
of the fax.  After the getpage call, we return to the start of the
loop for another getframe, which this time is for the phase D
post-page command.

By comparison with the DCS, the code for responding to other frames
is easier.  The reason why the faxin1 function is so straightforward
an implementation of the T.30 protocol is because receiving a fax is
essentially a passive process.

10.  Sending a fax with a class 1 modem - FAXOUT1.C

There are many similarities between the code for sending and
receiving faxes with class 1 modems.  This is only to be expected, as
the underlying functions in FAXCLAS1.C for sending and receiving
frames and data are identical.  The differences mostly lie in the
sequencing and the interpretation.  Like FAXIN1.C, FAXOUT1.C starts
off by interrogating the modem to find out what modulations and
speeds it can use for transmitting; this requires an AT+FTM=?
command.  We store the highest possible bit rate compatible with both
the modem and the preferred faxing speed (passed in the
session->brate) in session->bps.

The code for dialling via ATD and waiting for a CONNECT response is
exactly the same as we would use for making a straightforward data
call.

However, immediately the connection is made we have to be prepared to
receive the initial identification framed from the answering fax,
which required an immediate call to getframe.  This is a unique
situation.

Normally, we only call this function after we have issued an AT+FRH
command, and getframe starts by waiting for the primary response of
CONNECT, followed by the frame data,  and then end by receiving the
secondary response of OK for a good frame.

However, the AT+FRH command is implied by a successful dial, and we
have already received the CONNECT response.  In order to avoid
writing a special function to receive the first frame after dialling,
we identify the situation by initialising the character buffer used
to receive frame with the text string CALL JUST ANSWERED; when
getframe detects this, it skips the wait for the initial CONNECT
response and goes straight to receive frame data.

As with the code for receiving faxes, the rest of the code in
FAXOUT1.C is contained within a large continuous loop which starts by
receiving a frame, and then consists of switch (frame[2]) statement
with one case for each type of frame we might receive.

Cases such as the DIS frame handler look complicated because they need
to do a lot of analysis of the data.  For example, the protocol
requires that we match our own capabilities to that of the receiver.
This requires picking the relevant bits out of the frame, and
occasionally requires an additional switch statement for complex
fields, as in the following fragment.

 switch (dis.byte[1]&0x3c)
 {
     case 0x08 : receive_speed=4800 ; break ;
     case 0x0c : receive_speed=9600 ; break ;
     case 0x2c : receive_speed=14400 ; break ;
     default :   receive_speed=2400 ;
 if (receive_speed<session->bps) session->bps=receive_speed ;

All this is doing is isolating the bits carrying the receiver speed
and resetting our speed to match if we can send a fax faster than it
can be received.  This sort of code is typical of the way fax
negotiations are handled.  By contrast, the case statement that
actually sends the fax is the one handling the CFR frame which
confirms a successful train and is followed by phase C.  The
complexity is hidden in a simple call to sendpage.

One feature worth noting about the program flow is that the sequence
which sends the TSI and DCS and follows this up with a training check
is triggered not by the receipt of a DIS from the receiver, but by
receipt of final frame which isn't handled in any other way.  We test
the P/F bit in the control field which can be isolated via
frame[1]&0x10.  This enables us to retrain after FTT failures or
retrain requests.

11.  Receiving a fax using ECM - FAXECMRX.C

The modifications to FAXIN1.C to add error checking are quite
straightforward.

i.  We need to include a check that the fax modem can receive
high-speed HDLC frames, and if so at what speed.  This requires a
simple AT+FRH=?  test with an analysis of the result.

ii.  Provided that the modem is suitable,  we need to set bit 27 of
the DIS frame to let the transmitter know that we can handles ECM.

iii.  Finally, we check bit 27 of the DCS frame after a successful
train and call getecm (with the correct frame size) instead of
getpage if error checking has actually been negotiated.

The getecm function itself is a fairly straightforward implementation
of the error checking protocol.  The main body of the function
consists of a loop which send out the correct AT+FRH=<mod> command
and analyses the resulting frames.  We leave the modem to check the
integrity of the frame and report with either an OK or an ERROR.
Frames with errors are ignored, which leaves only two cases that need
considering.

The simpler is the RCP frame, which requires that we drop back to
negotiating speed to await the PPS frame.

The other possibility is an FCD frame.  There are up to 256 data
frames, with a possible total size of 64K bytes. The ECM data could
be buffered in memory, but the fragment presented here writes the
data in each frame to disk as it is received.  Modern disks are fast
enough to cope with this (and RAM disks can always be used if
preferred). The data in each frame isn't written sequentially, but at
a file offset calculated using the frame number (included in the FCD)
and the block number (which is initialized to 0 and incremented for
each block).  Finally, a bit map of the 256 possible frames in each
block is updated.  Note that since the last frame need not be full
size, the number of data bytes in each frame is variable, and is
worked out from subtracted the overhead in each frame from the total
number of bytes received.

 block_size=(bytes_received-6) ;
 frame_no=frame[3] ;
 image_offset=((block_size*frame_no)+(block_size*256L*block_no)) ; fseek
 (faxfile,image_offset,0) ;
 fwrite (frame+4,1,lastblock_size,faxfile) ;
 frame_map[frame_no/8]&=(~(0x01<<(frame_no%8))) ;

The bit map of all the frames is again checked when the PPS frame (of
any type) arrives, which contains the total number of data frames. If
the number of frames sent matches the number of frames marked as
good, then we send an MCF frame.

 for (i=0 ; i<frame[6] ; i++)
 if (frame_map[i/8]&(0x01<<(i%8))) break ;
 if (i==(frame[6]+1)) send_mcf ;

If one or more of the frames is marked as bad, then we send the
entire bit map as part of the PPR response, and await either a CTC or
an EOR response from the transmitter.

12.  Sending a fax using ECM - FAXECMTX.C

The modifications to FAXOUT1.C to add error checking complement those
made to the receiving module.

i.  We need to include a check that the fax modem can transmit
high-speed HDLC frames, and if so at what speed.  This requires a
simple AT+FTH=?  test with an analysis of the result.

ii.  Provided that the modem is suitable,  we match bit 27 of a
received DIS frame in bit 27 of our DCS frame.  If both are these are
set, then we are able to use ECM.  We can choose what block size to
use at our discretion.

iii.  Once the CFR frame confirms a successful train, we call sendecm
instead of sendpage.

The sendecm function itself is also rather similar in concept to that
for receiving an ECM frame.

We could start by sending a block as a series of sequential frames
and then do a selective send for those frames that were marked as bad
in the PPR response.  However, the fragment presented here uses the
same routine for both cases, and initialises the frame map to all bad
(all bits set to 1) in order to ensure that all frames are sent at
the first attempt.  We have a straightforward loop for each bad bit
in the map.  While the fseek at the start of the frame is strictly
speaking redundant on the first attempt, it won't do any harm.

The fragment initialises block_no to 0 and increments it for each
block, and also needs to know both the length of an image in advance
and the offset of the first byte in the image.  This is no real
problem for TIFF files, as it is all stored in the IFD.

 for (frame_no=0 ; frame_no<256 ; frame_no++)
 {
        if ((frame_map[frame_no/8]&(0x01<<(frame_no%8)))==0) continue ;
        image_offset=((block_size*frame_no)+(block_size*256*block_no)) ;
        fseek (faxfile,image_start+image_offset,0) ;
        bytes_left=(image_length-image_offset) ;
        if (bytes_left>=block_size) i=block_size ;
        else i=bytes_left ;
        bytesleft-=i ;
        fread (frame_buffer,1,i,defs->faxfile) ;

Once the frame data is read, sending it is quite straightforward.
After the last frame in a block has been read, the progress of the
transmission presents no real problems either, with the only decision
that has to made being the strategy to adopt if four consecutive PPR
responses are received.

13.  Receiving a fax with class 2/2.0 modems - FAXIN2.C and FAXIN20.C

The two functions faxin2 and faxin20 are almost identical.  The setup
procedure is rather more complex than it ought to be owing to the need
to allow for modems than include comma-delimited parameter
enumerations in brackets such as (0,1).  Apart from this, the fax
receive procedure is simple.  We answer the phone with ATA, issue an
AT+FDR, and then accept phase C data in exactly the same way as the
getpage function, with the action taken after the end of the page
determined by the FET:  report.

The FDCS: report before phase C which gives the negotiated parameter
is inspected for information on the resolution and compression, which
is included in the TIFF IFD when the image is written.  Apart from
the rather convoluted mechanism required for counting scan lines
(which can be omitted if the image will be read by a TIFF reader
which doesn't need this datum) there should be nothing terribly
unusual about either of these functions.  Though more documentation
can be found in the source code, there is no real substitute for a
good knowledge of the EIA-592 specifications.

14.  Sending a fax with class 2/2.0 modems - FAXOUT2.C and FAXOUT20.C

The two functions faxout2 and faxout20 are also almost the same. Once
the telephone is answered and the CONNECT message arrives, the phase
C data is simply send as a byte stream with <dle> shielding.  The
AT+FET command after each image reflects whether another page is due
to be sent.  Once again, the comments in the source together with
EIA-592 should enable anyone to follow the code.

The main complication in the sending process centres around the fact
that the image is prepared off-line and therefore may not match
either the capabilities of the modem or the capabilities of the
remote fax.  The same image structure we have used before serves to
handle these checks.
