
(1)  Short note on the history of the ITU fax standards

Originally, recommendation T.4 covered all aspects of fax
communications, but those aspects dealing with transmission of a fax
were hived off to recommendation T.30, leaving T.4 to deal primarily
with the generation and coding of fax images.  The first of a series
of downwardly compatible recommendations was accepted at the CCITT
plenary meeting held at Geneva in 1980, and were revised at Malaga in
1984, at Melbourne in 1988, and most recently by ITU-T Study Group 8
in 1993. The most important of the recommendations discussed in this
chapter have remained constant since 1984.  They predate the reform
of the CCITT and its transmogrification into the ITU-T, and are
therefore often referred to as CCITT standards, though I've opted for
referring to the originators of the standard as the ITU.  The
recommendation mandates the availability use of modulation scheme
V.27 ter (4800 bps with a 2400 bps fallback).  It includes an option
to use V.29 modulation instead (9600 bps with a 4800 bps fallback).
While 9600 bps is the most common fax speed used today, it was
originally not seen as being completely practical for dial- up
operation in 1980, when telephone technology was far less advanced
than it is now, and V.29 speeds were only really considered suitable
for leased lines.  The ITU also specifies the power levels to be used
in fax machine, which is of little interest here.  This chapter is
only concerned with the details of how a fax is generated from an
original document.  

(2)  Short note on building pictures out of dots

The concept of pictures being built up from a series of dots on a
contrasting background (usually black on white) is familiar to
non-computer users from looking at either comic art or grainy
photographs in newspapers.  It has been exploited by various artists
from the early impressionists, who built up their pictures from blobs
of coloured paint, to modern artists such as Roy Liechtenstein, who
overtly used dots of colour to give a comic strip atmosphere to their
work.

(3)  Short note on pixels, dots and pels

In line with the tendency of English versions of international
standards to be written in a rather idiosyncratic form, the ITU
referred to a digitized bit not as a dot or pixel, but as a picture
element. Although the word pixel was already available as a
contraction of picture element, they decided to use the term pel
instead.

(4)  Short note on the dimensions of an A4 page

The A4 size 210 mm by 297 mm corresponds to a page 8.27" wide by
11.69" long.  You should note that as an A4 page is 210 mm wide, it
has a resolution of only 1688 dots horizontally.  It is quite a
common error to assume that there ought to be 1728 dots across a page
without looking carefully at what the width actually is.  My first
ever brush with the ITU fax standards was when I was called in to
help out with a computer fax developer whose software was about to
fail an approval test set by one of the European PTTs.  The problem
was that faxes sent by their system were full of reproduction errors
such as circles on the originals being stretched out to ellipses at
the destination, and horizontal scales on faxed maps becoming
inaccurate for estimating vertical distances. The cause turned out to
be that they had encoded all their A4 lines to the full 1728 pels
while keeping the proper 3.85 lines/mm vertical resolution.  The
resulting fax file was perfectly legal and was still sent, but the
margin of error on the resulting images turned out to be rather more
than was permissible.

(5)  Short note on the use of a 20 x 16 font

There is some official support for this fax character size, as the
T.4 recommendation states that a 20 x 16 font should be use in the
optional mixed mode (E.8/T.4).  Note also that in working out the
ideal number of characters per line we need to allow for the fact
that the specification states that only the middle 196.6 mm of an A4
page are guaranteed reproducible. Once we do this, we are left with a
middle of 1580 dots, and two margins of 74 dots each.  As each
character is 20 dots wide we allow only 79 characters on a line
rather than 82. And since the guaranteed reproducible length of a fax
is only 281.46 mm, we can similarly rely on only 67 lines per page.
In practice this is less of a restriction, as while all fax machines
have a fixed width, most can accept images of an indefinite length.

(6)  Short note on the fonts used in the programs

It would be nice to add a modification at some point to make the line
and block characters from 176 to 223 join up, but as they aren't
standard ASCII it's something of a luxury. This is left as a
programming exercise left for the reader.  Regrettable, even if this
is done, the font still isn't marvellous.  However, it is no worse
than the so-called "Ugly" font used by the Ghostscript Postscript
interpreter, and does reproduce quite well.  Although the
digitization code in the text ignores non-printable characters, the
code on the disk does expands tabs.

(7)  Short note on what to do if you don't understand Huffman codes

If you didn't follow the explanation in this chapter, then don't
worry; the rules can be followed automatically without needing to
understand the theory.  Regrettably, no matter what language is used,
the essential step in getting at the individual bits in a character
array is less that straightforward on any CPU designed to handle
information in 8 bit bytes, and generally requires some familiarity
with Boolean algebra.  Knowledge of the logical (bitwise)
consequences of arithmetic operations is also useful.

(8)  Short note on alternative ways of coding the fragments

It is possible to have one changecolor function which starts off with
a check to see what the current color is.  Alternatively, the
changecolor function could be defined as a function pointer to either
nextwhite or nextblack.  Note also that if we wanted to handle pages
with scan lines greater than the standard 1728 dots, we'd need make
sure that our tables of makeup codes includes the T.4 additions which
extend to 2560 dots.

(9)  Short note quoting T.4 to summarize the chapter

The T.4 documentation gives us an unambiguous and reasonably
concise summary of the more important parts of the coding mechanism
that we have been discussing:

"A line of Data is composed of a series of variable length code words.
Each code word represents a run length of either all white or all
black. White runs and black runs alternate. A total of 1728 picture
elements represents one horizontal scan line of 215 mm length.

In order to ensure that the receiver maintains colour synchronization,
all Data lines will begin with a white run length code word. If the
actual scan line begins with a black run, a white run length of zero
will be sent. Black or white run lengths, up to a maximum length of
one scan line (1728 picture elements or pels) are defined by the code
words .... The code words are of two types : Terminating code words
and Make-up code words. Each run length is represented by either one
Terminating code word or one Make-up code word followed by a
Terminating code word.

Run lengths in the range of 0 to 63 pels are encoded with their
appropriate Terminating code word. Note that there is a different
list of code words for black and white run lengths. Run lengths in
the range of 64 to 1728 pels are encoded first by the Make-up code
word representing the run length which is equal to or shorter than
that required. This is then followed by the Terminating code word
representing the difference between the required run length and the
run length represented by the Make-up code.

End-of-line (EOL) : This code word follows each line of Data. It is a
unique code word that can never be found within a valid line of Data
: therefore, resynchronization after an error burst is possible. In
addition, this signal will occur prior to the first Data line of a
page .... the end of a document transmission is indicated by sending
six consecutive EOLs."

(Incidentally, anyone ordering the official version of T.4 from the
ITU directly should note that the table of make-up codes, which is
given in this chapter, is wrongly stated to be a table of terminating
codes in the 1994 printing of the standard.)


