Open Source Discussions
Join open source discussions to collaborate with developers, share insights, and stay informed about cutting-edge technologies and community initiatives.
Showing results for 
Search instead for 
Did you mean: 

Z, gz, ZIP: A bit of a compression eyechart

Active Contributor
0 Kudos

As I browse online analyses of a compromised compression algorithm library, my mind drifted back to the early days of free UNIX and open source software, particularly the "birth" of GNU Z, the LZW algorithm, and the refreshing alternative "zip."

My first UNIX experience was a Unisys AT&T System VR2, then later VR3, which among its other built-in software had the compression program aptly named "compress" that tool typically text files and shrank them to a tenth of the usual storage space. You may still (or "again" depending on your timeline) find this inside a range of operating systems. Like this:


     The compress utility exits 0 on success, and >0 if an error occurs.

     Welch, Terry A., "A Technique for High Performance Data Compression",
     IEEE Computer, 17:6, pp. 8-19, June, 1984.
     The compress command appeared in 4.3BSD.

NetBSD 10.0                    January 23, 2003                    NetBSD 10.0


Terry Welch had the "W" in the LZW protocol and worked at Sperry (one of the former companies that became Unisys). Because of law-stuff you can research, the distribution of this handy utility became problematic, so the GNU open source developers created an alternative tool that could (a) uncompress ".Z" files, (b) ran faster, (c) compressed more, and (d) was copy-lefted. Named gzip instead of compress, it liberated us from law-stuff, mostly.


     The gzip utility exits 0 on success, 1 on errors, and 2 if a warning
     gzip responds to the following signals:
             Report progress to standard error.
     bzip2(1), compress(1), xz(1), fts(3), zlib(3)
     The gzip program was originally written by Jean-loup Gailly, licensed
     under the GNU Public Licence.  Matthew R. Green wrote a simple front end
     for NetBSD 1.3 distribution media, based on the freely re-distributable
     zlib library.  It was enhanced to be mostly feature-compatible with the
     original GNU gzip program for NetBSD 2.0.
     This manual documents NetBSD gzip version 20170803.
     This implementation of gzip was written by Matthew R. Green


This man page mentions "zlib", for reference, also a handy base library for compression utilities.


     RFC 1950      ZLIB Compressed Data Format Specification.
     RFC 1951      DEFLATE Compressed Data Format Specification.
     RFC 1952      GZIP File Format Specification.


Once system administrators got wind of "gz" versus "Z", some of us went with the former, others who needed to report to stricter management could not. I remember hacking away at a Prime system that we got to supplant the Unisys UNIX host, with a bit of BSD and a bit of AT&T. Getting the GCC compiler to work was needed first, usually, as the stock K&R tools were not sufficient. Fixing compile time errors by removing or editing code got me the hairy eyeball from a peer, but I got gzip built and on the plan.

Among the people who stood out (see, e.g. a writeup on the RFC 1950: were Jean-Loup Gailly and Mark Adler, the latter tagged as "original Zip author; UnZip decompression". One of the cool facts was the overlap of space exploration with algorithm development. Mark Adler had a day job at NASA "driving Rovers" and was hence, well positioned to leverage that research with compression code to get pictures from Mars as fast as the speed of light would allow.

I'll also mention Greg Roelofs (see: who helped me when I attempted to fix something in the UNIX X legacy program xpaint. You will likely see the results of that experience when he reviewed the XPaint code base and found a terrible design flaw that mangled color palettes when editing images. You may still see that warning today (as I do):


XPaint uses the native display format for storing image info while editing; 
the original image information is thrown away.  This means that, in general, 
color information is irretrievably lost when using any display depth less 
than 24 bits. 


So I like to say I didn't create this bug, or squash it, I just was lucky to work with someone who spotted the design flaw when I asked about another issue.



 - - - -
|   Z    |
|   gz   |
|  ZIP   |
- - - - -