The gzip file (.gz) format consists of:
- a file header
- optional headers
- extra fields
- original file name
- header checksum
- a body, containing a DEFLATE-compressed payload
- a file footer
The gzip format uses little-endian.
|Date and time values||Filetime in UTC|
|Character string||ISO 8859-1 (LATIN-1)|
The file header is 10 bytes in size and contains:
|0||2||0x1f 0x8b||Signature (or identification byte 1 and 2)|
|4||4|| Last modification time |
Contains a POSIX timestamp.
|9||1|| Operating system |
Value that indicates on which operating system the gzip file was created.
|0 - 7||Reserved|
|8||"deflate"||zlib compressed data|
|0x01||FTEXT|| If set the uncompressed data needs to be treated as text instead of binary data. |
This flag hints end-of-line conversion for cross-platform text files but does not enforce it.
|0x02||FHCRC||The file contains a header checksum (CRC-16)|
|0x04||FEXTRA||The file contains extra fields|
|0x08||FNAME||The file contains an original file name string|
|0x10||FCOMMENT||The file contains comment|
Note: The FHCRC bit was never set by versions of gzip up to 1.2.4, even though it was documented with a different meaning in gzip 1.2.4.
If compression method is 8 the following extra flags can be defined:
|0x02||compressor used maximum compression, slowest algorithm|
|0x04||compressor used fastest algorithm|
|0||FAT filesystem (MS-DOS, OS/2, NT/Win32)|
|2||VMS (or OpenVMS)|
|6||HPFS filesystem (OS/2, NT)|
|11||NTFS filesystem (NT)|
TODO: add description
The extra field are variable of size and contains:
|0||2|| Extra field data size |
Value in bytes.
|2||...||Extra field data|
Original file name
This is the original name of the file being compressed, with any directory components removed, and, if the file being compressed is on a file system with case insensitive names, forced to lower case.
Contains an ISO 8859-1 (LATIN-1) string with end-of-string character.
Contains an ISO 8859-1 (LATIN-1) string with end-of-string character. Line breaks should be denoted by a single line feed character.
The header checksum contain a CRC-16 that consists of the two least significant bytes of the CRC-32 for all bytes of the gzip header up to and not including the CRC-16.
The file footer is 8 bytes in size and contains:
|4||4|| Uncompressed data size |
Value in bytes.