Difference between revisions of "Gzip"
Joachim Metz (Talk | contribs) |
Joachim Metz (Talk | contribs) (→Flags) |
||
Line 76: | Line 76: | ||
| 0x01 | | 0x01 | ||
| FTEXT | | FTEXT | ||
− | | If set the uncompressed data needs to be treated as text instead of binary data. <br> | + | | If set the uncompressed data needs to be treated as text instead of binary data. <br> This flag hints end-of-line conversion for cross-platform text files but does not enforce it. |
− | + | ||
|- | |- | ||
| 0x02 | | 0x02 |
Revision as of 07:01, 28 November 2013
Please help to improve this article by expanding it.
|
Contents
File format
The gzip file (.gz) format consists of:
- a file header
- optional headers
- extra fields
- original file name
- comment
- header checksum
- a body, containing a DEFLATE-compressed payload
- an 8-byte footer, containing a CRC-32 checksum and the length of the original uncompressed data.
File header
The file header is 10 bytes in size and contains:
Offset | Size | Value | Description |
---|---|---|---|
0 | 2 | 0x1f 0x8b | Signature (or identification byte 1 and 2) |
2 | 1 | Compression Method | |
3 | 1 | Flags | |
4 | 4 | Last modification time Contains a POSIX timestamp. | |
8 | 1 | Extra flags | |
9 | 1 | Operating system Value that indicates on which operating system the gzip file was created. |
Compression method
Value | Identifier | Description |
---|---|---|
0 - 7 | Reserved | |
8 | "deflate" | zlib compressed data |
Flags
Value | Identifier | Description |
---|---|---|
0x01 | FTEXT | If set the uncompressed data needs to be treated as text instead of binary data. This flag hints end-of-line conversion for cross-platform text files but does not enforce it. |
0x02 | FHCRC | The file contains a header checksum (CRC16) |
0x04 | FEXTRA | The file contains extra fields |
0x08 | FNAME | The file contains an original file name string |
0x10 | FCOMMENT | The file contains comment |
0x20 | Reserved | |
0x40 | Reserved | |
0x80 | Reserved |
Note: The FHCRC bit was never set by versions of gzip up to 1.2.4, even though it was documented with a different meaning in gzip 1.2.4.
Extra flags
If compression method is 8 the following extra flags can be defined:
Value | Identifier | Description |
---|---|---|
0x02 | compressor used maximum compression, slowest algorithm | |
0x04 | compressor used fastest algorithm |
Operating System
Value | Identifier | Description |
---|---|---|
0 | FAT filesystem (MS-DOS, OS/2, NT/Win32) | |
1 | Amiga | |
2 | VMS (or OpenVMS) | |
3 | Unix | |
4 | VM/CMS | |
5 | Atari TOS | |
6 | HPFS filesystem (OS/2, NT) | |
7 | Macintosh | |
8 | Z-System | |
9 | CP/M | |
10 | TOPS-20 | |
11 | NTFS filesystem (NT) | |
12 | QDOS | |
13 | Acorn RISCOS | |
255 | unknown |
Optional headers
Extra fields
TODO: add description
The extra field are variable of size and contains:
Offset | Size | Value | Description |
---|---|---|---|
0 | 2 | Extra field data size Value in bytes. | |
2 | ... | Extra field data |
Original file name
This is the original name of the file being compressed, with any directory components removed, and, if the file being compressed is on a file system with case insensitive names, forced to lower case.
Contains an ISO 8859-1 (LATIN-1) string with end-of-string character.
Comment
Contains an ISO 8859-1 (LATIN-1) string with end-of-string character. Line breaks should be denoted by a single line feed character.
Header checksum
The CRC16 of the header checksum consists of the two least significant bytes of the CRC32 for all bytes of the gzip header up to and not including the CRC16.