Difference between revisions of "Gzip"

From Forensics Wiki
Jump to: navigation, search
(File format)
(4 intermediate revisions by one user not shown)
Line 4: Line 4:
 
The gzip file (.gz) format consists of:
 
The gzip file (.gz) format consists of:
 
* a file header
 
* a file header
* optional extra headers, such as the original file name,
+
* optional extra headers
 +
** original file name
 +
** header checksum
 
* a body, containing a DEFLATE-compressed payload
 
* a body, containing a DEFLATE-compressed payload
 
* an 8-byte footer, containing a CRC-32 checksum and the length of the original uncompressed data.
 
* an 8-byte footer, containing a CRC-32 checksum and the length of the original uncompressed data.
Line 72: Line 74:
 
| 0x01
 
| 0x01
 
| FTEXT
 
| FTEXT
|  
+
| If set the uncompressed data needs to be treated as text instead of binary data. <br>
 +
| This flag hints end-of-line conversion for cross-platform text files but does not enforce it.
 
|-
 
|-
 
| 0x02
 
| 0x02
 
| FHCRC
 
| FHCRC
|  
+
| The file contains a header checksum (CRC16)
 
|-
 
|-
 
| 0x04
 
| 0x04
Line 84: Line 87:
 
| 0x08
 
| 0x08
 
| FNAME
 
| FNAME
|  
+
| The file contains an original file name string
 
|-
 
|-
 
| 0x10
 
| 0x10
Line 102: Line 105:
 
| Reserved
 
| Reserved
 
|}
 
|}
 +
 +
<b>Note:</b> The FHCRC bit was never set by versions of gzip up to 1.2.4, even though it was documented with a different meaning in gzip 1.2.4.
  
 
==== Extra flags ====
 
==== Extra flags ====
Line 111: Line 116:
 
|-
 
|-
 
| 0x02
 
| 0x02
|  
+
|
 
| compressor used maximum compression, slowest algorithm
 
| compressor used maximum compression, slowest algorithm
 
|-
 
|-
 
| 0x04
 
| 0x04
|  
+
|
 
| compressor used fastest algorithm
 
| compressor used fastest algorithm
 
|}
 
|}
 +
 +
==== Operating System ====
 +
{| class="wikitable"
 +
! align="left"| Value
 +
! Identifier
 +
! Description
 +
|-
 +
| 0
 +
|
 +
| FAT filesystem (MS-DOS, OS/2, NT/Win32)
 +
|-
 +
| 1
 +
|
 +
| Amiga
 +
|-
 +
| 2
 +
|
 +
| VMS (or OpenVMS)
 +
|-
 +
| 3
 +
|
 +
| Unix
 +
|-
 +
| 4
 +
|
 +
| VM/CMS
 +
|-
 +
| 5
 +
|
 +
| Atari TOS
 +
|-
 +
| 6
 +
|
 +
| HPFS filesystem (OS/2, NT)
 +
|-
 +
| 7
 +
|
 +
| Macintosh
 +
|-
 +
| 8
 +
|
 +
| Z-System
 +
|-
 +
| 9
 +
|
 +
| CP/M
 +
|-
 +
| 10
 +
|
 +
| TOPS-20
 +
|-
 +
| 11
 +
|
 +
| NTFS filesystem (NT)
 +
|-
 +
| 12
 +
|
 +
| QDOS
 +
|-
 +
| 13
 +
|
 +
| Acorn RISCOS
 +
|-
 +
| 255
 +
|
 +
| unknown
 +
|}
 +
 +
=== Optional headers ===
 +
==== Header checksum ====
 +
The CRC16 of the header checksum consists of the two least significant bytes of the CRC32 for all bytes of the gzip header up to and not including the CRC16.
  
 
== External Links ==
 
== External Links ==

Revision as of 01:51, 28 November 2013

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

Contents

File format

The gzip file (.gz) format consists of:

  • a file header
  • optional extra headers
    • original file name
    • header checksum
  • a body, containing a DEFLATE-compressed payload
  • an 8-byte footer, containing a CRC-32 checksum and the length of the original uncompressed data.

File header

The file header is 10 bytes in size and contains:

Offset Size Value Description
0 2 0x1f 0x8b Signature (or identification byte 1 and 2)
2 1 Compression Method
3 1 Flags
4 4 Last modification time
Contains a POSIX timestamp.
8 1 Extra flags
9 1 Operating system
Value that indicates on which operating system the gzip file was created.

Compression method

Value Identifier Description
0 - 7 Reserved
8 "deflate" zlib compressed data

Flags

Value Identifier Description
0x01 FTEXT If set the uncompressed data needs to be treated as text instead of binary data.
This flag hints end-of-line conversion for cross-platform text files but does not enforce it.
0x02 FHCRC The file contains a header checksum (CRC16)
0x04 FEXTRA
0x08 FNAME The file contains an original file name string
0x10 FCOMMENT
0x20 Reserved
0x40 Reserved
0x80 Reserved

Note: The FHCRC bit was never set by versions of gzip up to 1.2.4, even though it was documented with a different meaning in gzip 1.2.4.

Extra flags

If compression method is 8 the following extra flags can be defined:

Value Identifier Description
0x02 compressor used maximum compression, slowest algorithm
0x04 compressor used fastest algorithm

Operating System

Value Identifier Description
0 FAT filesystem (MS-DOS, OS/2, NT/Win32)
1 Amiga
2 VMS (or OpenVMS)
3 Unix
4 VM/CMS
5 Atari TOS
6 HPFS filesystem (OS/2, NT)
7 Macintosh
8 Z-System
9 CP/M
10 TOPS-20
11 NTFS filesystem (NT)
12 QDOS
13 Acorn RISCOS
255 unknown

Optional headers

Header checksum

The CRC16 of the header checksum consists of the two least significant bytes of the CRC32 for all bytes of the gzip header up to and not including the CRC16.

External Links