FAT, or file allocation table, is a file system that is designed to keep track of allocation status of clusters on a hard drive. Developed in 1977 by Microsoft Corporation, FAT was originally intended to be a file system for the Microsoft Disk BASIC interpreter. FAT was quickly incorporated into an early version of Tim Patterson's QDOS, which was a moniker for "Quick and Dirty Operating System". Microsoft later purchased the rights to QDOS and released it under Microsoft branding as PC-DOS and later, MS-DOS.
File Allocation Table Structure
The FAT file system is composed of several areas:
- Boot Record or Boot Sector
- Root Directory or Root Folder
- Data Area
- Wasted Sectors
When a computer is powered on, a POST (power-on self test) is performed, and control is then transferred to the MBR (Master Boot Record). The MBR is present no matter what file system is in use, and contains information about how the storage device is logically partitioned. When using a FAT file system, the MBR hands off control of the computer to the Boot Record, which is the first sector on the partition. The Boot Record, which occupies a reserved area on the partition, contains executable code, in addition to information such as an OEM identifier, number of FATs, media descriptor (type of storage device), and information about the operating system to be booted. Once the Boot Record code executes, control is handed off to the operating system installed on that partition.
The primary task of the FATs is to keep track of the allocation status of clusters, or logical groupings of sectors, on the disk drive. There are four different possible FAT entries: allocated (along with the address of the next cluster associated with the file), unallocated, end of file, and bad sector.
In order to provide redundancy in case of data corruption, two FATs, FAT1 and FAT2, are stored in the file system. FAT2 is a duplicate of FAT1.
The Root Directory, sometimes referred to as the Root Folder, contains an entry for each file and directory stored in the file system. This information includes the file name, starting cluster number, and file size. This information is changed whenever a file is created or subsequently modified. Root directory has a fixed size of 512 entries on a hard disk and the size on a floppy disk depends. With FAT32 it can be stored anywhere within the partition, although in previous versions it is always located immediately following the FAT region.
The Boot Record, FATs, and Root Directory are collectively referred to as the System Area. The remaining space on the logical drive is called the Data Area, which is where files are actually stored. It should be noted that when a file is deleted by the operating system, the data stored in the Data Area remains intact until it is overwritten.
In order for FAT to manage files with satisfactory efficiency, it groups sectors into larger blocks referred to as clusters. A cluster is the smallest unit of disk space that can be allocated to a file, which is why clusters are often called allocation units. Only the "data area" is divided into clusters, the rest of the partition is simply sectors. Cluster size is determined by the size of the disk volume and every file must be allocated an even number of clusters. Cluster sizing has a significant impact on performance and disk utilization. Larger cluster sizes result in more wasted space because files are less likely to fill up an even number of clusters.
The size of one cluster is specified in the Boot Record and can range from a single sector (512 bytes) to 128 sectors (65536 bytes). The sectors in a cluster are continuous, therefore each cluster is a continuous block of space on the disk. Note that only one file can be allocated to a cluster. Therefore if a 1KB file is placed within a 32KB cluster there are 31KB of wasted space.
Wasted Sectors are a result of the number of data sectors not being evenly distributed by the cluster size. It's made up of unused bytes left at the end of a file. Also if the partition as declared in the partition table is larger than what is claimed in the Boot Record the volume can be said to have wasted sectors. Small files on a hard drive are the reason for wasted space and the bigger the hard drive the more wasted space there is.
There are three variants of FAT in existence: FAT12, FAT16, and FAT32.
- FAT12 is the oldest type of FAT that uses a 12 bit file allocation table entry.
- FAT12 can hold a max of 4,086 clusters (which is 212 clusters minus a few values that are reserved for values used in the FAT).
- It is used for floppy disks and hard drive partitions that are smaller than 16 MB.
- All 1.44 MB 3.5" floppy disks are formatted using FAT12.
- Cluster size that is used is between 0.5 KB to 4 KB.
- It is called FAT16 because all entries are 16 bit.
- FAT16 can hold a max of 65,536 addressable units (2 26
- It is used for small and moderate sized hard disk volumes.
- The actual capacity is 65,525 due to some reserved values
FAT32 is the enhanced version of the FAT system implemented beginning with Windows 95 OSR2, Windows 98, and Windows Me. Features include:
- Drives of up to 2 terabytes are supported (Windows 2000 only supports up to 32 gigabytes)
- Since FAT32 uses smaller clusters (of 4 kilobytes each), it uses hard drive space more efficiently. This is a 10 to 15 percent improvement over FAT or FAT16.
- The limitations of FAT or FAT 16 on the number of root folder entries have been eliminated. In FAT32, the root folder is an ordinary cluster chain, and can be located anywhere on the drive.
- File allocation mirroring can be disabled in FAT32. This allows a different copy of the file allocation table then the default to be active.
Comparison of FAT Versions
Table adapted from: http://en.wikipedia.org/wiki/File_Allocation_Table
|Full Name||File Allocation Table|
|(12-bit version)||(16-bit version)||(32-bit version)|
|Introduced||1977 (Microsoft Disk BASIC)||July 1988 (MS-DOS 4.0)||August 1996 (Windows 95 OSR2)|
|Partition identifier||0x01 (MBR)||0x04, 0x06, 0x0E (MBR)||0x0B, 0x0C (MBR)
|File allocation||Linked List|
|Bad blocks||Linked List|
|Max file size||32 MiB||2 GiB||4 GiB|
|Max number of files||4,077||65,517||268,435,437|
|Max filename size||8.3 or 255 characters when using LFNs|
|Max volume size||16 MiB||2 GiB for all (4 GiB for some)||32 GiB for all OS (2 TiB for some)|
|Dates recorded||Creation, modified, access|
|Date range||January 1, 1980 - December 31, 2107|
|Unicode File Names||System Character Set|
|Attributes||Read-only, hidden, system, volume label, subdirectory, archive|
|Transparent compression||Per-volume, Stacker, DoubleSpace, DriveSpace||No|
|Transparent encryption||Per-volume only with DR-DOS||No|
|Disk Space Economy||Average||Minimal on large volumes||Max|
Applications of FAT
Due to its low cost, mobility, and non-volatile nature, flash memory has quickly become the choice medium for storing and transferring data in consumer electronic devices. The majority of flash memory storage is formatted using the FAT file system. In addition, FAT is also frequently used in electronic devices with miniature hard drives.
Examples of devices in which FAT is utilized include:
- USB thumb drives
- Digital cameras
- Digital camcorders
- Portable audio and video players
- Multifunction printers
- Electronic photo frames
- Electronic musical instruments
- Standard televisions
Recovering directory entries from FAT filesystems as part of recovering deleted data can be accomplished by looking for entries that begin with a sigma 0xe5. When a file or directory is deleted under a FAT filesystem, the first character of its name is changed to sigma. The remainder of the directory entry information remains intact.
The pointers are also changed to zero for each cluster used by the file. Recovery tools look at the FAT to find the entry for the file. The location of the starting cluster will still be there. It is not deleted or modified. The tool will go straight to that cluster and try to recover the file using the file size as a determinant. Some tools will go to the starting cluster and recover the next "X" number of clusters needed for the specific file size. However, this tool is not ideal. An ideal tool will locate "X" number of available clusters. Since files are most often fragmented, this will be a more precise way to recover the file.
An issue arises when two files in the same row of clusters are deleted. If the clusters are not in sequential order, the tool will automatically receive "X" number of clusters. However, because the file was fragmented, it's most likely that all the clusters obtained will not all contain data for that file. If these two deleted files are in the same row of clusters, it is highly unlikely the file can be recovered.
File slack is data that starts from the end of the file written and continues to the end of the sectors designated to the file. There are two types of file slack, RAM slack, and Residual slack. RAM slack starts from the end of the file and goes to the end of that sector. Residual slack then starts at the next sector and goes to the end of the cluster allocated for the file. File slack is a helpful tool when analyzing a hard drive because the old data that is not overwritten by the new file is still in tact. Go to http://www.pcguide.com/ref/hdd/file/partSizes-c.html for examples.
The diagram above demonstrates the larger the cluster size used, the more disk space is wasted due to slack. This suggests it is better to use smaller cluster sizes whenever possible.