Difference between revisions of "Text File (TXT)"

From Forensics Wiki
Jump to: navigation, search
(Expanded information which was suggesting ASCII was the only encoding possible.)
Line 1: Line 1:
 
'''Text file formats''' usually have the '''.txt''' extension.
 
'''Text file formats''' usually have the '''.txt''' extension.
  
These files contain text along with some control data such as tabs and line feeds.
+
These files contain text along with some control data such as tabs and line feeds. [http://en.wikipedia.org/wiki/Text_file] The text may use printable characters, punctuation, space, and a limited number of control characters.  Text files are split into several major types:
 +
* DOS/Windows format ends each line using Carriage Return (CR) or char(13) and a Line Feed (LF) char(10) byte sequence, []
 +
* Unix format includes only the Carriage Return (CR) or char (13) at the end of the line.
 +
* Unicode includes an optional encoding in the first two bytes Byte Order Mark (BOM) that identifies the unicode encoding. This is mainly used to identify little endian or big endian byte order.
 +
* EBCIDIC used char(15) for a new line. [http://en.wikipedia.org/wiki/EBCDIC]
 +
 
 +
They are usually [[ASCII]] encoded, although other encodings are possible to allow various language scripts to be used. Other encodings include EBCIDIC from the old IBM mainframe. Text files can have the [[MIME type]] "text/plain", often with suffixes indicating an encoding (e.g. "text/plain;charset=UTF-8".)  Any basic text reader can be used to view the contents of a simple text file, however some (notably Notepad) have issues with certain less popular encodings.
  
They are usually [[ASCII]] encoded, although other encodings are possible to allow various language scripts to be used.  Text files can have the [[MIME type]] "text/plain", often with suffixes indicating an encoding (e.g. "text/plain;charset=UTF-8".)  Any basic text reader can be used to view the contents of a simple text file, however some (notably Notepad) have issues with certain less popular encodings.
 
  
 
[[Category:File Formats]]
 
[[Category:File Formats]]

Revision as of 21:50, 23 May 2007

Text file formats usually have the .txt extension.

These files contain text along with some control data such as tabs and line feeds. [1] The text may use printable characters, punctuation, space, and a limited number of control characters. Text files are split into several major types:

  • DOS/Windows format ends each line using Carriage Return (CR) or char(13) and a Line Feed (LF) char(10) byte sequence, []
  • Unix format includes only the Carriage Return (CR) or char (13) at the end of the line.
  • Unicode includes an optional encoding in the first two bytes Byte Order Mark (BOM) that identifies the unicode encoding. This is mainly used to identify little endian or big endian byte order.
  • EBCIDIC used char(15) for a new line. [2]

They are usually ASCII encoded, although other encodings are possible to allow various language scripts to be used. Other encodings include EBCIDIC from the old IBM mainframe. Text files can have the MIME type "text/plain", often with suffixes indicating an encoding (e.g. "text/plain;charset=UTF-8".) Any basic text reader can be used to view the contents of a simple text file, however some (notably Notepad) have issues with certain less popular encodings.