<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://www.forensicswiki.org/w/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;feed=atom&amp;action=history</id>
		<title>Arabic PDFs - Revision history</title>
		<link rel="self" type="application/atom+xml" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;feed=atom&amp;action=history"/>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;action=history"/>
		<updated>2013-05-21T08:37:18Z</updated>
		<subtitle>Revision history for this page on the wiki</subtitle>
		<generator>MediaWiki 1.20.3</generator>

	<entry>
		<id>http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9387&amp;oldid=prev</id>
		<title>Jessek: Minor cleanup, some editing</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9387&amp;oldid=prev"/>
				<updated>2009-03-12T13:42:49Z</updated>
		
		<summary type="html">&lt;p&gt;Minor cleanup, some editing&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
			&lt;tr style='vertical-align: top;'&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;← Older revision&lt;/td&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 13:42, 12 March 2009&lt;/td&gt;
			&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;This page discusses issues that arise when working with Adobe &lt;/del&gt;PDF &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;files &lt;/del&gt;that contain Arabic. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Documents in [[&lt;/ins&gt;PDF&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] format &lt;/ins&gt;that contain Arabic &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;text may present problems for examiners. This page presents the difference between a glyph and a character and how PDF documents can commingle them&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;==Glyphs vs. Characters==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;==Glyphs vs. Characters==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;The term ''chracter'' describes an abstract concept of a letter. The term ''glyph'' describes how a character prints. A single character can have multiple glyphs (for example, glyphs with serifs and those without). A single glyph can have multiple characters &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;(for &lt;/del&gt;example, a lower case &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;quot;&lt;/del&gt;l&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;quot; &lt;/del&gt;and a capital &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;quot;&lt;/del&gt;I&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;quot; in Helvetica&lt;/del&gt;).&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;The term ''chracter'' describes an abstract concept of a letter. The term ''glyph'' describes how a character prints. A single character can have multiple glyphs (for example, glyphs with serifs and those without). A single glyph can have multiple characters&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. For &lt;/ins&gt;example, &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;in the font Helvetica the same shape is used to represent &lt;/ins&gt;a lower case &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;'&lt;/ins&gt;l&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;' (ell) &lt;/ins&gt;and a capital &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;'&lt;/ins&gt;I&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;' (eye&lt;/ins&gt;).&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;PDFs can contain glyphs or &lt;/del&gt;characters.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Unicode is set up as a catalog of nominal &lt;/ins&gt;characters&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;, independent and irrespective of the (computer) typographical consequences. For example, the English/US Unicode page has an entry for the character 'lower case a', but does not define how that letter should be displayed&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Unicode is set up as a catalogue of nominal &lt;/del&gt;characters&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;, independent and irrespective of the (computer) typographical consequences&lt;/del&gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;PDF documents can contain both glyphs and &lt;/ins&gt;characters. Modern PDFs &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;emulate &lt;/ins&gt;the 19th century-style metal-based typesetting process. Ideally PDFs should encode characters, not glyphs&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;sup&amp;gt;citation needed&amp;lt;/sup&amp;gt;&lt;/ins&gt;. &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;When &lt;/ins&gt;typesetting Arabic, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;Modern PDFs &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;essentially describe &lt;/del&gt;the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;result of a &lt;/del&gt;19th century-style metal-based typesetting process. Ideally PDFs should encode characters, not glyphs. &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;But when &lt;/del&gt;typesetting Arabic, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;By interpreting the Unicode standard as a look-up for glyph indexes, Unicode is abused as if it were a huge font book. This confuses multi-lingual encoding with computer typography. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;By interpreting the Unicode standard as a look-up for glyph indexes, Unicode is abused as if it were a huge font book. This confuses multi-lingual encoding with computer typography. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Jessek</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9386&amp;oldid=prev</id>
		<title>Simsong: /* Glyphs vs. Characters */</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9386&amp;oldid=prev"/>
				<updated>2009-03-09T23:17:00Z</updated>
		
		<summary type="html">&lt;p&gt;‎&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Glyphs vs. Characters&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
			&lt;tr style='vertical-align: top;'&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;← Older revision&lt;/td&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 23:17, 9 March 2009&lt;/td&gt;
			&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 2:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 2:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;==Glyphs vs. Characters==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;==Glyphs vs. Characters==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;The term ''chracter'' describes an abstract concept of a letter. The term ''glyph'' describes how a character prints. A single character can have multiple glyphs (for example, glyphs with serifs and those without). A single glyph can have multiple characters (for example, a lower case &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;''&lt;/del&gt;l&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;'' &lt;/del&gt;and a capital &amp;quot;I&amp;quot; in Helvetica).&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;The term ''chracter'' describes an abstract concept of a letter. The term ''glyph'' describes how a character prints. A single character can have multiple glyphs (for example, glyphs with serifs and those without). A single glyph can have multiple characters (for example, a lower case &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;quot;&lt;/ins&gt;l&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;quot; &lt;/ins&gt;and a capital &amp;quot;I&amp;quot; in Helvetica).&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;PDFs can contain glyphs or characters.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;PDFs can contain glyphs or characters.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Simsong</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9385&amp;oldid=prev</id>
		<title>Simsong at 12:38, 8 March 2009</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9385&amp;oldid=prev"/>
				<updated>2009-03-08T12:38:46Z</updated>
		
		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
			&lt;tr style='vertical-align: top;'&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;← Older revision&lt;/td&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 12:38, 8 March 2009&lt;/td&gt;
			&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. When typesetting &lt;/del&gt;Arabic&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book&lt;/del&gt;. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;This page discusses issues that arise when working with Adobe PDF files that contain &lt;/ins&gt;Arabic. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;By interpreting the Unicode standard as a look-up for glyph indexes, Unicode is abused as if it were a huge font book. This confuses multi-lingual encoding with computer typography&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;. Unicode is set up as a catalogue of nominal characters, independent and irrespective of the (computer) typographical consequences&lt;/del&gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;==Glyphs vs. Characters==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;The term ''chracter'' describes an abstract concept of a letter. The term ''glyph'' describes how a character prints. A single character can have multiple glyphs (for example, glyphs with serifs and those without). A single glyph can have multiple characters (for example, a lower case ''l'' and a capital &amp;quot;I&amp;quot; in Helvetica).&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;PDFs can contain glyphs or characters.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Unicode is set up as a catalogue of nominal characters, independent and irrespective of the (computer) typographical consequences.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. Ideally PDFs should encode characters, not glyphs. But when typesetting Arabic, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book. &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;By interpreting the Unicode standard as a look-up for glyph indexes, Unicode is abused as if it were a huge font book. This confuses multi-lingual encoding with computer typography. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;An underlying cause for this error is the idea that there can be such a thing as a Character-Glyph model. However, in the real world there is no connection between abstract characters and the glyphs used to represent them. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;An underlying cause for this error is the idea that there can be such a thing as a Character-Glyph model. However, in the real world there is no connection between abstract characters and the glyphs used to represent them. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Simsong</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9384&amp;oldid=prev</id>
		<title>Simsong at 15:41, 7 March 2009</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9384&amp;oldid=prev"/>
				<updated>2009-03-07T15:41:28Z</updated>
		
		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
			&lt;tr style='vertical-align: top;'&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;← Older revision&lt;/td&gt;
			&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 15:41, 7 March 2009&lt;/td&gt;
			&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. When typesetting Arabic, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. When typesetting Arabic, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;By interpreting the Unicode standard as a look-up for glyph indexes, Unicode is abused as if it were a huge font book. This confuses multi-lingual encoding with computer typography. Unicode is set up as a catalogue of nominal characters, independent and irrespective of the (computer) typographical consequences&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;An underlying cause for this error is the idea that there can be such a thing as a Character-Glyph model. However, in the real world there is no connection between abstract characters and the glyphs used to represent them. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;An underlying cause for this error is the idea that there can be such a thing as a Character-Glyph model. However, in the real world there is no connection between abstract characters and the glyphs used to represent them. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Simsong</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9383&amp;oldid=prev</id>
		<title>Simsong: New page: Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. When typesetting Arabic, Unicode is used as a glyph list, rather than a character list....</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/w/index.php?title=Arabic_PDFs&amp;diff=9383&amp;oldid=prev"/>
				<updated>2009-03-07T01:27:06Z</updated>
		
		<summary type="html">&lt;p&gt;New page: Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. When typesetting Arabic, Unicode is used as a glyph list, rather than a character list....&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Modern PDFs essentially describe the result of a 19th century-style metal-based typesetting process. When typesetting Arabic, Unicode is used as a glyph list, rather than a character list. The glyphs are used as indexes into a huge font book.&lt;br /&gt;
&lt;br /&gt;
An underlying cause for this error is the idea that there can be such a thing as a Character-Glyph model. However, in the real world there is no connection between abstract characters and the glyphs used to represent them. &lt;br /&gt;
&lt;br /&gt;
Increasingly font designers are discovering the enormous conceptual freedom one gets without any Character-Glyph constraint. But Adobe still uses the Unicode standard to extract the nominal character values from the font glyph numbers used to represent them. That is why more advanced Arabic fonts that do not use the Unicode Presentation Blocks produce gibberish when text is extracted from the PDF.&lt;br /&gt;
&lt;br /&gt;
Future versions of PDF are planned to embed Unicode as text in addition to the font information, which would resolve this issue.&lt;br /&gt;
&lt;br /&gt;
Part of the problem is that Unicode’s Arabic Presentation Blocks are officially deprecated by the Unicode Consortium. Their inclusion was at the time – late 1980’s - a technical compromise to allow the ISO 10646 to join Unicode. As such the compromise was incomplete, as only 400 out of originally 4000 requested Arabic ligatures we allowed to remain in the Unicode Standard. Ironically, all the printed examples in the Unicode standard were designed by Thomas Milo based on computer-generated synthesis of the underlying letter block fusions of traditional Arabic &amp;quot;Script Grammar&amp;quot;. This was done using  DecoType’s famous ACE technology, that eventually became the working model for Microsoft’s True Type Open, the precursor of today's OpenType. &lt;br /&gt;
 &lt;br /&gt;
Arabic Presentation Forms should never be encoded, such a practice amounts to reverting to Font Pages, whose very proliferation caused the development of a more intelligent alternative: Unicode.&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
*http://www.river-valley.tv/conferences/arabic_typography_2008/&lt;br /&gt;
*http://www.river-valley.tv/conferences/non_latintypefacedesign/&lt;/div&gt;</summary>
		<author><name>Simsong</name></author>	</entry>

	</feed>