diff xml/libpst.in @ 35:b2f247463b83 stable-0-5-6

better decoding of 7c blocks
author carl
date Sun, 15 Jul 2007 14:25:34 -0700
parents 12cac756bc05
children 6fe121a971c9
line wrap: on
line diff
--- a/xml/libpst.in	Thu Jul 12 14:59:13 2007 -0700
+++ b/xml/libpst.in	Sun Jul 15 14:25:34 2007 -0700
@@ -652,10 +652,10 @@
                 match the backPointer from the triple that pointed to this node.
             </para>
             <para>
-                Each item in this node is a triple of (ID, backPointer, offset)
+                Each item in this node is a triple of (ID1, backPointer, offset)
                 where the offset points to the next deeper node in the tree, the
                 backPointer value must match the backPointer in that deeper node,
-                and ID is the lowest ID value in the subtree.
+                and ID1 is the lowest ID1 value in the subtree.
             </para>
         </refsect1>
 
@@ -722,6 +722,12 @@
             </para>
             <para>
                 Each item in this node is a tuple of (ID1, offset, size, unknown)
+                The two low order bits of the ID1 value seem to be flags. I have
+                never seen a case with bit zero set. Bit one indicates that the
+                item is <emphasis>not</emphasis> encrypted. Note that references
+                to these ID1 values elsewhere may have the low order bit set (and
+                I don't know what that means), but when we do the search in this
+                tree we need to clear that bit so that we can find the correct item.
             </para>
         </refsect1>
 
@@ -905,34 +911,42 @@
 
 0000  indexOffset     [2 bytes] 0x013c     in this case
 0002  signature       [2 bytes] 0xbcec     constant
-0004  offset          [2 bytes] 0x0020     in this case
+0004  b5offset        [4 bytes] 0x0020     index reference
 ]]></literallayout>
             <para>
-                Note the signature of 0xbcec. There are other descriptor block
-                formats with other signatures.
-                Note the indexOffset of 0x013c - starting at that position in the
-                descriptor block, we have an array of two byte integers. The first
-                integer (0x000b) is a (count-1) of the number of overlapping pairs
-                following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14)
-                and the last (12th) pair is (0x123, 0x13b). These pairs are (start,end+1)
-                offsets of items in this block. So we have count+2 integers following
-                the count value.
+                Note the signature of 0xbcec.  There are other descriptor block formats
+                with other signatures.  Note the indexOffset of 0x013c - starting at
+                that position in the descriptor block, we have an array of two byte
+                integers.  The first integer (0x000b) is a (count-1) of the number of
+                overlapping pairs following the count.  The first pair is (0, 0xc), the
+                next pair is (0xc, 0x14) and the last (12th) pair is (0x123, 0x13b).
+                These pairs are (start,end+1) offsets of items in this block.  So we
+                have count+2 integers following the count value.
             </para>
             <para>
-                Note the offset of 0x0020, which needs to be right shifted by 4 bits
-                to become 0x0002, which is then a byte offset to be added to the above
-                indexOffset plus two (to skip the count), so it points to the (0xc, 0x14)
-                pair. Finally, we have the offset and size of the "b5" block located at offset 0xc
+                Note the b5offset of 0x0020, which is a type that I will call an index
+                reference.  Such index references have at least two different forms, and
+                may point to data either in this block, or in some other block.
+                External pointer references have the low order 4 bits all set, and are
+                ID2 values that can be used to fetch data.  This value of 0x0020 is an
+                internal pointer reference, which needs to be right shifted by 4 bits to
+                become 0x0002, which is then a byte offset to be added to the above
+                indexOffset plus two (to skip the count), so it points to the (0xc,
+                0x14) pair.
+            </para>
+            <para>
+                Finally, we have the offset and size of the "b5" block located at offset 0xc
                 with a size of 8 bytes in this descriptor block. The "b5" block has the
                 following format:
             </para>
             <literallayout class="monospaced"><![CDATA[
 0000  signature       [2 bytes] 0x02b5     constant
 0002  unknown         [2 bytes] 0x0006     in this case
-0004  offset          [4 bytes] 0x0040     in this case
+0004  descoffset      [4 bytes] 0x0040     index reference
 ]]></literallayout>
             <para>
-                Note the "b5" offset of 0x0040, which needs to be right shifted by 4 bits
+                Note the descoffset of 0x0040, which again is an index reference. In this
+                case, it is an internal pointer reference, which needs to be right shifted by 4 bits
                 to become 0x0004, which is then a byte offset to be added to the above
                 indexOffset plus two (to skip the count), so it points to the (0x14, 0x7c)
                 pair. We now have the offset 0x14 of the descriptor array, composed of 8 byte
@@ -945,9 +959,9 @@
 ]]></literallayout>
             <para>
                 For some reference types (2, 3, 0xb) the value is used directly. Otherwise,
-                the value is generally a non-zero offset, to be right shifted by 4 bits and used to fetch
-                a pair from the index table to find the offset and size of the item in this
-                descriptor block. However, if (value AND 0xf) == 0xf, then the value is an ID2 index.
+                the value is an index reference, which is either an ID2 value, or an
+                offset, to be right shifted by 4 bits and used to fetch a pair from the
+                index table to find the offset and size of the item in this descriptor block.
             </para>
             <para>
                 The following reference types are known, but not all of these
@@ -1197,7 +1211,7 @@
         <refsect1 id='pst.file.desc2.5'>
             <title>Associated Descriptor Item 0x7cec</title>
             <para>
-                This style of descriptor block is similar to the BCEC format.
+                This style of descriptor block is similar to the 0xbcec format.
             </para>
             <literallayout class="monospaced"><![CDATA[
 0000  7a 01 ec 7c  40 00 00 00  00 00 00 00  b5 04 02 00
@@ -1228,7 +1242,7 @@
 
 0000  indexOffset     [2 bytes] 0x017a     in this case
 0002  signature       [2 bytes] 0x7cec     constant
-0004  offset          [2 bytes] 0x0040     in this case
+0004  7coffset        [4 bytes] 0x0040     index reference
 ]]></literallayout>
             <para>
                 Note the signature of 0x7cec. There are other descriptor block
@@ -1242,7 +1256,8 @@
                 the count value.
             </para>
             <para>
-                Note the offset of 0x0040, which needs to be right shifted by 4 bits
+                Note the 7coffset of 0x0040, which is an index reference. In this case,
+                it is an internal reference pointer, which needs to be right shifted by 4 bits
                 to become 0x0004, which is then a byte offset to be added to the above
                 indexOffset plus two (to skip the count), so it points to the (0x14, 0xea)
                 pair. We have the offset and size of the "7c" block located at offset 0x14
@@ -1256,15 +1271,15 @@
 0004  unknown         [2 bytes] 0x0060     in this case
 0006  unknown         [2 bytes] 0x0062     in this case
 0008  recordSize      [2 bytes] 0x0065     in this case
-000a  b5Offset        [2 bytes] 0x0020     in this case
-000c  unknown         [2 bytes] 0x0000     in this case
-000e  index2Offset    [2 bytes] 0x0080     in this case
+000a  b5Offset        [4 bytes] 0x0020     index reference
+000e  index2Offset    [4 bytes] 0x0080     index reference
 0010  unknown         [2 bytes] 0x0000     in this case
 0012  unknown         [2 bytes] 0x0000     in this case
 0014  unknown         [2 bytes] 0x0000     in this case
 ]]></literallayout>
             <para>
-                Note the b5Offset of 0x0020, which needs to be right shifted by 4 bits
+                Note the b5Offset of 0x0020, which is an index reference. In this case,
+                it is an internal reference pointer, which needs to be right shifted by 4 bits
                 to become 0x0002, which is then a byte offset to be added to the above
                 indexOffset plus two (to skip the count), so it points to the (0xc,
                 0x14) pair.  Finally, we have the offset and size of the "b5" block
@@ -1274,10 +1289,11 @@
             <literallayout class="monospaced"><![CDATA[
 0000  signature       [2 bytes] 0x04b5     constant
 0002  unknown         [2 bytes] 0x0002     in this case
-0004  offset          [4 bytes] 0x0060     in this case
+0004  descoffset      [4 bytes] 0x0060     index reference
 ]]></literallayout>
             <para>
-                Note the "b5" offset of 0x0060, which needs to be right shifted by 4
+                Note the descoffset of 0x0060, which again is an index reference. In this
+                case, it is an internal pointer reference, which needs to be right shifted by 4
                 bits to become 0x0006, which is then a byte offset to be added to the
                 above indexOffset plus two (to skip the count), so it points to the
                 (0xea, 0xf0) pair.  That gives us (0xf0 - 0xea)/6 = 1, so we have a
@@ -1285,7 +1301,8 @@
                 and unused here.
             </para>
             <para>
-                Note the index2Offset above of 0x0080, which needs to be right shifted
+                Note the index2Offset above of 0x0080, which again is an index reference. In this
+                case, it is an internal pointer reference, which needs to be right shifted
                 by 4 bits to become 0x0008, which is then a byte offset to be added to
                 the above indexOffset plus two (to skip the count), so it points to the
                 (0xf0, 0x155) pair.  This is an array of tables of four byte integers.
@@ -1302,17 +1319,37 @@
 0000  referenceType   [2 bytes]
 0002  itemType        [2 bytes]
 0004  ind2Offset      [2 bytes]
-0006  unknown         [2 bytes]
+0006  size            [1 byte]
+0007  unknown         [1 byte]
 ]]></literallayout>
             <para>
-                The ind2Offset is a byte offset into the current IND2 table of a four
-                byte integer value.  Once we fetch that, we have the same triple (item
-                type, reference type, value) as we find in the 0xbcec style descriptor
-                blocks.  These 8 byte descriptors are processed recordCount times, each
+                The ind2Offset is a byte offset into the current IND2 table of some value.
+                If that is a four byte integer value, then once we fetch that, we have
+                the same triple (item type, reference type, value) as we find in the
+                0xbcec style descriptor blocks. If not, then this value is used directly.
+                These 8 byte descriptors are processed recordCount times, each
                 time using the next IND2 table.  The item and reference types are as
                 described above for the 0xbcec format descriptor block.
             </para>
         </refsect1>
 
+        <refsect1 id='pst.file.desc3.5'>
+            <title>Associated Descriptor Item 0x0002</title>
+            <para>
+                This style of descriptor block is almost unknown here.
+                It seems to contain a list of ID1 values.
+            </para>
+            <literallayout class="monospaced"><![CDATA[
+0000  01 01 02 00  26 28 00 00  18 77 0c 00  b8 04 00 00
+
+0000  signature       [2 bytes] 0x0101     constant
+0002  count           [2 bytes] 0x0002     in this case
+0004  unknown         [4 bytes] 0x002826   in this case
+  repeating
+0008  id              [4 bytes] 0x0c7718   in this case
+000c  id              [4 bytes] 0x0004b8   in this case
+]]></literallayout>
+        </refsect1>
+
     </refentry>
 </reference>