Mercurial > libpst
diff xml/libpst.in @ 35:b2f247463b83 stable-0-5-6
better decoding of 7c blocks
author | carl |
---|---|
date | Sun, 15 Jul 2007 14:25:34 -0700 |
parents | 12cac756bc05 |
children | 6fe121a971c9 |
line wrap: on
line diff
--- a/xml/libpst.in Thu Jul 12 14:59:13 2007 -0700 +++ b/xml/libpst.in Sun Jul 15 14:25:34 2007 -0700 @@ -652,10 +652,10 @@ match the backPointer from the triple that pointed to this node. </para> <para> - Each item in this node is a triple of (ID, backPointer, offset) + Each item in this node is a triple of (ID1, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, - and ID is the lowest ID value in the subtree. + and ID1 is the lowest ID1 value in the subtree. </para> </refsect1> @@ -722,6 +722,12 @@ </para> <para> Each item in this node is a tuple of (ID1, offset, size, unknown) + The two low order bits of the ID1 value seem to be flags. I have + never seen a case with bit zero set. Bit one indicates that the + item is <emphasis>not</emphasis> encrypted. Note that references + to these ID1 values elsewhere may have the low order bit set (and + I don't know what that means), but when we do the search in this + tree we need to clear that bit so that we can find the correct item. </para> </refsect1> @@ -905,34 +911,42 @@ 0000 indexOffset [2 bytes] 0x013c in this case 0002 signature [2 bytes] 0xbcec constant -0004 offset [2 bytes] 0x0020 in this case +0004 b5offset [4 bytes] 0x0020 index reference ]]></literallayout> <para> - Note the signature of 0xbcec. There are other descriptor block - formats with other signatures. - Note the indexOffset of 0x013c - starting at that position in the - descriptor block, we have an array of two byte integers. The first - integer (0x000b) is a (count-1) of the number of overlapping pairs - following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14) - and the last (12th) pair is (0x123, 0x13b). These pairs are (start,end+1) - offsets of items in this block. So we have count+2 integers following - the count value. + Note the signature of 0xbcec. There are other descriptor block formats + with other signatures. Note the indexOffset of 0x013c - starting at + that position in the descriptor block, we have an array of two byte + integers. The first integer (0x000b) is a (count-1) of the number of + overlapping pairs following the count. The first pair is (0, 0xc), the + next pair is (0xc, 0x14) and the last (12th) pair is (0x123, 0x13b). + These pairs are (start,end+1) offsets of items in this block. So we + have count+2 integers following the count value. </para> <para> - Note the offset of 0x0020, which needs to be right shifted by 4 bits - to become 0x0002, which is then a byte offset to be added to the above - indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) - pair. Finally, we have the offset and size of the "b5" block located at offset 0xc + Note the b5offset of 0x0020, which is a type that I will call an index + reference. Such index references have at least two different forms, and + may point to data either in this block, or in some other block. + External pointer references have the low order 4 bits all set, and are + ID2 values that can be used to fetch data. This value of 0x0020 is an + internal pointer reference, which needs to be right shifted by 4 bits to + become 0x0002, which is then a byte offset to be added to the above + indexOffset plus two (to skip the count), so it points to the (0xc, + 0x14) pair. + </para> + <para> + Finally, we have the offset and size of the "b5" block located at offset 0xc with a size of 8 bytes in this descriptor block. The "b5" block has the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 signature [2 bytes] 0x02b5 constant 0002 unknown [2 bytes] 0x0006 in this case -0004 offset [4 bytes] 0x0040 in this case +0004 descoffset [4 bytes] 0x0040 index reference ]]></literallayout> <para> - Note the "b5" offset of 0x0040, which needs to be right shifted by 4 bits + Note the descoffset of 0x0040, which again is an index reference. In this + case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0x7c) pair. We now have the offset 0x14 of the descriptor array, composed of 8 byte @@ -945,9 +959,9 @@ ]]></literallayout> <para> For some reference types (2, 3, 0xb) the value is used directly. Otherwise, - the value is generally a non-zero offset, to be right shifted by 4 bits and used to fetch - a pair from the index table to find the offset and size of the item in this - descriptor block. However, if (value AND 0xf) == 0xf, then the value is an ID2 index. + the value is an index reference, which is either an ID2 value, or an + offset, to be right shifted by 4 bits and used to fetch a pair from the + index table to find the offset and size of the item in this descriptor block. </para> <para> The following reference types are known, but not all of these @@ -1197,7 +1211,7 @@ <refsect1 id='pst.file.desc2.5'> <title>Associated Descriptor Item 0x7cec</title> <para> - This style of descriptor block is similar to the BCEC format. + This style of descriptor block is similar to the 0xbcec format. </para> <literallayout class="monospaced"><![CDATA[ 0000 7a 01 ec 7c 40 00 00 00 00 00 00 00 b5 04 02 00 @@ -1228,7 +1242,7 @@ 0000 indexOffset [2 bytes] 0x017a in this case 0002 signature [2 bytes] 0x7cec constant -0004 offset [2 bytes] 0x0040 in this case +0004 7coffset [4 bytes] 0x0040 index reference ]]></literallayout> <para> Note the signature of 0x7cec. There are other descriptor block @@ -1242,7 +1256,8 @@ the count value. </para> <para> - Note the offset of 0x0040, which needs to be right shifted by 4 bits + Note the 7coffset of 0x0040, which is an index reference. In this case, + it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0xea) pair. We have the offset and size of the "7c" block located at offset 0x14 @@ -1256,15 +1271,15 @@ 0004 unknown [2 bytes] 0x0060 in this case 0006 unknown [2 bytes] 0x0062 in this case 0008 recordSize [2 bytes] 0x0065 in this case -000a b5Offset [2 bytes] 0x0020 in this case -000c unknown [2 bytes] 0x0000 in this case -000e index2Offset [2 bytes] 0x0080 in this case +000a b5Offset [4 bytes] 0x0020 index reference +000e index2Offset [4 bytes] 0x0080 index reference 0010 unknown [2 bytes] 0x0000 in this case 0012 unknown [2 bytes] 0x0000 in this case 0014 unknown [2 bytes] 0x0000 in this case ]]></literallayout> <para> - Note the b5Offset of 0x0020, which needs to be right shifted by 4 bits + Note the b5Offset of 0x0020, which is an index reference. In this case, + it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0002, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) pair. Finally, we have the offset and size of the "b5" block @@ -1274,10 +1289,11 @@ <literallayout class="monospaced"><![CDATA[ 0000 signature [2 bytes] 0x04b5 constant 0002 unknown [2 bytes] 0x0002 in this case -0004 offset [4 bytes] 0x0060 in this case +0004 descoffset [4 bytes] 0x0060 index reference ]]></literallayout> <para> - Note the "b5" offset of 0x0060, which needs to be right shifted by 4 + Note the descoffset of 0x0060, which again is an index reference. In this + case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0006, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xea, 0xf0) pair. That gives us (0xf0 - 0xea)/6 = 1, so we have a @@ -1285,7 +1301,8 @@ and unused here. </para> <para> - Note the index2Offset above of 0x0080, which needs to be right shifted + Note the index2Offset above of 0x0080, which again is an index reference. In this + case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0008, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xf0, 0x155) pair. This is an array of tables of four byte integers. @@ -1302,17 +1319,37 @@ 0000 referenceType [2 bytes] 0002 itemType [2 bytes] 0004 ind2Offset [2 bytes] -0006 unknown [2 bytes] +0006 size [1 byte] +0007 unknown [1 byte] ]]></literallayout> <para> - The ind2Offset is a byte offset into the current IND2 table of a four - byte integer value. Once we fetch that, we have the same triple (item - type, reference type, value) as we find in the 0xbcec style descriptor - blocks. These 8 byte descriptors are processed recordCount times, each + The ind2Offset is a byte offset into the current IND2 table of some value. + If that is a four byte integer value, then once we fetch that, we have + the same triple (item type, reference type, value) as we find in the + 0xbcec style descriptor blocks. If not, then this value is used directly. + These 8 byte descriptors are processed recordCount times, each time using the next IND2 table. The item and reference types are as described above for the 0xbcec format descriptor block. </para> </refsect1> + <refsect1 id='pst.file.desc3.5'> + <title>Associated Descriptor Item 0x0002</title> + <para> + This style of descriptor block is almost unknown here. + It seems to contain a list of ID1 values. + </para> + <literallayout class="monospaced"><![CDATA[ +0000 01 01 02 00 26 28 00 00 18 77 0c 00 b8 04 00 00 + +0000 signature [2 bytes] 0x0101 constant +0002 count [2 bytes] 0x0002 in this case +0004 unknown [4 bytes] 0x002826 in this case + repeating +0008 id [4 bytes] 0x0c7718 in this case +000c id [4 bytes] 0x0004b8 in this case +]]></literallayout> + </refsect1> + </refentry> </reference>