annotate contrib/FILE-FORMAT.html @ 164:ab384fed78c5

Compensate for iconv conversion to utf-7 that produces strings that are not null terminated. Don't produce empty attachment files in separate mode.
author Carl Byington <carl@five-ten-sg.com>
date Mon, 16 Mar 2009 18:31:39 -0700
parents c508ee15dfca
children 5c0ce43c7532
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
16
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
1 <html>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
2 <head><title>File Format for Outlook PST files</title></head>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
3 <h2>Header</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
4 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
5 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
6 0x00 <a href="#header_sig" title="File Signature">xx xx xx xx</a> 00 00 00 00 00 00 00 00 00 00 00 00
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
7 ...
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
8 0xA0 00 00 00 00 00 00 00 00 <a href="#header_filesize" title="File Size">xx xx xx xx</a> 00 00 00 00
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
9 0xB0 00 00 00 00 00 00 00 00 00 00 00 00 <a href="#header_index_control" title="Controlling items index">xx xx xx xx</a>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
10 0xC0 00 00 00 00 <a href="#header_index_items" title="Items index">xx xx xx xx</a> 00 00 00 00 00 00 00 00
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
11
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
12 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
13
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
14
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
15 <h2 id="header_sig">Signature in Header</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
16 <p>The first 4 bytes of the PST file <i>should</i> be 0x21 0x42 0x44 0x4E, or as an int 0x4E444221. This is the only signature I have come across and will probably work in nearly all situations.</p>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
17
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
18 <h2 id="header_filesize">Size of current PST file</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
19 <p>This is the size of the file. If I understand correctly, then Outlook would appear to have a 2GB, or 4GB file limit. Actually, I am not sure that the whole file format could take more than 1GB.</p>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
20
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
21 <h2 id="header_index_control">Pointer to Index of Controlling Items</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
22 <p>This is what is reffered to as the second index, or Descriptive index. These records contain pointers to the <a href="#glossary_item">item</a> description and a table of extra ids. These records also contain the id2# of its parent.</p>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
23 <h3>Table pointing to further tables</h3>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
24 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
25 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
26 <a href="#glossary_id2" title="Starting ID2 value in following table">xx xx xx xx</a> 00 00 00 00 <a href="#glossary_offset" title="File Offset">xx xx xx xx</a>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
27 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
28 <h3>Leaf node table (Actual Records)</h3>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
29 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
30 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
31 <a href="#glossary_id2" title="ID2 Value of record">xx xx xx xx</a> <a href="#glossary_desc" title="ID of Description Record">xx xx xx xx</a> <a href="#assoclist" title="ID of Association Table">xx xx xx xx</a> <a href="#glossary_id2" title="ID2 Parent Record">xx xx xx xx</a>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
32 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
33
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
34 <h2 id="header_index_items">Pointer to index of ID Offsets</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
35 <p>This is what is reffered to as an ID. These just basically point to offsets in the file. The do not describe what they point to. Each ID2 record that needs data not stored in it, will have an ID value that is a pointer to some data.</p>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
36 <h3>Table pointing to further tables</h3>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
37 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
38 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
39 <a href="#glossary_id" title="Starting ID value in following table">xx xx xx xx</a> 00 00 00 00 <a href="#glossary_offset" title="File Offset">xx xx xx xx</a>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
40 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
41 <h3>Leaf node table (Actual Records)</h3>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
42 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
43 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
44 <a href="#glossary_id" title="ID Value of record">xx xx xx xx</a> <a href="#glossary_offset" title="File Offset">xx xx xx xx</a> <a href="#glossary_bl_size" title="Block Size">xx xx</a> 00 00
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
45 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
46
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
47 <h2 id="assoclist">Association Table</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
48 <p>This is a simple record associating the <a href="#glossary_id">ID</a> records with the <a href="#glossary_id2">ID2</a> records. It is nearly always the case that an item record will refer to an ID2 value. This list, which should be pre-read, will allow the ID2 values to point to file offsets. We must keep a full list of ID values in memory so that we can lookup the file offsets when required.</p>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
49 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
50 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
51 <b>02 00</b> <a href="#assoc_count" title="Number of items following">xx xx</a>[<a href="#glossary_id2" title="ID2 Value">xx xx xx xx</a> <a href="#glossary_id" title="ID Value">xx xx xx xx</a> 00 00 00 00]...
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
52 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
53
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
54 <h2 id="glossary_desc">Description Record</h2>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
55 <p>This is a record that lists attributes of an item together with the attribute's data. It is the main place where data is stored. All data in a PST file is stored as items - from emails, to contacts, to the layout of a folder.</p>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
56
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
57 <h3>Block Header</h3>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
58 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
59 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
60 <a href="#desc_block_index" title="Offset to Block Index">xx xx</a> <b>EC BC</b> <a href="#desc_sec1" title="IndexPos of Section1">xx xx xx xx</a>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
61 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
62 OR
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
63 <pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
64 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
65 <a href="#desc_block_index" title="Offset to Block Index">xx xx</a> <b>EC 7C</b> <a href="#desc_sec1" title="IndexPos of 7C position">xx xx xx xx</a>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
66 </pre>
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
67
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
68
c508ee15dfca switch to automake/autoconf
carl
parents:
diff changeset
69 </html>