Mercurial > libpst
changeset 35:b2f247463b83 stable-0-5-6
better decoding of 7c blocks
author | carl |
---|---|
date | Sun, 15 Jul 2007 14:25:34 -0700 |
parents | 07177825c91b |
children | 6fe121a971c9 |
files | ChangeLog Doxyfile Makefile.am Makefile.cvs NEWS configure.in regression/regression-tests.bash src/debug.c src/libpst.c src/libpst.h xml/libpst.in |
diffstat | 11 files changed, 1706 insertions(+), 380 deletions(-) [+] |
line wrap: on
line diff
--- a/ChangeLog Thu Jul 12 14:59:13 2007 -0700 +++ b/ChangeLog Sun Jul 15 14:25:34 2007 -0700 @@ -1,5 +1,11 @@ +LibPST 0.5.6 (2007-07-15) +=============================== + * Fix to allow very small pst files with only one node in the tree. We were mixing signed/unsigned types in comparisons. + * More progress decoding the basic structure 7c blocks. Many + four byte values may be ID2 indices with data outside the buffer. + * Start using doxygen to generate internal documentation. LibPST 0.5.5 (2007-07-10) ===============================
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Doxyfile Sun Jul 15 14:25:34 2007 -0700 @@ -0,0 +1,1161 @@ +# Doxyfile 1.3.9.1 + +# This file describes the settings to be used by the documentation system +# doxygen (www.doxygen.org) for a project +# +# All text after a hash (#) is considered a comment and will be ignored +# The format is: +# TAG = value [value, ...] +# For lists items can also be appended using: +# TAG += value [value, ...] +# Values that contain spaces should be placed between quotes (" ") + +#--------------------------------------------------------------------------- +# Project related configuration options +#--------------------------------------------------------------------------- + +# The PROJECT_NAME tag is a single word (or a sequence of words surrounded +# by quotes) that should identify the project. + +PROJECT_NAME = 'LibPst' + +# The PROJECT_NUMBER tag can be used to enter a project or revision number. +# This could be handy for archiving the generated documentation or +# if some version control system is used. + +PROJECT_NUMBER = + +# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) +# base path where the generated documentation will be put. +# If a relative path is entered, it will be relative to the location +# where doxygen was started. If left blank the current directory will be used. + +OUTPUT_DIRECTORY = + +# If the CREATE_SUBDIRS tag is set to YES, then doxygen will create +# 4096 sub-directories (in 2 levels) under the output directory of each output +# format and will distribute the generated files over these directories. +# Enabling this option can be useful when feeding doxygen a huge amount of source +# files, where putting all generated files in the same directory would otherwise +# cause performance problems for the file system. + +CREATE_SUBDIRS = NO + +# The OUTPUT_LANGUAGE tag is used to specify the language in which all +# documentation generated by doxygen is written. Doxygen will use this +# information to generate all constant output in the proper language. +# The default language is English, other supported languages are: +# Brazilian, Catalan, Chinese, Chinese-Traditional, Croatian, Czech, Danish, +# Dutch, Finnish, French, German, Greek, Hungarian, Italian, Japanese, +# Japanese-en (Japanese with English messages), Korean, Korean-en, Norwegian, +# Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovene, Spanish, +# Swedish, and Ukrainian. + +OUTPUT_LANGUAGE = English + +# This tag can be used to specify the encoding used in the generated output. +# The encoding is not always determined by the language that is chosen, +# but also whether or not the output is meant for Windows or non-Windows users. +# In case there is a difference, setting the USE_WINDOWS_ENCODING tag to YES +# forces the Windows encoding (this is the default for the Windows binary), +# whereas setting the tag to NO uses a Unix-style encoding (the default for +# all platforms other than Windows). + +USE_WINDOWS_ENCODING = NO + +# If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will +# include brief member descriptions after the members that are listed in +# the file and class documentation (similar to JavaDoc). +# Set to NO to disable this. + +BRIEF_MEMBER_DESC = YES + +# If the REPEAT_BRIEF tag is set to YES (the default) Doxygen will prepend +# the brief description of a member or function before the detailed description. +# Note: if both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the +# brief descriptions will be completely suppressed. + +REPEAT_BRIEF = YES + +# This tag implements a quasi-intelligent brief description abbreviator +# that is used to form the text in various listings. Each string +# in this list, if found as the leading text of the brief description, will be +# stripped from the text and the result after processing the whole list, is used +# as the annotated text. Otherwise, the brief description is used as-is. If left +# blank, the following values are used ("$name" is automatically replaced with the +# name of the entity): "The $name class" "The $name widget" "The $name file" +# "is" "provides" "specifies" "contains" "represents" "a" "an" "the" + +ABBREVIATE_BRIEF = + +# If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then +# Doxygen will generate a detailed section even if there is only a brief +# description. + +ALWAYS_DETAILED_SEC = NO + +# If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all inherited +# members of a class in the documentation of that class as if those members were +# ordinary class members. Constructors, destructors and assignment operators of +# the base classes will not be shown. + +INLINE_INHERITED_MEMB = NO + +# If the FULL_PATH_NAMES tag is set to YES then Doxygen will prepend the full +# path before files name in the file list and in the header files. If set +# to NO the shortest path that makes the file name unique will be used. + +FULL_PATH_NAMES = YES + +# If the FULL_PATH_NAMES tag is set to YES then the STRIP_FROM_PATH tag +# can be used to strip a user-defined part of the path. Stripping is +# only done if one of the specified strings matches the left-hand part of +# the path. The tag can be used to show relative paths in the file list. +# If left blank the directory from which doxygen is run is used as the +# path to strip. + +STRIP_FROM_PATH = + +# The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of +# the path mentioned in the documentation of a class, which tells +# the reader which header file to include in order to use a class. +# If left blank only the name of the header file containing the class +# definition is used. Otherwise one should specify the include paths that +# are normally passed to the compiler using the -I flag. + +STRIP_FROM_INC_PATH = . + +# If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter +# (but less readable) file names. This can be useful is your file systems +# doesn't support long names like on DOS, Mac, or CD-ROM. + +SHORT_NAMES = NO + +# If the JAVADOC_AUTOBRIEF tag is set to YES then Doxygen +# will interpret the first line (until the first dot) of a JavaDoc-style +# comment as the brief description. If set to NO, the JavaDoc +# comments will behave just like the Qt-style comments (thus requiring an +# explicit @brief command for a brief description. + +JAVADOC_AUTOBRIEF = YES + +# The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make Doxygen +# treat a multi-line C++ special comment block (i.e. a block of //! or /// +# comments) as a brief description. This used to be the default behaviour. +# The new default is to treat a multi-line C++ comment block as a detailed +# description. Set this tag to YES if you prefer the old behaviour instead. + +MULTILINE_CPP_IS_BRIEF = NO + +# If the DETAILS_AT_TOP tag is set to YES then Doxygen +# will output the detailed description near the top, like JavaDoc. +# If set to NO, the detailed description appears after the member +# documentation. + +DETAILS_AT_TOP = NO + +# If the INHERIT_DOCS tag is set to YES (the default) then an undocumented +# member inherits the documentation from any documented member that it +# re-implements. + +INHERIT_DOCS = YES + +# If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC +# tag is set to YES, then doxygen will reuse the documentation of the first +# member in the group (if any) for the other members of the group. By default +# all members of a group must be documented explicitly. + +DISTRIBUTE_GROUP_DOC = NO + +# The TAB_SIZE tag can be used to set the number of spaces in a tab. +# Doxygen uses this value to replace tabs by spaces in code fragments. + +TAB_SIZE = 4 + +# This tag can be used to specify a number of aliases that acts +# as commands in the documentation. An alias has the form "name=value". +# For example adding "sideeffect=\par Side Effects:\n" will allow you to +# put the command \sideeffect (or @sideeffect) in the documentation, which +# will result in a user-defined paragraph with heading "Side Effects:". +# You can put \n's in the value part of an alias to insert newlines. + +ALIASES = + +# Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C sources +# only. Doxygen will then generate output that is more tailored for C. +# For instance, some of the names that are used will be different. The list +# of all members will be omitted, etc. + +OPTIMIZE_OUTPUT_FOR_C = NO + +# Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java sources +# only. Doxygen will then generate output that is more tailored for Java. +# For instance, namespaces will be presented as packages, qualified scopes +# will look different, etc. + +OPTIMIZE_OUTPUT_JAVA = NO + +# Set the SUBGROUPING tag to YES (the default) to allow class member groups of +# the same type (for instance a group of public functions) to be put as a +# subgroup of that type (e.g. under the Public Functions section). Set it to +# NO to prevent subgrouping. Alternatively, this can be done per class using +# the \nosubgrouping command. + +SUBGROUPING = YES + +#--------------------------------------------------------------------------- +# Build related configuration options +#--------------------------------------------------------------------------- + +# If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in +# documentation are documented, even if no documentation was available. +# Private class members and static file members will be hidden unless +# the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES + +EXTRACT_ALL = YES + +# If the EXTRACT_PRIVATE tag is set to YES all private members of a class +# will be included in the documentation. + +EXTRACT_PRIVATE = YES + +# If the EXTRACT_STATIC tag is set to YES all static members of a file +# will be included in the documentation. + +EXTRACT_STATIC = YES + +# If the EXTRACT_LOCAL_CLASSES tag is set to YES classes (and structs) +# defined locally in source files will be included in the documentation. +# If set to NO only classes defined in header files are included. + +EXTRACT_LOCAL_CLASSES = YES + +# This flag is only useful for Objective-C code. When set to YES local +# methods, which are defined in the implementation section but not in +# the interface are included in the documentation. +# If set to NO (the default) only methods in the interface are included. + +EXTRACT_LOCAL_METHODS = NO + +# If the HIDE_UNDOC_MEMBERS tag is set to YES, Doxygen will hide all +# undocumented members of documented classes, files or namespaces. +# If set to NO (the default) these members will be included in the +# various overviews, but no documentation section is generated. +# This option has no effect if EXTRACT_ALL is enabled. + +HIDE_UNDOC_MEMBERS = NO + +# If the HIDE_UNDOC_CLASSES tag is set to YES, Doxygen will hide all +# undocumented classes that are normally visible in the class hierarchy. +# If set to NO (the default) these classes will be included in the various +# overviews. This option has no effect if EXTRACT_ALL is enabled. + +HIDE_UNDOC_CLASSES = NO + +# If the HIDE_FRIEND_COMPOUNDS tag is set to YES, Doxygen will hide all +# friend (class|struct|union) declarations. +# If set to NO (the default) these declarations will be included in the +# documentation. + +HIDE_FRIEND_COMPOUNDS = NO + +# If the HIDE_IN_BODY_DOCS tag is set to YES, Doxygen will hide any +# documentation blocks found inside the body of a function. +# If set to NO (the default) these blocks will be appended to the +# function's detailed documentation block. + +HIDE_IN_BODY_DOCS = NO + +# The INTERNAL_DOCS tag determines if documentation +# that is typed after a \internal command is included. If the tag is set +# to NO (the default) then the documentation will be excluded. +# Set it to YES to include the internal documentation. + +INTERNAL_DOCS = YES + +# If the CASE_SENSE_NAMES tag is set to NO then Doxygen will only generate +# file names in lower-case letters. If set to YES upper-case letters are also +# allowed. This is useful if you have classes or files whose names only differ +# in case and if your file system supports case sensitive file names. Windows +# and Mac users are advised to set this option to NO. + +CASE_SENSE_NAMES = YES + +# If the HIDE_SCOPE_NAMES tag is set to NO (the default) then Doxygen +# will show members with their full class and namespace scopes in the +# documentation. If set to YES the scope will be hidden. + +HIDE_SCOPE_NAMES = NO + +# If the SHOW_INCLUDE_FILES tag is set to YES (the default) then Doxygen +# will put a list of the files that are included by a file in the documentation +# of that file. + +SHOW_INCLUDE_FILES = YES + +# If the INLINE_INFO tag is set to YES (the default) then a tag [inline] +# is inserted in the documentation for inline members. + +INLINE_INFO = YES + +# If the SORT_MEMBER_DOCS tag is set to YES (the default) then doxygen +# will sort the (detailed) documentation of file and class members +# alphabetically by member name. If set to NO the members will appear in +# declaration order. + +SORT_MEMBER_DOCS = YES + +# If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the +# brief documentation of file, namespace and class members alphabetically +# by member name. If set to NO (the default) the members will appear in +# declaration order. + +SORT_BRIEF_DOCS = NO + +# If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be +# sorted by fully-qualified names, including namespaces. If set to +# NO (the default), the class list will be sorted only by class name, +# not including the namespace part. +# Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES. +# Note: This option applies only to the class list, not to the +# alphabetical list. + +SORT_BY_SCOPE_NAME = NO + +# The GENERATE_TODOLIST tag can be used to enable (YES) or +# disable (NO) the todo list. This list is created by putting \todo +# commands in the documentation. + +GENERATE_TODOLIST = YES + +# The GENERATE_TESTLIST tag can be used to enable (YES) or +# disable (NO) the test list. This list is created by putting \test +# commands in the documentation. + +GENERATE_TESTLIST = YES + +# The GENERATE_BUGLIST tag can be used to enable (YES) or +# disable (NO) the bug list. This list is created by putting \bug +# commands in the documentation. + +GENERATE_BUGLIST = YES + +# The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or +# disable (NO) the deprecated list. This list is created by putting +# \deprecated commands in the documentation. + +GENERATE_DEPRECATEDLIST= YES + +# The ENABLED_SECTIONS tag can be used to enable conditional +# documentation sections, marked by \if sectionname ... \endif. + +ENABLED_SECTIONS = + +# The MAX_INITIALIZER_LINES tag determines the maximum number of lines +# the initial value of a variable or define consists of for it to appear in +# the documentation. If the initializer consists of more lines than specified +# here it will be hidden. Use a value of 0 to hide initializers completely. +# The appearance of the initializer of individual variables and defines in the +# documentation can be controlled using \showinitializer or \hideinitializer +# command in the documentation regardless of this setting. + +MAX_INITIALIZER_LINES = 30 + +# Set the SHOW_USED_FILES tag to NO to disable the list of files generated +# at the bottom of the documentation of classes and structs. If set to YES the +# list will mention the files that were used to generate the documentation. + +SHOW_USED_FILES = YES + +# If the sources in your project are distributed over multiple directories +# then setting the SHOW_DIRECTORIES tag to YES will show the directory hierarchy +# in the documentation. + +SHOW_DIRECTORIES = YES + +#--------------------------------------------------------------------------- +# configuration options related to warning and progress messages +#--------------------------------------------------------------------------- + +# The QUIET tag can be used to turn on/off the messages that are generated +# by doxygen. Possible values are YES and NO. If left blank NO is used. + +QUIET = NO + +# The WARNINGS tag can be used to turn on/off the warning messages that are +# generated by doxygen. Possible values are YES and NO. If left blank +# NO is used. + +WARNINGS = YES + +# If WARN_IF_UNDOCUMENTED is set to YES, then doxygen will generate warnings +# for undocumented members. If EXTRACT_ALL is set to YES then this flag will +# automatically be disabled. + +WARN_IF_UNDOCUMENTED = YES + +# If WARN_IF_DOC_ERROR is set to YES, doxygen will generate warnings for +# potential errors in the documentation, such as not documenting some +# parameters in a documented function, or documenting parameters that +# don't exist or using markup commands wrongly. + +WARN_IF_DOC_ERROR = YES + +# The WARN_FORMAT tag determines the format of the warning messages that +# doxygen can produce. The string should contain the $file, $line, and $text +# tags, which will be replaced by the file and line number from which the +# warning originated and the warning text. + +WARN_FORMAT = "$file:$line: $text" + +# The WARN_LOGFILE tag can be used to specify a file to which warning +# and error messages should be written. If left blank the output is written +# to stderr. + +WARN_LOGFILE = + +#--------------------------------------------------------------------------- +# configuration options related to the input files +#--------------------------------------------------------------------------- + +# The INPUT tag can be used to specify the files and/or directories that contain +# documented source files. You may enter file names like "myfile.cpp" or +# directories like "/usr/src/myproject". Separate the files or directories +# with spaces. + +INPUT = src + +# If the value of the INPUT tag contains directories, you can use the +# FILE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp +# and *.h) to filter out the source-files in the directories. If left +# blank the following patterns are tested: +# *.c *.cc *.cxx *.cpp *.c++ *.java *.ii *.ixx *.ipp *.i++ *.inl *.h *.hh *.hxx *.hpp +# *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm + +FILE_PATTERNS = + +# The RECURSIVE tag can be used to turn specify whether or not subdirectories +# should be searched for input files as well. Possible values are YES and NO. +# If left blank NO is used. + +RECURSIVE = YES + +# The EXCLUDE tag can be used to specify files and/or directories that should +# excluded from the INPUT source files. This way you can easily exclude a +# subdirectory from a directory tree whose root is specified with the INPUT tag. + +EXCLUDE = + +# The EXCLUDE_SYMLINKS tag can be used select whether or not files or directories +# that are symbolic links (a Unix filesystem feature) are excluded from the input. + +EXCLUDE_SYMLINKS = NO + +# If the value of the INPUT tag contains directories, you can use the +# EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude +# certain files from those directories. + +EXCLUDE_PATTERNS = + +# The EXAMPLE_PATH tag can be used to specify one or more files or +# directories that contain example code fragments that are included (see +# the \include command). + +EXAMPLE_PATH = + +# If the value of the EXAMPLE_PATH tag contains directories, you can use the +# EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp +# and *.h) to filter out the source-files in the directories. If left +# blank all files are included. + +EXAMPLE_PATTERNS = + +# If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be +# searched for input files to be used with the \include or \dontinclude +# commands irrespective of the value of the RECURSIVE tag. +# Possible values are YES and NO. If left blank NO is used. + +EXAMPLE_RECURSIVE = NO + +# The IMAGE_PATH tag can be used to specify one or more files or +# directories that contain image that are included in the documentation (see +# the \image command). + +IMAGE_PATH = + +# The INPUT_FILTER tag can be used to specify a program that doxygen should +# invoke to filter for each input file. Doxygen will invoke the filter program +# by executing (via popen()) the command <filter> <input-file>, where <filter> +# is the value of the INPUT_FILTER tag, and <input-file> is the name of an +# input file. Doxygen will then use the output that the filter program writes +# to standard output. If FILTER_PATTERNS is specified, this tag will be +# ignored. + +INPUT_FILTER = + +# The FILTER_PATTERNS tag can be used to specify filters on a per file pattern +# basis. Doxygen will compare the file name with each pattern and apply the +# filter if there is a match. The filters are a list of the form: +# pattern=filter (like *.cpp=my_cpp_filter). See INPUT_FILTER for further +# info on how filters are used. If FILTER_PATTERNS is empty, INPUT_FILTER +# is applied to all files. + +FILTER_PATTERNS = + +# If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using +# INPUT_FILTER) will be used to filter the input files when producing source +# files to browse (i.e. when SOURCE_BROWSER is set to YES). + +FILTER_SOURCE_FILES = NO + +#--------------------------------------------------------------------------- +# configuration options related to source browsing +#--------------------------------------------------------------------------- + +# If the SOURCE_BROWSER tag is set to YES then a list of source files will +# be generated. Documented entities will be cross-referenced with these sources. +# Note: To get rid of all source code in the generated output, make sure also +# VERBATIM_HEADERS is set to NO. + +SOURCE_BROWSER = YES + +# Setting the INLINE_SOURCES tag to YES will include the body +# of functions and classes directly in the documentation. + +INLINE_SOURCES = NO + +# Setting the STRIP_CODE_COMMENTS tag to YES (the default) will instruct +# doxygen to hide any special comment blocks from generated source code +# fragments. Normal C and C++ comments will always remain visible. + +STRIP_CODE_COMMENTS = YES + +# If the REFERENCED_BY_RELATION tag is set to YES (the default) +# then for each documented function all documented +# functions referencing it will be listed. + +REFERENCED_BY_RELATION = YES + +# If the REFERENCES_RELATION tag is set to YES (the default) +# then for each documented function all documented entities +# called/used by that function will be listed. + +REFERENCES_RELATION = YES + +# If the VERBATIM_HEADERS tag is set to YES (the default) then Doxygen +# will generate a verbatim copy of the header file for each class for +# which an include is specified. Set to NO to disable this. + +VERBATIM_HEADERS = YES + +#--------------------------------------------------------------------------- +# configuration options related to the alphabetical class index +#--------------------------------------------------------------------------- + +# If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index +# of all compounds will be generated. Enable this if the project +# contains a lot of classes, structs, unions or interfaces. + +ALPHABETICAL_INDEX = YES + +# If the alphabetical index is enabled (see ALPHABETICAL_INDEX) then +# the COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns +# in which this list will be split (can be a number in the range [1..20]) + +COLS_IN_ALPHA_INDEX = 5 + +# In case all classes in a project start with a common prefix, all +# classes will be put under the same header in the alphabetical index. +# The IGNORE_PREFIX tag can be used to specify one or more prefixes that +# should be ignored while generating the index headers. + +IGNORE_PREFIX = + +#--------------------------------------------------------------------------- +# configuration options related to the HTML output +#--------------------------------------------------------------------------- + +# If the GENERATE_HTML tag is set to YES (the default) Doxygen will +# generate HTML output. + +GENERATE_HTML = YES + +# The HTML_OUTPUT tag is used to specify where the HTML docs will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `html' will be used as the default path. + +HTML_OUTPUT = html.internal + +# The HTML_FILE_EXTENSION tag can be used to specify the file extension for +# each generated HTML page (for example: .htm,.php,.asp). If it is left blank +# doxygen will generate files with .html extension. + +HTML_FILE_EXTENSION = .html + +# The HTML_HEADER tag can be used to specify a personal HTML header for +# each generated HTML page. If it is left blank doxygen will generate a +# standard header. + +HTML_HEADER = + +# The HTML_FOOTER tag can be used to specify a personal HTML footer for +# each generated HTML page. If it is left blank doxygen will generate a +# standard footer. + +HTML_FOOTER = + +# The HTML_STYLESHEET tag can be used to specify a user-defined cascading +# style sheet that is used by each HTML page. It can be used to +# fine-tune the look of the HTML output. If the tag is left blank doxygen +# will generate a default style sheet. Note that doxygen will try to copy +# the style sheet file to the HTML output directory, so don't put your own +# stylesheet in the HTML output directory as well, or it will be erased! + +HTML_STYLESHEET = + +# If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes, +# files or namespaces will be aligned in HTML using tables. If set to +# NO a bullet list will be used. + +HTML_ALIGN_MEMBERS = YES + +# If the GENERATE_HTMLHELP tag is set to YES, additional index files +# will be generated that can be used as input for tools like the +# Microsoft HTML help workshop to generate a compressed HTML help file (.chm) +# of the generated HTML documentation. + +GENERATE_HTMLHELP = NO + +# If the GENERATE_HTMLHELP tag is set to YES, the CHM_FILE tag can +# be used to specify the file name of the resulting .chm file. You +# can add a path in front of the file if the result should not be +# written to the html output directory. + +CHM_FILE = + +# If the GENERATE_HTMLHELP tag is set to YES, the HHC_LOCATION tag can +# be used to specify the location (absolute path including file name) of +# the HTML help compiler (hhc.exe). If non-empty doxygen will try to run +# the HTML help compiler on the generated index.hhp. + +HHC_LOCATION = + +# If the GENERATE_HTMLHELP tag is set to YES, the GENERATE_CHI flag +# controls if a separate .chi index file is generated (YES) or that +# it should be included in the master .chm file (NO). + +GENERATE_CHI = NO + +# If the GENERATE_HTMLHELP tag is set to YES, the BINARY_TOC flag +# controls whether a binary table of contents is generated (YES) or a +# normal table of contents (NO) in the .chm file. + +BINARY_TOC = NO + +# The TOC_EXPAND flag can be set to YES to add extra items for group members +# to the contents of the HTML help documentation and to the tree view. + +TOC_EXPAND = NO + +# The DISABLE_INDEX tag can be used to turn on/off the condensed index at +# top of each HTML page. The value NO (the default) enables the index and +# the value YES disables it. + +DISABLE_INDEX = NO + +# This tag can be used to set the number of enum values (range [1..20]) +# that doxygen will group on one line in the generated HTML documentation. + +ENUM_VALUES_PER_LINE = 4 + +# If the GENERATE_TREEVIEW tag is set to YES, a side panel will be +# generated containing a tree-like index structure (just like the one that +# is generated for HTML Help). For this to work a browser that supports +# JavaScript, DHTML, CSS and frames is required (for instance Mozilla 1.0+, +# Netscape 6.0+, Internet explorer 5.0+, or Konqueror). Windows users are +# probably better off using the HTML help feature. + +GENERATE_TREEVIEW = YES + +# If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be +# used to set the initial width (in pixels) of the frame in which the tree +# is shown. + +TREEVIEW_WIDTH = 250 + +#--------------------------------------------------------------------------- +# configuration options related to the LaTeX output +#--------------------------------------------------------------------------- + +# If the GENERATE_LATEX tag is set to YES (the default) Doxygen will +# generate Latex output. + +GENERATE_LATEX = NO + +# The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `latex' will be used as the default path. + +LATEX_OUTPUT = latex + +# The LATEX_CMD_NAME tag can be used to specify the LaTeX command name to be +# invoked. If left blank `latex' will be used as the default command name. + +LATEX_CMD_NAME = latex + +# The MAKEINDEX_CMD_NAME tag can be used to specify the command name to +# generate index for LaTeX. If left blank `makeindex' will be used as the +# default command name. + +MAKEINDEX_CMD_NAME = makeindex + +# If the COMPACT_LATEX tag is set to YES Doxygen generates more compact +# LaTeX documents. This may be useful for small projects and may help to +# save some trees in general. + +COMPACT_LATEX = NO + +# The PAPER_TYPE tag can be used to set the paper type that is used +# by the printer. Possible values are: a4, a4wide, letter, legal and +# executive. If left blank a4wide will be used. + +PAPER_TYPE = a4wide + +# The EXTRA_PACKAGES tag can be to specify one or more names of LaTeX +# packages that should be included in the LaTeX output. + +EXTRA_PACKAGES = + +# The LATEX_HEADER tag can be used to specify a personal LaTeX header for +# the generated latex document. The header should contain everything until +# the first chapter. If it is left blank doxygen will generate a +# standard header. Notice: only use this tag if you know what you are doing! + +LATEX_HEADER = + +# If the PDF_HYPERLINKS tag is set to YES, the LaTeX that is generated +# is prepared for conversion to pdf (using ps2pdf). The pdf file will +# contain links (just like the HTML output) instead of page references +# This makes the output suitable for online browsing using a pdf viewer. + +PDF_HYPERLINKS = NO + +# If the USE_PDFLATEX tag is set to YES, pdflatex will be used instead of +# plain latex in the generated Makefile. Set this option to YES to get a +# higher quality PDF documentation. + +USE_PDFLATEX = NO + +# If the LATEX_BATCHMODE tag is set to YES, doxygen will add the \\batchmode. +# command to the generated LaTeX files. This will instruct LaTeX to keep +# running if errors occur, instead of asking the user for help. +# This option is also used when generating formulas in HTML. + +LATEX_BATCHMODE = NO + +# If LATEX_HIDE_INDICES is set to YES then doxygen will not +# include the index chapters (such as File Index, Compound Index, etc.) +# in the output. + +LATEX_HIDE_INDICES = NO + +#--------------------------------------------------------------------------- +# configuration options related to the RTF output +#--------------------------------------------------------------------------- + +# If the GENERATE_RTF tag is set to YES Doxygen will generate RTF output +# The RTF output is optimized for Word 97 and may not look very pretty with +# other RTF readers or editors. + +GENERATE_RTF = NO + +# The RTF_OUTPUT tag is used to specify where the RTF docs will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `rtf' will be used as the default path. + +RTF_OUTPUT = rtf + +# If the COMPACT_RTF tag is set to YES Doxygen generates more compact +# RTF documents. This may be useful for small projects and may help to +# save some trees in general. + +COMPACT_RTF = NO + +# If the RTF_HYPERLINKS tag is set to YES, the RTF that is generated +# will contain hyperlink fields. The RTF file will +# contain links (just like the HTML output) instead of page references. +# This makes the output suitable for online browsing using WORD or other +# programs which support those fields. +# Note: wordpad (write) and others do not support links. + +RTF_HYPERLINKS = YES + +# Load stylesheet definitions from file. Syntax is similar to doxygen's +# config file, i.e. a series of assignments. You only have to provide +# replacements, missing definitions are set to their default value. + +RTF_STYLESHEET_FILE = + +# Set optional variables used in the generation of an rtf document. +# Syntax is similar to doxygen's config file. + +RTF_EXTENSIONS_FILE = + +#--------------------------------------------------------------------------- +# configuration options related to the man page output +#--------------------------------------------------------------------------- + +# If the GENERATE_MAN tag is set to YES (the default) Doxygen will +# generate man pages + +GENERATE_MAN = NO + +# The MAN_OUTPUT tag is used to specify where the man pages will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `man' will be used as the default path. + +MAN_OUTPUT = man + +# The MAN_EXTENSION tag determines the extension that is added to +# the generated man pages (default is the subroutine's section .3) + +MAN_EXTENSION = .3 + +# If the MAN_LINKS tag is set to YES and Doxygen generates man output, +# then it will generate one additional man file for each entity +# documented in the real man page(s). These additional files +# only source the real man page, but without them the man command +# would be unable to find the correct page. The default is NO. + +MAN_LINKS = NO + +#--------------------------------------------------------------------------- +# configuration options related to the XML output +#--------------------------------------------------------------------------- + +# If the GENERATE_XML tag is set to YES Doxygen will +# generate an XML file that captures the structure of +# the code including all documentation. + +GENERATE_XML = NO + +# The XML_OUTPUT tag is used to specify where the XML pages will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `xml' will be used as the default path. + +XML_OUTPUT = xml + +# The XML_SCHEMA tag can be used to specify an XML schema, +# which can be used by a validating XML parser to check the +# syntax of the XML files. + +XML_SCHEMA = + +# The XML_DTD tag can be used to specify an XML DTD, +# which can be used by a validating XML parser to check the +# syntax of the XML files. + +XML_DTD = + +# If the XML_PROGRAMLISTING tag is set to YES Doxygen will +# dump the program listings (including syntax highlighting +# and cross-referencing information) to the XML output. Note that +# enabling this will significantly increase the size of the XML output. + +XML_PROGRAMLISTING = YES + +#--------------------------------------------------------------------------- +# configuration options for the AutoGen Definitions output +#--------------------------------------------------------------------------- + +# If the GENERATE_AUTOGEN_DEF tag is set to YES Doxygen will +# generate an AutoGen Definitions (see autogen.sf.net) file +# that captures the structure of the code including all +# documentation. Note that this feature is still experimental +# and incomplete at the moment. + +GENERATE_AUTOGEN_DEF = NO + +#--------------------------------------------------------------------------- +# configuration options related to the Perl module output +#--------------------------------------------------------------------------- + +# If the GENERATE_PERLMOD tag is set to YES Doxygen will +# generate a Perl module file that captures the structure of +# the code including all documentation. Note that this +# feature is still experimental and incomplete at the +# moment. + +GENERATE_PERLMOD = NO + +# If the PERLMOD_LATEX tag is set to YES Doxygen will generate +# the necessary Makefile rules, Perl scripts and LaTeX code to be able +# to generate PDF and DVI output from the Perl module output. + +PERLMOD_LATEX = NO + +# If the PERLMOD_PRETTY tag is set to YES the Perl module output will be +# nicely formatted so it can be parsed by a human reader. This is useful +# if you want to understand what is going on. On the other hand, if this +# tag is set to NO the size of the Perl module output will be much smaller +# and Perl will parse it just the same. + +PERLMOD_PRETTY = YES + +# The names of the make variables in the generated doxyrules.make file +# are prefixed with the string contained in PERLMOD_MAKEVAR_PREFIX. +# This is useful so different doxyrules.make files included by the same +# Makefile don't overwrite each other's variables. + +PERLMOD_MAKEVAR_PREFIX = + +#--------------------------------------------------------------------------- +# Configuration options related to the preprocessor +#--------------------------------------------------------------------------- + +# If the ENABLE_PREPROCESSING tag is set to YES (the default) Doxygen will +# evaluate all C-preprocessor directives found in the sources and include +# files. + +ENABLE_PREPROCESSING = YES + +# If the MACRO_EXPANSION tag is set to YES Doxygen will expand all macro +# names in the source code. If set to NO (the default) only conditional +# compilation will be performed. Macro expansion can be done in a controlled +# way by setting EXPAND_ONLY_PREDEF to YES. + +MACRO_EXPANSION = YES + +# If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES +# then the macro expansion is limited to the macros specified with the +# PREDEFINED and EXPAND_AS_PREDEFINED tags. + +EXPAND_ONLY_PREDEF = NO + +# If the SEARCH_INCLUDES tag is set to YES (the default) the includes files +# in the INCLUDE_PATH (see below) will be search if a #include is found. + +SEARCH_INCLUDES = YES + +# The INCLUDE_PATH tag can be used to specify one or more directories that +# contain include files that are not input files but should be processed by +# the preprocessor. + +INCLUDE_PATH = + +# You can use the INCLUDE_FILE_PATTERNS tag to specify one or more wildcard +# patterns (like *.h and *.hpp) to filter out the header-files in the +# directories. If left blank, the patterns specified with FILE_PATTERNS will +# be used. + +INCLUDE_FILE_PATTERNS = + +# The PREDEFINED tag can be used to specify one or more macro names that +# are defined before the preprocessor is started (similar to the -D option of +# gcc). The argument of the tag is a list of macros of the form: name +# or name=definition (no spaces). If the definition and the = are +# omitted =1 is assumed. To prevent a macro definition from being +# undefined via #undef or recursively expanded use the := operator +# instead of the = operator. + +PREDEFINED = + +# If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then +# this tag can be used to specify a list of macro names that should be expanded. +# The macro definition that is found in the sources will be used. +# Use the PREDEFINED tag if you want to use a different macro definition. + +EXPAND_AS_DEFINED = + +# If the SKIP_FUNCTION_MACROS tag is set to YES (the default) then +# doxygen's preprocessor will remove all function-like macros that are alone +# on a line, have an all uppercase name, and do not end with a semicolon. Such +# function macros are typically used for boiler-plate code, and will confuse the +# parser if not removed. + +SKIP_FUNCTION_MACROS = YES + +#--------------------------------------------------------------------------- +# Configuration::additions related to external references +#--------------------------------------------------------------------------- + +# The TAGFILES option can be used to specify one or more tagfiles. +# Optionally an initial location of the external documentation +# can be added for each tagfile. The format of a tag file without +# this location is as follows: +# TAGFILES = file1 file2 ... +# Adding location for the tag files is done as follows: +# TAGFILES = file1=loc1 "file2 = loc2" ... +# where "loc1" and "loc2" can be relative or absolute paths or +# URLs. If a location is present for each tag, the installdox tool +# does not have to be run to correct the links. +# Note that each tag file must have a unique name +# (where the name does NOT include the path) +# If a tag file is not located in the directory in which doxygen +# is run, you must also specify the path to the tagfile here. + +TAGFILES = + +# When a file name is specified after GENERATE_TAGFILE, doxygen will create +# a tag file that is based on the input files it reads. + +GENERATE_TAGFILE = + +# If the ALLEXTERNALS tag is set to YES all external classes will be listed +# in the class index. If set to NO only the inherited external classes +# will be listed. + +ALLEXTERNALS = NO + +# If the EXTERNAL_GROUPS tag is set to YES all external groups will be listed +# in the modules index. If set to NO, only the current project's groups will +# be listed. + +EXTERNAL_GROUPS = YES + +# The PERL_PATH should be the absolute path and name of the perl script +# interpreter (i.e. the result of `which perl'). + +PERL_PATH = /usr/bin/perl + +#--------------------------------------------------------------------------- +# Configuration options related to the dot tool +#--------------------------------------------------------------------------- + +# If the CLASS_DIAGRAMS tag is set to YES (the default) Doxygen will +# generate a inheritance diagram (in HTML, RTF and LaTeX) for classes with base or +# super classes. Setting the tag to NO turns the diagrams off. Note that this +# option is superseded by the HAVE_DOT option below. This is only a fallback. It is +# recommended to install and use dot, since it yields more powerful graphs. + +CLASS_DIAGRAMS = YES + +# If set to YES, the inheritance and collaboration graphs will hide +# inheritance and usage relations if the target is undocumented +# or is not a class. + +HIDE_UNDOC_RELATIONS = YES + +# If you set the HAVE_DOT tag to YES then doxygen will assume the dot tool is +# available from the path. This tool is part of Graphviz, a graph visualization +# toolkit from AT&T and Lucent Bell Labs. The other options in this section +# have no effect if this option is set to NO (the default) + +HAVE_DOT = NO + +# If the CLASS_GRAPH and HAVE_DOT tags are set to YES then doxygen +# will generate a graph for each documented class showing the direct and +# indirect inheritance relations. Setting this tag to YES will force the +# the CLASS_DIAGRAMS tag to NO. + +CLASS_GRAPH = YES + +# If the COLLABORATION_GRAPH and HAVE_DOT tags are set to YES then doxygen +# will generate a graph for each documented class showing the direct and +# indirect implementation dependencies (inheritance, containment, and +# class references variables) of the class with other documented classes. + +COLLABORATION_GRAPH = YES + +# If the UML_LOOK tag is set to YES doxygen will generate inheritance and +# collaboration diagrams in a style similar to the OMG's Unified Modeling +# Language. + +UML_LOOK = NO + +# If set to YES, the inheritance and collaboration graphs will show the +# relations between templates and their instances. + +TEMPLATE_RELATIONS = NO + +# If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDE_GRAPH, and HAVE_DOT +# tags are set to YES then doxygen will generate a graph for each documented +# file showing the direct and indirect include dependencies of the file with +# other documented files. + +INCLUDE_GRAPH = YES + +# If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDED_BY_GRAPH, and +# HAVE_DOT tags are set to YES then doxygen will generate a graph for each +# documented header file showing the documented files that directly or +# indirectly include this file. + +INCLUDED_BY_GRAPH = YES + +# If the CALL_GRAPH and HAVE_DOT tags are set to YES then doxygen will +# generate a call dependency graph for every global function or class method. +# Note that enabling this option will significantly increase the time of a run. +# So in most cases it will be better to enable call graphs for selected +# functions only using the \callgraph command. + +CALL_GRAPH = NO + +# If the GRAPHICAL_HIERARCHY and HAVE_DOT tags are set to YES then doxygen +# will graphical hierarchy of all classes instead of a textual one. + +GRAPHICAL_HIERARCHY = YES + +# The DOT_IMAGE_FORMAT tag can be used to set the image format of the images +# generated by dot. Possible values are png, jpg, or gif +# If left blank png will be used. + +DOT_IMAGE_FORMAT = png + +# The tag DOT_PATH can be used to specify the path where the dot tool can be +# found. If left blank, it is assumed the dot tool can be found on the path. + +DOT_PATH = + +# The DOTFILE_DIRS tag can be used to specify one or more directories that +# contain dot files that are included in the documentation (see the +# \dotfile command). + +DOTFILE_DIRS = + +# The MAX_DOT_GRAPH_WIDTH tag can be used to set the maximum allowed width +# (in pixels) of the graphs generated by dot. If a graph becomes larger than +# this value, doxygen will try to truncate the graph, so that it fits within +# the specified constraint. Beware that most browsers cannot cope with very +# large images. + +MAX_DOT_GRAPH_WIDTH = 1024 + +# The MAX_DOT_GRAPH_HEIGHT tag can be used to set the maximum allows height +# (in pixels) of the graphs generated by dot. If a graph becomes larger than +# this value, doxygen will try to truncate the graph, so that it fits within +# the specified constraint. Beware that most browsers cannot cope with very +# large images. + +MAX_DOT_GRAPH_HEIGHT = 1024 + +# The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the +# graphs generated by dot. A depth value of 3 means that only nodes reachable +# from the root by following a path via at most 3 edges will be shown. Nodes that +# lay further from the root node will be omitted. Note that setting this option to +# 1 or 2 may greatly reduce the computation time needed for large code bases. Also +# note that a graph may be further truncated if the graph's image dimensions are +# not sufficient to fit the graph (see MAX_DOT_GRAPH_WIDTH and MAX_DOT_GRAPH_HEIGHT). +# If 0 is used for the depth value (the default), the graph is not depth-constrained. + +MAX_DOT_GRAPH_DEPTH = 0 + +# If the GENERATE_LEGEND tag is set to YES (the default) Doxygen will +# generate a legend page explaining the meaning of the various boxes and +# arrows in the dot generated graphs. + +GENERATE_LEGEND = YES + +# If the DOT_CLEANUP tag is set to YES (the default) Doxygen will +# remove the intermediate dot files that are used to generate +# the various graphs. + +DOT_CLEANUP = YES + +#--------------------------------------------------------------------------- +# Configuration::additions related to the search engine +#--------------------------------------------------------------------------- + +# The SEARCHENGINE tag specifies whether or not a search engine should be +# used. If set to NO the values of all tags below this one will be ignored. + +SEARCHENGINE = NO
--- a/Makefile.am Thu Jul 12 14:59:13 2007 -0700 +++ b/Makefile.am Sun Jul 15 14:25:34 2007 -0700 @@ -1,3 +1,3 @@ SUBDIRS = src man html info CLEANFILES = xml/libpst xml/Makefile -EXTRA_DIST = libpst.spec $(wildcard xml/M*) $(wildcard xml/h*) $(wildcard xml/lib*) +EXTRA_DIST = libpst.html.tar.gz libpst.spec $(wildcard xml/M*) $(wildcard xml/h*) $(wildcard xml/lib*)
--- a/Makefile.cvs Thu Jul 12 14:59:13 2007 -0700 +++ b/Makefile.cvs Sun Jul 15 14:25:34 2007 -0700 @@ -1,8 +1,12 @@ default: all +HOST=$(shell hostname) + all: aclocal autoheader automake autoconf - + rm -rf html.internal + doxygen + tar cfz libpst.html.tar.gz html.internal
--- a/NEWS Thu Jul 12 14:59:13 2007 -0700 +++ b/NEWS Sun Jul 15 14:25:34 2007 -0700 @@ -1,5 +1,6 @@ $Id$ +0.5.6 2007-07-15 handle small pst files, better decoding of 7c blocks 0.5.5 2007-07-10 merge changes from Joe Nahmias version 0.5.4 2006-02-25 add MH mode, generated filenames with no leading zeros 0.5.3 2006-02-20 switch to gnu autoconf/automake
--- a/configure.in Thu Jul 12 14:59:13 2007 -0700 +++ b/configure.in Sun Jul 15 14:25:34 2007 -0700 @@ -1,7 +1,7 @@ AC_INIT(configure.in) AM_CONFIG_HEADER(config.h) -AM_INIT_AUTOMAKE(libpst,0.5.5) +AM_INIT_AUTOMAKE(libpst,0.5.6) AC_PATH_PROGS(BASH, bash) AC_LANG_CPLUSPLUS
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/regression/regression-tests.bash Sun Jul 15 14:25:34 2007 -0700 @@ -0,0 +1,21 @@ +#!/bin/bash + +for i in {1..6}; do + rm -rf output$i + mkdir output$i +done + +# ../src/pst2ldif -b 'o=ams-cc.com, c=US' -c 'newPerson' ams.pst >ams.err +# ../src/readpst -cv -o output1 ams.pst +# ../src/readpst -cl -r -o output2 ams.pst +# ../src/readpst -S -o output3 ams.pst +# ../src/readpst -M -o output4 ams.pst +# ../src/readpst -o output5 mbmg.archive.pst + + ../src/readpst -o output1 -d dumper ams.pst + ../src/readpstlog dumper >dumperams.log + + ../src/readpst -o output6 -d dumper /tmp/pam.pst + ../src/readpstlog dumper >dumperpam.log + + rm -f dumper
--- a/src/debug.c Thu Jul 12 14:59:13 2007 -0700 +++ b/src/debug.c Sun Jul 15 14:25:34 2007 -0700 @@ -237,7 +237,6 @@ free(func_ptr->name); free(func_ptr); } - if (debug_fp) fclose(debug_fp); debug_fp = NULL; }
--- a/src/libpst.c Thu Jul 12 14:59:13 2007 -0700 +++ b/src/libpst.c Sun Jul 15 14:25:34 2007 -0700 @@ -44,7 +44,7 @@ #define PST_SIGNATURE 0x4E444221 struct _pst_table_ptr_struct{ - u_int32_t start; + int32_t start; int32_t u1; int32_t offset; }; @@ -55,12 +55,12 @@ } pst_block_header; typedef struct _pst_id2_assoc { - int32_t id2; - int32_t id; + u_int32_t id2; + u_int32_t id; int32_t table2; } pst_id2_assoc; -// this is an array of the un-encrypted values. the un-encrypyed value is in the position +// this is an array of the un-encrypted values. the un-encrypted value is in the position // of the encrypted value. ie the encrypted value 0x13 represents 0x02 // 0 1 2 3 4 5 6 7 // 8 9 a b c d e f @@ -192,9 +192,6 @@ pst_desc_ll* pst_getTopOfFolders(pst_file *pf, pst_item *root) { pst_desc_ll *ret; - // pst_item *i; - // char *a, *b; - // int x,z; DEBUG_ENT("pst_getTopOfFolders"); if (!root || !root->message_store) { DEBUG_INDEX(("There isn't a top of folder record here.\n")); @@ -521,7 +518,7 @@ if (index.id == 0) break; DEBUG_INDEX(("[%i]%i Item [id = %#x, offset = %#x, u1 = %#x, size = %i(%#x)]\n", depth, x, index.id, index.offset, index.u1, index.size, index.size)); - if (index.id & 0x02) DEBUG_INDEX(("two-bit set!!\n")); + // if (index.id & 0x02) DEBUG_INDEX(("two-bit set!!\n")); if ((index.id >= end_val) || (index.id < old)) { DEBUG_WARN(("This item isn't right. Must be corruption, or I got it wrong!\n")); if (buf) free(buf); @@ -591,23 +588,124 @@ } -int32_t _pst_build_desc_ptr (pst_file *pf, int32_t offset, int32_t depth, int32_t linku1, u_int32_t *high_id, int32_t start_val, int32_t end_val) { +/** this list node type is used for a quick cache + of the descriptor tree nodes (rooted at pf->d_head) + and for a "lost and found" list. + If the parent isn't found yet, put it on the lost and found + list and check it each time you read a new item. +*/ +struct cache_list_node { + pst_desc_ll *ptr; + /** only used for lost and found lists */ + u_int32_t parent; + struct cache_list_node *next; + struct cache_list_node *prev; +}; +struct cache_list_node *cache_head; +struct cache_list_node *cache_tail; +struct cache_list_node *lostfound_head; +int32_t cache_count; + + +/** + add the d_ptr descriptor into the global tree +*/ +void record_descriptor(pst_file *pf, pst_desc_ll *d_ptr, u_int32_t parent_id) { + struct cache_list_node *lostfound_ptr = NULL; + struct cache_list_node *cache_ptr = NULL; + pst_desc_ll *parent = NULL; + + if (parent_id == 0 || parent_id == d_ptr->id) { + // add top level node to the descriptor tree + if (parent_id == 0) { + DEBUG_INDEX(("No Parent\n")); + } else { + DEBUG_INDEX(("Record is its own parent. What is this world coming to?\n")); + } + if (pf->d_tail) pf->d_tail->next = d_ptr; + if (!pf->d_head) pf->d_head = d_ptr; + d_ptr->prev = pf->d_tail; + pf->d_tail = d_ptr; + } else { + DEBUG_INDEX(("Searching for parent\n")); + // check in the cache for the parent + cache_ptr = cache_head; + while (cache_ptr && (cache_ptr->ptr->id != parent_id)) { + cache_ptr = cache_ptr->next; + } + if (!cache_ptr && (parent = _pst_getDptr(pf, parent_id)) == NULL) { + // check in the lost/found list + lostfound_ptr = lostfound_head; + while (lostfound_ptr && (lostfound_ptr->ptr->id != parent_id)) { + lostfound_ptr = lostfound_ptr->next; + } + if (!lostfound_ptr) { + DEBUG_WARN(("ERROR -- cannot find parent with id %#x. Adding to lost/found\n", parent_id)); + lostfound_ptr = (struct cache_list_node*) xmalloc(sizeof(struct cache_list_node)); + lostfound_ptr->prev = NULL; + lostfound_ptr->next = lostfound_head; + lostfound_ptr->parent = parent_id; + lostfound_ptr->ptr = d_ptr; + lostfound_head = lostfound_ptr; + } else { + parent = lostfound_ptr->ptr; + DEBUG_INDEX(("Found parent (%#x) in Lost and Found\n", parent->id)); + } + } + + if (cache_ptr || parent) { + if (cache_ptr) + // parent is already in the cache + parent = cache_ptr->ptr; + else { + //add the parent to the cache + DEBUG_INDEX(("Cache addition\n")); + cache_ptr = (struct cache_list_node*) xmalloc(sizeof(struct cache_list_node)); + cache_ptr->prev = NULL; + cache_ptr->next = cache_head; + cache_ptr->ptr = parent; + cache_head = cache_ptr; + if (!cache_tail) cache_tail = cache_ptr; + cache_count++; + if (cache_count > 100) { + DEBUG_INDEX(("trimming quick cache\n")); + //remove one from the end + cache_ptr = cache_tail; + cache_tail = cache_ptr->prev; + free (cache_ptr); + cache_count--; + } + } + DEBUG_INDEX(("Found a parent\n")); + parent->no_child++; + d_ptr->parent = parent; + if (parent->child_tail) parent->child_tail->next = d_ptr; + if (!parent->child) parent->child = d_ptr; + d_ptr->prev = parent->child_tail; + parent->child_tail = d_ptr; + } + } +} + +int32_t _pst_build_desc_ptr (pst_file *pf, int32_t offset, int32_t depth, int32_t linku1, u_int32_t *high_id, u_int32_t start_val, u_int32_t end_val) { struct _pst_table_ptr_struct table, table2; pst_desc desc_rec; - pst_desc_ll *d_ptr=NULL, *d_par=NULL; - int32_t d_ptr_count = 0; + pst_desc_ll *d_ptr=NULL, *parent=NULL; int32_t x, item_count; - int32_t old = start_val; + u_int32_t old = start_val; char *buf = NULL, *bptr; + struct cache_list_node *cache_ptr = NULL; + struct cache_list_node *lostfound_ptr = NULL; + struct cache_list_node *lostfound_shd = NULL; + struct cache_list_node *lostfound_tmp = NULL; - struct _pst_d_ptr_ll { - pst_desc_ll * ptr; - int32_t parent; // used for lost and found lists - struct _pst_d_ptr_ll * next; - struct _pst_d_ptr_ll * prev; - } *d_ptr_head=NULL, *d_ptr_tail=NULL, *d_ptr_ptr=NULL, *lf_ptr=NULL, *lf_head=NULL, *lf_shd=NULL, *lf_tmp; - // lf_ptr and lf_head are used for the lost/found list. If the parent isn't found yet, put it on this - // list and check it each time you read a new item + if (depth == 0) { + // initialize the linked list and lost/found list. + cache_head = NULL; + cache_tail = NULL; + lostfound_head = NULL; + cache_count = 0; + } DEBUG_ENT("_pst_build_desc_ptr"); DEBUG_INDEX(("offset %x depth %i linku1 %x start %x end %x\n", offset, depth, linku1, start_val, end_val)); @@ -663,7 +761,7 @@ } old = desc_rec.d_id; if (x == 1) { // first entry - if (start_val != -1 && desc_rec.d_id != start_val) { + if (start_val && (desc_rec.d_id != start_val)) { DEBUG_WARN(("This item isn't right. Must be corruption, or I got it wrong!\n")); if (buf) free(buf); DEBUG_RET(); @@ -672,7 +770,7 @@ } // When duplicates found, just update the info.... perhaps this is correct functionality DEBUG_INDEX(("Searching for existing record\n")); - if (desc_rec.d_id <= *high_id && (d_ptr = _pst_getDptr(pf, desc_rec.d_id)) != NULL) { + if (desc_rec.d_id <= *high_id && (d_ptr = _pst_getDptr(pf, desc_rec.d_id))) { DEBUG_INDEX(("Updating Existing Values\n")); d_ptr->list_index = _pst_getID(pf, desc_rec.list_id); d_ptr->desc = _pst_getID(pf, desc_rec.desc_id); @@ -706,72 +804,7 @@ d_ptr->prev = NULL; d_ptr->next = NULL; d_ptr->parent = NULL; - - // ok, now place in correct place - DEBUG_INDEX(("Searching for parent\n")); - if (desc_rec.parent_id == 0) { - DEBUG_INDEX(("No Parent\n")); - if (pf->d_tail) pf->d_tail->next = d_ptr; - if (!pf->d_head) pf->d_head = d_ptr; - d_ptr->prev = pf->d_tail; - pf->d_tail = d_ptr; - } else { - // check in the quick list - d_ptr_ptr = d_ptr_head; - while (d_ptr_ptr && (d_ptr_ptr->ptr->id != desc_rec.parent_id)) { - d_ptr_ptr = d_ptr_ptr->next; - } - - if (!d_ptr_ptr && (d_par = _pst_getDptr(pf, desc_rec.parent_id)) == NULL) { - // check in the lost/found list - lf_ptr = lf_head; - while (lf_ptr && lf_ptr->ptr->id != desc_rec.parent_id) { - lf_ptr = lf_ptr->next; - } - if (!lf_ptr) { - DEBUG_WARN(("ERROR -- not found parent with id %#x. Adding to lost/found\n", desc_rec.parent_id)); - lf_ptr = (struct _pst_d_ptr_ll*) xmalloc(sizeof(struct _pst_d_ptr_ll)); - lf_ptr->prev = NULL; - lf_ptr->next = lf_head; - lf_ptr->parent = desc_rec.parent_id; - lf_ptr->ptr = d_ptr; - lf_head = lf_ptr; - } else { - d_par = lf_ptr->ptr; - DEBUG_INDEX(("Found parent (%#x) in Lost and Found\n", d_par->id)); - } - } - - if (d_ptr_ptr || d_par) { - if (d_ptr_ptr) - d_par = d_ptr_ptr->ptr; - else { - //add the d_par to the cache - DEBUG_INDEX(("Update - Cache addition\n")); - d_ptr_ptr = (struct _pst_d_ptr_ll*) xmalloc(sizeof(struct _pst_d_ptr_ll)); - d_ptr_ptr->prev = NULL; - d_ptr_ptr->next = d_ptr_head; - d_ptr_ptr->ptr = d_par; - d_ptr_head = d_ptr_ptr; - if (!d_ptr_tail) d_ptr_tail = d_ptr_ptr; - d_ptr_count++; - if (d_ptr_count > 100) { - //remove on from the end - d_ptr_ptr = d_ptr_tail; - d_ptr_tail = d_ptr_ptr->prev; - free (d_ptr_ptr); - d_ptr_count--; - } - } - DEBUG_INDEX(("Found a parent\n")); - d_par->no_child++; - d_ptr->parent = d_par; - if (d_par->child_tail) d_par->child_tail->next = d_ptr; - if (!d_par->child) d_par->child = d_ptr; - d_ptr->prev = d_par->child_tail; - d_par->child_tail = d_ptr; - } - } + record_descriptor(pf, d_ptr, desc_rec.parent_id); // add to the global tree } } else { if (*high_id < desc_rec.d_id) { @@ -789,99 +822,31 @@ d_ptr->child = NULL; d_ptr->child_tail = NULL; d_ptr->no_child = 0; + record_descriptor(pf, d_ptr, desc_rec.parent_id); // add to the global tree - DEBUG_INDEX(("Searching for parent\n")); - if (desc_rec.parent_id == 0 || desc_rec.parent_id == desc_rec.d_id) { - if (desc_rec.parent_id == 0) { - DEBUG_INDEX(("No Parent\n")); - } else { - DEBUG_INDEX(("Record is its own parent. What is this world coming to?\n")); - } - if (pf->d_tail) pf->d_tail->next = d_ptr; - if (!pf->d_head) pf->d_head = d_ptr; - d_ptr->prev = pf->d_tail; - pf->d_tail = d_ptr; - } else { - d_ptr_ptr = d_ptr_head; - while (d_ptr_ptr && (d_ptr_ptr->ptr->id != desc_rec.parent_id)) { - d_ptr_ptr = d_ptr_ptr->next; - } - if (!d_ptr_ptr && (d_par = _pst_getDptr(pf, desc_rec.parent_id)) == NULL) { - // check in the lost/found list - lf_ptr = lf_head; - while (lf_ptr && (lf_ptr->ptr->id != desc_rec.parent_id)) { - lf_ptr = lf_ptr->next; - } - if (!lf_ptr) { - DEBUG_WARN(("ERROR -- not found parent with id %#x. Adding to lost/found\n", desc_rec.parent_id)); - lf_ptr = (struct _pst_d_ptr_ll*) xmalloc(sizeof(struct _pst_d_ptr_ll)); - lf_ptr->prev = NULL; - lf_ptr->next = lf_head; - lf_ptr->parent = desc_rec.parent_id; - lf_ptr->ptr = d_ptr; - lf_head = lf_ptr; - } else { - d_par = lf_ptr->ptr; - DEBUG_INDEX(("Found parent (%#x) in Lost and Found\n", d_par->id)); - } - } - - if (d_ptr_ptr || d_par) { - if (d_ptr_ptr) - d_par = d_ptr_ptr->ptr; - else { - //add the d_par to the cache - DEBUG_INDEX(("Normal - Cache addition\n")); - d_ptr_ptr = (struct _pst_d_ptr_ll*) xmalloc(sizeof(struct _pst_d_ptr_ll)); - d_ptr_ptr->prev = NULL; - d_ptr_ptr->next = d_ptr_head; - d_ptr_ptr->ptr = d_par; - d_ptr_head = d_ptr_ptr; - if (!d_ptr_tail) d_ptr_tail = d_ptr_ptr; - d_ptr_count++; - if (d_ptr_count > 100) { - //remove one from the end - d_ptr_ptr = d_ptr_tail; - d_ptr_tail = d_ptr_ptr->prev; - free (d_ptr_ptr); - d_ptr_count--; - } - } - - DEBUG_INDEX(("Found a parent\n")); - d_par->no_child++; - d_ptr->parent = d_par; - if (d_par->child_tail) d_par->child_tail->next = d_ptr; - if (!d_par->child) d_par->child = d_ptr; - d_ptr->prev = d_par->child_tail; - d_par->child_tail = d_ptr; - } - } } // check here to see if d_ptr is the parent of any of the items in the lost / found list - lf_ptr = lf_head; - lf_shd = NULL; - while (lf_ptr) { - if (lf_ptr->parent == d_ptr->id) { - DEBUG_INDEX(("Found a child (%#x) of the current record. Joining to main structure.\n", lf_ptr->ptr->id)); - d_par = d_ptr; - d_ptr = lf_ptr->ptr; - d_par->no_child++; - d_ptr->parent = d_par; - if (d_par->child_tail) d_par->child_tail->next = d_ptr; - if (!d_par->child) d_par->child = d_ptr; - d_ptr->prev = d_par->child_tail; - d_par->child_tail = d_ptr; - if (!lf_shd) - lf_head = lf_ptr->next; - else - lf_shd->next = lf_ptr->next; - lf_tmp = lf_ptr->next; - free(lf_ptr); - lf_ptr = lf_tmp; + lostfound_ptr = lostfound_head; + lostfound_shd = NULL; + while (lostfound_ptr) { + if (lostfound_ptr->parent == d_ptr->id) { + DEBUG_INDEX(("Found a child (%#x) of the current record. Joining to main structure.\n", lostfound_ptr->ptr->id)); + parent = d_ptr; + d_ptr = lostfound_ptr->ptr; + parent->no_child++; + d_ptr->parent = parent; + if (parent->child_tail) parent->child_tail->next = d_ptr; + if (!parent->child) parent->child = d_ptr; + d_ptr->prev = parent->child_tail; + parent->child_tail = d_ptr; + if (!lostfound_shd) lostfound_head = lostfound_ptr->next; + else lostfound_shd->next = lostfound_ptr->next; + lostfound_tmp = lostfound_ptr->next; + free(lostfound_ptr); + lostfound_ptr = lostfound_tmp; } else { - lf_shd = lf_ptr; - lf_ptr = lf_ptr->next; + lostfound_shd = lostfound_ptr; + lostfound_ptr = lostfound_ptr->next; } } } @@ -930,14 +895,21 @@ _pst_build_desc_ptr(pf, table.offset, depth+1, table.u1, high_id, table.start, table2.start); } } - // ok, lets try freeing the d_ptr_head cache here - while (d_ptr_head) { - d_ptr_ptr = d_ptr_head->next; - free(d_ptr_head); - d_ptr_head = d_ptr_ptr; + if (depth == 0) { + // free the quick cache + while (cache_head) { + cache_ptr = cache_head->next; + free(cache_head); + cache_head = cache_ptr; + } + // free the lost and found + while (lostfound_head) { + lostfound_ptr = lostfound_head->next; + WARN(("unused lost/found item with parent %d))", lostfound_head->parent)); + free(lostfound_head); + lostfound_head = lostfound_ptr; + } } - // TODO - need to free lost and found list also!! - // TODO - and show error for any remaining lf items if (buf) free(buf); DEBUG_RET(); return 0; @@ -1076,14 +1048,49 @@ } +void freeall(unsigned char *buf, pst_block_offset_pointer *p1, + pst_block_offset_pointer *p2, + pst_block_offset_pointer *p3, + pst_block_offset_pointer *p4, + pst_block_offset_pointer *p5, + pst_block_offset_pointer *p6, + pst_block_offset_pointer *p7) { + if (buf) free(buf); + if (p1->needfree) free(p1->from); + if (p2->needfree) free(p2->from); + if (p3->needfree) free(p3->from); + if (p4->needfree) free(p4->from); + if (p5->needfree) free(p5->from); + if (p6->needfree) free(p6->from); + if (p7->needfree) free(p7->from); +} + + pst_num_array * _pst_parse_block(pst_file *pf, u_int32_t block_id, pst_index2_ll *i2_head) { unsigned char *buf = NULL; pst_num_array *na_ptr = NULL, *na_head = NULL; - pst_block_offset block_offset; - // pst_index_ll *rec = NULL; - u_int32_t size = 0, t_ptr = 0, fr_ptr = 0, to_ptr = 0, ind_ptr = 0, x = 0; - u_int32_t num_recs = 0, count_rec = 0, ind2_ptr = 0, ind2_end = 0, list_start = 0, num_list = 0, cur_list = 0; - int32_t block_type, rec_size; + pst_block_offset_pointer block_offset1; + pst_block_offset_pointer block_offset2; + pst_block_offset_pointer block_offset3; + pst_block_offset_pointer block_offset4; + pst_block_offset_pointer block_offset5; + pst_block_offset_pointer block_offset6; + pst_block_offset_pointer block_offset7; + u_int32_t size; + u_int32_t x; + u_int32_t num_recs; + u_int32_t count_rec; + u_int32_t num_list; + u_int32_t cur_list; + u_int32_t block_type; + u_int32_t rec_size; + u_int32_t ind_ptr; + unsigned char* list_start; + unsigned char* t_ptr; + unsigned char* fr_ptr; + unsigned char* to_ptr; + unsigned char* ind2_end; + unsigned char* ind2_ptr; size_t read_size=0; pst_x_attrib_ll *mapptr; @@ -1091,17 +1098,21 @@ u_int16_t type; u_int16_t ref_type; u_int32_t value; - } table_rec; //for type 1 ("BC") blocks + } table_rec; //for type 1 (0xBCEC) blocks struct { u_int16_t ref_type; u_int16_t type; u_int16_t ind2_off; - u_int16_t u1; - } table2_rec; //for type 2 ("7C") blocks + u_int8_t size; + u_int8_t slot; + } table2_rec; //for type 2 (0x7CEC) blocks + struct { + u_int32_t id; + } table3_rec; //for type 3 (0x0101) blocks struct { u_int16_t index_offset; u_int16_t type; - u_int16_t offset; + u_int32_t offset; } block_hdr; struct { unsigned char seven_c; @@ -1110,10 +1121,8 @@ u_int16_t u2; u_int16_t u3; u_int16_t rec_size; - u_int16_t b_five_offset; - u_int16_t u5; - u_int16_t ind2_offset; - u_int16_t u6; + u_int32_t b_five_offset; + u_int32_t ind2_offset; u_int16_t u7; u_int16_t u8; } seven_c_blk; @@ -1130,10 +1139,18 @@ return NULL; } + block_offset1.needfree = 0; + block_offset2.needfree = 0; + block_offset3.needfree = 0; + block_offset4.needfree = 0; + block_offset5.needfree = 0; + block_offset6.needfree = 0; + block_offset7.needfree = 0; + memcpy(&block_hdr, buf, sizeof(block_hdr)); LE16_CPU(block_hdr.index_offset); LE16_CPU(block_hdr.type); - LE16_CPU(block_hdr.offset); + LE32_CPU(block_hdr.offset); DEBUG_EMAIL(("block header (index_offset=%#hx, type=%#hx, offset=%#hx\n", block_hdr.index_offset, block_hdr.type, block_hdr.offset)); ind_ptr = block_hdr.index_offset; @@ -1141,26 +1158,34 @@ if (block_hdr.type == 0xBCEC) { //type 1 block_type = 1; - _pst_getBlockOffset(buf, read_size, ind_ptr, block_hdr.offset, &block_offset); - fr_ptr = block_offset.from; - - memcpy(&table_rec, &(buf[fr_ptr]), sizeof(table_rec)); + if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, block_hdr.offset, &block_offset1)) { + DEBUG_WARN(("internal error (bc.b5 offset %#x) in reading block id %#x\n", block_hdr.offset, block_id)); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); + DEBUG_RET(); + return NULL; + } + memcpy(&table_rec, block_offset1.from, sizeof(table_rec)); LE16_CPU(table_rec.type); LE16_CPU(table_rec.ref_type); LE32_CPU(table_rec.value); - DEBUG_EMAIL(("table_rec (type=%#hx, ref_type=%#hx, value=%#x\n", table_rec.type, table_rec.ref_type, table_rec.value)); + DEBUG_EMAIL(("table_rec (type=%#hx, ref_type=%#hx, value=%#x)\n", table_rec.type, table_rec.ref_type, table_rec.value)); if (table_rec.type != 0x02B5) { WARN(("Unknown second block constant - %#X for id %#x\n", table_rec.type, block_id)); DEBUG_HEXDUMPC(buf, sizeof(table_rec), 0x10); - if (buf) free (buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); DEBUG_RET(); return NULL; } - _pst_getBlockOffset(buf, read_size, ind_ptr, table_rec.value, &block_offset); - list_start = block_offset.from; - to_ptr = block_offset.to; + if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, table_rec.value, &block_offset2)) { + DEBUG_WARN(("internal error (bc.b5.desc offset) in reading block id %#x\n", table_rec.value, block_id)); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); + DEBUG_RET(); + return NULL; + } + list_start = block_offset2.from; + to_ptr = block_offset2.to; num_list = (to_ptr - list_start)/sizeof(table_rec); num_recs = 1; // only going to be one object in these blocks rec_size = 0; // doesn't matter cause there is only one object @@ -1168,18 +1193,21 @@ else if (block_hdr.type == 0x7CEC) { //type 2 block_type = 2; - _pst_getBlockOffset(buf, read_size, ind_ptr, block_hdr.offset, &block_offset); - fr_ptr = block_offset.from; //now got pointer to "7C block" + if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, block_hdr.offset, &block_offset3)) { + DEBUG_WARN(("internal error (7c.7c offset %#x) in reading block id %#x\n", block_hdr.offset, block_id)); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); + DEBUG_RET(); + return NULL; + } + fr_ptr = block_offset3.from; //now got pointer to "7C block" memset(&seven_c_blk, 0, sizeof(seven_c_blk)); - memcpy(&seven_c_blk, &(buf[fr_ptr]), sizeof(seven_c_blk)); + memcpy(&seven_c_blk, fr_ptr, sizeof(seven_c_blk)); LE16_CPU(seven_c_blk.u1); LE16_CPU(seven_c_blk.u2); LE16_CPU(seven_c_blk.u3); LE16_CPU(seven_c_blk.rec_size); - LE16_CPU(seven_c_blk.b_five_offset); - LE16_CPU(seven_c_blk.u5); - LE16_CPU(seven_c_blk.ind2_offset); - LE16_CPU(seven_c_blk.u6); + LE32_CPU(seven_c_blk.b_five_offset); + LE32_CPU(seven_c_blk.ind2_offset); LE16_CPU(seven_c_blk.u7); LE16_CPU(seven_c_blk.u8); @@ -1187,53 +1215,75 @@ if (seven_c_blk.seven_c != 0x7C) { // this would mean it isn't a 7C block! WARN(("Error. There isn't a 7C where I want to see 7C!\n")); - if (buf) free(buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); DEBUG_RET(); return NULL; } rec_size = seven_c_blk.rec_size; num_list = seven_c_blk.item_count; - DEBUG_EMAIL(("b5 offset = %#x\n", seven_c_blk.b_five_offset)); - _pst_getBlockOffset(buf, read_size, ind_ptr, seven_c_blk.b_five_offset, &block_offset); - fr_ptr = block_offset.from; - memcpy(&table_rec, &(buf[fr_ptr]), sizeof(table_rec)); + if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, seven_c_blk.b_five_offset, &block_offset4)) { + DEBUG_WARN(("internal error (7c.b5 offset %#x) in reading block id %#x\n", seven_c_blk.b_five_offset, block_id)); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); + DEBUG_RET(); + return NULL; + } + memcpy(&table_rec, block_offset4.from, sizeof(table_rec)); LE16_CPU(table_rec.type); LE16_CPU(table_rec.ref_type); LE32_CPU(table_rec.value); - DEBUG_EMAIL(("after convert %#x\n", table_rec.type)); if (table_rec.type != 0x04B5) { // different constant than a type 1 record WARN(("Unknown second block constant - %#X for id %#x\n", table_rec.type, block_id)); - if (buf) free(buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); DEBUG_RET(); return NULL; } - if (table_rec.value == 0) { // this is for the 2nd index offset - DEBUG_INFO(("reference to second index block is zero. ERROR\n")); - if (buf) free(buf); + if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, table_rec.value, &block_offset5)) { + DEBUG_WARN(("internal error (7c.5b.desc offset %#x) in reading block id %#x\n", table_rec.value, block_id)); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); + DEBUG_RET(); + return NULL; + } + num_recs = (block_offset5.to - block_offset5.from) / 6; // this will give the number of records in this block + + if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, seven_c_blk.ind2_offset, &block_offset6)) { + DEBUG_WARN(("internal error (7c.ind2 offset %#x) in reading block id %#x\n", seven_c_blk.ind2_offset, block_id)); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); DEBUG_RET(); return NULL; } - - _pst_getBlockOffset(buf, read_size, ind_ptr, table_rec.value, &block_offset); - num_recs = (block_offset.to - block_offset.from) / 6; // this will give the number of records in this block - - _pst_getBlockOffset(buf, read_size, ind_ptr, seven_c_blk.ind2_offset, &block_offset); - ind2_ptr = block_offset.from; - ind2_end = block_offset.to; + ind2_ptr = block_offset6.from; + ind2_end = block_offset6.to; + } + else if (block_hdr.index_offset == 0x0101) { //type 2 + unsigned char *buf2 = NULL; + int n = block_hdr.type; // count + int m = sizeof(table3_rec); + int i; + block_type = 3; + for (i=0; i<n; i++) { + memcpy(&table3_rec, buf+8+i*m, m); + LE32_CPU(table3_rec.id); + _pst_ff_getIDblock_dec(pf, table3_rec.id, &buf2); + if (buf2) free(buf2); + buf2 = NULL; + } + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); + DEBUG_RET(); + return NULL; } else { WARN(("ERROR: Unknown block constant - %#X for id %#x\n", block_hdr.type, block_id)); DEBUG_HEXDUMPC(buf, read_size,0x10); - if (buf) free(buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); DEBUG_RET(); return NULL; } - DEBUG_EMAIL(("Mallocing number of items %i\n", num_recs)); - while (count_rec < num_recs) { + DEBUG_EMAIL(("Mallocing number of records %i\n", num_recs)); + for (count_rec=0; count_rec<num_recs; count_rec++) { na_ptr = (pst_num_array*) xmalloc(sizeof(pst_num_array)); memset(na_ptr, 0, sizeof(pst_num_array)); na_ptr->next = na_head; @@ -1246,45 +1296,51 @@ DEBUG_EMAIL(("going to read %i (%#x) items\n", na_ptr->count_item, na_ptr->count_item)); - fr_ptr = list_start; // init fr_ptr to the start of the list. - cur_list = 0; - while (cur_list < num_list) { //we will increase fr_ptr as we progress through index + fr_ptr = list_start; // initialize fr_ptr to the start of the list. + for (cur_list=0; cur_list<num_list; cur_list++) { //we will increase fr_ptr as we progress through index + unsigned char* value_pointer = NULL; // needed for block type 2 with values larger than 4 bytes + int value_size = 0; if (block_type == 1) { - memcpy(&table_rec, &(buf[fr_ptr]), sizeof(table_rec)); + memcpy(&table_rec, fr_ptr, sizeof(table_rec)); LE16_CPU(table_rec.type); LE16_CPU(table_rec.ref_type); //LE32_CPU(table_rec.value); // done later, some may be order invariant fr_ptr += sizeof(table_rec); } else if (block_type == 2) { // we will copy the table2_rec values into a table_rec record so that we can keep the rest of the code - memcpy(&table2_rec, &(buf[fr_ptr]), sizeof(table2_rec)); + memcpy(&table2_rec, fr_ptr, sizeof(table2_rec)); LE16_CPU(table2_rec.ref_type); LE16_CPU(table2_rec.type); LE16_CPU(table2_rec.ind2_off); - LE16_CPU(table2_rec.u1); // table_rec and table2_rec are arranged differently, so assign the values across table_rec.type = table2_rec.type; table_rec.ref_type = table2_rec.ref_type; - if (ind2_ptr+table2_rec.ind2_off <= ind2_end) { - memcpy(&(table_rec.value), &(buf[ind2_ptr+table2_rec.ind2_off]), sizeof(table_rec.value)); + if ((ind2_end - ind2_ptr) <= (table2_rec.ind2_off + table2_rec.size)) { + int n = table2_rec.size; + int m = sizeof(table_rec.value); + table_rec.value = 0; + if (n <= m) { + memcpy(&table_rec.value, ind2_ptr + table2_rec.ind2_off, n); + } + else { + value_pointer = ind2_ptr + table2_rec.ind2_off; + value_size = n; + } //LE32_CPU(table_rec.value); // done later, some may be order invariant } else { - DEBUG_WARN (("trying to read more than blocks size. Size=%#x, Req.=%#x," - " Req Size=%#x\n", read_size, ind2_ptr+table2_rec.ind2_off, - sizeof(table_rec.value))); + DEBUG_WARN (("Trying to read outside buffer, buffer size %#x, offset %#x, data size %#x\n", + read_size, ind2_end-ind2_ptr+table2_rec.ind2_off, table2_rec.size)); } - fr_ptr += sizeof(table2_rec); } else { WARN(("Missing code for block_type %i\n", block_type)); - if (buf) free(buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); if (na_head) _pst_free_list(na_head); DEBUG_RET(); return NULL; } - cur_list++; // get ready to read next bit from list DEBUG_EMAIL(("reading block %i (type=%#x, ref_type=%#x, value=%#x)\n", x, table_rec.type, table_rec.ref_type, table_rec.value)); @@ -1329,84 +1385,68 @@ 0x1102 - Array of Binary data */ - if (table_rec.ref_type == 0x0002 || table_rec.ref_type == 0x0003 || table_rec.ref_type == 0x000b) { - //contains data - na_ptr->items[x]->data = xmalloc(sizeof(int32_t)); - memcpy(na_ptr->items[x]->data, &(table_rec.value), sizeof(int32_t)); + if (table_rec.ref_type == 0x0002 || + table_rec.ref_type == 0x0003 || + table_rec.ref_type == 0x000b) { + //contains 32 bits of data na_ptr->items[x]->size = sizeof(int32_t); na_ptr->items[x]->type = table_rec.ref_type; + na_ptr->items[x]->data = xmalloc(sizeof(int32_t)); + memcpy(na_ptr->items[x]->data, &(table_rec.value), sizeof(int32_t)); - } else if (table_rec.ref_type == 0x0005 || table_rec.ref_type == 0x000D - || table_rec.ref_type == 0x1003 || table_rec.ref_type == 0x0014 - || table_rec.ref_type == 0x001E || table_rec.ref_type == 0x0102 - || table_rec.ref_type == 0x0040 || table_rec.ref_type == 0x101E - || table_rec.ref_type == 0x0048 || table_rec.ref_type == 0x1102 - || table_rec.ref_type == 0x1014) { - //contains index_ref to data + } else if (table_rec.ref_type == 0x0005 || + table_rec.ref_type == 0x000d || + table_rec.ref_type == 0x0014 || + table_rec.ref_type == 0x001e || + table_rec.ref_type == 0x0040 || + table_rec.ref_type == 0x0048 || + table_rec.ref_type == 0x0102 || + table_rec.ref_type == 0x1003 || + table_rec.ref_type == 0x1014 || + table_rec.ref_type == 0x101e || + table_rec.ref_type == 0x1102) { + //contains index reference to data LE32_CPU(table_rec.value); - if ((table_rec.value & 0x0000000F) == 0xF) { - // if value ends in 'F' then this should be an id2 value - DEBUG_EMAIL(("Found id2 [%#x] value. Will follow it\n", table_rec.value)); - if ((na_ptr->items[x]->size = _pst_ff_getID2block(pf, table_rec.value, i2_head, - &(na_ptr->items[x]->data)))==0) { - DEBUG_WARN(("not able to read the ID2 data. Setting to be read later. %#x\n", table_rec.value)); + if (value_pointer) { + // in a type 2 block, with a value that is more than 4 bytes + // directly stored in this block. + na_ptr->items[x]->size = value_size; + na_ptr->items[x]->type = table_rec.ref_type; + na_ptr->items[x]->data = xmalloc(value_size); + memcpy(na_ptr->items[x]->data, value_pointer, value_size); + } + else if (_pst_getBlockOffsetPointer(pf, i2_head, buf, read_size, ind_ptr, table_rec.value, &block_offset7)) { + if (table_rec.value) { + DEBUG_WARN(("failed to get block offset for table_rec.value of %#x\n", table_rec.value)); + } + na_ptr->count_item --; //we will be skipping a row + continue; + } + else { + value_size = block_offset7.to - block_offset7.from; + na_ptr->items[x]->size = value_size; + na_ptr->items[x]->type = table_rec.ref_type; + na_ptr->items[x]->data = xmalloc(value_size+1); + memcpy(na_ptr->items[x]->data, block_offset7.from, value_size); + na_ptr->items[x]->data[value_size] = '\0'; // it might be a string, null terminate it. + } + if (table_rec.ref_type == 0xd) { + // there is still more to do for the type of 0xD + type_d_rec = (struct _type_d_rec*) na_ptr->items[x]->data; + LE32_CPU(type_d_rec->id); + if ((na_ptr->items[x]->size = _pst_ff_getID2block(pf, type_d_rec->id, i2_head, &(na_ptr->items[x]->data)))==0){ + DEBUG_WARN(("not able to read the ID2 data. Setting to be read later. %#x\n", + type_d_rec->id)); + free(na_ptr->items[x]->data); na_ptr->items[x]->size = 0; na_ptr->items[x]->data = NULL; - na_ptr->items[x]->type = table_rec.value; - } - } else if (table_rec.value != 0) { - if ((table_rec.value >> 4)+ind_ptr > read_size) { - // check that we will not be outside the buffer we have read - DEBUG_WARN(("table_rec.value [%#x] is outside of block [%#x]\n", table_rec.value, read_size)); - na_ptr->count_item --; - continue; - } - if (_pst_getBlockOffset(buf, read_size, ind_ptr, table_rec.value, &block_offset)) { - DEBUG_WARN(("failed to get block offset for table_rec.value of %#x\n", table_rec.value)); - na_ptr->count_item --; //we will be skipping a row - continue; - } - t_ptr = block_offset.from; - if (t_ptr <= block_offset.to) { - na_ptr->items[x]->size = size = block_offset.to - t_ptr; - } else { - DEBUG_WARN(("I don't want to malloc less than zero sized block. from=%#x, to=%#x." - "Will change to 1 byte\n", block_offset.from, block_offset.to)); - na_ptr->items[x]->size = size = 0; // the malloc statement will add one to this + na_ptr->items[x]->type = type_d_rec->id; } - - // plus one for good luck (and strings) we will null terminate all reads - na_ptr->items[x]->data = (unsigned char*) xmalloc(size+1); - memcpy(na_ptr->items[x]->data, &(buf[t_ptr]), size); - na_ptr->items[x]->data[size] = '\0'; // null terminate buffer - - if (table_rec.ref_type == 0xd) { - // there is still more to do for the type of 0xD - type_d_rec = (struct _type_d_rec*) na_ptr->items[x]->data; - LE32_CPU(type_d_rec->id); - if ((na_ptr->items[x]->size = _pst_ff_getID2block(pf, type_d_rec->id, i2_head, &(na_ptr->items[x]->data)))==0){ - DEBUG_WARN(("not able to read the ID2 data. Setting to be read later. %#x\n", - type_d_rec->id)); - na_ptr->items[x]->size = 0; - na_ptr->items[x]->data = NULL; - na_ptr->items[x]->type = type_d_rec->id; - } - } - } else { - DEBUG_EMAIL(("Ignoring 0 value in offset\n")); - if (na_ptr->items[x]->data) free (na_ptr->items[x]->data); - na_ptr->items[x]->data = NULL; - free(na_ptr->items[x]); - na_ptr->count_item--; // remove this item from the destination list - continue; } - if (na_ptr->items[x]->type == 0) - //it can be used to convey information - // to later functions - na_ptr->items[x]->type = table_rec.ref_type; + if (na_ptr->items[x]->type == 0) na_ptr->items[x]->type = table_rec.ref_type; } else { WARN(("ERROR Unknown ref_type %#x\n", table_rec.ref_type)); - if (buf) free(buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); if (na_head) _pst_free_list(na_head); DEBUG_RET(); return NULL; @@ -1415,9 +1455,8 @@ } DEBUG_EMAIL(("increasing ind2_ptr by %i [%#x] bytes. Was %#x, Now %#x\n", rec_size, rec_size, ind2_ptr, ind2_ptr+rec_size)); ind2_ptr += rec_size; - count_rec++; } - if (buf) free(buf); + freeall(buf, &block_offset1, &block_offset2, &block_offset3, &block_offset4, &block_offset5, &block_offset6, &block_offset7); DEBUG_RET(); return na_head; } @@ -2998,6 +3037,7 @@ DEBUG_EMAIL(("%s\n", item->journal->type)); break; default: + DEBUG_EMAIL(("unknown type %#x\n", list->items[x]->id)); /* Reference Types 2 - 0x0002 - Signed 16bit value @@ -3431,11 +3471,50 @@ } +/** + * The offset might be zero, in which case we have no data, so return a pair of null pointers. + * Or, the offset might end in 0xf, so it is an id2 pointer, in which case we read the id2 block. + * Otherwise, the offset>>4 is an index into the table of offsets in the buffer. +*/ +int32_t _pst_getBlockOffsetPointer(pst_file *pf, pst_index2_ll *i2_head, unsigned char *buf, int32_t read_size, int32_t i_offset, int32_t offset, pst_block_offset_pointer *p) { + int32_t size; + pst_block_offset block_offset; + DEBUG_ENT("_pst_getBlockOffsetPointer"); + if (p->needfree) free(p->from); + p->from = NULL; + p->needfree = 0; + if (!offset) { + p->from = p->to = NULL; + } + else if ((offset & 0xf) == 0xf) { + DEBUG_WARN(("Found id2 %#x value. Will follow it\n", offset)); + size = _pst_ff_getID2block(pf, offset, i2_head, &(p->from)); + if (size) { + p->to = p->from + size; + p->needfree = 1; + } + else { + p->from = p->to = NULL; + } + } + else if (_pst_getBlockOffset(buf, read_size, i_offset, offset, &block_offset)) { + p->from = p->to = NULL; + } + else { + p->from = buf + block_offset.from; + p->to = buf + block_offset.to; + } + DEBUG_RET(); + return (p->from) ? 0 : 1; +} + + int32_t _pst_getBlockOffset(unsigned char *buf, int32_t read_size, int32_t i_offset, int32_t offset, pst_block_offset *p) { - int32_t of1 = offset>>4; + int32_t low = offset & 0xf; + int32_t of1 = offset >> 4; DEBUG_ENT("_pst_getBlockOffset"); - if (!p || !buf || (i_offset == 0) || (i_offset+2+of1+sizeof(*p) > read_size)) { - DEBUG_WARN(("p is NULL or buf is NULL or offset is 0 (%p, %p, %#x, %i, %i)\n", p, buf, offset, read_size, i_offset)); + if (!p || !buf || !i_offset || low || (i_offset+2+of1+sizeof(*p) > read_size)) { + DEBUG_WARN(("p is NULL or buf is NULL or offset is 0 or offset has low bits or beyond read size (%p, %p, %#x, %i, %i)\n", p, buf, offset, read_size, i_offset)); DEBUG_RET(); return -1; } @@ -3443,13 +3522,17 @@ memcpy(&(p->to), &(buf[(i_offset+2)+of1+sizeof(p->from)]), sizeof(p->to)); LE16_CPU(p->from); LE16_CPU(p->to); - DEBUG_WARN(("get block offset finds from=%i(%#x), to=%i(%#x)", p->from, p->from, p->to, p->to)); + DEBUG_WARN(("get block offset finds from=%i(%#x), to=%i(%#x)\n", p->from, p->from, p->to, p->to)); + if (p->from > p->to) { + DEBUG_WARN(("get block offset from > to")); + return -1; + } DEBUG_RET(); return 0; } -pst_index_ll * _pst_getID(pst_file* pf, u_int32_t id) { +pst_index_ll* _pst_getID(pst_file* pf, u_int32_t id) { pst_index_ll *ptr = NULL; DEBUG_ENT("_pst_getID"); if (id == 0) { @@ -3457,21 +3540,17 @@ return NULL; } - /* if (id & 0x3) { // if either of the last two bits on the id are set - DEBUG_INDEX(("ODD_INDEX (not even) is this a pointer to a table?\n")); - }*/ - // Dave: I don't think I should do this. next bit. I really think it doesn't work - // it isn't based on sound principles either. - // update: seems that the last two sig bits are flags. u tell me! - id &= 0xFFFFFFFE; // remove least sig. bit. seems that it might work if I do this + //if (id & 1) DEBUG_INDEX(("have odd id bit %#x\n", id)); + //if (id & 2) DEBUG_INDEX(("have two id bit %#x\n", id)); + id &= 0xFFFFFFFE; DEBUG_INDEX(("Trying to find %#x\n", id)); if (!ptr) ptr = pf->i_head; while (ptr && (ptr->id != id)) { ptr = ptr->next; } - if (ptr) {DEBUG_INDEX(("Found Value %#x\n", ptr->id));} - else {DEBUG_INDEX(("ERROR: Value not found\n")); } + if (ptr) {DEBUG_INDEX(("Found Value %#x\n", id)); } + else {DEBUG_INDEX(("ERROR: Value %#x not found\n", id)); } DEBUG_RET(); return ptr; } @@ -3496,7 +3575,15 @@ } -pst_desc_ll * _pst_getDptr(pst_file *pf, u_int32_t id) { +/** + * find the id in the descriptor tree rooted at pf->d_head + * + * @param pf global pst file pointer + * @param id the id we are looking for + * + * @return pointer to the pst_desc_ll node in the descriptor tree +*/ +pst_desc_ll* _pst_getDptr(pst_file *pf, u_int32_t id) { pst_desc_ll *ptr = pf->d_head; DEBUG_ENT("_pst_getDptr"); while (ptr && (ptr->id != id)) { @@ -3742,16 +3829,19 @@ size_t _pst_ff_getIDblock_dec(pst_file *pf, u_int32_t id, unsigned char **b) { size_t r; DEBUG_ENT("_pst_ff_getIDblock_dec"); + DEBUG_INDEX(("for id %#x\n", id)); r = _pst_ff_getIDblock(pf, id, b); - if (pf->encryption) _pst_decrypt(*b, r, pf->encryption); DEBUG_HEXDUMPC(*b, r, 16); + int noenc = (id & 2); // disable encryption + if ((pf->encryption) & !(noenc)) { + _pst_decrypt(*b, r, pf->encryption); + DEBUG_HEXDUMPC(*b, r, 16); + } DEBUG_RET(); return r; } -/** the get ID function for the default file format that I am working with - ie the one in the PST files */ size_t _pst_ff_getIDblock(pst_file *pf, u_int32_t id, unsigned char** b) { pst_index_ll *rec; size_t rsize = 0;//, re_size=0;
--- a/src/libpst.h Thu Jul 12 14:59:13 2007 -0700 +++ b/src/libpst.h Sun Jul 15 14:25:34 2007 -0700 @@ -126,7 +126,7 @@ typedef struct _pst_entryid_struct { int32_t u1; char entryid[16]; - int32_t id; + u_int32_t id; } pst_entryid; typedef struct _pst_desc_struct { @@ -152,7 +152,7 @@ } pst_index_ll; typedef struct _pst_index2_tree { - int32_t id2; + u_int32_t id2; pst_index_ll *id; struct _pst_index2_tree * next; } pst_index2_ll; @@ -432,8 +432,14 @@ int16_t to; } pst_block_offset; +typedef struct _pst_block_offset_pointer { + unsigned char *from; + unsigned char *to; + int needfree; +} pst_block_offset_pointer; + struct _pst_num_item { - int32_t id; + u_int32_t id; unsigned char *data; int32_t type; size_t size; @@ -467,7 +473,7 @@ int32_t pst_load_extended_attributes(pst_file *pf); int32_t _pst_build_id_ptr(pst_file *pf, int32_t offset, int32_t depth, int32_t linku1, u_int32_t start_val, u_int32_t end_val); -int32_t _pst_build_desc_ptr (pst_file *pf, int32_t offset, int32_t depth, int32_t linku1, u_int32_t *high_id, int32_t start_id, int32_t end_val); +int32_t _pst_build_desc_ptr (pst_file *pf, int32_t offset, int32_t depth, int32_t linku1, u_int32_t *high_id, u_int32_t start_id, u_int32_t end_val); pst_item* _pst_getItem(pst_file *pf, pst_desc_ll *d_ptr); void * _pst_parse_item (pst_file *pf, pst_desc_ll *d_ptr); pst_num_array * _pst_parse_block(pst_file *pf, u_int32_t block_id, pst_index2_ll *i2_head); @@ -478,6 +484,7 @@ int32_t _pst_free_id (pst_index_ll *head); int32_t _pst_free_desc (pst_desc_ll *head); int32_t _pst_free_xattrib(pst_x_attrib_ll *x); +int32_t _pst_getBlockOffsetPointer(pst_file *pf, pst_index2_ll *i2_head, unsigned char *buf, int32_t read_size, int32_t i_offset, int32_t offset, pst_block_offset_pointer *p); int32_t _pst_getBlockOffset(unsigned char *buf, int32_t read_size, int32_t i_offset, int32_t offset, pst_block_offset *p); pst_index2_ll * _pst_build_id2(pst_file *pf, pst_index_ll* list, pst_index2_ll* head_ptr); pst_index_ll * _pst_getID(pst_file* pf, u_int32_t id);
--- a/xml/libpst.in Thu Jul 12 14:59:13 2007 -0700 +++ b/xml/libpst.in Sun Jul 15 14:25:34 2007 -0700 @@ -652,10 +652,10 @@ match the backPointer from the triple that pointed to this node. </para> <para> - Each item in this node is a triple of (ID, backPointer, offset) + Each item in this node is a triple of (ID1, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, - and ID is the lowest ID value in the subtree. + and ID1 is the lowest ID1 value in the subtree. </para> </refsect1> @@ -722,6 +722,12 @@ </para> <para> Each item in this node is a tuple of (ID1, offset, size, unknown) + The two low order bits of the ID1 value seem to be flags. I have + never seen a case with bit zero set. Bit one indicates that the + item is <emphasis>not</emphasis> encrypted. Note that references + to these ID1 values elsewhere may have the low order bit set (and + I don't know what that means), but when we do the search in this + tree we need to clear that bit so that we can find the correct item. </para> </refsect1> @@ -905,34 +911,42 @@ 0000 indexOffset [2 bytes] 0x013c in this case 0002 signature [2 bytes] 0xbcec constant -0004 offset [2 bytes] 0x0020 in this case +0004 b5offset [4 bytes] 0x0020 index reference ]]></literallayout> <para> - Note the signature of 0xbcec. There are other descriptor block - formats with other signatures. - Note the indexOffset of 0x013c - starting at that position in the - descriptor block, we have an array of two byte integers. The first - integer (0x000b) is a (count-1) of the number of overlapping pairs - following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14) - and the last (12th) pair is (0x123, 0x13b). These pairs are (start,end+1) - offsets of items in this block. So we have count+2 integers following - the count value. + Note the signature of 0xbcec. There are other descriptor block formats + with other signatures. Note the indexOffset of 0x013c - starting at + that position in the descriptor block, we have an array of two byte + integers. The first integer (0x000b) is a (count-1) of the number of + overlapping pairs following the count. The first pair is (0, 0xc), the + next pair is (0xc, 0x14) and the last (12th) pair is (0x123, 0x13b). + These pairs are (start,end+1) offsets of items in this block. So we + have count+2 integers following the count value. </para> <para> - Note the offset of 0x0020, which needs to be right shifted by 4 bits - to become 0x0002, which is then a byte offset to be added to the above - indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) - pair. Finally, we have the offset and size of the "b5" block located at offset 0xc + Note the b5offset of 0x0020, which is a type that I will call an index + reference. Such index references have at least two different forms, and + may point to data either in this block, or in some other block. + External pointer references have the low order 4 bits all set, and are + ID2 values that can be used to fetch data. This value of 0x0020 is an + internal pointer reference, which needs to be right shifted by 4 bits to + become 0x0002, which is then a byte offset to be added to the above + indexOffset plus two (to skip the count), so it points to the (0xc, + 0x14) pair. + </para> + <para> + Finally, we have the offset and size of the "b5" block located at offset 0xc with a size of 8 bytes in this descriptor block. The "b5" block has the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 signature [2 bytes] 0x02b5 constant 0002 unknown [2 bytes] 0x0006 in this case -0004 offset [4 bytes] 0x0040 in this case +0004 descoffset [4 bytes] 0x0040 index reference ]]></literallayout> <para> - Note the "b5" offset of 0x0040, which needs to be right shifted by 4 bits + Note the descoffset of 0x0040, which again is an index reference. In this + case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0x7c) pair. We now have the offset 0x14 of the descriptor array, composed of 8 byte @@ -945,9 +959,9 @@ ]]></literallayout> <para> For some reference types (2, 3, 0xb) the value is used directly. Otherwise, - the value is generally a non-zero offset, to be right shifted by 4 bits and used to fetch - a pair from the index table to find the offset and size of the item in this - descriptor block. However, if (value AND 0xf) == 0xf, then the value is an ID2 index. + the value is an index reference, which is either an ID2 value, or an + offset, to be right shifted by 4 bits and used to fetch a pair from the + index table to find the offset and size of the item in this descriptor block. </para> <para> The following reference types are known, but not all of these @@ -1197,7 +1211,7 @@ <refsect1 id='pst.file.desc2.5'> <title>Associated Descriptor Item 0x7cec</title> <para> - This style of descriptor block is similar to the BCEC format. + This style of descriptor block is similar to the 0xbcec format. </para> <literallayout class="monospaced"><![CDATA[ 0000 7a 01 ec 7c 40 00 00 00 00 00 00 00 b5 04 02 00 @@ -1228,7 +1242,7 @@ 0000 indexOffset [2 bytes] 0x017a in this case 0002 signature [2 bytes] 0x7cec constant -0004 offset [2 bytes] 0x0040 in this case +0004 7coffset [4 bytes] 0x0040 index reference ]]></literallayout> <para> Note the signature of 0x7cec. There are other descriptor block @@ -1242,7 +1256,8 @@ the count value. </para> <para> - Note the offset of 0x0040, which needs to be right shifted by 4 bits + Note the 7coffset of 0x0040, which is an index reference. In this case, + it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0xea) pair. We have the offset and size of the "7c" block located at offset 0x14 @@ -1256,15 +1271,15 @@ 0004 unknown [2 bytes] 0x0060 in this case 0006 unknown [2 bytes] 0x0062 in this case 0008 recordSize [2 bytes] 0x0065 in this case -000a b5Offset [2 bytes] 0x0020 in this case -000c unknown [2 bytes] 0x0000 in this case -000e index2Offset [2 bytes] 0x0080 in this case +000a b5Offset [4 bytes] 0x0020 index reference +000e index2Offset [4 bytes] 0x0080 index reference 0010 unknown [2 bytes] 0x0000 in this case 0012 unknown [2 bytes] 0x0000 in this case 0014 unknown [2 bytes] 0x0000 in this case ]]></literallayout> <para> - Note the b5Offset of 0x0020, which needs to be right shifted by 4 bits + Note the b5Offset of 0x0020, which is an index reference. In this case, + it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0002, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) pair. Finally, we have the offset and size of the "b5" block @@ -1274,10 +1289,11 @@ <literallayout class="monospaced"><![CDATA[ 0000 signature [2 bytes] 0x04b5 constant 0002 unknown [2 bytes] 0x0002 in this case -0004 offset [4 bytes] 0x0060 in this case +0004 descoffset [4 bytes] 0x0060 index reference ]]></literallayout> <para> - Note the "b5" offset of 0x0060, which needs to be right shifted by 4 + Note the descoffset of 0x0060, which again is an index reference. In this + case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0006, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xea, 0xf0) pair. That gives us (0xf0 - 0xea)/6 = 1, so we have a @@ -1285,7 +1301,8 @@ and unused here. </para> <para> - Note the index2Offset above of 0x0080, which needs to be right shifted + Note the index2Offset above of 0x0080, which again is an index reference. In this + case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0008, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xf0, 0x155) pair. This is an array of tables of four byte integers. @@ -1302,17 +1319,37 @@ 0000 referenceType [2 bytes] 0002 itemType [2 bytes] 0004 ind2Offset [2 bytes] -0006 unknown [2 bytes] +0006 size [1 byte] +0007 unknown [1 byte] ]]></literallayout> <para> - The ind2Offset is a byte offset into the current IND2 table of a four - byte integer value. Once we fetch that, we have the same triple (item - type, reference type, value) as we find in the 0xbcec style descriptor - blocks. These 8 byte descriptors are processed recordCount times, each + The ind2Offset is a byte offset into the current IND2 table of some value. + If that is a four byte integer value, then once we fetch that, we have + the same triple (item type, reference type, value) as we find in the + 0xbcec style descriptor blocks. If not, then this value is used directly. + These 8 byte descriptors are processed recordCount times, each time using the next IND2 table. The item and reference types are as described above for the 0xbcec format descriptor block. </para> </refsect1> + <refsect1 id='pst.file.desc3.5'> + <title>Associated Descriptor Item 0x0002</title> + <para> + This style of descriptor block is almost unknown here. + It seems to contain a list of ID1 values. + </para> + <literallayout class="monospaced"><![CDATA[ +0000 01 01 02 00 26 28 00 00 18 77 0c 00 b8 04 00 00 + +0000 signature [2 bytes] 0x0101 constant +0002 count [2 bytes] 0x0002 in this case +0004 unknown [4 bytes] 0x002826 in this case + repeating +0008 id [4 bytes] 0x0c7718 in this case +000c id [4 bytes] 0x0004b8 in this case +]]></literallayout> + </refsect1> + </refentry> </reference>