XML / PHP String cut off problem
Hi all,
I need some help with a problem.
I?ve developed an MS Project import / export component for our software using SAX XML. We take the project file and convert it to XML (average size: 500k), load it up and suck the info out of it to build a replica in our database (SQL Server, Oracle, MySQL).
The problem that I'm having is that at seemingly random points in the XML, strings are being cut off.
For example:
"27-08-2003 5:00:00 PM" becomes "08-2003 5:00:00 PM"
This happens predominately with dates, although there are also some issues with incorrect boolean values and even missing tasks. All things that we can see are correct in the XML file yet when they reach the function to read the data between tags they have become corrupted in some manner (cut off).
Has anyone else experienced this? Or even better knows a way to solve these problems?
Any help would be greatly appriciated.
If this can't be resolved I'll have to consider dumping the entire XML component and use Access conversions of the project files with ODBC instead.
Code examples and snippets are below.
Thanks,
Cam
XML Snippet:
PHP:<?php
<Task>
<TaskUID>19</TaskUID>
<TaskID>19</TaskID>
<TaskName>Hire Tradepersons</TaskName>
<TaskIsSummary>False</TaskIsSummary>
<TaskIsMilestone>False</TaskIsMilestone>
<TaskOutlineNumber>1.4.2</TaskOutlineNumber>
<TaskWBS />
<TaskPriority>500</TaskPriority>
<TaskPercentageComplete>10</TaskPercentageComplete>
<TaskCreationDate>11-08-2003 4:54:00 PM</TaskCreationDate>
<TaskStart>27-08-2003 8:00:00 AM</TaskStart>
<TaskFinish>27-08-2003 5:00:00 PM</TaskFinish>
<TaskBaseStart>27-08-2003 8:00:00 AM</TaskBaseStart>
<TaskBaseFinish>27-08-2003 5:00:00 PM</TaskBaseFinish>
<TaskActualStart>27-08-2003 8:00:00 AM</TaskActualStart>
<TaskActualFinish />
<TaskDuration>4800</TaskDuration>
<TaskTotalSlack>68400</TaskTotalSlack>
<TaskFreeSlack>0</TaskFreeSlack>
<TaskConstraintType>0</TaskConstraintType>
<TaskConstraintDate />
</Task>
?>
Parser code:
PHP:<?php
# File to process
$file = "../files/".$id."/import.xml";
# Initialize parser
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
xml_parser_set_option ($xml_parser, XML_OPTION_CASE_FOLDING, FALSE);
if (!($fp = fopen($file, "r"))){
die("Cannot locate XML data file: $file");
}
while ($data = fread($fp, 4096)){
# Replace & with and
$data = str_replace("&", "and", "$data");
if (!xml_parse($xml_parser, $data, feof($fp))){
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser))
);
}
}
xml_parser_free($xml_parser);
?>
Function to read data between tags (character_data_handler) snippet:
PHP:<?php
# Process data between tags
function characterData($parser, $data) {
global $currentTag,$projectArray,$taskArray,$linkArray;
switch ($currentTag) {
case "TaskUID":
unset($taskArray[2]);
$taskArray[0] = $data;
break;
}
}
?>

and what to escape them to.